Hi James, Yes, this makes sense. I've recommended doing the same to others before. It would be good to have this be a part of Solr. There is one person (named Jason) working on adding more real-time search support to both Lucene and Solr.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: James Brady <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Wednesday, June 11, 2008 11:24:38 PM > Subject: Strategy for presenting fresh data > > Hi, > The product I'm working on requires new documents to be searchable > very quickly (inside 60 seconds is my goal). The corpus is also going > to grow very large, although it is perfectly partitionable by user. > > The approach I tried first was to have write-only masters and read- > only slaves with data being replicated from one to another postCommit > and postOptimise. > > This allowed new documents to be visible inside 5 minutes or so (until > the indexes got so large that re-opening IndexSearchers took for ever, > that is...), but still not good enough. > > Now, I am considering cutting out the commit / replicate / re-open > cycle by augmenting Solr with a RAMDirectory per core. > > Your thoughts on the following approach would be much appreciated: > > Searches would be forked to both the RAMDirectory and FSDirectory, > while writes would go to the RAMDirectory only. The RAMDirectory would > be flushed back to the FSDirectory regularly, using > IndexWriter.addIndexes (or addIndexesNoOptimise). > > Effectively, I'd be creating a searchable queue in front of a > regularly committed and optimised conventional index. > > As this seems to be a useful pattern (and is mentioned tangentially in > Lucene in Action), is there already support for this in Lucene? > > Thanks, > James