Re: Strategy for presenting fresh data

Otis Gospodnetic Wed, 11 Jun 2008 20:50:30 -0700

Hi James,

Yes, this makes sense.  I've recommended doing the same to others before.  It 
would be good to have this be a part of Solr.  There is one person (named 
Jason) working on adding more real-time search support to both Lucene and Solr.



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: James Brady <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, June 11, 2008 11:24:38 PM
> Subject: Strategy for presenting fresh data
> 
> Hi,
> The product I'm working on requires new documents to be searchable  
> very quickly (inside 60 seconds is my goal). The corpus is also going  
> to grow very large, although it is perfectly partitionable by user.
> 
> The approach I tried first was to have write-only masters and read- 
> only slaves with data being replicated from one to another postCommit  
> and postOptimise.
> 
> This allowed new documents to be visible inside 5 minutes or so (until  
> the indexes got so large that re-opening IndexSearchers took for ever,  
> that is...), but still not good enough.
> 
> Now, I am considering cutting out the commit / replicate / re-open  
> cycle by augmenting Solr with a RAMDirectory per core.
> 
> Your thoughts on the following approach would be much appreciated:
> 
> Searches would be forked to both the RAMDirectory and FSDirectory,  
> while writes would go to the RAMDirectory only. The RAMDirectory would  
> be flushed back to the FSDirectory regularly, using  
> IndexWriter.addIndexes (or addIndexesNoOptimise).
> 
> Effectively, I'd be creating a searchable queue in front of a  
> regularly committed and optimised conventional index.
> 
> As this seems to be a useful pattern (and is mentioned tangentially in  
> Lucene in Action), is there already support for this in Lucene?
> 
> Thanks,
> James

Re: Strategy for presenting fresh data

Reply via email to