SV: SV: Integrating dynamic data into Lucene search/ranking

Marcus Falk Thu, 17 Jan 2008 05:27:32 -0800

In our solution we used a RAMDir for the newest incoming articles and a FSDir 
for older ones. Then we had a limit for the ramdir  like 10.000 documents when 
that limit were hit we used mergesegments to move the content from ramdir -> 
fsdir, actually we had to do some modification in the mergesegment method since 
it always seemed to do an optimize on the index after the merge, I have the 
code if u want it.

If you use RAMDir + FSDir you can use 2 indexserchers and one multisearcher on 
top. The indexsearcher that uses the small RAMDir can be rebinded quite often. 

/
Regards
M

-----Ursprungligt meddelande-----
Från: Andrzej Bialecki [mailto:[EMAIL PROTECTED] 
Skickat: den 17 januari 2008 10:55
Till: java-user@lucene.apache.org
Ämne: Re: SV: Integrating dynamic data into Lucene search/ranking

Tobias Lohr wrote:
> I'm not really sure, if this approach is possible for working in changes 
> every - let's say - 30 seconds!?

The conventional wisdom is to use RAMDirectory in such scenarios. I.e. 
you commit frequent updates to a RAMDirectory and frequently reopen its 
Searcher (which should be fast). Periodically, merge the RAMDirectory 
index with your on-disk index - you need to open a new IndexSearcher in 
the background, warm it up with the latest N queries, and when it's 
ready you swap searchers, i.e. you close the old one, purge the 
RAMDirectory (since it was synced to the on-disk index), and start using 
the new IndexSearcher.

And again, start accumulating new docs in the RAMDirectory, etc, etc ...

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

> Datum: Thu, 17 Jan 2008 05:35:13 +0100
> Von: "Marcus Falk" <[EMAIL PROTECTED]>
> An: java-user@lucene.apache.org, java-user@lucene.apache.org
> Betreff: SV: Integrating dynamic data into Lucene search/ranking

> We did this in our system, indexing a constant flow of news articles, 
> by doing as Otis described (reopened the indexsearcher)..
>  
> Every 3:d minute we are creating a new indexsearcher in the background 
> after this searcher has been created we are fireing some warm up 
> queries against it and after that we change the old searcher to point to the 
> new one.
> Works fine for us and we got large indexes (several millions of articles)... 
>  
> /Regards
> Marcus
>  
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

SV: SV: Integrating dynamic data into Lucene search/ranking

Reply via email to