Thanks for your hint. If its possible I would take a look into the code, but 
the approach is interesting.

What would you say to this approach I developed in my mind:

- Having an additional quite smaller index, were only the dynamic data resides 
and is incorporated every N seconds with incremental index updates.
- Documents of the additional index have the same semantical "id" field, to 
model a relation between them.
- A search is actually based on the index containing the searchable content, 
but the sorting/ranking is done using a SortComparatorSource, which "extracts" 
the information and calculates the score for the documents  of the content 
index.

What do you say?

-------- Original-Nachricht --------
> Datum: Thu, 17 Jan 2008 14:26:53 +0100
> Von: "Marcus Falk" <[EMAIL PROTECTED]>
> An: java-user@lucene.apache.org
> Betreff: SV: SV: Integrating dynamic data into Lucene search/ranking

> In our solution we used a RAMDir for the newest incoming articles and a
> FSDir for older ones. Then we had a limit for the ramdir  like 10.000
> documents when that limit were hit we used mergesegments to move the content 
> from
> ramdir -> fsdir, actually we had to do some modification in the
> mergesegment method since it always seemed to do an optimize on the index 
> after the
> merge, I have the code if u want it.
> 
> If you use RAMDir + FSDir you can use 2 indexserchers and one
> multisearcher on top. The indexsearcher that uses the small RAMDir can be 
> rebinded
> quite often. 
> 
> /
> Regards
> M
> 
> 
> -----Ursprungligt meddelande-----
> Från: Andrzej Bialecki [mailto:[EMAIL PROTECTED] 
> Skickat: den 17 januari 2008 10:55
> Till: java-user@lucene.apache.org
> Ämne: Re: SV: Integrating dynamic data into Lucene search/ranking
> 
> Tobias Lohr wrote:
> > I'm not really sure, if this approach is possible for working in changes
> every - let's say - 30 seconds!?
> 
> The conventional wisdom is to use RAMDirectory in such scenarios. I.e. 
> you commit frequent updates to a RAMDirectory and frequently reopen its 
> Searcher (which should be fast). Periodically, merge the RAMDirectory 
> index with your on-disk index - you need to open a new IndexSearcher in 
> the background, warm it up with the latest N queries, and when it's 
> ready you swap searchers, i.e. you close the old one, purge the 
> RAMDirectory (since it was synced to the on-disk index), and start using 
> the new IndexSearcher.
> 
> And again, start accumulating new docs in the RAMDirectory, etc, etc ...
> 
> -- 
> Best regards,
> Andrzej Bialecki     <><
>   ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> > Datum: Thu, 17 Jan 2008 05:35:13 +0100
> > Von: "Marcus Falk" <[EMAIL PROTECTED]>
> > An: java-user@lucene.apache.org, java-user@lucene.apache.org
> > Betreff: SV: Integrating dynamic data into Lucene search/ranking
> 
> > We did this in our system, indexing a constant flow of news articles, 
> > by doing as Otis described (reopened the indexsearcher)..
> >  
> > Every 3:d minute we are creating a new indexsearcher in the background 
> > after this searcher has been created we are fireing some warm up 
> > queries against it and after that we change the old searcher to point to
> the new one.
> > Works fine for us and we got large indexes (several millions of
> articles)... 
> >  
> > /Regards
> > Marcus
> >  
> >  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to