Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

Michael Sokolov Fri, 01 Mar 2013 04:42:00 -0800

On 2/28/2013 5:05 PM, Uwe Schindler wrote:

...  Collector instead of HitCollector (like your ancient Lucene from 2.4), you have to 
respect the new semantics that are *different* to old HitCollector. Collector works with 
low-level atomic readers (also in Lucene 3.x), the calls to the "collect(int)" 
method are *not* using global document IDs, so using a IndexReader from outside does not 
work and will never work - PERIOD: The document IDs are only *relative* to the atomic 
reader that was passed to the collector by setNextReader() before a sequence of collect() 
calls. To make global docIds out of it, you may use readerContext.docBase, but this is 
slower than using the low-level atomic reader.

Uwe, thanks for this lucid explanation! I wonder if you wouldn't mindelaborating a bit on the slowdown you refer to from using docBase toabsolutize docIDs. I have a use case where I need to pass control to mycaller, allowing them to *pull* results - so I don't know how many Iwill need. In the case where documents are returned in(docID) order,the code is actually pretty straightforward: I iterate over the atomicreaders and pull results from each in turn. Are you saying that isslower because it prevents multi-threading, or is there some other reason?


-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)

Reply via email to