Thank you, this looks exactly like what I need!

-Tod

On Aug 18, 2017, at 1:42 AM, Adrien Grand 
<jpou...@gmail.com<mailto:jpou...@gmail.com>> wrote:

You could wrap a collector wrapper (have a look at FilterCollector maybe) that 
throws a CollectionTerminatedException whenever more than X hits have been 
collected in total. It will likely stop in the middle of the first segment, and 
then before collecting further segments.

FYI you can not only throw a CollectionTerminatedException from the collect 
method, but also from the getLeafCollector method, which allows to skip a 
segment entirely before even starting to find a match.

We have such a collector in Elasticsearch, feel free to copy-paste it and adapt 
to your needs if you want. It is licensed under ASL2: 
https://github.com/elastic/elasticsearch/blob/36a5cf8f35e5cbaa1ff857b5a5db8c02edc1a187/core/src/main/java/org/elasticsearch/search/query/EarlyTerminatingCollector.java

Le jeu. 17 août 2017 à 21:46, Tod Olson 
<t...@uchicago.edu<mailto:t...@uchicago.edu>> a écrit :
Hi everyone,

I'm modifying an existing application, which uses a Lucene SimpleCollector to 
return document ids and some other fields from a search. For various reasons, 
we now want to place an upper bound on the number of documents actually 
collected.

Is there a reasonable way to put a limit on the results returned by a 
SimpleCollector? Or do I need to change Collectors?

Based on the docs, I could keep a counter and raise a 
CollectionTerminatedException after N documents, but then the search moves on 
to the next leaf. I'd like to have the entire search terminate and return the 
collected documents.

Any assistance for a Lucene novice is greatly appreciated!

-Tod


Tod Olson 
<t...@uchicago.edu<mailto:t...@uchicago.edu><mailto:t...@uchicago.edu<mailto:t...@uchicago.edu>>>
Systems Librarian
Interim Director for Integrated Library Systems
University of Chicago Library


Reply via email to