Re: "I think such a parameter should not exist on individual search methods since it's more of a global setting (i.e., I want my searches to be limited to 5 seconds, always, not just for a particular query). Right?"
I am not sure about this one, we had cases where one phisical index served two logical indices with different requirements for clients. having Timeout settable per Query is nice to have. At the end of day, with such timeout you support Quality/Time compromise settings: "if you need all results, be ready to wait longer and set longer timeout" "if you need SOME results quickly than reduce this timeout" that should be idealy user decision ________________________________ From: Shai Erera <ser...@gmail.com> To: java-dev@lucene.apache.org Sent: Wednesday, 24 June, 2009 10:55:50 Subject: Re: Improving TimeLimitedCollector But TimeLimitingCollector's logic is coded in its collect() method. The top scorer calls nextDoc() or advance() on all its sub-scorers, and only when a match is found it calls collect(). If we want the sub-scorers to check whether they should abort, we'd need to revamp (liked the word :)) TimeLimitingCollector, to be something like CheckAbort SegmentMerger uses. I.e., the top scorer will pass such an instance to its sub scorers, which will call a TimeLimit.check() or something and if the time limit has expired this call will throw a TimeExceededException (like TLC). We can enable this by adding another parameter to IndexSearcher whether searches should be limited by time, and what's the time limit. It will then instantiate that object and pass it to its Scorer and so on. I think such a parameter should not exist on individual search methods since it's more of a global setting (i.e., I want my searches to be limited to 5 seconds, always, not just for a particular query). Right? Another option would be to add a setTimeout method on Query, which will use it when it constructs its Scorer. The shortcoming of this is that if I want to use someone else's query which did not implement setTimeout, then I'll need to build a TimeOutQueryWrapper that will wrap a Query, and implement the timeout logic, but that's get complicated. I think the Collector approach makes the most sense to me, since it's the only object I fully control in the search process. I cannot control Query implementations, and I cannot control the decisions made by IndexSearcher. But I can always wrap someone else's Collector with TLC and pass it to search(). Shai On Wed, Jun 24, 2009 at 12:26 AM, Jason Rutherglen <jason.rutherg...@gmail.com> wrote: As we're revamping collectors, weights, and scorers, perhaps we can push time limiting into the individual subscorers? Currently on a boolean query, we're timing out the query at the top level which doesn't work well if the subqueries exceed the time limit.