: Nutch recently added a search query timeout (NUTCH-308). Are there any : plans to add such functionality to the Lucene HitCollector directly? Or : is there some reason that this is a bad idea?
Quickly skimming the patch in that Issue, Nutch seems to have done what has been discussed previously on this list: using a HitCollector which throws a RuntimeException if a certain amount of time has elapsed. this is something anyone using the Lucene API can do as long as they use a HitCollector ... the Nutch impl seems to ctually spin up a seperate thread for each request rather then comparing timestamps after each doc is colelcted (and interesting choice that both frightens and excites me) a HitCollector like this could easily be "promoted" up in the Lucene code base ... the real question is would we want timeout info like this to be exposed in some of the simpler Searcher APIs (ie: that return TopDocs or Hits) and if so how do you signal that it's only a partial result set? : I'm using Solr which doesn't seem to support search timeouts. It seems : that it would make sense to add the feature at the Lucene level rather : than implement the feature in each derivative. even if it were added to the Lucene core, some fairly heavy questions would have to be answered before encorporating this into Solr: mainly what do do about Caching ... do you cache the partial results? do you attempt to continue searching after returning the partial results? the next time a request comes in and partial results are cached, do you try to pick up where you left off since you've got a threshold of time available? because of these issues, Solr may need a custom solution to this situation. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]