While reviewing Lucene 9.5 changes coming into Solr, I noticed some changes
relating to the ability to specify timeAllowed in Solr on a search query.
Solr uses both ExitableDirectoryReader and TimeLimitingCollector from
Lucene for this (complementary to each other).  Unfortunately, changes in
Lucene will make the cost of ExitableDirectoryReader wrapping happen for
all queries into Solr, even those not using timeAllowed.  Options to keep
EDR aren't good -- fork it basically.  Anecdotally, I think I've heard the
overhead is not trivial and my intuition thinks likewise.  Meanwhile,
Lucene 9.3 added a new TimeLimitingBulkScorer which even gets first class
integration into IndexSearcher which has a timeout.  It's been
incrementally improved, and I really like its approach, probable
performance, and simplicity.  It should be straightforward to integrate
this into SolrIndexSearcher and also only do so for queries specifying
timeAllowed.  I'm not sure TimeLimitingCollector offers much value to using
TLBS other than additional precision on timeAllowed at some cost to
unselective queries.

I think doing this should block Solr 9.2 using Lucene 9.5.  Alternatively,
someone might benchmark the state of things and see that things aren't so
bad as they may seem.  But that takes work too.

[1] QueryTimeout.isTimeoutEnabled is gone:
https://github.com/apache/lucene/pull/11954
[2] TimeLimitingBulkScorer in LUCENE-10151
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TimeLimitingBulkScorer.java

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley

Reply via email to