[
https://issues.apache.org/jira/browse/LUCENE-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530000#comment-17530000
]
Greg Miller commented on LUCENE-10544:
--------------------------------------
{quote}In my opinion, a better solution that has less overhead and would still
support cancelling such slow queries consists of leveraging
{{BulkScorer#score}} to score small-ish ranges of doc IDs at a time.
{quote}
+1. We've had success by implementing a "timeout enforcing" Query that does
timeout enforcement within the Scorer it provides as a short-term solution, but
there are a number of flaws with this approach. Hooking into the BulkScorer
makes sense but does need some thought as [~dpsharma] mentions since Queries
may (and do!) provide their own BulkScorers in some cases (e.g.,
{{{}BooleanScorer{}}}).
{quote}Long-term I'd like ExitableDirectoryReader and other tooling to handle
cancellation/timeout to become mostly implementation details, and have proper
support directly on IndexSearcher (LUCENE-10151).
{quote}
+1. For full disclosure, [~dpsharma] and I work together at Amazon and she is
working on LUCENE-10151. One idea is to use {{ExitableDirectoryReader}} as an
internal implementation detail of {{IndexSearcher}} to add first-class timeout
support. While we were debugging some prototype code, we ran into this issue
with {{ExitableDirectoryReader}} and I thought it warranted a spin-off issue
since it seems like something we might want to generally fix.
> Should ExitableTermsEnum wrap postings and impacts?
> ---------------------------------------------------
>
> Key: LUCENE-10544
> URL: https://issues.apache.org/jira/browse/LUCENE-10544
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/index
> Reporter: Greg Miller
> Priority: Major
>
> While looking into options for LUCENE-10151, I noticed that
> {{ExitableDirectoryReader}} doesn't actually do any timeout checking once you
> start iterating postings/impacts. It *does* create a {{ExitableTermsEnum}}
> wrapper when loading a {{{}TermsEnum{}}}, but that wrapper doesn't do
> anything to wrap postings or impacts. So timeouts will be enforced when
> moving to the "next" term, but not when iterating the postings/impacts
> associated with a term.
> I think we ought to wrap the postings/impacts as well with some form of
> timeout checking so timeouts can be enforced on long-running queries. I'm not
> sure why this wasn't done originally (back in 2014), but it was questioned
> back in 2020 on the original Jira SOLR-5986. Does anyone know of a good
> reason why we shouldn't enforce timeouts in this way?
> Related, we may also want to wrap things like {{seekExact}} and {{seekCeil}}
> given that only {{next}} is being wrapped currently.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]