[
https://issues.apache.org/jira/browse/LUCENE-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819158#comment-15819158
]
Paul Elschot edited comment on LUCENE-7628 at 1/11/17 9:09 PM:
---------------------------------------------------------------
And if there was also something like SpanAndMergeQuery that merges the Spans
positions when all of them are present in a document?
This could have an AndMergeSpans as a subclass of DisjunctionSpans above.
was (Author: [email protected]):
And if there was also something like SpanAndMergeQuery that merges the Spans
positions when all of them are present in a document?
> Add a getMatchingChildren() method to DisjunctionScorer
> -------------------------------------------------------
>
> Key: LUCENE-7628
> URL: https://issues.apache.org/jira/browse/LUCENE-7628
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Alan Woodward
> Assignee: Alan Woodward
> Priority: Minor
>
> This one is a bit convoluted, so bear with me...
> The luwak highlighter works by rewriting queries into their Span-equivalents,
> and then running them with a special Collector. At each matching doc, the
> highlighter gathers all the Spans objects positioned on the current doc and
> collects their positions using the SpanCollection API.
> Some queries can't be translated into Spans. For those queries that generate
> Scorers with ChildScorers, like BooleanQuery, we can call .getChildren() on
> the Scorer and see if any of them are SpanScorers, and for those that aren't
> we can call .getChildren() again and recurse down. For each child scorer, we
> check that it's positioned on the current document, so non-matching
> subscorers can be skipped.
> This all works correctly *except* in the case of a DisjunctionScorer where
> one of the children is a two-phase iterator that has matched its
> approximation, but not its refinement query. A SpanScorer in this situation
> will be correctly positioned on the current document, but its Spans will be
> in an undefined state, meaning the highlighter will either collect incorrect
> hits, or it will throw an Exception and prevent hits being collected from
> other subspans.
> We've tried various ways around this (including forking SpanNearQuery and
> adding a bunch of slow position checks to it that are used only by the
> highlighting code), but it turns out that the simplest fix is to add a new
> method to DisjunctionScorer that only returns the currently matching child
> Scorers. It's a bit of a hack, and it won't be used anywhere else, but it's
> a fairly small and contained hack.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]