[ 
https://issues.apache.org/jira/browse/LUCENE-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819158#comment-15819158
 ] 

Paul Elschot edited comment on LUCENE-7628 at 1/11/17 9:09 PM:
---------------------------------------------------------------

And if there was also something like SpanAndMergeQuery that merges the Spans 
positions when all of them are present in a document?

This could have an AndMergeSpans as a subclass of DisjunctionSpans above.


was (Author: paul.elsc...@xs4all.nl):
And if there was also something like SpanAndMergeQuery that merges the Spans 
positions when all of them are present in a document?

> Add a getMatchingChildren() method to DisjunctionScorer
> -------------------------------------------------------
>
>                 Key: LUCENE-7628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7628
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Minor
>
> This one is a bit convoluted, so bear with me...
> The luwak highlighter works by rewriting queries into their Span-equivalents, 
> and then running them with a special Collector.  At each matching doc, the 
> highlighter gathers all the Spans objects positioned on the current doc and 
> collects their positions using the SpanCollection API.
> Some queries can't be translated into Spans.  For those queries that generate 
> Scorers with ChildScorers, like BooleanQuery, we can call .getChildren() on 
> the Scorer and see if any of them are SpanScorers, and for those that aren't 
> we can call .getChildren() again and recurse down.  For each child scorer, we 
> check that it's positioned on the current document, so non-matching 
> subscorers can be skipped.
> This all works correctly *except* in the case of a DisjunctionScorer where 
> one of the children is a two-phase iterator that has matched its 
> approximation, but not its refinement query.  A SpanScorer in this situation 
> will be correctly positioned on the current document, but its Spans will be 
> in an undefined state, meaning the highlighter will either collect incorrect 
> hits, or it will throw an Exception and prevent hits being collected from 
> other subspans.
> We've tried various ways around this (including forking SpanNearQuery and 
> adding a bunch of slow position checks to it that are used only by the 
> highlighting code), but it turns out that the simplest fix is to add a new 
> method to DisjunctionScorer that only returns the currently matching child 
> Scorers.  It's a bit of a hack, and it won't be used anywhere else, but it's 
> a fairly small and contained hack.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to