[ 
https://issues.apache.org/jira/browse/LUCENE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892972#comment-15892972
 ] 

Paul Elschot edited comment on LUCENE-7398 at 3/2/17 8:54 PM:
--------------------------------------------------------------

One way to view the problem is that when span end positions are used to 
determine the slop, it becomes impossible to determine an order for moving the 
subspans to a next position.

So one direction out of this could be: use NearSpans that determines the slop 
only by the start positions of the subspans. That leaves only the cases in 
which the subspans can start (and maybe also end) at the same position.
To make sure that all the subspans move forward after a match we could move 
them all forward until after the current match, and while doing that also 
count/collect them for scoring/highlighting as long as they are within the 
match. That should solve the bug reported here, which is about scoring a missed 
matching occurrence.

This limits the required slop to using only the starting positions of the 
subspans. Could this work?



was (Author: [email protected]):
On way to view the problem is that when span end positions are used to 
determine the slop, it becomes impossible to determine an order for moving the 
subspans to a next position.

So one direction out of this could be: use NearSpans that determines the slop 
only by the start positions of the subspans. That leaves only the cases in 
which the subspans can start (and maybe also end) at the same position.
To make sure that all the subspans move forward after a match we could move 
them all forward until after the current match, and while doing that also 
count/collect them for scoring/highlighting as long as they are within the 
match. That should solve the bug reported here, which is about scoring a missed 
matching occurrence.

This limits the required slop to using only the starting positions of the 
subspans. Could this work?


> Nested Span Queries are buggy
> -----------------------------
>
>                 Key: LUCENE-7398
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7398
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 5.5, 6.x
>            Reporter: Christoph Goller
>            Assignee: Alan Woodward
>            Priority: Critical
>         Attachments: LUCENE-7398-20160814.patch, LUCENE-7398-20160924.patch, 
> LUCENE-7398-20160925.patch, LUCENE-7398.patch, LUCENE-7398.patch, 
> LUCENE-7398.patch, TestSpanCollection.java
>
>
> Example for a nested SpanQuery that is not working:
> Document: Human Genome Organization , HUGO , is trying to coordinate gene 
> mapping research worldwide.
> Query: spanNear([body:coordinate, spanOr([spanNear([body:gene, body:mapping], 
> 0, true), body:gene]), body:research], 0, true)
> The query should match "coordinate gene mapping research" as well as 
> "coordinate gene research". It does not match  "coordinate gene mapping 
> research" with Lucene 5.5 or 6.1, it did however match with Lucene 4.10.4. It 
> probably stopped working with the changes on SpanQueries in 5.3. I will 
> attach a unit test that shows the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to