[ 
https://issues.apache.org/jira/browse/LUCENE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823633#comment-15823633
 ] 

Artem Lukanin commented on LUCENE-7398:
---------------------------------------

This issue describes only a partial problem, when 2 SpanTerms are at the same 
position inside SpanOr. But there is a general problem, when SpanTerms has 
different positions. For example, if I want to find this text "aa bb fineness 
cc ee colority dd" these queries will not find it, because only the first 
SpanTerm from 2 is taken into account:

{code:java}
  @Test
  public void testNestedOrQuery3() throws IOException {
    SpanNearQuery snq = new SpanNearQuery.Builder(FIELD, 
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
        .addClause(
            new SpanNearQuery.Builder(FIELD, 
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
                .addClause(new SpanTermQuery(new Term(FIELD, "aa")))
                .addClause(new SpanOrQuery(
                    new SpanTermQuery(new Term(FIELD, "bb")),
                    new SpanNearQuery.Builder(FIELD, 
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
                        .addClause(new SpanTermQuery(new Term(FIELD, "cc")))
                        .addClause(new SpanTermQuery(new Term(FIELD, "ee")))
                        .setSlop(2)
                        .build()
                ))
                .setSlop(2)
                .build()
        )
        .addClause(new SpanTermQuery(new Term(FIELD, "dd")))
        .setSlop(2)
        .build();

    Spans spans = snq.createWeight(searcher, 
false).getSpans(searcher.getIndexReader().leaves().get(0), 
SpanWeight.Postings.POSITIONS);
    assertEquals(8, spans.advance(8));
  }

  @Test
  public void testNestedOrQuery4() throws IOException {
    SpanNearQuery snq = new SpanNearQuery.Builder(FIELD, 
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
        .addClause(new SpanTermQuery(new Term(FIELD, "aa")))
        .addClause(new SpanOrQuery(
            new SpanTermQuery(new Term(FIELD, "bb")),
            SpanNearQuery.newOrderedNearQuery(FIELD)
                .addClause(new SpanTermQuery(new Term(FIELD, "cc")))
                .addClause(new SpanTermQuery(new Term(FIELD, "ee")))
                .setSlop(2)
                .build()
        ))
        .addClause(new SpanTermQuery(new Term(FIELD, "dd")))
        .setSlop(2)
        .build();

    Spans spans = snq.createWeight(searcher, 
false).getSpans(searcher.getIndexReader().leaves().get(0), 
SpanWeight.Postings.POSITIONS);
    assertEquals(8, spans.advance(8));
  }
{code}

Also, the patch only works for SpanNear of more than 2 subclauses and the same 
binary-clauses test does not work either:

{code:java}
  @Test
  public void testNestedOrQuery2() throws IOException {
    SpanNearQuery snq = new SpanNearQuery.Builder(FIELD, 
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
        .addClause(
            new SpanNearQuery.Builder(FIELD, 
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
                .addClause(new SpanTermQuery(new Term(FIELD, "coordinate")))
                .addClause(new SpanOrQuery(
                    new SpanTermQuery(new Term(FIELD, "gene")),
                    new SpanNearQuery.Builder(FIELD, 
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
                        .addClause(new SpanTermQuery(new Term(FIELD, "gene")))
                        .addClause(new SpanTermQuery(new Term(FIELD, 
"mapping")))
                        .build()
                ))
                .build()
        )
        .addClause(new SpanTermQuery(new Term(FIELD, "research")))
        .build();

    Spans spans = snq.createWeight(searcher, 
false).getSpans(searcher.getIndexReader().leaves().get(0), 
SpanWeight.Postings.POSITIONS);
    assertEquals(4, spans.advance(4));
    assertEquals(5, spans.nextDoc());
  }
{code}

> Nested Span Queries are buggy
> -----------------------------
>
>                 Key: LUCENE-7398
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7398
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 5.5, 6.x
>            Reporter: Christoph Goller
>            Assignee: Alan Woodward
>            Priority: Critical
>         Attachments: LUCENE-7398-20160814.patch, LUCENE-7398-20160924.patch, 
> LUCENE-7398-20160925.patch, LUCENE-7398.patch, LUCENE-7398.patch, 
> LUCENE-7398.patch, TestSpanCollection.java
>
>
> Example for a nested SpanQuery that is not working:
> Document: Human Genome Organization , HUGO , is trying to coordinate gene 
> mapping research worldwide.
> Query: spanNear([body:coordinate, spanOr([spanNear([body:gene, body:mapping], 
> 0, true), body:gene]), body:research], 0, true)
> The query should match "coordinate gene mapping research" as well as 
> "coordinate gene research". It does not match  "coordinate gene mapping 
> research" with Lucene 5.5 or 6.1, it did however match with Lucene 4.10.4. It 
> probably stopped working with the changes on SpanQueries in 5.3. I will 
> attach a unit test that shows the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to