[
https://issues.apache.org/jira/browse/LUCENE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823633#comment-15823633
]
Artem Lukanin commented on LUCENE-7398:
---------------------------------------
This issue describes only a partial problem, when 2 SpanTerms are at the same
position inside SpanOr. But there is a general problem, when SpanTerms has
different positions. For example, if I want to find this text "aa bb fineness
cc ee colority dd" these queries will not find it, because only the first
SpanTerm from 2 is taken into account:
{code:java}
@Test
public void testNestedOrQuery3() throws IOException {
SpanNearQuery snq = new SpanNearQuery.Builder(FIELD,
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
.addClause(
new SpanNearQuery.Builder(FIELD,
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
.addClause(new SpanTermQuery(new Term(FIELD, "aa")))
.addClause(new SpanOrQuery(
new SpanTermQuery(new Term(FIELD, "bb")),
new SpanNearQuery.Builder(FIELD,
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
.addClause(new SpanTermQuery(new Term(FIELD, "cc")))
.addClause(new SpanTermQuery(new Term(FIELD, "ee")))
.setSlop(2)
.build()
))
.setSlop(2)
.build()
)
.addClause(new SpanTermQuery(new Term(FIELD, "dd")))
.setSlop(2)
.build();
Spans spans = snq.createWeight(searcher,
false).getSpans(searcher.getIndexReader().leaves().get(0),
SpanWeight.Postings.POSITIONS);
assertEquals(8, spans.advance(8));
}
@Test
public void testNestedOrQuery4() throws IOException {
SpanNearQuery snq = new SpanNearQuery.Builder(FIELD,
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
.addClause(new SpanTermQuery(new Term(FIELD, "aa")))
.addClause(new SpanOrQuery(
new SpanTermQuery(new Term(FIELD, "bb")),
SpanNearQuery.newOrderedNearQuery(FIELD)
.addClause(new SpanTermQuery(new Term(FIELD, "cc")))
.addClause(new SpanTermQuery(new Term(FIELD, "ee")))
.setSlop(2)
.build()
))
.addClause(new SpanTermQuery(new Term(FIELD, "dd")))
.setSlop(2)
.build();
Spans spans = snq.createWeight(searcher,
false).getSpans(searcher.getIndexReader().leaves().get(0),
SpanWeight.Postings.POSITIONS);
assertEquals(8, spans.advance(8));
}
{code}
Also, the patch only works for SpanNear of more than 2 subclauses and the same
binary-clauses test does not work either:
{code:java}
@Test
public void testNestedOrQuery2() throws IOException {
SpanNearQuery snq = new SpanNearQuery.Builder(FIELD,
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
.addClause(
new SpanNearQuery.Builder(FIELD,
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
.addClause(new SpanTermQuery(new Term(FIELD, "coordinate")))
.addClause(new SpanOrQuery(
new SpanTermQuery(new Term(FIELD, "gene")),
new SpanNearQuery.Builder(FIELD,
SpanNearQuery.MatchNear.ORDERED_LOOKAHEAD)
.addClause(new SpanTermQuery(new Term(FIELD, "gene")))
.addClause(new SpanTermQuery(new Term(FIELD,
"mapping")))
.build()
))
.build()
)
.addClause(new SpanTermQuery(new Term(FIELD, "research")))
.build();
Spans spans = snq.createWeight(searcher,
false).getSpans(searcher.getIndexReader().leaves().get(0),
SpanWeight.Postings.POSITIONS);
assertEquals(4, spans.advance(4));
assertEquals(5, spans.nextDoc());
}
{code}
> Nested Span Queries are buggy
> -----------------------------
>
> Key: LUCENE-7398
> URL: https://issues.apache.org/jira/browse/LUCENE-7398
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/search
> Affects Versions: 5.5, 6.x
> Reporter: Christoph Goller
> Assignee: Alan Woodward
> Priority: Critical
> Attachments: LUCENE-7398-20160814.patch, LUCENE-7398-20160924.patch,
> LUCENE-7398-20160925.patch, LUCENE-7398.patch, LUCENE-7398.patch,
> LUCENE-7398.patch, TestSpanCollection.java
>
>
> Example for a nested SpanQuery that is not working:
> Document: Human Genome Organization , HUGO , is trying to coordinate gene
> mapping research worldwide.
> Query: spanNear([body:coordinate, spanOr([spanNear([body:gene, body:mapping],
> 0, true), body:gene]), body:research], 0, true)
> The query should match "coordinate gene mapping research" as well as
> "coordinate gene research". It does not match "coordinate gene mapping
> research" with Lucene 5.5 or 6.1, it did however match with Lucene 4.10.4. It
> probably stopped working with the changes on SpanQueries in 5.3. I will
> attach a unit test that shows the problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]