Hi,
"one two three four five six"
We are unable to match the above text using the query (small reproducer at the
bottom):
spanNear([spanNear([f:one, spanOr([f:two, f:three])], 1, true), f:five], 1,
true)
The human readable form is "one W~1 (two OR three) W~1 five", which reads like
("one" within 1 slop of "two" or "three") and within 1 slop of "five".
We think it should match as "<b>one</b> two <b>three</b> four <b>five</b>", but
it seems the inner spanNear sees "one two" as satisfying the criteria and does
not consider "three", which is required for an overall match. If we increase
the slops to 2, we do get a match. However, a slop of 1 looks sufficient here.
Could this be a bug with SpanNearQuery?
Thank you,
Kenny Wong
public class LuceneTest {
public static void main(String[] args) throws Exception {
RAMDirectory mem = new RAMDirectory();
IndexWriter writer = new IndexWriter(mem,
new IndexWriterConfig(new WhitespaceAnalyzer()));
try {
Document doc = new Document();
Field f = new TextField("f", "one two three four five six",
Store.NO);
doc.add(f);
writer.addDocument(doc);
}
finally {
writer.close();
}
SpanQuery q = newSpanNear(1,
newSpanNear(1, newSpanTerm("one"), newSpanOr(newSpanTerm("two"),
newSpanTerm("three"))),
newSpanTerm("five"));
try (DirectoryReader reader = DirectoryReader.open(mem)) {
TopDocs topDocs = new IndexSearcher(reader).search(q, 1);
System.out.println(1 == topDocs.totalHits);
}
}
static SpanQuery newSpanTerm(String text) {
return new SpanTermQuery(new Term("f", text));
}
static SpanQuery newSpanNear(int slop, SpanQuery... clauses) {
return new SpanNearQuery(clauses, slop, true);
}
static SpanQuery newSpanOr(SpanQuery...clauses) {
return new SpanOrQuery(clauses);
}
}