Hi,

"one two three four five six"

We are unable to match the above text using the query (small reproducer at the 
bottom):

    spanNear([spanNear([f:one, spanOr([f:two, f:three])], 1, true), f:five], 1, 
true)

The human readable form is "one W~1 (two OR three) W~1 five", which reads like 
("one" within 1 slop of "two" or "three") and within 1 slop of "five".

We think it should match as "<b>one</b> two <b>three</b> four <b>five</b>", but 
it seems the inner spanNear sees "one two" as satisfying the criteria and does 
not consider "three", which is required for an overall match. If we increase 
the slops to 2, we do get a match. However, a slop of 1 looks sufficient here.

Could this be a bug with SpanNearQuery?

Thank you,
Kenny Wong

public class LuceneTest {

    public static void main(String[] args) throws Exception {
        RAMDirectory mem = new RAMDirectory();
        IndexWriter writer = new IndexWriter(mem,
            new IndexWriterConfig(new WhitespaceAnalyzer()));
        try {
            Document doc = new Document();
            Field f = new TextField("f", "one two three four five six", 
Store.NO);
            doc.add(f);
            writer.addDocument(doc);
        }
        finally {
            writer.close();
        }

        SpanQuery q = newSpanNear(1,
            newSpanNear(1, newSpanTerm("one"), newSpanOr(newSpanTerm("two"), 
newSpanTerm("three"))),
            newSpanTerm("five"));

        try (DirectoryReader reader = DirectoryReader.open(mem)) {
            TopDocs topDocs = new IndexSearcher(reader).search(q, 1);
            System.out.println(1 == topDocs.totalHits);
        }
    }

    static SpanQuery newSpanTerm(String text) {
        return new SpanTermQuery(new Term("f", text));
    }

    static SpanQuery newSpanNear(int slop, SpanQuery... clauses) {
        return new SpanNearQuery(clauses, slop, true);
    }

    static SpanQuery newSpanOr(SpanQuery...clauses) {
        return new SpanOrQuery(clauses);
    }
}

Reply via email to