Query Expansion for Synonyms

Daniel Bigham Thu, 28 Apr 2016 08:27:07 -0700

I'm investigating various ways of supporting synonyms in Lucene.

One such approach that looks potentially interesting is to do a kind of"query expansion".

For example, if the user searches for "us 1888", one might expand thequery as follows:


    SpanNearQuery query =
    new SpanNearQuery(
        new SpanQuery[]
        {
            new SpanOrQuery(
                new SpanTermQuery(new Term("Plaintext", "us")),
                new SpanNearQuery(
                    new SpanQuery[]
                    {
                        new SpanTermQuery(new Term("Plaintext", "united")),
                        new SpanTermQuery(new Term("Plaintext", "states"))
                    },
                    0,
                    true
                )
            ),
            new SpanTermQuery(new Term("Plaintext", "1888"))
        },
        0,
        true
    );

A couple of questions:

- Is this approach in use within the community?
- Are there "gotchas" with this approach that make it undesirable?

I've done a few quick tests wrt query performance on a test index andfound that a query can indeed take 10x longer if enough synonyms areused, but if the baseline search time is around 1 ms, then 10 ms isstill plently fast enough. (that said, my test was on a 70 MB index, somy 10 ms might turn into something nasty with a 7 GB index)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Query Expansion for Synonyms

Reply via email to