Nolan Lawson created SOLR-4381:
----------------------------------
Summary: Query-time multi-word synonym expansion
Key: SOLR-4381
URL: https://issues.apache.org/jira/browse/SOLR-4381
Project: Solr
Issue Type: Improvement
Reporter: Nolan Lawson
Priority: Minor
This is an issue that seems to come up perennially.
The [Solr
docs|http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory]
caution that index-time synonym expansion should be preferred to query-time
synonym expansion, due to the way multi-word synonyms are treated and how IDF
values can be boosted artificially. But query-time expansion should have huge
benefits, given that changes to the synonyms don't require re-indexing, the
index size stays the same, and the IDF values for the documents don't get
permanently altered.
The proposed solution is to move the synonym expansion logic from the analysis
chain (either query- or index-type) and into a new QueryParser. See the
attached patch for an implementation.
The core Lucene functionality is untouched. Instead, the EDismaxQParser is
extended, and synonym expansion is done on-the-fly. Queries are parsed into a
lattice (i.e. all possible synonym combinations), while individual components
of the query are still handled by the EDismaxQParser itself.
It's not an ideal solution by any stretch. But it's nice and self-contained, so
it invites experimentation and improvement. And I think it fits in well with
the merry band of misfit query parsers, like {{func}} and {{frange}}.
More details about this solution can be found in [this blog
post|http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/] and
[the Github page for the
code|https://github.com/healthonnet/hon-lucene-synonyms].
At the risk of tooting my own horn, I also think this patch sufficiently fixes
SOLR-3390 (highlighting problems with multi-word synonyms) and LUCENE-4499
(better support for multi-word synonyms).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]