Does it work for you if you change the order of your filters; put the synonym expansion after the stemming? But be very careful here, this will also have consequences...
Best Erick On Thu, Apr 4, 2013 at 3:59 AM, Okke Klein (JIRA) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/SOLR-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621944#comment-13621944 > ] > > Okke Klein commented on SOLR-4381: > ---------------------------------- > > The terms that are being expanded by the solr.SynonymFilterFactory are also > being stemmed. This is unwanted if you want to expand "MIA" to "missing in > action" and not "miss in action". See [Github > issue|https://github.com/healthonnet/hon-lucene-synonyms/issues/14] for > details. > > > > > > >> Query-time multi-word synonym expansion >> --------------------------------------- >> >> Key: SOLR-4381 >> URL: https://issues.apache.org/jira/browse/SOLR-4381 >> Project: Solr >> Issue Type: Improvement >> Components: query parsers >> Reporter: Nolan Lawson >> Priority: Minor >> Labels: multi-word, queryparser, synonyms >> Fix For: 4.3 >> >> Attachments: SOLR-4381-2.patch, SOLR-4381.patch >> >> >> This is an issue that seems to come up perennially. >> The [Solr >> docs|http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory] >> caution that index-time synonym expansion should be preferred to query-time >> synonym expansion, due to the way multi-word synonyms are treated and how >> IDF values can be boosted artificially. But query-time expansion should have >> huge benefits, given that changes to the synonyms don't require re-indexing, >> the index size stays the same, and the IDF values for the documents don't >> get permanently altered. >> The proposed solution is to move the synonym expansion logic from the >> analysis chain (either query- or index-type) and into a new QueryParser. >> See the attached patch for an implementation. >> The core Lucene functionality is untouched. Instead, the EDismaxQParser is >> extended, and synonym expansion is done on-the-fly. Queries are parsed into >> a lattice (i.e. all possible synonym combinations), while individual >> components of the query are still handled by the EDismaxQParser itself. >> It's not an ideal solution by any stretch. But it's nice and self-contained, >> so it invites experimentation and improvement. And I think it fits in well >> with the merry band of misfit query parsers, like {{func}} and {{frange}}. >> More details about this solution can be found in [this blog >> post|http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/] and >> [the Github page for the >> code|https://github.com/healthonnet/hon-lucene-synonyms]. >> At the risk of tooting my own horn, I also think this patch sufficiently >> fixes SOLR-3390 (highlighting problems with multi-word synonyms) and >> LUCENE-4499 (better support for multi-word synonyms). > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
