Hello,

I'm looking for some guidance around solving the infamous index-time vs.
query-time multi-word synonym problem.  Looking for help with understanding
the pieces and effort involved, and also being on a lookout for any
potential "man, it will take you forever, you'll have to do major Lucene
surgery" type of warnings.

I never looked deeply into this problem and my understanding is that
multi-word synonyms don't work at query-time because QueryParser(?) simply
breaks queries on spaces and thus makes it impossible for
SynonymTokenFilter (?) to "see" the non-broken-up token sequence and do
synonym expansion.

I think this is also documented on the Wiki.
Are there other pieces involved that I didn't mention, but should have?

The following are 3 different efforts I found:
https://issues.apache.org/jira/browse/LUCENE-4499
http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
http://www.ub.uni-bielefeld.de/~befehl/base/solr/eurovoc.html

Plus Jack's proposal:
http://search-lucene.com/m/Zkj0k15dDGP1

Does any of the above approaches sound like the right one, or at least in
the right direction, and stands the chance of being accepted?

Thanks,
Otis

Reply via email to