Hello, I'm looking for some guidance around solving the infamous index-time vs. query-time multi-word synonym problem. Looking for help with understanding the pieces and effort involved, and also being on a lookout for any potential "man, it will take you forever, you'll have to do major Lucene surgery" type of warnings.
I never looked deeply into this problem and my understanding is that multi-word synonyms don't work at query-time because QueryParser(?) simply breaks queries on spaces and thus makes it impossible for SynonymTokenFilter (?) to "see" the non-broken-up token sequence and do synonym expansion. I think this is also documented on the Wiki. Are there other pieces involved that I didn't mention, but should have? The following are 3 different efforts I found: https://issues.apache.org/jira/browse/LUCENE-4499 http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ http://www.ub.uni-bielefeld.de/~befehl/base/solr/eurovoc.html Plus Jack's proposal: http://search-lucene.com/m/Zkj0k15dDGP1 Does any of the above approaches sound like the right one, or at least in the right direction, and stands the chance of being accepted? Thanks, Otis
