FWIW, multi-word synonyms is a side benefit of query parsing approach
implemented by my team.
Here how it looks like
https://docs.google.com/a/griddynamics.com/presentation/pub?id=1oifLFI0MiA3ZyXZWisHJVRK13P8cki5yCABvABPObKw&start=false&loop=false&delayms=3000#slide=id.g1006de00_2_34"fee
people" frequent typo has been substituted to the correct brand.
Five previous slides depicts the approach, the main idea is get rid of old
good Boolean retrieval, and introduce own notion of matching.
I can share more details if you wish.


On Tue, Jan 22, 2013 at 7:17 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hello,
>
> I'm looking for some guidance around solving the infamous index-time vs.
> query-time multi-word synonym problem.  Looking for help with understanding
> the pieces and effort involved, and also being on a lookout for any
> potential "man, it will take you forever, you'll have to do major Lucene
> surgery" type of warnings.
>
> I never looked deeply into this problem and my understanding is that
> multi-word synonyms don't work at query-time because QueryParser(?) simply
> breaks queries on spaces and thus makes it impossible for
> SynonymTokenFilter (?) to "see" the non-broken-up token sequence and do
> synonym expansion.
>
> I think this is also documented on the Wiki.
> Are there other pieces involved that I didn't mention, but should have?
>
> The following are 3 different efforts I found:
> https://issues.apache.org/jira/browse/LUCENE-4499
> http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
> http://www.ub.uni-bielefeld.de/~befehl/base/solr/eurovoc.html
>
> Plus Jack's proposal:
> http://search-lucene.com/m/Zkj0k15dDGP1
>
> Does any of the above approaches sound like the right one, or at least in
> the right direction, and stands the chance of being accepted?
>
> Thanks,
> Otis
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
 <mkhlud...@griddynamics.com>

Reply via email to