I prefer fuzzy search for misspellings. Solr does a very nice job with those, weighting them by the similarity to the matched term.
wunder On Dec 12, 2012, at 4:45 PM, Jack Krupansky wrote: > Another great use case for synonyms is misspellings. I saw one synonym list > in which the top synonym was the phrase "dead mouse" (which doesn't look > misspelled at all); I won't tell you what it's "proper" synonym was, other > than to say that it was VERY app/culture-dependent. It was also interesting > because the user's original query phrase needed to be given a much lower > weighting in order to find what the user was "likely" looking for. > > -- Jack Krupansky > > -----Original Message----- From: Walter Underwood > Sent: Wednesday, December 12, 2012 7:16 PM > To: solr-user@lucene.apache.org > Subject: Re: Can a field with defined synonym be searched without the synonym? > > If you have tons of content, you can do selective reindexing. You only need > to reindex the docs containing the the new terms. If I add a synonym for > "babysitter" and "baby sitter", then I can do a search for documents > containing either of those, and only reindex those. > > Reverse weighting to even out the IDF would work, but it could be pretty > tweaky. If one synonym is very rare, you put in small weight, but then you > index several documents with that term and the it is overweighted. > > wunder > > On Dec 12, 2012, at 4:09 PM, Jack Krupansky wrote: > >> Sure, synonyms have lots of issues and choosing index vs. query is simply >> picking your poison, but it all depends on your app and your data and your >> user expectations, and you, the developer, have tools to moderate a lot of >> these issues. >> >> Index-time synonyms have the problem (among others) that they cannot be >> changed without reindexing. >> >> One technique is to simulate the query-time synonym filter expansion by >> having your app preprocess user queries to expand to the OR of the synonyms >> and then boost or de-boost the synonyms as makes sense for your app. >> >> For example, >> >> (tv^0.5 OR television^2.5 OR "boob tube"^0.0001) >> >> -- Jack Krupansky >> >> -----Original Message----- From: Steve Rowe >> Sent: Wednesday, December 12, 2012 5:28 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Can a field with defined synonym be searched without the >> synonym? >> >> Hmm, I've gotten this very wrong :) - DisjunctionMaxQuery will operate >> per-doc, so using it in the way I suggested will not allow for synonym IDF >> leveling across documents. Also, scoring obviously includes more factors >> than IDF. >> >> On Dec 12, 2012, at 5:18 PM, Steve Rowe <sar...@gmail.com> wrote: >> >>> But couldn't the IDF problem be fixed by applying the same IDF to all >>> synonyms, e.g. via DisjunctionMaxQuery? (Maybe the ideal would be an >>> average, not a max.) >>> >>> (E)dismax applies this query per-field, but AFAICT there is nothing >>> stopping anybody (modulo query parser construction :) ) from using it on >>> synonyms in the same field. >>> >>> Steve >>> >>> On Dec 12, 2012, at 12:50 PM, Walter Underwood <wun...@wunderwood.org> >>> wrote: >>> >>>> Query parsers cannot fix the IDF problem or make query-time synonyms >>>> faster. Query synonym expansion makes more search terms. More search terms >>>> are more work at query time. >>>> >>>> The IDF problem is real; I've run up against it. The most rare variant of >>>> the synonym have the highest score. This probably the opposite of what you >>>> want. For me, it was "TV" and "television". Documents with "TV" had higher >>>> scores than those with "television". >>>> >>>> wunder >>>> >>>> On Dec 12, 2012, at 9:45 AM, Roman Chyla wrote: >>>> >>>>> @wunder >>>>> It is a misconception (well, supported by that wiki description) that the >>>>> query time synonym filter have these problems. It is actually the default >>>>> parser, that is causing these problems. Look at this if you still think >>>>> that index time synonyms are cure for all: >>>>> https://issues.apache.org/jira/browse/LUCENE-4499 >>>>> >>>>> @joe >>>>> If you can use the flexible query parser (as linked in by @Swati) then all >>>>> you need to do is to define a different field with a different tokenizer >>>>> chain and then swap the field names before the analyzers processes the >>>>> document (and then rewrite the field name back - for example, we have >>>>> fields called "author" and "author_nosyn") >>>>> >>>>> roman >>>>> >>>>> On Wed, Dec 12, 2012 at 12:38 PM, Walter Underwood >>>>> <wun...@wunderwood.org>wrote: >>>>> >>>>>> Query time synonyms have known problems. They are slower, cause incorrect >>>>>> IDF, and don't work for phrase synonyms. >>>>>> >>>>>> Apply synonyms at index time and you will have none of those problems. >>>>>> >>>>>> See: >>>>>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory >>>>>> >>>>>> wunder >>>>>> >>>>>> On Dec 12, 2012, at 9:34 AM, Swati Swoboda wrote: >>>>>> >>>>>>> Query-time analyzers are still applied, even if you include a string in >>>>>> quotes. Would you expect "foo" to not match "Foo" just because it's >>>>>> enclosed in quotes? >>>>>>> >>>>>>> Also look at this, someone who had similar requirements: >>>>>>> >>>>>> http://lucene.472066.n3.nabble.com/Synonym-Filter-disable-at-query-time-td2919876.html >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: joe.cohe...@gmail.com [mailto:joe.cohe...@gmail.com] >>>>>>> Sent: Wednesday, December 12, 2012 12:09 PM >>>>>>> To: solr-user@lucene.apache.org >>>>>>> Subject: Re: Can a field with defined synonym be searched without the >>>>>> synonym? >>>>>>> >>>>>>> >>>>>>> I'm aplying only query-time synonym, so I have the original values >>>>>> stored and indexed. >>>>>>> I would've expected that if I search a strin with quotations, i'll get >>>>>> the exact match, without applying a synonym. >>>>>>> >>>>>>> any way to achieve that? >>>>>>> >>>>>>> >>>>>>> Upayavira wrote >>>>>>>> You can only search against terms that are stored in your index. If >>>>>>>> you have applied index time synonyms, you can't remove them at query >>>>>> time. >>>>>>>> >>>>>>>> You can, however, use copyField to clone an incoming field to another >>>>>>>> field that doesn't use synonyms, and search against that field instead. >>>>>>>> >>>>>>>> Upayavira >>>>>>>> >>>>>>>> On Wed, Dec 12, 2012, at 04:26 PM, >>>>>>> >>>>>>>> joe.cohen.m@ >>>>>>> >>>>>>>> wrote: >>>>>>>>> Hi >>>>>>>>> I hava a field type without defined synonym.txt which retrieves both >>>>>>>>> records with "home" and "house" when I search either one of them. >>>>>>>>> >>>>>>>>> I want to be able to search this field on the specific value that I >>>>>>>>> enter, without the synonym filter. >>>>>>>>> >>>>>>>>> is it possible? >>>>>>>>> >>>>>>>>> thanks. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> View this message in context: >>>>>>>>> http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-b >>>>>>>>> e-searched-without-the-synonym-tp4026381.html >>>>>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>> http://lucene.472066.n3.nabble.com/Can-a-field-with-defined-synonym-be-searched-without-the-synonym-tp4026381p4026405.html >>>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>> >>>>>> -- >>>>>> Walter Underwood >>>>>> wun...@wunderwood.org >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>>> -- >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> >>>> >>>> > > -- > Walter Underwood > wun...@wunderwood.org > > > -- Walter Underwood wun...@wunderwood.org