I appreciate the fact that you want to know why you shouldn't use synonyms at query time. I couldn't find the following articles during my last response (I read them a while back and I have waaaaay too many bookmarks), but I finally found them:
http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -- Ivan On Tue, Jul 22, 2014 at 11:03 AM, Ivan Brusic <[email protected]> wrote: > A couple of reasons. The biggest issue is multi word synonyms since the > query parser will tokenize the query before analysis is applied. Also, > scoring could be affected and the results can be screwy. Here is a better > write up: > > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory > > -- > Ivan > > > On Tue, Jul 22, 2014 at 10:47 AM, Daniel Yim <[email protected]> wrote: > >> Thank you! That solved the initial issue. >> >> Could you expand on why I would need two analyzers? I did what you asked, >> but I am unsure of the reason behind it and would like to learn. >> >> Here are my updated settings: >> >> curl -XPUT "http://localhost:9200/personsearch" -d' >> { >> "settings": { >> "index": { >> "analysis": { >> "analyzer": { >> "XYZSynAnalyzer": { >> "tokenizer": "whitespace", >> "filter": [ >> "lowercase", >> "XYZSynFilter" >> ] >> }, >> "MyAnalyzer": { >> "tokenizer": "standard", >> "filter": [ >> "standard", >> "lowercase", >> "stop" >> ] >> } >> }, >> "filter": { >> "XYZSynFilter": { >> "type": "synonym", >> "synonyms": [ >> "aids, retrovirology" >> ] >> } >> } >> } >> } >> }, >> "mappings": { >> "xyzemployee": { >> "_all": { >> "analyzer": "XYZSynAnalyzer" >> }, >> "properties": { >> "firstName": { >> "type": "string" >> }, >> "lastName": { >> "type": "string" >> }, >> "middleName": { >> "type": "string", >> "include_in_all": false, >> "index": "not_analyzed" >> }, >> "specialty": { >> "type": "string", >> "index_analyzer": "XYZSynAnalyzer", >> "search_analyzer": "MyAnalyzer" >> } >> } >> } >> } >> }' >> >> On Tuesday, July 22, 2014 11:56:40 AM UTC-5, Ivan Brusic wrote: >> >>> Your issue is casing. You are only applying the synonym filter, which by >>> default does not lowercase terms. You can either set ignore_case to true >>> for the synonym filter or apply a lower case filter before the synonym. I >>> prefer to use the latter approach since I prefer to have all my analyzed >>> tokens lowercased. >>> >>> Also, you should only apply the synonym filter at index time. You would >>> need to create two similar analyzers, one with the synonym filter and one >>> without. You can set the different ones via index_analyzer and >>> search_analyzer. >>> >>> http://www.elasticsearch.org/guide/en/elasticsearch/ >>> reference/current/mapping-core-types.html#string >>> >>> Cheers, >>> >>> Ivan >>> >>> >>> On Tue, Jul 22, 2014 at 9:33 AM, Daniel Yim <[email protected]> wrote: >>> >>>> Hi everyone, >>>> >>>> I am relatively new to elasticsearch and am having issues with getting >>>> my synonym filter to work. Can you take a look at the settings and tell me >>>> where I am going wrong? >>>> >>>> I am expecting the search for "aids" to match the search results if I >>>> were to search for "retrovirology", but this is not happening. >>>> >>>> Thanks! >>>> >>>> curl -XDELETE "http://localhost:9200/personsearch" >>>> >>>> curl -XPUT "http://localhost:9200/personsearch" -d' >>>> { >>>> "settings": { >>>> "index": { >>>> "analysis": { >>>> "analyzer": { >>>> "XYZSynAnalyzer": { >>>> "tokenizer": "standard", >>>> "filter": [ >>>> "XYZSynFilter" >>>> ] >>>> } >>>> }, >>>> "filter": { >>>> "XYZSynFilter": { >>>> "type": "synonym", >>>> "synonyms": [ >>>> "aids, retrovirology" >>>> ] >>>> } >>>> } >>>> } >>>> } >>>> }, >>>> "mappings": { >>>> "xyzemployee": { >>>> "_all": { >>>> "analyzer": "XYZSynAnalyzer" >>>> }, >>>> "properties": { >>>> "firstName": { >>>> "type": "string" >>>> }, >>>> "lastName": { >>>> "type": "string" >>>> }, >>>> "middleName": { >>>> "type": "string", >>>> "include_in_all": false, >>>> "index": "not_analyzed" >>>> }, >>>> "specialty": { >>>> "type": "string", >>>> "analyzer": "XYZSynAnalyzer" >>>> } >>>> } >>>> } >>>> } >>>> }' >>>> >>>> curl -XPUT "http://localhost:9200/personsearch/xyzemployee/1" -d' >>>> { >>>> "firstName": "Don", >>>> "middleName": "W.", >>>> "lastName": "White", >>>> "specialty": "Adult Retrovirology" >>>> }' >>>> >>>> curl -XPUT "http://localhost:9200/personsearch/xyzemployee/2" -d' >>>> { >>>> "firstName": "Terrance", >>>> "middleName": "G.", >>>> "lastName": "Gartner", >>>> "specialty": "Retrovirology" >>>> }' >>>> >>>> curl -XPUT "http://localhost:9200/personsearch/xyzemployee/3" -d' >>>> { >>>> "firstName": "Carter", >>>> "middleName": "L.", >>>> "lastName": "Taylor", >>>> "specialty": "Pediatric Retrovirology" >>>> }' >>>> >>>> curl -XGET "http://localhost:9200/personsearch/xyzemployee/_ >>>> search?pretty=true" -d' >>>> { >>>> "query": { >>>> "match": { >>>> "specialty": "retrovirology" >>>> } >>>> } >>>> }' >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/2e227a33-d935-4d22-89fb-57b59358c89d% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/2e227a33-d935-4d22-89fb-57b59358c89d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/c860ecf3-e4ae-4aad-8a44-e41166f7995e%40googlegroups.com >> <https://groups.google.com/d/msgid/elasticsearch/c860ecf3-e4ae-4aad-8a44-e41166f7995e%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCp2%2BQVRBsTDG-3KoJeSsvqZcg%3DxEeCq%3DE1PEETp4GmLw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
