I appreciate the fact that you want to know why you shouldn't use synonyms
at query time. I couldn't find the following articles during my last
response (I read them a while back and I have waaaaay too many bookmarks),
but I finally found them:

http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/

-- 
Ivan


On Tue, Jul 22, 2014 at 11:03 AM, Ivan Brusic <[email protected]> wrote:

> A couple of reasons. The biggest issue is multi word synonyms since the
> query parser will tokenize the query before analysis is applied. Also,
> scoring could be affected and the results can be screwy. Here is a better
> write up:
>
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
>
> --
> Ivan
>
>
> On Tue, Jul 22, 2014 at 10:47 AM, Daniel Yim <[email protected]> wrote:
>
>> Thank you! That solved the initial issue.
>>
>> Could you expand on why I would need two analyzers? I did what you asked,
>> but I am unsure of the reason behind it and would like to learn.
>>
>> Here are my updated settings:
>>
>> curl -XPUT "http://localhost:9200/personsearch"; -d'
>> {
>>   "settings": {
>>     "index": {
>>       "analysis": {
>>         "analyzer": {
>>           "XYZSynAnalyzer": {
>>             "tokenizer": "whitespace",
>>             "filter": [
>>               "lowercase",
>>               "XYZSynFilter"
>>             ]
>>           },
>>           "MyAnalyzer": {
>>             "tokenizer": "standard",
>>             "filter": [
>>               "standard",
>>               "lowercase",
>>               "stop"
>>             ]
>>           }
>>         },
>>         "filter": {
>>           "XYZSynFilter": {
>>             "type": "synonym",
>>             "synonyms": [
>>               "aids, retrovirology"
>>             ]
>>           }
>>         }
>>       }
>>     }
>>   },
>>   "mappings": {
>>     "xyzemployee": {
>>       "_all": {
>>         "analyzer": "XYZSynAnalyzer"
>>       },
>>       "properties": {
>>         "firstName": {
>>           "type": "string"
>>         },
>>         "lastName": {
>>           "type": "string"
>>         },
>>         "middleName": {
>>           "type": "string",
>>           "include_in_all": false,
>>           "index": "not_analyzed"
>>         },
>>         "specialty": {
>>           "type": "string",
>>           "index_analyzer": "XYZSynAnalyzer",
>>           "search_analyzer": "MyAnalyzer"
>>         }
>>       }
>>     }
>>   }
>> }'
>>
>> On Tuesday, July 22, 2014 11:56:40 AM UTC-5, Ivan Brusic wrote:
>>
>>> Your issue is casing. You are only applying the synonym filter, which by
>>> default does not lowercase terms. You can either set ignore_case to true
>>> for the synonym filter or apply a lower case filter before the synonym. I
>>> prefer to use the latter approach since I prefer to have all my analyzed
>>> tokens lowercased.
>>>
>>> Also, you should only apply the synonym filter at index time. You would
>>> need to create two similar analyzers, one with the synonym filter and one
>>> without. You can set the different ones via index_analyzer and
>>> search_analyzer.
>>>
>>> http://www.elasticsearch.org/guide/en/elasticsearch/
>>> reference/current/mapping-core-types.html#string
>>>
>>> Cheers,
>>>
>>> Ivan
>>>
>>>
>>> On Tue, Jul 22, 2014 at 9:33 AM, Daniel Yim <[email protected]> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I am relatively new to elasticsearch and am having issues with getting
>>>> my synonym filter to work. Can you take a look at the settings and tell me
>>>> where I am going wrong?
>>>>
>>>> I am expecting the search for "aids" to match the search results if I
>>>> were to search for "retrovirology", but this is not happening.
>>>>
>>>> Thanks!
>>>>
>>>>  curl -XDELETE "http://localhost:9200/personsearch";
>>>>
>>>> curl -XPUT "http://localhost:9200/personsearch"; -d'
>>>> {
>>>>   "settings": {
>>>>     "index": {
>>>>       "analysis": {
>>>>         "analyzer": {
>>>>           "XYZSynAnalyzer": {
>>>>             "tokenizer": "standard",
>>>>             "filter": [
>>>>               "XYZSynFilter"
>>>>             ]
>>>>           }
>>>>         },
>>>>         "filter": {
>>>>           "XYZSynFilter": {
>>>>             "type": "synonym",
>>>>             "synonyms": [
>>>>               "aids, retrovirology"
>>>>             ]
>>>>           }
>>>>         }
>>>>       }
>>>>     }
>>>>   },
>>>>   "mappings": {
>>>>     "xyzemployee": {
>>>>       "_all": {
>>>>         "analyzer": "XYZSynAnalyzer"
>>>>       },
>>>>       "properties": {
>>>>         "firstName": {
>>>>           "type": "string"
>>>>         },
>>>>         "lastName": {
>>>>           "type": "string"
>>>>         },
>>>>         "middleName": {
>>>>           "type": "string",
>>>>           "include_in_all": false,
>>>>           "index": "not_analyzed"
>>>>         },
>>>>         "specialty": {
>>>>           "type": "string",
>>>>           "analyzer": "XYZSynAnalyzer"
>>>>         }
>>>>       }
>>>>     }
>>>>   }
>>>> }'
>>>>
>>>> curl -XPUT "http://localhost:9200/personsearch/xyzemployee/1"; -d'
>>>> {
>>>>   "firstName": "Don",
>>>>   "middleName": "W.",
>>>>   "lastName": "White",
>>>>   "specialty": "Adult Retrovirology"
>>>> }'
>>>>
>>>> curl -XPUT "http://localhost:9200/personsearch/xyzemployee/2"; -d'
>>>> {
>>>>   "firstName": "Terrance",
>>>>   "middleName": "G.",
>>>>   "lastName": "Gartner",
>>>>   "specialty": "Retrovirology"
>>>> }'
>>>>
>>>> curl -XPUT "http://localhost:9200/personsearch/xyzemployee/3"; -d'
>>>> {
>>>>   "firstName": "Carter",
>>>>   "middleName": "L.",
>>>>   "lastName": "Taylor",
>>>>   "specialty": "Pediatric Retrovirology"
>>>> }'
>>>>
>>>> curl -XGET "http://localhost:9200/personsearch/xyzemployee/_
>>>> search?pretty=true" -d'
>>>> {
>>>>   "query": {
>>>>     "match": {
>>>>       "specialty": "retrovirology"
>>>>     }
>>>>   }
>>>> }'
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>>
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/2e227a33-d935-4d22-89fb-57b59358c89d%
>>>> 40googlegroups.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/2e227a33-d935-4d22-89fb-57b59358c89d%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/c860ecf3-e4ae-4aad-8a44-e41166f7995e%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/c860ecf3-e4ae-4aad-8a44-e41166f7995e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCp2%2BQVRBsTDG-3KoJeSsvqZcg%3DxEeCq%3DE1PEETp4GmLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to