KeywordTokenizerFactory with SynonymFilterFactory
Hi I have the following 2 field types fieldType name=tokenizer1 class=solr.TextField sortMissingLast=true autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=false expand=true/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType fieldType name=tokenizer2 class=solr.TextField sortMissingLast=true autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=false expand=true/ /analyzer /fieldType The problem I am seeing is if I have an entry as this in the synonyms.txt file helping hand = assistance then issuing helping hand query (with dismax) to the field tokenized with tokenizer1 returns the correct query (assistance) whereas there is no synonym mapping for tokenizer2 (confirmed in Solr admin panel). Am I doing something wrong? thank you
RE: KeywordTokenizerFactory with SynonymFilterFactory
Try changing the tokenizer2 SynonymFilterFactory filter to this: filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=false expand=true tokenizerFactory=solr.KeywordTokenizerFactory/ By default, it seems that it uses WhitespaceTokenizer. -Michael
Re: KeywordTokenizerFactory with SynonymFilterFactory
thank you Michael. On Jun 16, 2012, at 6:40 PM, Michael Ryan wrote: Try changing the tokenizer2 SynonymFilterFactory filter to this: filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=false expand=true tokenizerFactory=solr.KeywordTokenizerFactory/ By default, it seems that it uses WhitespaceTokenizer. -Michael