KeywordTokenizerFactory with SynonymFilterFactory

2012-06-16 Thread Peyman Faratin
Hi

I have the following 2 field types

fieldType name=tokenizer1 class=solr.TextField sortMissingLast=true 
autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=false expand=true/ 
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType


fieldType name=tokenizer2 class=solr.TextField sortMissingLast=true 
autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=false expand=true/ 
  /analyzer
/fieldType

The problem I am seeing is if I have an entry as this in the synonyms.txt file

helping hand = assistance

then issuing helping hand query (with dismax) to the field tokenized with 
tokenizer1 returns the correct query (assistance) whereas there is no synonym 
mapping for tokenizer2 (confirmed in Solr admin panel). 

Am I doing something wrong?

thank you




RE: KeywordTokenizerFactory with SynonymFilterFactory

2012-06-16 Thread Michael Ryan
Try changing the tokenizer2 SynonymFilterFactory filter to this:

filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=false expand=true 
tokenizerFactory=solr.KeywordTokenizerFactory/

By default, it seems that it uses WhitespaceTokenizer.

-Michael


Re: KeywordTokenizerFactory with SynonymFilterFactory

2012-06-16 Thread Peyman Faratin
thank you Michael.

On Jun 16, 2012, at 6:40 PM, Michael Ryan wrote:

 Try changing the tokenizer2 SynonymFilterFactory filter to this:
 
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=false expand=true 
 tokenizerFactory=solr.KeywordTokenizerFactory/
 
 By default, it seems that it uses WhitespaceTokenizer.
 
 -Michael