> Thanks, this helps. > But our synonym file has some 16,000 sets of synonyms.
Thats a lot. Can you give some examples? > - the individual synonyms in your synonym file should be in > a form as if they were sent through the tokenizers which > come before the SynonymFilterFactory. Exactly. Orders of filters are very important. (choice of Tokenizer and CharFilter also) For example if you have StemFilter before SynonymFilter then your syn.txt should contain stemmed synonyms. IMO Absalom\, Absalom!, William Faulkner is an ugly entry. absalom absalom, william faulkner is a beautiful entry. > Would it be possible for Solr to apply the Tokenizer in use > while reading the synonym file? Then the user would only > need the original synonym file, and their could not be a > conflict. Purpose of tokenizer is to break free form text into words (tokens). May be you can try to use solr.MappingCharFilterFactory with the mapping.txt "absalom absalom" => "william faulkner" But i am not sure if it is a good idea. Also I am not sure about its case sensitivity.