Thanks, this helps. But our synonym file has some 16,000 sets of synonyms.
Should the wiki warn users? - WhitespaceTokenizerFactory with synonyms at indexing will not expand synonyms in text "... synonym[punctuation mark] ..." - the individual synonyms in your synonym file should be in a form as if they were sent through the tokenizers which come before the SynonymFilterFactory With a WhitespaceTokenizerFactory: Flaubert's Parrot, Julian Barnes A History of the World in 10½ Chapters, Julian Barnes England\, England, Julian Barnes Arthur & George, Julian Barnes Absalom\, Absalom!, William Faulkner k-nearest neighbors algorithm, k-NN, k nn With a StandardTokenizerFactory: Flaubert's Parrot, Julian Barnes A History of the World in 10 Chapters, Julian Barnes England England, Julian Barnes Arthur George, Julian Barnes Absalom Absalom, William Faulkner k nearest neighbors algorithm, k-NN, k nn, knn This means that when changing the TokenizerFactory you also might have to change your synonym file. But the change may be irreversible (you can't reconstruct the first version from the second one). Would it be possible for Solr to apply the Tokenizer in use while reading the synonym file? Then the user would only need the original synonym file, and their could not be a conflict. regards geert > > You lose the WordDelimiterFilterFactory functionality: > > > > Syn.txt has: ADC, HIV-dementie > > Search on "ADC" doesn't find document with "HIV-dementie". > > synonym filter can handle multi word synonyms. Replace Syn.txt to > Syn.txt has: ADC, HIV dementie > > And search on "ADC" will find document with "HIV-dementie". > > hope this helps.