Hmmmm... The Analyzer shows me *almost* what I am expecting to see. When I show it being verbose with debug info, I can see exactly what is going on, which is great. Thanks for the tip.
What's happening (for most of my test cases) is that some of the synonyms are multiple words (and it's a big synonym list), and then also the word delimiter is creating even more terms. The analyzer finds a match in individual words (highlighted words) but the query engine makes a more complex. Consider: a document with the text "the quick brown fox jumps over the lazy dog" in a "body" field of type "text" like in schema mentioned above. a synonym list like: dog,canine,mut,domestic dog,barker wretch,dog hound,dog,pooch,doggy and query for the word "dog" The analyzer creates two terms, like this: Term position 1: dog,canin,mut,domest,barker,wretch,hound,pooch,doggi Term position 2: dog (here, the synonym "domestic dog" for "dog" creates two tokens: "domestic" and "dog") And highlights the word dog in the query. So the analyzer can find it. The query is parsed into: MultiPhraseQuery(text:"(dog canin mut domest barker wretch hound pooch doggi) dog") Which only matches a document with "dog dog" or "canine dog" or "domestic dog" (etc) in it. If these words are separated, eg: "a canine is a kind of dog" then we get no match! :( Why does a two word synonym require a two word match for all synonyms? I was also hoping that the synonym list might be one way: ie: dog expands to hound but not wretch in the example above. Is there a way to do this too? (that might be a story for another thread). Thanks, Matt -- View this message in context: http://www.nabble.com/Synonyms-list-breaks-solr-tp18401710p18405876.html Sent from the Solr - User mailing list archive at Nabble.com.