Hello,

I'm working on a project that involves search in Japanese and uses
synonyms. The Japanese tokenizer creates an analysis graph, but the
SynonymGraphFilter states it cannot take a graph as input. After a few
tests I've seen it can create some unusual outputs if given a graph as
input. The SynonymFilter is marked deprecated, and has documentation
pointing out it doesn't handle multiple synonym paths correctly.

My question is what is the 'correct' way to handle synonyms with Japanese
in Lucene? should the graph be flattened before the SynonymGraphFilter,
then flattened again after? This seems extra lossy. Is the correct answer
to make SynonymGraphFilter accept graphs as inputs? is there another option
that I'm missing?

thanks,
Geoff

Reply via email to