Best practices for Solr highlighter for CJK

Tom Burton-West Wed, 02 Jan 2013 10:52:14 -0800

Hello all,

What are the best practices for setting up the highlighter to work with CJK?
We are using the ICUTokenizer with the CJKBigramFilter, so overlapping
bigrams are what are actually being searched. However the highlighter seems
to only highlight the first of any two overlapping bigrams.   i.e.  ABC =>
searched as AB BC  only AB gets highlighted even if the matching string is
ABC. (Where ABC are chinese characters such as 大亚湾  => searched as 大亚 亚湾,
but only   大亚 is highlighted rather than 大亚湾)


Is there some highlighting parameter that might fix this?

Tom Burton-West

Best practices for Solr highlighter for CJK

Reply via email to