On Thu, Jan 24, 2013 at 10:53 AM, Jerome Lanneluc <jerome_lanne...@fr.ibm.com> wrote: > It looks like my attachment was lost. It referred to > org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer. >
I think this analyzer will not properly tokenize text outside of the BMP: it pretty much only works for simplified text (e.g. chars from GB2312 range) --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org