mmn

jnbbbjb)nkkkk9nooooooon

Sent from my HTC

----- Reply message -----
From: "Shawn Heisey" <s...@elyograg.org>
To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
Subject: ICUTokenizer acting very strangely with oriental characters
Date: Tue, Aug 12, 2014 19:00

See the original message on this thread for full details.  Some
additional information:

This happens on version 4.6.1, 4.7.2, and 4.9.0.  Here is a screenshot
showing the analysis problem in more detail.  The first line you can see
is the ICUTokenizer.

https://www.dropbox.com/s/9wbi7lz77ivya9j/ICUTokenizer-wrong-analysis.png

The original field value was:

20世紀の100人;ポートレートアーカイブス;政治家・軍人;政治家・指導
者・軍人;[政 治],100peopeof20century,pploftwentycentury,pploftwentycentury

Thanks,
Shawn

Reply via email to