On 23/09/2015 16:23, Alexandre Rafalovitch wrote:
You may find the following articles interesting:
http://discovery-grindstone.blogspot.ca/2014/01/searching-in-solr-analyzing-results-and.html
( a whole epic journey)
https://dzone.com/articles/indexing-chinese-solr

The latter article is great and we drew on it when helping a recent client with Chinese indexing. However, if you do use Paoding bear in mind that it has few if any tests and all the comments are in Chinese. We found a problem with it recently (it breaks the Lucene highlighters) and have submitted a patch: http://git.oschina.net/zhzhenqin/paoding-analysis/issues/1

Cheers

Charlie

Regards,
    Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 23 September 2015 at 10:41, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote:
Hi,

Would like to check, will StandardTokenizerFactory works well for indexing
both English and Chinese (Bilingual) documents, or do we need tokenizers
that are customised for chinese (Eg: HMMChineseTokenizerFactory)?


Regards,
Edwin


--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk

Reply via email to