e
>-
>Uwe Schindler
>H.-H.-Meier-Allee 63, D-28213 Bremen
>http://www.thetaphi.de
>eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Wayne Xin [mailto:wayne_...@hotmail.com]
>> Sent: Friday, August 14, 2015 8:44 PM
>
; Sent: Friday, August 14, 2015 8:44 PM
> To: java-user@lucene.apache.org
> Subject: Re: getting full english word from tokenizing with
> SmartChineseAnalyzer
>
> Thanks Michael. That works well. Not sure why SmartChineseAnalyzer is
> final, otherwise we could overwrite createCompone
Thanks Michael. That works well. Not sure why SmartChineseAnalyzer is
final, otherwise we could overwrite createComponents().
New output:
女 单 方面 王 适 娴 second seed 和 头号 种子 卫冕 冠军 西班牙 选手 马 林
first seed 同 处 1 4 区 3 号
种子 李 雪 芮 和 韩国 选手 korean player 成 池 铉 处在 2 4 区 不过 成 池 铉
先 要 过 日本 小将
japanese player
The easiest thing to do is to create your own analyzer, cut and paste the
code from org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer into it,
and get rid of the line in createComponents(String fieldName, Reader
reader) that says
result = new PorterStemFilter(result);
On Fri, Aug 14,