Re: Lucene and Chinese language

lwl Thu, 01 Jul 2010 02:31:06 -0700

yes, the StandardAnalyzer interpret each Chinese letter as one word.
Better analyzers for chinese are here:
http://hi.baidu.com/lewutian/blog/item/ca61060a06914b1394ca6b25.html


在 2010年7月1日 下午5:19，Kolhoff, Jacqueline - ENCOWAY <kolh...@encoway.de>写道：

>
> Hi!
>
> We are using lucene in our project to search through information objects
> which works fine. For indexing we use the StandardAnalyzer.
> Now, we have to support the Chinese language. I found out that the Chinese
> words and letters are correctly saved in the index but the query to search
> for them does not work. Example: in English language the query is “text”
> which we parse to “*text*”. If we search for Chinese words / phrases like
> “佛山东方书城”the query is “*佛山东方书城*“ but there are no search results. If the
> query places blanks between the single letters / symbols like this “*佛 山 东 方
> 书 城*“ we are getting results. Does the StandardAnalyzer interpret each
> Chinese letter as one word? What are best practices for this case? Shall we
> use another analyzer (Chinese analyzer)? Or is it better to replace the
> query parser in this case?
>
> Regards,
> Jacqueline.
>

Re: Lucene and Chinese language

Reply via email to