Hi Tian&Wu

I suppose nutch now supports CJK bi-gram segmentation now.

/Jack

On 5/25/05, Transbuerg Tian <[EMAIL PROTECTED]> wrote:
> hi, wufuheng,
> 
> first:
> if you are using lucene or nutch for indexing chinese content,
> I recommend weblucene for you , you could get more info at :
> http://www.chedong.com .
> second:
> cjk sentence split is quite different , for chinese , the very famous is use
> 
> ICTCLAS , you could search it at google,
> 
> and I write a chinese sentence spliter , by java, c sharp ,both.
> 
> you can get that at: http://www.domolo.com/tec/index.htm
> or write a letter to : [EMAIL PROTECTED]
> 
> hope this will help you.
> 
> transbuerg tian
> beijing,china
> http://www.domolo.com
> 
> 
> 
> 
> 2005/5/24, wu fuheng <[EMAIL PROTECTED]>:
> >
> > Dear all,
> > I think Nutch is a good wrapper for Lucene and with a good crawler.
> > Now if I want to build some Chinese/Japan/Korean Language search
> > application. Should I start from Lucene or Nutch? How Nutch does
> > support CJK application?
> > Sincerely your,
> > Simon
> >
> 
>

Reply via email to