hi, wufuheng, first: if you are using lucene or nutch for indexing chinese content, I recommend weblucene for you , you could get more info at : http://www.chedong.com . second: cjk sentence split is quite different , for chinese , the very famous is use
ICTCLAS , you could search it at google, and I write a chinese sentence spliter , by java, c sharp ,both. you can get that at: http://www.domolo.com/tec/index.htm or write a letter to : [EMAIL PROTECTED] hope this will help you. transbuerg tian beijing,china http://www.domolo.com 2005/5/24, wu fuheng <[EMAIL PROTECTED]>: > > Dear all, > I think Nutch is a good wrapper for Lucene and with a good crawler. > Now if I want to build some Chinese/Japan/Korean Language search > application. Should I start from Lucene or Nutch? How Nutch does > support CJK application? > Sincerely your, > Simon >
