I was browing NutchAnalysis.jj and found that
Hungul Syllables (U+AC00 ... U+D7AF; U+xxxx means
a Unicode character of the hex value xxxx) are not
part of LETTER or CJK class.  This seems to me that
Nutch cannot handle Korean documents at all.

Is anybody successfully using Nutch for Korean?

-kuro


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to