I was browing NutchAnalysis.jj and found that Hungul Syllables (U+AC00 ... U+D7AF; U+xxxx means a Unicode character of the hex value xxxx) are not part of LETTER or CJK class. This seems to me that Nutch cannot handle Korean documents at all.
Is anybody successfully using Nutch for Korean? -kuro
