> So all that's needed is a Japanese, Chinese, and Korean text corpus to
> "train" the identifier?  Can the LanguageIdentifier deal and properly handle
> multi-byte character sets?

In theory, yes, but I have not yet tested it.

Jérôme


--
http://motrech.free.fr/
http://www.frutch.org/

Reply via email to