https://issues.apache.org/bugzilla/show_bug.cgi?id=47726



--- Comment #3 from Peter S. Housel <hou...@acm.org> 2009-08-31 10:19:51 PDT ---
(In reply to comment #2)
> The Unicode UAX#14 indicates that proper line breaking for the Thai language
> involves morphological analysis in order to determine word boundaries. The
> standard considered this as too complex and left it to the "higher levels
> of processing".
> The libthai project (http://linux.thai.net/projects/libthai) produces open
> source
> software for this purpose, written in C/C++, which is used by Mozilla, Gnome
> applications and other OSS. Apparently, Java applications aren't as easily
> supported, yet.

The com.ibm.icu.text.ThaiBreakIterator class in recent versions of ICU4J can
supposedly do this. It makes use of an included dictionary of Thai words in
order to locate valid break points.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to