Hi, From: srintuar26 <[EMAIL PROTECTED]> Subject: Re: supporting XIM Date: Sun, 30 Mar 2003 19:25:41 -0500
> > - Tcl/Tk's XIM support is unstable even now. (Every time I try to > > input Japanese, it sticks). When I read Tcl/Tk's roadmap in > > version 8.0 age, I was really surprised that XIM support (essential > > for CJK, as you know) is very low priority. > > eh, XIM needs to be dropped imo. From personal observation, building > tools such as XIM and IIIMF which are integrated into the X server is > the wrong way to go, and GTK+ input methods seem to work much better. Why wrong? Anyway, CJK people are waiting for years. No more vaporware. Note that Tcl/Tk-based softwares which need text input are not usable at all because of this problem. > > - Text line wrapping. Chinese and Japanese (not Korean) don't use > > whitespace between "words". > > Ooh, that makes me curious: is there a good discussion of how to > line-break Japanese text? I wonder how browsers are doing it... (Non) usage of space in Chinese and Japanese causes problem on text search system such as mnoGoSearch. Now mnoGoSearch developer team seems to be thinking about using ChaSen to analyze Japanese text (though ChaSen doesn't support Chinese). Also, I cannot imagine a Japanese dictionary for ispell. Line-break in Japanese can be done almost any places except for several symbols (like kuten and touten which are like period and comma in English sentences). Also Japanese sentences often contain Latin alphabets (for example, there are many companies whose names are written in Latin alphabets, like SONY, NEC, and so on) and whitespace. Note that LF code in original Japanese text must not regarded as a space (Don't insert a space when connecting Japanese lines). However, Thai is much more difficult. It doesn't use whitespace between words, but line-breaking must be done at borders of words. It means that Thai dictionary is needed to achieve correct line- breaking for Thai. --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
