On Tue, Feb 26, 2002 at 09:42:25AM +0900, Tomohiro KUBOTA wrote:
> > Kanji appear to be getting collated, however:
> > 
> > 05:13pm [EMAIL PROTECTED]/2 [~] sort
> > 日本
> > 綺麗
> > 日本
> > (eof)
> > 日本
> > 日本
> > 綺麗
> > 
> > (I couldn't tell if that's the correct collation order, but it's clear
> > they're being reordered, where the hiragana above are not.)
> 
> It is impossible to collate Kanji by using simple functions such
> as strcoll(), because one Kanji has several readings depending on
> context (or word) in most cases.  (This is Japanese case).
> (It is technically virtually impossible.  It will need natural
> language understanding algorithm.)

I'm not concerned about the collation order of Kanji.  (It's probably
useful that there be one, even if it's just UCS order, to allow ie.
"sort | uniq".)  There does seem to be collation for Kanji; I showed
this to distinguish it from hiragana.

The question was, why aren't katakana and hiragana getting collated?  As
far as I can tell, they should be.

-- 
Glenn Maynard
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to