On Tue, Feb 26, 2002 at 09:42:25AM +0900, Tomohiro KUBOTA wrote: > > Kanji appear to be getting collated, however: > > > > 05:13pm [EMAIL PROTECTED]/2 [~] sort > > 日本 > > 綺麗 > > 日本 > > (eof) > > 日本 > > 日本 > > 綺麗 > > > > (I couldn't tell if that's the correct collation order, but it's clear > > they're being reordered, where the hiragana above are not.) > > It is impossible to collate Kanji by using simple functions such > as strcoll(), because one Kanji has several readings depending on > context (or word) in most cases. (This is Japanese case). > (It is technically virtually impossible. It will need natural > language understanding algorithm.)
I'm not concerned about the collation order of Kanji. (It's probably useful that there be one, even if it's just UCS order, to allow ie. "sort | uniq".) There does seem to be collation for Kanji; I showed this to distinguish it from hiragana. The question was, why aren't katakana and hiragana getting collated? As far as I can tell, they should be. -- Glenn Maynard -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
