On Mon, 3 Dec 2012 19:34:01 +0100 =?KOI8-R?B?z8zYx8Egy9LZ1sHOz9fTy8HR?= wrote: > Glenn, as context, see http://www.unicode.org/faq/collation.html and > http://www.unicode.org/reports/tr10/ (Unicode Technical Standard #10: > Unicode Collation Algorithm).
> I think, because the libc locale modules do implement the collation > sorting, the question is, do the UTF-8 locales implement the Unicode > standard collation algorithm? > This might be a question for Dr. Fink. > On Mon, Dec 3, 2012 at 6:25 PM, Cedric Blancher > <[email protected]> wrote: > > On 3 December 2012 18:16, Glenn Fowler <[email protected]> wrote: [ stuff elided ] > >> I can't fathom reliable usage of -C in portable scripts > > > > Can you fathom reliable usage of tr -C when the locale is using UTF-8 > > encoding and follows Unicode standard conventions, i.e. the Unicode > > standard collation order? "portable" in this context means "for all possible inputs" which includes locales and locales with different codesets and collation order so with fixed codeset and collation order I can fathom it, but its not portable perhaps someone can crituque my understanding of codeset vs collation I believe these to be independent $ LC_COLLATE=C sort <<<$'a\nA\nb\nB' A B a b $ LC_COLLATE=en_US sort <<<$'a\nA\nb\nB' a A b B so, w.r.t. posix, how does the "standard collation order of a codeset" have any bearing on LC_COLLATE which seems to be a language/locale issue rather than a codeset issue _______________________________________________ ast-developers mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-developers
