On Mon, 3 Dec 2012 19:34:01 +0100 =?KOI8-R?B?z8zYx8Egy9LZ1sHOz9fTy8HR?= wrote:
> Glenn, as context, see http://www.unicode.org/faq/collation.html and
> http://www.unicode.org/reports/tr10/ (Unicode Technical Standard #10:
> Unicode Collation Algorithm).

> I think, because the libc locale modules do implement the collation
> sorting, the question is, do the UTF-8 locales implement the Unicode
> standard collation algorithm?
> This might be a question for Dr. Fink.

> On Mon, Dec 3, 2012 at 6:25 PM, Cedric Blancher
> <[email protected]> wrote:
> > On 3 December 2012 18:16, Glenn Fowler <[email protected]> wrote:

[ stuff elided ]

> >> I can't fathom reliable usage of -C in portable scripts
> >
> > Can you fathom reliable usage of tr -C when the locale is using UTF-8
> > encoding and follows Unicode standard conventions, i.e. the Unicode
> > standard collation order?

"portable" in this context means "for all possible inputs" which includes
locales and locales with different codesets and collation order

so with fixed codeset and collation order I can fathom it, but its not portable

perhaps someone can crituque my understanding of codeset vs collation
I believe these to be independent

        $ LC_COLLATE=C sort <<<$'a\nA\nb\nB'      
        A
        B
        a
        b
        $ LC_COLLATE=en_US sort <<<$'a\nA\nb\nB'  
        a
        A
        b
        B

so, w.r.t. posix, how does the "standard collation order of a codeset" have any
bearing on LC_COLLATE which seems to be a language/locale issue rather than
a codeset issue

_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Reply via email to