Hi Ken,

> > (3) assume charset=utf-8 (maybe allow this to be overridden in
> > profile)
>
> We already do (1) and (2).  (3) is the problem.  Other people who have
> thoughts on this topic are free to weigh in.  Personally, I believe
> that if you're doing LANG=C, you shouldn't be dealing with any 8-bit
> characters at all.  Isn't that's what that means?

Agreed.  I eventually moved from LC_ALL=C to LANG=en_GB.utf8 and it
isn't too painful these days.  GNU grep and others have worked on the
performance hit they had initially and for those times when I do want,
e.g. sort(1), to be in the C locale I use

    $ cat ~/bin/C
    #! /bin/sh

    # LC_ALL has precedence over LANG according to POSIX[1], but we may as
    # well stamp out any traces by setting LANG too.
    # 1.  The Open Group Base Specifications, Ch. 8 Environment Variables.

    LC_ALL=C LANG=C exec -- "$@"
    $

BTW, WRT spotting multi-byte UTF-8 encoding, I don't think that's a
goer.  Valid UTF-8 and valid GB2312 can share the same sequences,
especially if it's just the odd `£' or `拢` in ASCII text.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

_______________________________________________
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Reply via email to