Re: AUTHORS list and the C locale on Mac OS X

Reece Dunn Wed, 10 Nov 2010 15:01:45 -0800

On 10 November 2010 22:45, Ken Thomases <k...@codeweavers.com> wrote:
> On Nov 10, 2010, at 2:27 PM, Hin-Tak Leung wrote:
>
>> --- On Wed, 10/11/10, Ken Thomases <k...@codeweavers.com> wrote:
>>
>>> Are you sure about that?  Checking on a couple of
>>> Linux systems here, the "locale" command reports:
>>>
>>> $ locale
>>> LANG=en_US.UTF-8
>>> LC_CTYPE="en_US.UTF-8"
>>> ...
>>
>> mine (fedora x86_64) does the utf8 thing:
>>
>> # locale
>> LANG=en_GB.utf8
>> LC_CTYPE="en_GB.utf8"
>> ...
>>
>> so there is some truth in the reporter's assertion - what it means is that 
>> it varies between different linux'es!!!
>
> I should have been clearer.  The output just reflects your environment.  So, 
> you have LANG set to en_GB.utf8.  I had LANG set to en_US.UTF-8.  My only 
> point was to say that the "UTF-8" form is acceptable.  It was not to suggest 
> that "utf8" is not, nor that one or the other is a standard.
>
> The real question is: does the Linux C library accept 'UTF-8' in the 
> environment variables?  I believe it does, which is useful because that's 
> what Mac OS X requires.  (It doesn't accept "utf8".)
>
> For example, the following reports just fine on some Linux systems here:
>
> LC_ALL=en_GB.UTF-8 locale
>
> As does your case:
>
> LC_ALL=en_GB.utf8 locale
>
> But the following both produce some diagnostics indicating that the C library 
> is choking on the value:
>
> LC_ALL=en_GB.bogus locale
> LC_ALL=en_GB.UTF-9 locale
>
> I take this to mean it's a legitimate test of whether a value is valid.  
> Further, it indicates that (at least some) Linuxes take either form.


I'm getting the same behaviour (Ubuntu 10.10) -- LC_ALL accepts either
utf8 or UTF-8 for en_GB, en_IE, etc. The caveat here is that the
primary locale needs to exist (and presumably needs to have a UTF-8
valiant present).

That is, as I don't have a French locale (fr_FR) installed on my
machine, the following reports errors:

LC_ALL=fr_FR.UTF-8 locale

This means that systems that don't have the English locale installed
(en_US or en_GB, whichever is chosen) will still fail.

What is wrong with iterating over the content of `locale -a` or
`locale -a | grep -F utf8` to find a UTF-8 based locale? Or even:

LC_ALL=`locale -a | grep -F utf8 | head -n 1` sed ... authors.c

- Reece

Re: AUTHORS list and the C locale on Mac OS X

Reply via email to