Ned Deily added the comment:
I've looked at this a bit, primarily on OS X 10.9 Mavericks, although I expect
mostly similar behavior on older recent releases of OS X. On 10.9, the setting
of locale variables is done by whatever program is used to launch a shell. I
looked at the behavior of the built-in Terminal.app, the third-party
iTerm2.app, the MacPorts distribution of xterm, and the built-in sshd. By
default, the latter two do not set any locale env variables. Both Terminal.app
and iTerm2.app set either LANG or LC_CTYPE based on the user's settings for
"Region" and "Preferred Language" in the "System Preferences" -> "Language &
Region" control panel. Three examples:
1. "Region" = "United States", "Preferred Language" = "English":
-> LANG=en_US.UTF-8
2. "Region" = "Germany", "Preferred Language" = "German"
-> LANG=de_DE.UTF-8
3. "Region" = "Germany", "Preferred Language" = "English"
-> LC_CTYPE= "UTF-8"
So it is almost certainly the last case that is under discussion here. Whether
or not that is a bug is not as clear as it might seem at first. BSD
implementations of locale differ from the GNU Linux version. Both FreeBSD and
OS X define a "UTF-8" locale that has only one locale category defined in it:
LC_CTYPE. It appears to be a fallback locale used when there is no applicable
region / language combination, in this case no "en_DE*" locales.
$ ls /usr/share/locale/UTF*
LC_CTYPE
Compare with the en_US* locales:
$ ls /usr/share/locale/en_US*
/usr/share/locale/en_US:
LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME
/usr/share/locale/en_US.ISO8859-1:
LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME
/usr/share/locale/en_US.ISO8859-15:
LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME
/usr/share/locale/en_US.US-ASCII:
LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME
/usr/share/locale/en_US.UTF-8:
LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME
Now as I read the current POSIX standard, there is nothing wrong with this.
AFAICT, the standard places no restriction on the format of locale names, in
particular, it does not mandate that they conform to RFC 1766 or its
successors. Further, the standard provides for implementation-specific locales
(other than the mandatory "POSIX" aka "C" locale) and some platforms provide
tools to create custom locales, e.g. mklocale(1) on FreeBSD and OS X,
localedef(1) on GNU Linux. So I wonder if the locale module should really be
imposing its own restrictions on locale names as it does currently.
>From IEEE Std 1003.1, 2013 Edition:
"The capability to specify additional locales to those provided by an
implementation is optional, denoted by the _POSIX2_LOCALEDEF symbol. If the
option is not supported, only implementation-supplied locales are available.
Such locales shall be documented using the format specified in this section.
[...] The locale definition file shall contain one or more locale category
source definitions, and shall not contain more than one definition for the same
locale category. [...] In the event that some of the information for a locale
category, as specified in this volume of POSIX.1-2008, is missing from the
locale source definition, the behavior of that category, if it is referenced,
is unspecified."
There is a further complication for OS X. Apple provides a richer native API
for locales, CFLocale (and its Cocoa equivalent, NSLocale). So some nuances
may get lost in the imperfect mapping between CFLocale and the conventional
LC_* environment variables and between them and Python. We could look at
trying to support the native APIs as well.
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07
https://developer.apple.com/library/mac/documentation/CoreFoundation/Conceptual/CFLocales/CFLocales.html
https://developer.apple.com/library/mac/documentation/CoreFoundation/Reference/CFLocaleRef/Reference/reference.html
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue18378>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com