Eric Blake wrote:
> Still, if we are advocating mixed locale
> execution, we MUST ensure sane defaults for ALL of the LC_* variables.

In theory all the LC_* variables, yes. In practice, only those locale categories
that the programs actually uses (e.g. test-quotearg.c). Which rarely goes
beyond LC_CTYPE, LC_NUMERIC, LC_COLLATE.

But whether that makes 3 variables that you have to set/unset, or 12,
is irrelevant. My point is that it's simpler - both conceptually and
for practical issues - to define a self-contained locale specification -
whether it contains 15 characters or 200 characters, doesn't matter -
rather than define half of it and then stumble across issues.

> We already have problems with Python refusing to import UTF-8 data in
> LC_ALL=C environments (which is arguably a bug in python, since POSIX
> says locale "C" is 8-bit clean and therefore cannot cause encoding
> errors).  C.UTF-8 does not exist everywhere, but does appear to shut up
> the python problem when mixed with LANG=C, at least on the platforms
> where the problems are encountered in the first place.

So, there is not even a universal good value of LC_ALL!
C programs linked against libc need one set of variables; Python programs
another set; Java programs another set, etc.

I am in favour of putting common "default code" or fallbacks into gnulib,
when there is no doubt that
  1. it is the correct default,
  2. it actually significantly helps the projects that use gnulib.
This is not the case here:
  1. You have shown that LC_ALL=C or LANG=C is not optimal for Python.
  2. The self-contained locale specification fits in 200 characters.

Bruno


Reply via email to