Hi,
Remi Locherer wrote on Sat, Mar 16, 2019 at 11:26:54PM +0100:
> On Sat, Mar 16, 2019 at 09:48:21PM +0000, Stuart Henderson wrote:
>> On 2019/03/16 22:12, Remi Locherer wrote:
>>> Index: lang/python/python.port.mk
>>> ===================================================================
>>> RCS file: /cvs/ports/lang/python/python.port.mk,v
>>> retrieving revision 1.100
>>> diff -u -p -r1.100 python.port.mk
>>> --- lang/python/python.port.mk 4 Dec 2018 05:57:31 -0000 1.100
>>> +++ lang/python/python.port.mk 16 Mar 2019 20:40:34 -0000
>>> @@ -150,6 +150,7 @@ CONFIGURE_ENV += PYTHON="${MODPY_BIN}"
>>> CONFIGURE_ENV += ac_cv_prog_PYTHON="${MODPY_BIN}" \
>>> ac_cv_path_PYTHON="${MODPY_BIN}"
>>> .endif
>>> +TEST_ENV += LC_CTYPE=C.UTF-8
>> Do we actually support LC_CTYPE=C.UTF-8?
We do not recommend it; the locale(1) manual page only recommends
either leaving LC_* unset or setting LC_CTYPE=en_US.UTF-8.
But we do support it, just like we support LC_CTYPE=FooBar.UTF-8.
The locale(1) manual page says:
If the value of LC_CTYPE ends in ".UTF-8", programs in the OpenBSD
base system ignore the beginning of it, treating for example
zh_CN.UTF-8 exactly like en_US.UTF-8. Programs from packages(7)
may however make a difference.
Theoretically, it could happen that at some point in the future,
we might stop supporting LC_CTYPE=FooBar.UTF-8, though i do not
expect that. But i don't think we could ever stop supporting
LC_CTYPE=C.UTF-8. It seems too widespread in practice, and removing
support for it would no doubt break more than one thing in ports.
> According to "locale -a" we do.
The -a option of locale(1) is a scam. It should never be used as
an argument for or against anything. Its only purpose is to appease
ports who insist in inspecting it, but even for that purpose, it
is far from perfect.
> I proposed C.UTF-8 because I think this is what python prefers
> after skimming over https://www.python.org/dev/peps/pep-0538/ .
If that's what Python folks like, i think there is nothing wrong with
using it; on OpenBSD, it's just an alias for LC_CTYPE=en_US.UTF-8.
On other operating systems, it may or may not work, exactly like
LC_CTYPE=en_US.UTF-8 may or may not work elsewehere. I have seen
systems where LC_CTYPE=en_US.UTF-8 works and LC_CTYPE=C.UTF-8 does
not and vice versa; i think i have even seen systems where neither
work. Locale names are simply not standardized - except for "C"
and "POSIX".
I didn't speak up earlier because the last time i did a substantial
programming project in Python was about a decade ago, so i'm no
longer qualified to OK or to object to Python patches...
Yours,
Ingo