[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-11-28 Thread STINNER Victor
STINNER Victor added the comment: The initial bug has been fixed, I close the issue. -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-11-28 Thread STINNER Victor
STINNER Victor added the comment: See also bpo-28604: localeconv() doesn't support LC_MONETARY encoding different than LC_CTYPE encoding. -- ___ Python tracker ___

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-10-17 Thread STINNER Victor
STINNER Victor added the comment: Victor: > The technical issue here is that the libc has no "stateless" function to > process bytes and text with one specific locale. Andreas Schwab: > That's not true. There is a rich set of *_l functions that take a locale_t > object and operate on that

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-28 Thread Andreas Schwab
Andreas Schwab added the comment: > The technical issue here is that the libc has no "stateless" function to > process bytes and text with one specific locale. That's not true. There is a rich set of *_l functions that take a locale_t object and operate on that

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: New changeset 5f959c4f9eca404b8bc4bc6348fed27c4b907b89 by Victor Stinner in branch '3.6': [3.6] bpo-31900: Fix localeconv() encoding for LC_NUMERIC (#4174) (#5192)

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
Change by STINNER Victor : -- pull_requests: +5046 ___ Python tracker ___ ___

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: lc_numeric.py contains a typo, used fixed lc_numeric2.py instead to test my PR 5191 which fixes decimal.Decimal. -- Added file: https://bugs.python.org/file47386/lc_numeric2.py ___ Python

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
Change by STINNER Victor : -- pull_requests: +5045 ___ Python tracker ___ ___

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: New changeset cb064fc2321ce8673fe365e9ef60445a27657f54 by Victor Stinner in branch 'master': bpo-31900: Fix localeconv() encoding for LC_NUMERIC (#4174)

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: On macOS 10.13.2, I failed to find any non-ASCII decimal_point or thousands_sep in localeconv(). I wrote a script to find all non-ASCII data in all locales: https://github.com/vstinner/misc/blob/master/python/all_locales.py

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: Test on Linux (Fedora 27, glibc 2.26): locale.setlocale(locale.LC_ALL, "fr_FR") locale.setlocale(locale.LC_NUMERIC, "es_MX.utf8") It works as expected, result: decimal_point: '.' thousands_sep: '\u2009' Python 3.6 returns mojibake:

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: I tested localeconv() with PR 4174 on FreeBSD: -- locale.setlocale(locale.LC_ALL, "C") locale.setlocale(locale.LC_NUMERIC, "ar_SA.UTF-8") -- It works as expected, result: -- decimal_point: '\u066b' thousands_sep: '\u066c' -- Compare

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Sounds like a good compromise :-) -- ___ Python tracker ___

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: > I would not consider this a bug in Python, but rather in the locale settings > passed to setlocale(). Past 10 years, I repeated to every single user I met that "Python 3 is right, your system setup is wrong". But that's a waste of

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Indeed. The major problem with all libc locale functions is that they are not thread safe. The GIL does help a bit protecting against corrupted data, though. -- ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: The technical issue here is that the libc has no "stateless" function to process bytes and text with one specific locale. All functions rely on the *current* locales. To decode byte strings, we use mbstowcs(), and this function

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Ok, it seems that the C setlocale() itself does not follow the conventions set forth for environment variables: http://pubs.opengroup.org/onlinepubs/7908799/xsh/setlocale.html (see the example at the bottom) So the behavior shown by

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: Example of Fedora 27 and Python 3.6: vstinner@apu$ env -i LC_NUMERIC=uk_UA.koi8u python3 -c 'import locale; print(locale.setlocale(locale.LC_ALL, "")); print(locale.getpreferredencoding(),

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: Marc-Andre Lemburg: "If you first set LC_ALL and then one of the other categories such as LC_NUMERIC, locale C functions will still use the LC_ALL setting for everything. LC_NUMERIC does not override the LC_ALL setting." The root of

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Stefan Krah
Stefan Krah added the comment: On Mon, Jan 15, 2018 at 12:37:28PM +, Marc-Andre Lemburg wrote: > If you first set LC_ALL and then one of the other categories such as > LC_NUMERIC, locale C functions will still use the LC_ALL setting for > everything. LC_NUMERIC does

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: I just wanted to note that the description and title may cause a wrong interpretation of what should happen: If you first set LC_ALL and then one of the other categories such as LC_NUMERIC, locale C functions will still use the LC_ALL

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: > Please confirm the bug without having LC_ALL or LANG set. lc_numeric.py uses: locale.setlocale(locale.LC_ALL, "fr_FR") Are you talking about that? What is the problem with this configuration? I'm sure that there is a bug :-)

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Just FYI: LC_ALL has precedence over all other more specific LC_* settings: http://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html http://man7.org/linux/man-pages/man7/locale.7.html Please confirm the bug without having LC_ALL or

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: Oops lc_numeric.py contains a typo: d = decimal.Decimal(1234) print("Decimal.__format__: %a" % f"{i:n}") => it should be f"{d:n}" -- ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread STINNER Victor
STINNER Victor added the comment: Update: I pushed a large change to fix locale encodings in bpo-29240: commit 7ed7aead9503102d2ed316175f198104e0cd674c. -- ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-10 Thread STINNER Victor
STINNER Victor added the comment: I completed my change. It now fixes locale.localeconv(), str.format() for int, float, complex and decimal.Decimal: vstinner@apu$ ./python lc_numeric.py LC_CTYPE: ('fr_FR', 'ISO8859-1') LC_NUMERIC: ('es_MX', 'UTF-8') decimal_point:

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2017-12-18 Thread STINNER Victor
STINNER Victor added the comment: Oh. Another Python function is impacted by the bug, str.format: $ env -i python3 -c 'import locale; locale.setlocale(locale.LC_ALL, "fr_FR"); locale.setlocale(locale.LC_NUMERIC, "es_MX.utf8"); print(ascii(f"{1000:n}"))'

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2017-11-29 Thread Charalampos Stratakis
Charalampos Stratakis added the comment: Pinging here. Is there some way I can help to move the issue forward? -- ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2017-10-30 Thread STINNER Victor
STINNER Victor added the comment: inconsistent_locale_encodings.py of closed issue #7442 is interesting: I copy it here. -- Added file: https://bugs.python.org/file47246/inconsistent_locale_encodings.py ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2017-10-30 Thread STINNER Victor
STINNER Victor added the comment: Oh wow, this bug is older than what I expected :-) I added support for non-ASCII thousands separator in 2012: https://bugs.python.org/issue13706#msg151733 -- ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2017-10-30 Thread Stefan Krah
Stefan Krah added the comment: Same as #7442, I think. -- nosy: +skrah ___ Python tracker ___

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2017-10-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This is a duplicate of issue28604. See also issue25812. -- nosy: +serhiy.storchaka ___ Python tracker

[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2017-10-30 Thread STINNER Victor
Change by STINNER Victor : -- title: localeconv() should decide numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding -> localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding