[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-03-27 Thread STINNER Victor
Changes by STINNER Victor : -- pull_requests: +757 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-03-27 Thread STINNER Victor
Changes by STINNER Victor : -- pull_requests: -15 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-14 Thread Eryk Sun
Eryk Sun added the comment: > it should be replaced with sys.getfilesystemencodeerrors() > to support UTF-8 Strict mode. I did that in the patch for issue 28188. The focus of the patch is to add bytes support on Windows for os.putenv and os.environb, but I also tried to maximize consistency (

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-13 Thread STINNER Victor
STINNER Victor added the comment: Oh, I just noticed that os.environ uses the hardcoded error handler "surrogateescape": it should be replaced with sys.getfilesystemencodeerrors() to support UTF-8 Strict mode. -- ___ Python tracker

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-12 Thread INADA Naoki
INADA Naoki added the comment: How about locale.getpreferredencoding() returns 'utf-8' in utf8 mode? -- ___ Python tracker ___ ___ Pyt

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-12 Thread STINNER Victor
STINNER Victor added the comment: encodings.py: enhancement version of pep540_cli.py, add locale and filesystem encoding. Script to test the implementation of the PEP 540 (and PEP 538). -- Added file: http://bugs.python.org/file46274/encodings.py ___

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-12 Thread STINNER Victor
STINNER Victor added the comment: Patch version 4: * Handle PYTHONLEGACYWINDOWSFSENCODING: this env var now disables the UTF-8 mode and has the priority over -X utf8 and PYTHONUTF8 * Add an unit test on PYTHONUTF8 env var and -E cmdline option * Add an unit test on the POSIX locale * Fix initst

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-12 Thread STINNER Victor
STINNER Victor added the comment: Hum, test_utf8mode lacks an unit test on the -E command line option: PYTHONUTF8 should be ignored if -E is used. -- ___ Python tracker ___ _

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-12 Thread Chi Hsuan Yen
Changes by Chi Hsuan Yen : -- nosy: +Chi Hsuan Yen ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-12 Thread STINNER Victor
STINNER Victor added the comment: > Can -X utf8 option be processed before Py_Main()? I'm trying to implement that, but it's hard to factorize the code. I will probably have to duplicate the code handling -E, -X utf8, PYTHONMALLOC and PYTHONUTF8 for wchar_t* (UCS4 or UTF-16) and char* (bytes).

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread INADA Naoki
INADA Naoki added the comment: > Hum, pep540-3.patch doesn't work if the locale encoding is different than > ASCII and UTF-8. argv must be reencoded: I want to skip reencoding. On UTF-8 mode, arbitrary bytes in cmdline (e.g. broken filename passed by xarg) should be able to roundtrip by UTF-8/

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread STINNER Victor
STINNER Victor added the comment: I only tested the the PEP 540 implementation on Linux. The PEP and its implementation should adjusted for Windows, especially Windows-only env vars like PYTHONLEGACYWINDOWSFSENCODING. Changes are maybe also needed for Mac OS X and Android, which always use UTF

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread STINNER Victor
STINNER Victor added the comment: Hum, pep540-3.patch doesn't work if the locale encoding is different than ASCII and UTF-8. argv must be reencoded: $ LC_ALL=fr_FR ./python -X utf8 -c 'import sys; print(ascii(sys.argv))' $(echo -ne "\xff") ['-c', '\xff'] The result should not depend on the lo

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread STINNER Victor
STINNER Victor added the comment: Oops, I introduced an obvious bug in my latest refactoring. It's now fixed in the patch version 3: pep540-3.patch. -- Added file: http://bugs.python.org/file46263/pep540-3.patch ___ Python tracker

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread STINNER Victor
STINNER Victor added the comment: pep540-2.patch: Patch version 2, updated to the latest version of the PEP 540. It has no more FIXME/TODO and has more unit tests. The main change is that the strict mode doesn't use strict anymore for OS data, but keeps surrogateescape. See the PEP for the rat

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread INADA Naoki
Changes by INADA Naoki : -- nosy: +inada.naoki ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyt

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread STINNER Victor
STINNER Victor added the comment: Examples with pep540_cli.py. Python 3.5: $ python3 pep540_cli.py sys.argv: ['pep540_cli.py'] stdin: UTF-8/strict stdout: UTF-8/strict stderr: UTF-8/backslashreplace open(): UTF-8/strict $ LC_ALL=C python3 pep540_cli.py sys.argv: ['pep540_cli.py'] stdin: ANSI

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread STINNER Victor
STINNER Victor added the comment: pep540.patch: first draft Changes: * Add sys.flags.utf8mode * Add -X utf8 command line option * Add PYTHONUTF8 environment variable * sys.stdin, sys.stdout and sys.stderr encoding and errors are modified in UTF-8 mode * open() default encoding and errors is mo

[issue29240] Implementation of the PEP 540: Add a new UTF-8 mode

2017-01-11 Thread STINNER Victor
New submission from STINNER Victor: This issue tracks the implementation of the PEP 540. Attached pep540_cli.py script can be used to play with it. -- components: Interpreter Core, Library (Lib), Unicode files: pep540_cli.py messages: 285214 nosy: ezio.melotti, haypo priority: normal pu