On 30Aug2016 1611, Victor Stinner wrote:
2016-08-30 23:51 GMT+02:00 Victor Stinner <victor.stin...@gmail.com>:
As I already wrote once, my problem is also tjat I simply have no idea how
much Python 3 code uses bytes filename. For example, does it concern more
than 25% of py3 modules on PyPi, or less than 5%?
I made a very quick test on Windows using a modified Python raising an
exception on bytes path.
First of all, setuptools fails. It's a kind of blocker issue :-) I
quickly fixed it (only one line needs to be modified).
I tried to run Twisted unit tests (python -m twisted.trial twisted) of
Twisted 16.4. I got a lot of exceptions on bytes path from the
twisted/python/filepath.py module, but also from
twisted/trial/util.py. It looks like these modules are doing their
best to convert all paths to... bytes. I had to modify more than 5
methods just to be able to start running unit tests.
Quick result: setuptools and Twisted rely on bytes path. Dropping
bytes path support on Windows breaks these modules.
It also means that these modules don't support the full Unicode range
on Windows on Python 3.5.
Thanks. That's a good idea (certainly better than mine, which was to go
reading code...)
I haven't looked into setuptools, but Twisted appears to be correctly
using sys.getfilesystemencoding() when they coerce to bytes, which means
the proposed change will simply allow the full Unicode range when paths
are encoded.
However, if there are places where bytes are not transcoded when they
should be *then* there will be new issues. I wonder if we can quickly
test whether that happens (e.g. use the file system encoding to "taint"
the path somehow - special prefix? - so we can raise if bytes that
haven't been correctly encoded at some point are passed in).
Some of my other searching revealed occasional correct use of
sys.getfilesystemencoding(), a decent number of uses as a fallback when
other encodings are not available, and it's very hard to search for code
that uses the os module with bytes not checked to be the right encoding.
This is why I argue that the beta period is the best opportunity to
check, and why we're better to flip the switch now and flip it back if
it all goes horribly wrong - the alternative is a *very* labour
intensive exercise that I doubt we can muster.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com