On Mon, 11 Apr 2016 at 14:11 Ethan Furman <et...@stoneleaf.us> wrote:
> On 04/11/2016 01:42 PM, Victor Stinner wrote: > > 2016-04-11 21:00 GMT+02:00 Brett Cannon: > > >> I'm -0 on allowing __fspath__ to return bytes, but we can see what > others > >> think. > > > > With the PEP 383, a bytes filename can be stored as str using the > > surrogateescape error handler. So DirEntry can convert a bytes path to > > str using os.fsdecode(). > > I am far from a unicode expert, but if I understand this correctly you > are proposing that DirEntry.__whatever__ can always return a str using > the surogateescape (SE) method. > > However, before this SE string can be used, it would need to be > converted back to bytes, and with the same SE method, yes? And this has > already been implemented in the stdlib? > > So my concern in such a case is what happens if we pass this SE string > somewhere else: a UTF-8 file, or over a socket, or into a database? > Does this have issues that we wouldn't face if we just used bytes? > This is my worry as well and why I have not proposed this kind of universal normalizing of bytes paths using os.fsdecode() w/ surrogateescape. Doing this sort of thing from the system boundary and documenting as such as PEP 383 proposed makes a bit more sense as the expectation is more controlled and is a clear input boundary.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com