On 15 April 2016 at 00:01, Random832 <random...@fastmail.com> wrote: > On Thu, Apr 14, 2016, at 09:50, Chris Angelico wrote: >> Adding integers and floats is considered "safe" because most people's >> use of floats completely compasses their use of ints. (You'll get >> OverflowError if it can't be represented.) But float and Decimal are >> considered "unsafe": >> >> >>> 1.5 + decimal.Decimal("1.5") >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> TypeError: unsupported operand type(s) for +: 'float' and >> 'decimal.Decimal' >> >> This is more what's happening here. Floats and Decimals can represent >> similar sorts of things, but with enough incompatibilities that you >> can't simply merge them. > > And what such incompatibilities exist between bytes and str for the > purpose of representing file paths? At the end of the day, there's > exactly one answer to "what file on disk this represents (or would > represent if it existed)".
Bytes paths on WIndows are encoded as mbcs for use with the ASCII-only Windows APIs, and hence don't support the full range of characters that str does. The colloquial shorthand for that is "bytes paths don't work properly on Windows" (the more strictly accurate description is "bytes paths only work correctly on Windows if every code point in the path can be encoded using the 'mbcs' codec"). Even on *nix, os.fsencode may fail outright if the system is configured to use a non-universal encoding, while os.fsdecode may pollute the resulting string with surrogate escaped characters. Regardless of platform, if somebody hands you *mixed* bytes and str data, the appropriate default reaction is to complain about it rather than assume they meant one or the other. That complaint may take one of two forms: - for a high level, platform independent API, bytes should just be rejected outright - for a low level API with input type dependent behaviour, the input should be rejected as ambiguous - the API doesn't know whether the str behaviour or the bytes behaviour is the intended one pathlib falls into the first category - it just rejects bytes as input os.path.join falls into the second category - all str is fine, and all bytes is fine, but mixing them fails However, once somebody reaches for the coercion APIs (fsdecode and fsencode), they're now *explicitly* telling the interpreter what they want, since there's no ambiguity about the possible return types from those functions. In relation to Victor's comment about this being complex code to show to a novice: os.path.join(*map(os.fsdecode, ("str", b"bytes"))) I agree, but also think that's a good reason for people to switch to teaching novices pathlib rather than os.path, and letting them discover the underlying libraries as required by the code and examples they encounter. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com