On Mon, 18 Apr 2016 at 12:26 Stephen J. Turnbull <step...@xemacs.org> wrote:
> Brett Cannon writes: > > > If we continue with the "str is an encoding of file paths", > > It's not. It's a representation, but not an encoding. In Python 3, > encoding means a representation of a character string using bytes. > It's using "encoding" generically for "representation" that makes your > head hurt. > Well, it makes *your* head hurt; for me it helped clarify some things. :) > > > you can then build from "bytes is an encoding of str" to get a > > pyramid of file path encodings: Path -> str -> bytes. I don't think > > this is in any way a controversial view. > > Perhaps not. But it's not particularly useful. ;-) Here's the > pyramid I think about: > > Path > / \ > / \ > V V > str <-> bytes > > That is, str and bytes are interchangeable *without* any knowledge of > paths, which are on a higher level of complexity and abstraction. > Although in pathlib, there's an assumption that paths are serialized > to str which is (implicitly) serialized to bytes when talking to the > OS, this is not necessarily true for other structured path classes, in > particular it is not true for DirEntry (which is a "enhanced > degenerate" path containing only one path segment but also other > useful information about the filesystem object addressed) > > I haven't looked at Antipathy, but I would guess from Ethan's > promotion of bytes paths and concern with efficiency that "bytes > antipaths" do *not* "go through" str to get to bytes, they already are > bytes (in the sense of class inheritance). > > > But that's when I realized that adding __fspath__ support to > os.fsdecode() > > and os.fsencode(), they become more coercion functions rather than > > encoding/decoding functions. It also means that os.fspath() has a place > > when you want to say "I only want to encode a file path to str" and > avoid > > the decode bit that os.fsdecode() would do > > I don't understand what you're trying to say here. fsdecode currently > does not promise to decode anything, because it's polymorphic, > accepting str and bytes. fsdecode and fsencode already *are* coercion > functions. > And they will continue to be coercion functions. My point is that since they coerce there is no way to use them in a way to dictate that you don't want any str/bytes encoding/decoding to occur without checking the arguments going into the function (i.e. "no guessing about encodings, please"). By providing os.fspath() I can say that I do not, under any circumstances, want someone to guess at the encoding some bytes path is under to get me a string and instead I want to start and end entirely in a world of strings. IOW os.fspath() lets me work in such a way that the instant bytes are introduced into my code for file paths it triggers a TypeError. > > It's this kind of semantic confusion and broken nomenclature that is > *why* I dislike these polymorphic functions and objects so much. It > is impossible to reason correctly about them. We're stuck with > invoking "practicality" and muddling through. And the names mislead > even experienced Pythonistas. > Yep, we are stuck with the names unless you want to propose a new name and deprecate the old one.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com