Re: [Python-Dev] When should pathlib stop being provisional?
Fair enough, I stand corrected for both points. Le 07/04/2016 18:13, Zachary Ware a écrit : > On Thu, Apr 7, 2016 at 5:50 AM, Michel Desmoulin >wrote: >> Path objects don't have splitext() or and don't allow "string" / path. >> Those are the ones bugging me the most. > import pathlib p = '/some/test' / pathlib.Path('path') / 'file_with.ext' p > PosixPath('/some/test/path/file_with.ext') p.parent, p.stem, p.suffix > (PosixPath('/some/test/path'), 'file_with', '.ext') > > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Thu, Apr 7, 2016 at 3:50 AM, Michel Desmoulinwrote: > > Path objects don't have splitext() that is useful -- let's add it. (and others if need be) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Thu, Apr 7, 2016 at 5:50 AM, Michel Desmoulinwrote: > Path objects don't have splitext() or and don't allow "string" / path. > Those are the ones bugging me the most. >>> import pathlib >>> p = '/some/test' / pathlib.Path('path') / 'file_with.ext' >>> p PosixPath('/some/test/path/file_with.ext') >>> p.parent, p.stem, p.suffix (PosixPath('/some/test/path'), 'file_with', '.ext') -- Zach ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/07/2016 03:50 AM, Michel Desmoulin wrote: Path objects don't have splitext() or and don't allow "string" / path. Those are the ones bugging me the most. --> Path('README.md') --> p = Path('README.md') # PosixPath('README.md') --> '/home/ethan' / p # PosixPath('/home/ethan/README.md') --> p.splitext() Traceback (most recent call last): File "", line 1, in AttributeError: 'PosixPath' object has no attribute 'splitext' So, yeah, no .splitext() -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Le 06/04/2016 22:47, Sven R. Kunze a écrit : > On 06.04.2016 07:00, Guido van Rossum wrote: >> On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furmanwrote: >>> [...] we can't do: >>> >>> app_root = Path(...) >>> config = app_root/'settings.cfg' >>> with open(config) as blah: >>> # whatever >>> >>> It feels like instead of addressing this basic disconnect, the answer >>> has >>> instead been: add that to pathlib! Which works great -- until a >>> user or a >>> library gets this path object and tries to use something from os on it. >> I agree that asking for config.open() isn't the right answer here >> (even if it happens to work). > > How come? > >> But in this example, once 3.5.2 is out, >> the solution would be to use open(config.path), and that will also >> work when passing it to a library. Is it still unacceptable then? > > I think so. Although in this example I would prefer the shorter > config.open alternative as I am lazy. > > > I still cannot remember what the concrete issue was why we dropped > pathlib the same day we gave it a try. It was something really stupid > and although I hoped to reduce the size of the code, it was less > readable. But it was not the path->str issue but something more mundane. > It was something that forced us to use os[.path] as Path didn't provide > something equivalent. Cannot remember. Path objects don't have splitext() or and don't allow "string" / path. Those are the ones bugging me the most. > > > Best, > Sven > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/desmoulinmichel%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
As a simple user, pathlib simplifies playing with paths. A lot of things are easy to do. For example, Pathlib / "subfile" is so useful. I also have a subclass of pathlib.Path on github that makes easy seeking for files and directories. So keep alive pathlib ! Le 6 avr. 2016 13:06, "Paul Moore"a écrit : On 6 April 2016 at 00:45, Guido van Rossum wrote: > This does sound like it's the crucial issue, and it is worth writing > up clearly the pros and cons. Let's draft those lists in a thread > (this one's fine) and then add them to the PEP. We can then decide to: > > - keep the status quo > - change PurePath to inherit from str > - decide it's never going to be settled and kill pathlib.py > > (And yes, I'm dead serious about the latter, rather Solomonic option.) By the way, even if there's no solution that satisfies everyone to the "inherit from str" question, I'd still be unhappy if pathlib disappeared from the stdlib. It's useful for quick admin scripts that don't justify an external dependency. Those typically do quite a bit of path manipulation, and as such benefit from the improved API of pathlib over os.path. +1 on making (and documenting) a final decision on the "inherit from str" question -1 on removing pathlib just because that decision might not satisfy everyone Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/projetmbc%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Apr 6, 2016 14:00, "Barry Warsaw"wrote: > Aside from the name of the attribute (though I'm partial to __path__), Ahem, pkg.__path__. -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
> On Apr 5, 2016, at 3:55 PM, Guido van Rossumwrote: > > It's been provisional since 3.4. I think if it is still there in 3.6.0 > it should be considered no longer provisional. But this may indeed be > a test case for the ultimate fate of provisional modules -- should we > remove it? I lean slightly towards for removal. Having worked through the API when it is first released, I find it to be highly forgettable (i.e. I have to re-read the docs each time I've revisited it). While I haven't seen any uptake in real code, there are occasional questions about it on StackOverflow, so we do know that there is at least some interest. I'm not sure that it needs to live in the standard library though. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Nick Coghlan wrote: I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me. It's not quite the same thing, though. DirEntry.path takes something that is not a path (a DirEntry instance) and gives you a path representing it, so the name makes sense. But a Path instance is already "a path", so Path.path is weird. Path.str would make more sense. -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Yeah, sure. But it was more like this on a single line: os.missing1(str(our_path.something1)) *** os.missing2(str(our_path.something1)) *** os.missing1(str(our_path.something1)) And then it started to get messy because you need to work on a single long line or you need to open more than one line. It was a simple thing actually. Like repeating the same calls to pathlib just because we need to switch to os.path I will ask my colleague if he remembers or if we can recover the code tommorrow... Best, Sven NOTE to myself: getting old, need to write down everything On 06.04.2016 23:03, Ethan Furman wrote: On 04/06/2016 01:47 PM, Sven R. Kunze wrote: I still cannot remember what the concrete issue was why we dropped pathlib the same day we gave it a try. It was something really stupid and although I hoped to reduce the size of the code, it was less readable. But it was not the path->str issue but something more mundane. It was something that forced us to use os[.path] as Path didn't provide something equivalent. Cannot remember. I'm willing to guess that if you had been able to just call os.whatever(your_path_obj) it would have been at most a minor annoyance. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, 6 Apr 2016 at 14:03 Wes Turnerwrote: > > On Apr 6, 2016 12:47 PM, "Brett Cannon" wrote: > > > > > > > > On Wed, 6 Apr 2016 at 10:41 Wes Turner wrote: > >> > >> * +1 for __path__, __fspath__ > >> (though I don't know what each does) > > > > > > Returns a string representing a file system path. > > Why two methods? __uripath__? > > (scheme, host (port), path, query, fragment) so, not __uripath__ > > what would be the difference between __path__ and __fspath__? > There is no difference; we're trying to choose a name. > > > >> > >> * why not Text(basestring / bytestring) and pathlib.Path(Text)? > > > > > > See the points about next() vs __next__() > > Path(b'123') / u'456' > > similarly, > Path(b'123') / UTF8 / UTF16 > As other people pointed out on the other thread, while bytes paths do exist, we don't want to promote them as they are a mess to work with. -Brett > > > >> > >>* are there examples of cases where this cannot be? > > > > > > I don't understand what you think "cannot be". > > What one recommends (path.py(str) / str(pathlib.Path()) + getattr) is > distinct from what any given programmer chooses to do with their code. > > > > >> > >> * if not, +1 for subclassing str/Text > >> > >> * where are the examples of method collisions between the str > interface and the pathlib.Path interface? > > > > > > There aren't any and that's partially why some people wanted the str > subclass to begin with. > > > > Please consider this thread a str-subclass-free zone. This line of > discussion is to flesh out the proposal for a path protocol as a proposal > against subclassing str, not to settle the whole discussion outright. If > you want to continue to debate the subclassing-str side of this please use > the other thread. > > this seems to be a sudden, arbitrary distinction. > > are these proposals necessarily disjoint? > > so, > adding getattr(path, '__path__', path) to stdlib and other code is going > to prevent which edge cases (before os.path.normpath()* anyway) for which > benefit? > > when do I do getattr(path, '__fspath__', path)? > > > > > -Brett > > > >> > >> * str.__div__ is nonsensical > >> * pathlib.Path.__div__ is super-useful > > ah, not .__add__() but .append() > > I suppose the request here is for the cases which would be prevented (that > we need to learn to look for) > > >> > >> > >> > >> On Apr 6, 2016 10:10 AM, "Ethan Furman" wrote: > >>> > >>> On 04/05/2016 11:57 PM, Nick Coghlan wrote: > > On 6 April 2016 at 16:53, Nathaniel Smith wrote: > > > > On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan > wrote: > >>> > >>> > >> I'd missed the existing precedent in DirEntry.path, so simply taking > >> that and running with it sounds good to me. > > > > > > This makes me twitch slightly, because NumPy has had a whole set of > > problems due to the ancient and minimally-considered decision to > > assume a bunch of ad hoc non-namespaced method names fulfilled some > > protocol -- like all .sum methods will have a signature that's > > compatible with numpy's, and if an object has a .log method then > > surely that computes the logarithm (what else in computing could > "log" > > possibly refer to?), etc. This experience may or may not be relevant, > > I'm not sure -- sometimes these kinds of twitches are good guides to > > intuition, and sometimes they are just knee-jerk responses to an old > > and irrelevant problem :-) > > > > But you might want to at least think about > > how common it might be to have existing objects with unrelated > > attributes that happen to be called "path", and the bizarro problems > > that might be caused if someone accidentally passes one of them to a > > function that expects all .path attributes to be instances of this > new > > protocol. > > > sys.path, for example. > > That's why I'd actually prefer the implicit conversion protocol to be > the more explicitly named "__fspath__", with suitable "__fspath__ = > path" assignments added to DirEntry and pathlib. However, I'm also not > offering to actually *do* the work here, and the casting vote goes to > the folks pursuing the implementation effort. > >>> > >>> > >>> If we decide upon __fspath__ (or __path__) I will do the work on > pathlib and scandir to add those attributes. > >>> > >>> -- > >>> ~Ethan~ > >>> ___ > >>> Python-Dev mailing list > >>> Python-Dev@python.org > >>> https://mail.python.org/mailman/listinfo/python-dev > >>> > >>> Unsubscribe: > https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com > >> > >> ___ > >> Python-Dev mailing list > >> Python-Dev@python.org > >>
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/06/2016 01:47 PM, Sven R. Kunze wrote: I still cannot remember what the concrete issue was why we dropped pathlib the same day we gave it a try. It was something really stupid and although I hoped to reduce the size of the code, it was less readable. But it was not the path->str issue but something more mundane. It was something that forced us to use os[.path] as Path didn't provide something equivalent. Cannot remember. I'm willing to guess that if you had been able to just call os.whatever(your_path_obj) it would have been at most a minor annoyance. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Apr 6, 2016 12:47 PM, "Brett Cannon"wrote: > > > > On Wed, 6 Apr 2016 at 10:41 Wes Turner wrote: >> >> * +1 for __path__, __fspath__ >> (though I don't know what each does) > > > Returns a string representing a file system path. Why two methods? __uripath__? (scheme, host (port), path, query, fragment) so, not __uripath__ what would be the difference between __path__ and __fspath__? > >> >> * why not Text(basestring / bytestring) and pathlib.Path(Text)? > > > See the points about next() vs __next__() Path(b'123') / u'456' similarly, Path(b'123') / UTF8 / UTF16 > >> >>* are there examples of cases where this cannot be? > > > I don't understand what you think "cannot be". What one recommends (path.py(str) / str(pathlib.Path()) + getattr) is distinct from what any given programmer chooses to do with their code. > >> >> * if not, +1 for subclassing str/Text >> >> * where are the examples of method collisions between the str interface and the pathlib.Path interface? > > > There aren't any and that's partially why some people wanted the str subclass to begin with. > > Please consider this thread a str-subclass-free zone. This line of discussion is to flesh out the proposal for a path protocol as a proposal against subclassing str, not to settle the whole discussion outright. If you want to continue to debate the subclassing-str side of this please use the other thread. this seems to be a sudden, arbitrary distinction. are these proposals necessarily disjoint? so, adding getattr(path, '__path__', path) to stdlib and other code is going to prevent which edge cases (before os.path.normpath()* anyway) for which benefit? when do I do getattr(path, '__fspath__', path)? > > -Brett > >> >> * str.__div__ is nonsensical >> * pathlib.Path.__div__ is super-useful ah, not .__add__() but .append() I suppose the request here is for the cases which would be prevented (that we need to learn to look for) >> >> >> >> On Apr 6, 2016 10:10 AM, "Ethan Furman" wrote: >>> >>> On 04/05/2016 11:57 PM, Nick Coghlan wrote: On 6 April 2016 at 16:53, Nathaniel Smith wrote: > > On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan wrote: >>> >>> >> I'd missed the existing precedent in DirEntry.path, so simply taking >> that and running with it sounds good to me. > > > This makes me twitch slightly, because NumPy has had a whole set of > problems due to the ancient and minimally-considered decision to > assume a bunch of ad hoc non-namespaced method names fulfilled some > protocol -- like all .sum methods will have a signature that's > compatible with numpy's, and if an object has a .log method then > surely that computes the logarithm (what else in computing could "log" > possibly refer to?), etc. This experience may or may not be relevant, > I'm not sure -- sometimes these kinds of twitches are good guides to > intuition, and sometimes they are just knee-jerk responses to an old > and irrelevant problem :-) > > But you might want to at least think about > how common it might be to have existing objects with unrelated > attributes that happen to be called "path", and the bizarro problems > that might be caused if someone accidentally passes one of them to a > function that expects all .path attributes to be instances of this new > protocol. sys.path, for example. That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort. >>> >>> >>> If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes. >>> >>> -- >>> ~Ethan~ >>> ___ >>> Python-Dev mailing list >>> Python-Dev@python.org >>> https://mail.python.org/mailman/listinfo/python-dev >>> >>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com >> >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 06.04.2016 07:00, Guido van Rossum wrote: On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furmanwrote: [...] we can't do: app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it. I agree that asking for config.open() isn't the right answer here (even if it happens to work). How come? But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then? I think so. Although in this example I would prefer the shorter config.open alternative as I am lazy. I still cannot remember what the concrete issue was why we dropped pathlib the same day we gave it a try. It was something really stupid and although I hoped to reduce the size of the code, it was less readable. But it was not the path->str issue but something more mundane. It was something that forced us to use os[.path] as Path didn't provide something equivalent. Cannot remember. Best, Sven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/05/2016 11:53 PM, Nathaniel Smith wrote: On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan wrote: I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me. This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-). But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol. A very good point, thank you. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/06/2016 02:41 AM, Antoine Pitrou wrote: On a concrete point, inheriting str would make the API a horrible, confusing, dangerous mess missing regular string semantics (concatenation with +, for example, or indexing) with path-specific semantics and various grey areas (should .split() have path semantics or str semantics? what is the rule and how are people supposed to remember it?). While I agree in principle.. (of course, for PHP or Javascript programmers it may not sound like a problem. Let "adding" two IP addresses return the concatenation of their string representations...) Like if had a subnet of '192.168' and a host of '.11.16' and adding them together gave you '192.168.11.16'? (yeah, a bit weak) Or, more appropriately: a path of '/home/ethan/mystuff' + '_bak' so I can make a copy? Actually, that would be stuff = pathlib.Path('/home/ethan/mystuff') # no issue here backup_stuff = stuff.with_name(stuff.name + '_bak') # eww Sure, you can make the argument that `with_suffix('.bak')` is cleaner, but it is not up to the stdlib to micromanage my code. Oh, and I do not consort with PHP, and only do so with Javascript when forced. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Apr 06, 2016, at 12:44 PM, Nick Coghlan wrote: >The next challenge would then be to make a list of APIs to be updated >for 3.6 to implicitly accept "rich path" objects via the agreed >convention, with pathlib.PurePath used as a test class: > >* open() >* codecs.open() (et al) >* io.* >* os.path.* >* other os functions >* shutil.* >* tempfile.* >* shelve.* >* csv.* Aside from the name of the attribute (though I'm partial to __path__), I think this would go a long way toward making path objects nicer to work with. And right, it doesn't have to be 100% but this would be a big improvement. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Apr 05, 2016, at 09:29 PM, Ethan Furman wrote: >We should either remove it or make the rest of the stdlib work with it. >Currently, pathlib.*Paths are second-class citizens, and working with them is >not significantly better than working with os.path.* simply because we have >to cast to str every time we want to deal with any other part of the stdlib. This. I've tried to use them in a couple of projects and in many ways pathlib objects are nice to work with. But rarely can they be used exclusively. There are just too many other packages and APIs that use os.path and the two do not interoperate very well. That makes practical use of pathlib objects just too unwieldy for project-wide adoption. I don't know if inheriting them from str would fix this problem. I'm +0 on removing the provisional status of pathlib and in trying to figure out ways for them to work better with other libraries (both stdlib and 3rd party) that will continue to be os.path based for the foreseeable future. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wednesday, April 06, 2016 07:39, Steven D'Aprano wrote: > > How well does that apply to path/__path__? > > I think it's potentially the same. Possibly there are fewer existing uses of > "obj.path" out there which conflict with this use, but there's at least one in the > std lib: sys.path. Somewhat ironically, also os. >>> import os.path >>> getattr(os, "path") ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, 6 Apr 2016 at 10:41 Wes Turnerwrote: > * +1 for __path__, __fspath__ > (though I don't know what each does) > Returns a string representing a file system path. > * why not Text(basestring / bytestring) and pathlib.Path(Text)? > See the points about next() vs __next__() >* are there examples of cases where this cannot be? > I don't understand what you think "cannot be". > * if not, +1 for subclassing str/Text > > * where are the examples of method collisions between the str > interface and the pathlib.Path interface? > There aren't any and that's partially why some people wanted the str subclass to begin with. Please consider this thread a str-subclass-free zone. This line of discussion is to flesh out the proposal for a path protocol as a proposal against subclassing str, not to settle the whole discussion outright. If you want to continue to debate the subclassing-str side of this please use the other thread. -Brett > * str.__div__ is nonsensical > * pathlib.Path.__div__ is super-useful > > > On Apr 6, 2016 10:10 AM, "Ethan Furman" wrote: > >> On 04/05/2016 11:57 PM, Nick Coghlan wrote: >> >>> On 6 April 2016 at 16:53, Nathaniel Smith wrote: >>> On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan wrote: >>> >> I'd missed the existing precedent in DirEntry.path, so simply taking > that and running with it sounds good to me. > This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-) But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol. >>> >>> sys.path, for example. >>> >>> That's why I'd actually prefer the implicit conversion protocol to be >>> the more explicitly named "__fspath__", with suitable "__fspath__ = >>> path" assignments added to DirEntry and pathlib. However, I'm also not >>> offering to actually *do* the work here, and the casting vote goes to >>> the folks pursuing the implementation effort. >>> >> >> If we decide upon __fspath__ (or __path__) I will do the work on pathlib >> and scandir to add those attributes. >> >> -- >> ~Ethan~ >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com >> > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
* +1 for __path__, __fspath__ (though I don't know what each does) * why not Text(basestring / bytestring) and pathlib.Path(Text)? * are there examples of cases where this cannot be? * if not, +1 for subclassing str/Text * where are the examples of method collisions between the str interface and the pathlib.Path interface? * str.__div__ is nonsensical * pathlib.Path.__div__ is super-useful On Apr 6, 2016 10:10 AM, "Ethan Furman"wrote: > On 04/05/2016 11:57 PM, Nick Coghlan wrote: > >> On 6 April 2016 at 16:53, Nathaniel Smith wrote: >> >>> On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan >>> wrote: >>> >> > I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me. >>> >>> This makes me twitch slightly, because NumPy has had a whole set of >>> problems due to the ancient and minimally-considered decision to >>> assume a bunch of ad hoc non-namespaced method names fulfilled some >>> protocol -- like all .sum methods will have a signature that's >>> compatible with numpy's, and if an object has a .log method then >>> surely that computes the logarithm (what else in computing could "log" >>> possibly refer to?), etc. This experience may or may not be relevant, >>> I'm not sure -- sometimes these kinds of twitches are good guides to >>> intuition, and sometimes they are just knee-jerk responses to an old >>> and irrelevant problem :-) >>> >>> But you might want to at least think about >>> how common it might be to have existing objects with unrelated >>> attributes that happen to be called "path", and the bizarro problems >>> that might be caused if someone accidentally passes one of them to a >>> function that expects all .path attributes to be instances of this new >>> protocol. >>> >> >> sys.path, for example. >> >> That's why I'd actually prefer the implicit conversion protocol to be >> the more explicitly named "__fspath__", with suitable "__fspath__ = >> path" assignments added to DirEntry and pathlib. However, I'm also not >> offering to actually *do* the work here, and the casting vote goes to >> the folks pursuing the implementation effort. >> > > If we decide upon __fspath__ (or __path__) I will do the work on pathlib > and scandir to add those attributes. > > -- > ~Ethan~ > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/05/2016 11:57 PM, Nick Coghlan wrote: On 6 April 2016 at 16:53, Nathaniel Smithwrote: On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan wrote: I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me. This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-) But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol. sys.path, for example. That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort. If we decide upon __fspath__ (or __path__) I will do the work on pathlib and scandir to add those attributes. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/06/2016 02:50 AM, Antoine Pitrou wrote: Ethan Furman stoneleaf.us> writes: Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level. The point is that a function that receives a "path" object (whether str or Path) shouldn't have to care: it should be able to call os.path.split on the thing it received and get back a usable answer. pathlib should already replicate the useful parts of os.path. That was the design goal after all. Yes it does, and very well. So this is like saying you want a Python file or socket object to be accepted by os.read(). In the rare case where you want that, you call the .fileno() method explicitly. The equivalent for Path objects is to lookup the .path attribute explicitly. Unfortunately for Path objects there is already a well-established ecosystem for dealing with paths as strings, and it currently breaks when passed a Path path object. This is a high barrier to entry. Having the stdlib support Path objects would lower that barrier significantly. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Apr 6, 2016 07:44, "Steven D'Aprano"wrote: > > On Wed, Apr 06, 2016 at 11:30:32AM +0200, Petr Viktorin wrote: > > > Python was in a similar situation with the .next method on iterators, > > which changed to __next__ in Python 3. PEP 3114 (which explains this > > change) says: > > > > > Code that nowhere contains an explicit call to a next method can > > > nonetheless be silently affected by the presence of such > > > a method. Therefore, this PEP proposes that iterators should have > > > a __next__ method instead of a next method (with no change in > > > semantics). > > > > How well does that apply to path/__path__? > > I think it's potentially the same. Possibly there are fewer existing > uses of "obj.path" out there which conflict with this use, but there's > at least one in the std lib: sys.path. > > > > That PEP also introduced the next() builtin. This suggests that a > > protocol with __path__/__fspath__ would need a corresponding > > path()/fspath() builtin. > > Not necessarily. Take a look at (say) dir(object()) and you'll see a few > dunders that don't correspond to built-ins: > > __reduce__ and __reduce_ex__ are used by pickle; > __sizeof__ is used by sys.getsizeof; > __subclasshook__ is used by the ABC system; > > Another example is __trunc__ used by math.trunc(). > > So any such fspath function should stand on its own as a useful > feature, not just because there's a dunder method __fspath__. An even more precise analogy is provided by __index__, whose semantics are to provide safe casting to integer (the name is a historical accident), as opposed to __int__'s tendency to cast things to integer willy-nilly, including things that really shouldn't be silently accepted as integers. Basically __index__ is to __int__ as __(fs)path__ would be to __str__. There's an operator.index but no builtins.index. -n ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, Apr 06, 2016 at 11:30:32AM +0200, Petr Viktorin wrote: > Python was in a similar situation with the .next method on iterators, > which changed to __next__ in Python 3. PEP 3114 (which explains this > change) says: > > > Code that nowhere contains an explicit call to a next method can > > nonetheless be silently affected by the presence of such > > a method. Therefore, this PEP proposes that iterators should have > > a __next__ method instead of a next method (with no change in > > semantics). > > How well does that apply to path/__path__? I think it's potentially the same. Possibly there are fewer existing uses of "obj.path" out there which conflict with this use, but there's at least one in the std lib: sys.path. > That PEP also introduced the next() builtin. This suggests that a > protocol with __path__/__fspath__ would need a corresponding > path()/fspath() builtin. Not necessarily. Take a look at (say) dir(object()) and you'll see a few dunders that don't correspond to built-ins: __reduce__ and __reduce_ex__ are used by pickle; __sizeof__ is used by sys.getsizeof; __subclasshook__ is used by the ABC system; Another example is __trunc__ used by math.trunc(). So any such fspath function should stand on its own as a useful feature, not just because there's a dunder method __fspath__. -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 00:45, Guido van Rossumwrote: > This does sound like it's the crucial issue, and it is worth writing > up clearly the pros and cons. Let's draft those lists in a thread > (this one's fine) and then add them to the PEP. We can then decide to: > > - keep the status quo > - change PurePath to inherit from str > - decide it's never going to be settled and kill pathlib.py > > (And yes, I'm dead serious about the latter, rather Solomonic option.) By the way, even if there's no solution that satisfies everyone to the "inherit from str" question, I'd still be unhappy if pathlib disappeared from the stdlib. It's useful for quick admin scripts that don't justify an external dependency. Those typically do quite a bit of path manipulation, and as such benefit from the improved API of pathlib over os.path. +1 on making (and documenting) a final decision on the "inherit from str" question -1 on removing pathlib just because that decision might not satisfy everyone Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Tue, Apr 05, 2016 at 11:53:05PM -0700, Nathaniel Smith wrote: > This makes me twitch slightly, because NumPy has had a whole set of > problems due to the ancient and minimally-considered decision to > assume a bunch of ad hoc non-namespaced method names fulfilled some > protocol -- like all .sum methods will have a signature that's > compatible with numpy's, and if an object has a .log method then > surely that computes the logarithm (what else in computing could "log" > possibly refer to?), etc. It's the down-side of duck-typing. It's all well and good accepting anything with a quack method, but not everything is that straight- forward: artist.draw() gunslinger.draw() I think that file system paths are important enough, and tricky enough, to justify their own protocol. I like Nick's suggestion of a special dunder method for converting path-like objects into paths, without the problems that str(x) has, or the risk of assuming that anything with a .path attribute refers to a file system path. (maze.path, garden.path, career.path perhaps?) -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Ethan Furman stoneleaf.us> writes: > > > > Not sure about os.path.*. The purpose of os.path module is manipulating > > string paths. From the perspective of pathlib it can look lower level. > > The point is that a function that receives a "path" object (whether str > or Path) shouldn't have to care: it should be able to call os.path.split > on the thing it received and get back a usable answer. pathlib should already replicate the useful parts of os.path. That was the design goal after all. So this is like saying you want a Python file or socket object to be accepted by os.read(). In the rare case where you want that, you call the .fileno() method explicitly. The equivalent for Path objects is to lookup the .path attribute explicitly. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Nick Coghlan gmail.com> writes: > > sys.path, for example. > > That's why I'd actually prefer the implicit conversion protocol to be > the more explicitly named "__fspath__", with suitable "__fspath__ = > path" assignments added to DirEntry and pathlib. That was my preference as well. > However, I'm also not > offering to actually *do* the work here, and the casting vote goes to > the folks pursuing the implementation effort. Indeed. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Brett Cannon python.org> writes: > > :) I figured. I was close myself until I decided to be the "not inheriting from str is a sane decision" camp because people weren't understanding where the design decision probably came from, hence http://www.snarky.ca/why-pathlib-path-doesn-t-inherit-from-str That's a good write-up, thank you. Paths don't have to inherit str any more than IP addresses or any other thing that happens to be passed as a string in traditional APIs. On a concrete point, inheriting str would make the API a horrible, confusing, dangerous mess missing regular string semantics (concatenation with +, for example, or indexing) with path-specific semantics and various grey areas (should .split() have path semantics or str semantics? what is the rule and how are people supposed to remember it?). (of course, for PHP or Javascript programmers it may not sound like a problem. Let "adding" two IP addresses return the concatenation of their string representations...) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/06/2016 08:53 AM, Nathaniel Smith wrote: > On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlanwrote: >> On 6 April 2016 at 15:57, Serhiy Storchaka wrote: >>> On 06.04.16 05:44, Nick Coghlan wrote: The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)") >>> >>> This was already discussed. Current conclusion is using the "path" >>> attribute. See http://bugs.python.org/issue22570 . >> >> I'd missed the existing precedent in DirEntry.path, so simply taking >> that and running with it sounds good to me. > > This makes me twitch slightly, because NumPy has had a whole set of > problems due to the ancient and minimally-considered decision to > assume a bunch of ad hoc non-namespaced method names fulfilled some > protocol -- like all .sum methods will have a signature that's > compatible with numpy's, and if an object has a .log method then > surely that computes the logarithm (what else in computing could "log" > possibly refer to?), etc. This experience may or may not be relevant, > I'm not sure -- sometimes these kinds of twitches are good guides to > intuition, and sometimes they are just knee-jerk responses to an old > and irrelevant problem :-). But you might want to at least think about > how common it might be to have existing objects with unrelated > attributes that happen to be called "path", and the bizarro problems > that might be caused if someone accidentally passes one of them to a > function that expects all .path attributes to be instances of this new > protocol. > > -n > Python was in a similar situation with the .next method on iterators, which changed to __next__ in Python 3. PEP 3114 (which explains this change) says: > Code that nowhere contains an explicit call to a next method can > nonetheless be silently affected by the presence of such > a method. Therefore, this PEP proposes that iterators should have > a __next__ method instead of a next method (with no change in > semantics). How well does that apply to path/__path__? That PEP also introduced the next() builtin. This suggests that a protocol with __path__/__fspath__ would need a corresponding path()/fspath() builtin. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 06:00, Guido van Rossumwrote: > On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furman wrote: >> [...] we can't do: >> >> app_root = Path(...) >> config = app_root/'settings.cfg' >> with open(config) as blah: >> # whatever >> >> It feels like instead of addressing this basic disconnect, the answer has >> instead been: add that to pathlib! Which works great -- until a user or a >> library gets this path object and tries to use something from os on it. > > I agree that asking for config.open() isn't the right answer here > (even if it happens to work). But in this example, once 3.5.2 is out, > the solution would be to use open(config.path), and that will also > work when passing it to a library. Is it still unacceptable then? My sense is that this will remain unacceptable to those people who have a problem here. The issue is not so much the ugliness of the code (in spite of the fact that this is what people focus on) but rather the disconnect between the mental model people have and the reality of the code they have to write. The basic idea behind pathlib.Path objects is that they represent a *path*. And when you call open, you should pass it a path. So (the argument goes) why should you have to convert the path you have (a Path object) to pass it to a function (like open) that requires a path argument? Making stdlib functions work with Path objects would fix a lot of the conceptual difficulties here. And it would also mean that (thanks to duck typing) a lot of 3rd party code would work without change, further alleviating the issue. But ultimately, there will still be code that needs changing to be aware of Path objects. The change is simple enough (patharg = str(patharg), or the getattr('path') approach) but it's a change in mental model (this time by library authors) and the benefit of the change is not sufficiently obvious. Inheriting from str is the commonly-proposed solution, because in practical terms it works. But it does so by mixing layers of abstraction in a way that is difficult to explain to someone who thinks of a "path" as an abstract object rather than as a (text? byte?) string. Ultimately, all that's happening is that the burden of keeping the abstractions separate is placed on the design, rather than being explicit in the code. But while I have no evidence that this is a problem, it does leave me with a nagging feeling that it "seems similar to the bytes/text issue". My feelings: - I'd *like* to push for the cleaner separation of abstractions that a "pure" Path object provides. - It does need library writers (and in particular the stdlib) to "buy into" the model and make changes to support Path objects - I don't have a huge problem with using str(p) or p.path as a workaround during the transition, but that's from the POV of throwaway scripting. I'm not sure I'd be so happy using the workaround in code that would need to be supported for a long time. - I'd rather compromise on principles than abandon the idea of a stdlib Path object - In practical terms, inheriting from str is probably fine. At least evidence from 3rd party path libraries indicates so. Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Apr 6, 2016 1:26 AM, "Chris Angelico"wrote: > > On Wed, Apr 6, 2016 at 3:37 PM, Stephen J. Turnbull wrote: > > Chris Angelico writes: > > > > > Outside of deliberate tests, we don't create files on our disks > > > whose names are strings of random bytes; > > > > Wishful thinking. First, names made of control characters have often > > been deliberately used by miscreants to conceal their warez. Second, > > in some systems it's all too easy to create paths with components in > > different locales (the place I've seen it most frequently is in NFS > > mounts). I think that's much less true today, but perhaps that's only > > because my employer figured out that it was much less pain if system > > paths were pure ASCII so that it mostly didn't matter what encoding > > users chose for their subtrees. > > Control characters are still characters, though. You can take a > bytestring consisting of byte values less than 32, decode it as UTF-8, > and have a series of codepoints to work with. > > If your employer has "solved" the problem by restricting system paths > to ASCII, that's a fine solution for a single system with a single > ASCII-compatible encoding; a better solution is to mandate UTF-8 as > the file system encoding, as that's what most people are expecting > anyway. > > > It remains important to be able to handle nearly arbitrary bytestrings > > in file names as far as I can see. Please note that 100 million > > Japanese and 1 billion Chinese by and large still prefer their > > homegrown encodings (plural!!) to Unicode, while many systems are now > > defaulting filenames to UTF-8. There's plenty of room remaining for > > copying bytestrings to arguments of open and friends. > > Why exactly do they prefer these other encodings? Are they > representing characters that Unicode doesn't contain? If so, we have a > fundamental problem (no Python program is going to be able to cope > with these, without a third party library or some stupid mess of local > code); if not, you can always represent it as Unicode and encode it as > UTF-8 when it reaches the file system. Re-encoding is something that's > easy when you treat something as text, and impossible when you treat > it as bytes. > > So far, you're still actually agreeing with me: paths are *text*, but > sometimes we don't know the encoding (and that's a problem to be > solved). re: bytestring, unicode, encodings after e.g. os.path.split / Path.split: from "[Python-ideas] Type hints for text/binary data in Python 2+3 code" https://mail.python.org/pipermail/python-ideas/2016-March/038869.html >> would/will it be possible to use Typing.Text as a base class for even-more abstract string types https://mail.python.org/pipermail/python-ideas/2016-March/039016.html >> * Text.encoding >> * Text.lang (urn:ietf:rfc:3066) ... forgot to CC: >> * https://tools.ietf.org/html/rfc5646 "Tags for Identifying Languages" urn:ietf:rfc:5646 is this (Path) a narrower case of string types (#strypes), because after transformations we want to preserve string metadata like e.g encoding? I'd vote for * adding DirEntry.__path__ as a proxy to DirEntry.path * standardizing on __path__ (over .path) * because this operation *is* fundamentally similar to e.g. __str__ * operator.path pathify, pathifize > > ChrisA > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 16:53, Nathaniel Smithwrote: > On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlan wrote: >> I'd missed the existing precedent in DirEntry.path, so simply taking >> that and running with it sounds good to me. > > This makes me twitch slightly, because NumPy has had a whole set of > problems due to the ancient and minimally-considered decision to > assume a bunch of ad hoc non-namespaced method names fulfilled some > protocol -- like all .sum methods will have a signature that's > compatible with numpy's, and if an object has a .log method then > surely that computes the logarithm (what else in computing could "log" > possibly refer to?), etc. This experience may or may not be relevant, > I'm not sure -- sometimes these kinds of twitches are good guides to > intuition, and sometimes they are just knee-jerk responses to an old > and irrelevant problem :-) > > But you might want to at least think about > how common it might be to have existing objects with unrelated > attributes that happen to be called "path", and the bizarro problems > that might be caused if someone accidentally passes one of them to a > function that expects all .path attributes to be instances of this new > protocol. sys.path, for example. That's why I'd actually prefer the implicit conversion protocol to be the more explicitly named "__fspath__", with suitable "__fspath__ = path" assignments added to DirEntry and pathlib. However, I'm also not offering to actually *do* the work here, and the casting vote goes to the folks pursuing the implementation effort. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Tue, Apr 5, 2016 at 11:29 PM, Nick Coghlanwrote: > On 6 April 2016 at 15:57, Serhiy Storchaka wrote: >> On 06.04.16 05:44, Nick Coghlan wrote: >>> >>> The most promising option for that is probably "getattr(path, 'path', >>> path)", since the "path" attribute is being added to pathlib, and the >>> given idiom can be readily adopted in Python 2/3 compatible code >>> (since normal strings and any other object without a "path" attribute >>> are passed through unchanged). Alternatively, since it's a protocol, >>> double-underscores on the property name may be appropriate (i.e. >>> "getattr(path, '__path__', path)") >> >> This was already discussed. Current conclusion is using the "path" >> attribute. See http://bugs.python.org/issue22570 . > > I'd missed the existing precedent in DirEntry.path, so simply taking > that and running with it sounds good to me. This makes me twitch slightly, because NumPy has had a whole set of problems due to the ancient and minimally-considered decision to assume a bunch of ad hoc non-namespaced method names fulfilled some protocol -- like all .sum methods will have a signature that's compatible with numpy's, and if an object has a .log method then surely that computes the logarithm (what else in computing could "log" possibly refer to?), etc. This experience may or may not be relevant, I'm not sure -- sometimes these kinds of twitches are good guides to intuition, and sometimes they are just knee-jerk responses to an old and irrelevant problem :-). But you might want to at least think about how common it might be to have existing objects with unrelated attributes that happen to be called "path", and the bizarro problems that might be caused if someone accidentally passes one of them to a function that expects all .path attributes to be instances of this new protocol. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 16:25, Ethan Furmanwrote: > On 04/05/2016 10:50 PM, Serhiy Storchaka wrote: >> On 06.04.16 05:44, Nick Coghlan wrote: >>> The next challenge would then be to make a list of APIs to be updated >>> for 3.6 to implicitly accept "rich path" objects via the agreed >>> convention, with pathlib.PurePath used as a test class: >>> >>> * open() >>> * codecs.open() (et al) >>> * io.* >>> * os.path.* >>> * other os functions >>> * shutil.* >>> * tempfile.* >>> * shelve.* >>> * csv.* >> >> >> Not sure about os.path.*. The purpose of os.path module is manipulating >> string paths. From the perspective of pathlib it can look lower level. > > The point is that a function that receives a "path" object (whether str or > Path) shouldn't have to care: it should be able to call os.path.split on the > thing it received and get back a usable answer. I actually think it makes sense to pursue this question in a test driven manner: create "test_pathlib_support" as a new test case, start passing pathlib.PurePath instances to a relatively high level API like shutil, and see what low level interfaces need to be updated accept filesystem path objects (in addition to strings) in order to make that work. If shutil can be updated to support pathlib with changes solely at at the io and os module layer, then that bodes well for transparently enabling support in 3rd party APIs as well. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/05/2016 10:00 PM, Guido van Rossum wrote: On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furmanwrote: [...] we can't do: app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it. I agree that asking for config.open() isn't the right answer here (even if it happens to work). But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then? On the one hand that is definitely more palatable. On the other hand it doesn't address having the stdlib itself directly support Path. On the gripping hand this feels reminiscent of the arguments over bytes vs unicode, but without any of the "This is why unicode is better!" bits. Why is pathlib better than plain strings? - attribute access to different parts such as the dirname, the filename, the extension (suffix) - easy access to on-disk answers such as .exists(), .stat(), .chdir - easy creation/modification of Path objects What problem is it solving that makes the pain worth dealing with? - no idea This is an especially important point considering the str-derived Path libraries already out there that have the same advantages as pathlib, but none of the pain. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 15:57, Serhiy Storchakawrote: > On 06.04.16 05:44, Nick Coghlan wrote: >> >> The most promising option for that is probably "getattr(path, 'path', >> path)", since the "path" attribute is being added to pathlib, and the >> given idiom can be readily adopted in Python 2/3 compatible code >> (since normal strings and any other object without a "path" attribute >> are passed through unchanged). Alternatively, since it's a protocol, >> double-underscores on the property name may be appropriate (i.e. >> "getattr(path, '__path__', path)") > > This was already discussed. Current conclusion is using the "path" > attribute. See http://bugs.python.org/issue22570 . I'd missed the existing precedent in DirEntry.path, so simply taking that and running with it sounds good to me. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/05/2016 10:50 PM, Serhiy Storchaka wrote: On 06.04.16 05:44, Nick Coghlan wrote: The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class: * open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.* Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level. The point is that a function that receives a "path" object (whether str or Path) shouldn't have to care: it should be able to call os.path.split on the thing it received and get back a usable answer. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, Apr 6, 2016 at 3:37 PM, Stephen J. Turnbullwrote: > Chris Angelico writes: > > > Outside of deliberate tests, we don't create files on our disks > > whose names are strings of random bytes; > > Wishful thinking. First, names made of control characters have often > been deliberately used by miscreants to conceal their warez. Second, > in some systems it's all too easy to create paths with components in > different locales (the place I've seen it most frequently is in NFS > mounts). I think that's much less true today, but perhaps that's only > because my employer figured out that it was much less pain if system > paths were pure ASCII so that it mostly didn't matter what encoding > users chose for their subtrees. Control characters are still characters, though. You can take a bytestring consisting of byte values less than 32, decode it as UTF-8, and have a series of codepoints to work with. If your employer has "solved" the problem by restricting system paths to ASCII, that's a fine solution for a single system with a single ASCII-compatible encoding; a better solution is to mandate UTF-8 as the file system encoding, as that's what most people are expecting anyway. > It remains important to be able to handle nearly arbitrary bytestrings > in file names as far as I can see. Please note that 100 million > Japanese and 1 billion Chinese by and large still prefer their > homegrown encodings (plural!!) to Unicode, while many systems are now > defaulting filenames to UTF-8. There's plenty of room remaining for > copying bytestrings to arguments of open and friends. Why exactly do they prefer these other encodings? Are they representing characters that Unicode doesn't contain? If so, we have a fundamental problem (no Python program is going to be able to cope with these, without a third party library or some stupid mess of local code); if not, you can always represent it as Unicode and encode it as UTF-8 when it reaches the file system. Re-encoding is something that's easy when you treat something as text, and impossible when you treat it as bytes. So far, you're still actually agreeing with me: paths are *text*, but sometimes we don't know the encoding (and that's a problem to be solved). ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 15:59, Serhiy Storchakawrote: > On 06.04.16 08:52, Greg Ewing wrote: >> >> Nick Coghlan wrote: >>> >>> The most promising option for that is probably "getattr(path, 'path', >>> path)", >> >> >> Is there something seriously wrong with str(path)? > > What if path is None or bytes? Or an int, float, list, dict, or arbitrary other object. To be more explicit, the problem isn't what happens when the API doing "str(path)" internally is used correctly, it's what happens when it's used incorrectly: you end up proceeding with a nonsense string as your path name, rather than failing early with TypeError or AttributeError. Doing "getattr(path, 'path', path)" instead means that in the error case (i.e. no "path" attribute), any existing argument checking is still triggered normally. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/05/2016 10:40 PM, Stephen J. Turnbull wrote: Ethan Furman writes: > No, Stephen, that is not what this is about. Wrong Steven. Spelling matters in email too. Yes, it absolutely does. My apologies. -1 Not good enough. I wouldn't do it that often that "ugly" overrides the reasoning Brett presented [...] But we don't object to (de)serializing dicts to (from) str (as JSON or pickle). Amusingly enough, I don't have to deal with serializing dicts. :) However, as a comparison: imagine you had to transform your dict to JSON every time some function wanted a dict as input. And had to transform returned JSON strings in to dicts. I think Path vs. string is similarly different to justify saying so (especially when treating user input). [...] Thus, strings that look like paths (as strings) actually will have multiple internal representations, similarly to the way that a dict can have multiple serializations. I don't follow. When dealing with the file system one passes a string* representing the path of the object one wants -- pretty much the same string that was passed in to Path. -- ~Ethan~ * or bytes, but the same sameness, really. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 06.04.16 08:52, Greg Ewing wrote: Nick Coghlan wrote: The most promising option for that is probably "getattr(path, 'path', path)", Is there something seriously wrong with str(path)? What if path is None or bytes? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 06.04.16 05:44, Nick Coghlan wrote: The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)") This was already discussed. Current conclusion is using the "path" attribute. See http://bugs.python.org/issue22570 . ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Nick Coghlan wrote: The most promising option for that is probably "getattr(path, 'path', path)", Is there something seriously wrong with str(path)? -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 06.04.16 05:44, Nick Coghlan wrote: The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class: * open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.* Not sure about os.path.*. The purpose of os.path module is manipulating string paths. From the perspective of pathlib it can look lower level. Supporting pathlib.Path will complicate and slow down os.path functions (they are already more complex and slow than were in Python 2). Since os.path functions often called several times in a loop, their performance is important. On other hand, some Path methods are more efficient than os.path functions, and Path specialized code at higher level can be more preferable. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 15:03, Guido van Rossumwrote: > On Tue, Apr 5, 2016 at 7:44 PM, Nick Coghlan wrote: >> Option 4: define a rich-object-to-text path serialisation convention, > > Unfortunately that sounds like a classic "serious programming" > solution (objects, abstractions, serialization, all big important > words :-). Yeah, my choice of phrasing made the idea sound more complicated than it is. The actual change would be to add the following to some Python standard library APIs that accept a filesystem path as an argument: arg = getattr(arg, "path", arg) and the C API based equivalent to some C modules. (With the main bike-sheddable part being whether to use the generic "path" or something more explicit like "__fspath__" for the property name, since pathlib can readily support either/both of them, and "__fspath__" would be in line with the "os.fsencode" and "os.fsdecode" abbreviations) The key goal of this approach would be to make it so that most third party libraries would "just work" with path objects if they were already using os.path and other standard library APIs for path manipulation (rather than using string methods directly), while still avoiding the type confusion that comes from inheriting directly from str. >From a testing perspective, it would arguably make sense to tackle it as a separate "test_path_protocol" test case that checked pathlib compatibility with the APIs of interest, simply to avoid adding a pathlib dependency to all those module tests. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Ethan Furman writes: > No, Stephen, that is not what this is about. Wrong Steven. Spelling matters in email too. And he's more worth paying attention to than I am. But I'll have my say anyway. ;-) > This is about the ugliness of code with str(path) this and > str(path) that -1 Not good enough. I wouldn't do it that often that "ugly" overrides the reasoning Brett presented, and if you do, I bet one or two personal helpers would clean up 95% of your cases. But see Nick's comment that "str(var)" is too permissive. I'll have to think about that, but my first take is he's right, and we need to do something about making use of Path more straightforward within the stdlib. Whatever that is, preferably would make life easier for 3rd party usage too, of course. Is error-checking within Path sufficiently robust in the light of "too permissive"? (I don't know exactly what I mean by that, but something like if "str(var_purporting_to_be_Path)" is too permissive, are we sure that "str(really_is_Path_var)" is "safe"? Apparently we haven't had a lot of beta testing.) > and let's not forget the Path(this_returned_string) and > Path(that_returned_string), But we don't object to (de)serializing dicts to (from) str (as JSON or pickle). I think Path vs. string is similarly different to justify saying so (especially when treating user input). Note, too, that based on discussion in that thread it seems likely that Path is likely to be inappropriate as an internal representation of URL.RFC3986.Path. Thus, strings that look like paths (as strings) actually will have multiple internal representations, similarly to the way that a dict can have multiple serializations. If representation transformation is not invertible, EIBTI says we need the "boilerplate". YMMV, but that's my take. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
Chris Angelico writes: > Outside of deliberate tests, we don't create files on our disks > whose names are strings of random bytes; Wishful thinking. First, names made of control characters have often been deliberately used by miscreants to conceal their warez. Second, in some systems it's all too easy to create paths with components in different locales (the place I've seen it most frequently is in NFS mounts). I think that's much less true today, but perhaps that's only because my employer figured out that it was much less pain if system paths were pure ASCII so that it mostly didn't matter what encoding users chose for their subtrees. It remains important to be able to handle nearly arbitrary bytestrings in file names as far as I can see. Please note that 100 million Japanese and 1 billion Chinese by and large still prefer their homegrown encodings (plural!!) to Unicode, while many systems are now defaulting filenames to UTF-8. There's plenty of room remaining for copying bytestrings to arguments of open and friends. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 06.04.16 01:41, Brett Cannon wrote: After a rather extensive discussion on python-ideas about pathlib.PurePath not inheriting from str, another point that came up was that the use of pathlib has been rather light. Unfortunately even the stdlib doesn't really use pathlib because it's currently marked as provisional (or at least that's why I haven't tried to use it where possible in importlib). Do we have a plan of what is required to remove the provisional label from pathlib? The behavior of the Path.resolve() method likely should be changed with breaking backward compatibility. There is an open issue about this. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Tue, Apr 5, 2016 at 7:44 PM, Nick Coghlanwrote: > Option 4: define a rich-object-to-text path serialisation convention, Unfortunately that sounds like a classic "serious programming" solution (objects, abstractions, serialization, all big important words :-). -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Tue, Apr 5, 2016 at 9:29 PM, Ethan Furmanwrote: > [...] we can't do: > > app_root = Path(...) > config = app_root/'settings.cfg' > with open(config) as blah: > # whatever > > It feels like instead of addressing this basic disconnect, the answer has > instead been: add that to pathlib! Which works great -- until a user or a > library gets this path object and tries to use something from os on it. I agree that asking for config.open() isn't the right answer here (even if it happens to work). But in this example, once 3.5.2 is out, the solution would be to use open(config.path), and that will also work when passing it to a library. Is it still unacceptable then? -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 13:06, Alexander Walterswrote: > I think the naysayers would be satisfied with an object that... while not > str or bytes or a derived class of either... acted like str when it had to. > Is that possible without deriving from str or bytes? Only if the consuming code explicitly casts with "str()", and that's *too* permissive for most use cases (since __str__ and the __repr__ fallback are completely inappropriate as a "convert to a text representation of a filesystem path" command). A "__text__" protocol for non-lossy conversions to str would arguably be feasible, but its scope goes way beyond what's needed for a "rich path object" conversion protocol. Implementing that model in the general case would require something more akin to https://www.python.org/dev/peps/pep-0357/, which added __index__ as a guaranteed-non-lossy conversion from other types to a builtin integer, allowing non-builtin integers to accepted for things like slicing and sequence repetition, without inadvertently also accepting non-integral types like builtin floats. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/05/2016 07:40 PM, Steven D'Aprano wrote: On Tue, Apr 05, 2016 at 11:47:32PM +, Brett Cannon wrote: To me it seems to basically be a question of whether people can be patient during a transition and embrace pathlib over time or if they will simply refuse to add support in libraries and refuse to use `getattr(path, 'path', path)` or `str(path)` in the mean time. Wait, what? Is that what the whole fuss is about? That some people refuse to call str(path) when passing a path object to a function that expects a string? No, Stephen, that is not what this is about. This is about the ugliness of code with str(path) this and str(path) that and let's not forget the Path(this_returned_string) and Path(that_returned_string), not to mention the frustrations of forgetting to cast a str to Path or a Path to str. It's about the horror of boiler-plate infecting our otherwise beautiful Python code. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 04/05/2016 03:55 PM, Guido van Rossum wrote: It's been provisional since 3.4. I think if it is still there in 3.6.0 it should be considered no longer provisional. But this may indeed be a test case for the ultimate fate of provisional modules -- should we remove it? We should either remove it or make the rest of the stdlib work with it. Currently, pathlib.*Paths are second-class citizens, and working with them is not significantly better than working with os.path.* simply because we have to cast to str every time we want to deal with any other part of the stdlib. Would making it inherit from str cause most hostility to disappear? I don't think that is necessary. The hostility (of which I have some) is because we can't do: app_root = Path(...) config = app_root/'settings.cfg' with open(config) as blah: # whatever It feels like instead of addressing this basic disconnect, the answer has instead been: add that to pathlib! Which works great -- until a user or a library gets this path object and tries to use something from os on it. To come at this from a different angle: Python now has Enum; it is arguable that Path is more important, or at least much more useful. We have IntEnum whose sole purpose in life is to make it possible to (mostly) seamlessly work with the stdlib and other libraries where ints are being used to represent enumerations; and in pathlib we have . . . absolutely nothing. We have the promise of great things and wonderful usability, but in reality we have just as much pain as before -- or more if we forget to str(path) somewhere. I said that pathlib.Path does not need to inherit from str, and I still think that; however, to be a good stepping stone / transitional library I think the pathlib backport does need to have its Paths inherit from str. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, Apr 6, 2016 at 12:51 PM, Steven D'Apranowrote: > On Wed, Apr 06, 2016 at 10:02:30AM +1000, Chris Angelico wrote: > >> My personal view on the text/bytes debate is that a path is >> fundamentally a human concept, and consists therefore of text. The >> fact that some file systems store (at the low level) bytes and some >> store (I think) UTF-16 code units should be immaterial; path >> components exist for people. We can smuggle unrecognized bytes around, >> but ultimately, those bytes came from characters at some point - we >> just don't know the encoding. So a Path object has no relationship >> with bytes, only with str. > > That might be usually true in practice, but it is incorrect in > principle. Paths in POSIX systems like Linux are fundamentally > byte-strings with only two restrictions: \0 and \x2f are forbidden. That's the file system level. But more fundamentally than that, a path exists so that humans can refer to files. That's why they have *names*, not just dirent numbers. We could assign dirent number -1 to mean "parent directory", and then represent everything with tuples of directory entries. Follow the chain and you get an inode. Absolute paths would start with an inode (the root directory being inode 2) and proceed with dirents thereafter. Maybe we'd need a pseudo-inode to mean "current directory". Should we do paths like this? No way! Much better to have either "/home/rosuav/cpython/python" or (P.ROOT, "home", "rosuav", "cpython", "python") to represent them, because they exist for the human. The POSIX file system rules aren't insignificant, but my point is that every byte value seen in a file name was once representing a character. Outside of deliberate tests, we don't create files on our disks whose names are strings of random bytes; the normal use of a file system is to store files that a human has named. Hence my recommendation that a Path object be tied to str, but *not* to bytes. > The fact that paths in Linux mostly happen to look like English words > (often heavily abbreviated) is a historical accident. The file system > itself supported paths containing (say) \xff even back in the days when > text was pure US-ASCII and bytes over \x7f had no textual meaning, and > these days paths still support sequences of bytes that have no human > meaning in any encoding. > > I don't know if this makes the tiniest lick of difference for Pathlib. I > would be perfectly content if we stuck with the design decision that > Pathlib can only represent paths representable as Unicode strings, and > left weird POSIX filenames to the legacy byte-string interface. I'd prefer to keep the surrogateescape compatibility hack with U+DC00 to U+DCFF being used to smuggle bytes around. That means that every path can be represented as a Unicode string, with only minor loss of functionality (imagine a path with only a single character that can't be decoded - chances are a human can figure out what the file is), but it still strongly pushes to a Unicode interpretation of the path. An *actual* byte-string interface (such as os.listdir and friends support) would be completely outside of anything involving Pathlib. If you give bytes, you'll get bytes. And I'd deprecate that once Path objects are more broadly accepted. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 4/5/2016 22:44, Nick Coghlan wrote: Option 4: define a rich-object-to-text path serialisation convention, as paths are not conceptually the same as arbitrary strings Just as a nit to pick, it is perfectly acceptable for hypothetical path objects to raise when someone tries to shoehorn them into acting like arbitrary strings - open() will gladly halt and set fire if you try and pass the text of war and peace as an argument. I think the naysayers would be satisfied with an object that... while not str or bytes or a derived class of either... acted like str when it had to. Is that possible without deriving from str or bytes? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, Apr 06, 2016 at 10:02:30AM +1000, Chris Angelico wrote: > My personal view on the text/bytes debate is that a path is > fundamentally a human concept, and consists therefore of text. The > fact that some file systems store (at the low level) bytes and some > store (I think) UTF-16 code units should be immaterial; path > components exist for people. We can smuggle unrecognized bytes around, > but ultimately, those bytes came from characters at some point - we > just don't know the encoding. So a Path object has no relationship > with bytes, only with str. That might be usually true in practice, but it is incorrect in principle. Paths in POSIX systems like Linux are fundamentally byte-strings with only two restrictions: \0 and \x2f are forbidden. The fact that paths in Linux mostly happen to look like English words (often heavily abbreviated) is a historical accident. The file system itself supported paths containing (say) \xff even back in the days when text was pure US-ASCII and bytes over \x7f had no textual meaning, and these days paths still support sequences of bytes that have no human meaning in any encoding. I don't know if this makes the tiniest lick of difference for Pathlib. I would be perfectly content if we stuck with the design decision that Pathlib can only represent paths representable as Unicode strings, and left weird POSIX filenames to the legacy byte-string interface. -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 6 April 2016 at 09:45, Guido van Rossumwrote: > On Tue, Apr 5, 2016 at 4:13 PM, Chris Angelico wrote: >> On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walters >> wrote: >>> * pathlib should be improved (specifically by making it inherit from str) >> >> I'd like to see this specific change settled on in the PEP, actually. >> There are some arguments on both sides, and some hybrid solutions >> being proposed, and it looks to be an important enough issue to people >> for there to be an answer somewhere. It seems to come down to a >> sloppiness vs strictness concern, I think, but I'm not sure. > > This does sound like it's the crucial issue, and it is worth writing > up clearly the pros and cons. Let's draft those lists in a thread > (this one's fine) and then add them to the PEP. We can then decide to: > > - keep the status quo > - change PurePath to inherit from str > - decide it's never going to be settled and kill pathlib.py Option 4: define a rich-object-to-text path serialisation convention, as paths are not conceptually the same as arbitrary strings, and we can define a new protocol accepted by builtins and standard library modules, while third parties can't The most promising option for that is probably "getattr(path, 'path', path)", since the "path" attribute is being added to pathlib, and the given idiom can be readily adopted in Python 2/3 compatible code (since normal strings and any other object without a "path" attribute are passed through unchanged). Alternatively, since it's a protocol, double-underscores on the property name may be appropriate (i.e. "getattr(path, '__path__', path)") The next challenge would then be to make a list of APIs to be updated for 3.6 to implicitly accept "rich path" objects via the agreed convention, with pathlib.PurePath used as a test class: * open() * codecs.open() (et al) * io.* * os.path.* * other os functions * shutil.* * tempfile.* * shelve.* * csv.* The list wouldn't necessarily need to be 100% comprehensive (similar to the rollout of context management, "support rich path objects in API " may appear as future RFEs), but it should be comprehensive enough for rich path objects to mostly "just work" with other APIs that aren't specifically limiting their inputs to str objects (although using lower level APIs may force a conversion to the lower level plain text representation as a side-effect). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
I haven't really been following this discussion, but a couple of comments... On Tue, Apr 05, 2016 at 11:47:32PM +, Brett Cannon wrote: > http://www.snarky.ca/why-pathlib-path-doesn-t-inherit-from-str Nice write-up, thanks. [...] > To me it seems to basically be a question of whether people can be patient > during a transition and embrace pathlib over time or if they will simply > refuse to add support in libraries and refuse to use `getattr(path, 'path', > path)` or `str(path)` in the mean time. Wait, what? Is that what the whole fuss is about? That some people refuse to call str(path) when passing a path object to a function that expects a string? Really? That's it? The mind boggles. -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 4/5/2016 7:45 PM, Guido van Rossum wrote: This does sound like it's the crucial issue, and it is worth writing up clearly the pros and cons. Let's draft those lists in a thread (this one's fine) and then add them to the PEP. We can then decide to: - keep the status quo - change PurePath to inherit from str - decide it's never going to be settled and kill pathlib.py (And yes, I'm dead serious about the latter, rather Solomonic option.) My sense of the discussion was that some people think that the new-in-upcoming 3.5.2 PurePath.path should serve as a substitute for inheriting from str. In particular, it should make it easy for stringpath functions to also accept path objects. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, Apr 6, 2016 at 9:45 AM, Guido van Rossumwrote: > On Tue, Apr 5, 2016 at 4:13 PM, Chris Angelico wrote: >> On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walters >> wrote: >>> * pathlib should be improved (specifically by making it inherit from str) >> >> I'd like to see this specific change settled on in the PEP, actually. >> There are some arguments on both sides, and some hybrid solutions >> being proposed, and it looks to be an important enough issue to people >> for there to be an answer somewhere. It seems to come down to a >> sloppiness vs strictness concern, I think, but I'm not sure. > > This does sound like it's the crucial issue, and it is worth writing > up clearly the pros and cons. Let's draft those lists in a thread > (this one's fine) and then add them to the PEP. We can then decide to: > > - keep the status quo > - change PurePath to inherit from str > - decide it's never going to be settled and kill pathlib.py > > (And yes, I'm dead serious about the latter, rather Solomonic option.) Summarizing from memory to get things started. Inheriting from str makes it easier for code to support pathlib without really caring about the details. NOT inheriting from str forces code to be aware that it's working with a path, in the same way that text and bytes are fundamentally different things, and the Unicode string doesn't inherit from the byte string, nor vice versa. If a few crucial built-in functions support Path objects (notably open() and a handful of os.* functions), the bulk of stdlib support will be easy (sometimes trivial) to implement. Paths are [or are not] fundamentally different from strings. <-- argued point Paths might be backed by Unicode text, and might be backed by bytes. Should a Path be able to be implicitly constructed from either? Should there be some sort of "Path literal"? <-- possibly a completely separate question, to be resolved after this one How should .. be handled? Can you canonicalize a Path? Can Path handle URIs as well as file system paths? - My personal view on the text/bytes debate is that a path is fundamentally a human concept, and consists therefore of text. The fact that some file systems store (at the low level) bytes and some store (I think) UTF-16 code units should be immaterial; path components exist for people. We can smuggle unrecognized bytes around, but ultimately, those bytes came from characters at some point - we just don't know the encoding. So a Path object has no relationship with bytes, only with str. Whether a Path is fundamentally "a text string that uses slashes to separate components" or "a tuple of path components" is up for debate. Both make a lot of sense, and I'm somewhat inclined to the latter view; it allows for other forms of path component, such as an open directory (for statat/openat etc), or a special thing representing "current directory" or "root directory". ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Tue, 5 Apr 2016 at 15:55 Guido van Rossumwrote: > It's been provisional since 3.4. I think if it is still there in 3.6.0 > it should be considered no longer provisional. But this may indeed be > a test case for the ultimate fate of provisional modules -- should we > remove it? > > I have to admit I got tired of the discussions and muted them all. > :) I figured. I was close myself until I decided to be the "not inheriting from str is a sane decision" camp because people weren't understanding where the design decision probably came from, hence http://www.snarky.ca/why-pathlib-path-doesn-t-inherit-from-str . > Personally I am not worried about the light use (I always expected it > would take a long time to get adoption) Ditto. My expectation/hope is that once we stop having it be provisional and we start using it in the stdlib then usage will pick up, especially if libraries pick up the `getattr(path, 'path', path)` idiom as an easy transition technique until they decide to drop support for str-based paths. The main motivation of this email is actually to have newcomers to the sprints at PyCon US sprint on adding support for pathlib (after we add "path-like object" to the glossary to say something like "a `str` object or an object that has a `path` attribute that itself is a `str`"). > but I am worried about the > hostility towards the module. My last/only comment in the discussion > was about there possibly being a dichotomy between people who use > Python for scripting and those who use it to write more substantial > programs (I'm trying not to judge one group more important than > another -- I'm just observing there seem to be these two groups). But > I didn't stick around long enough to watch for responses to this idea. > Nope, no response (as Alexander pointed out). > > Would making it inherit from str cause most hostility to disappear? > Probably. Most people were upset with pathlib because they couldn't use it immediately with all of the third-party libraries out there on top of the stdlib because adoption has been so low. Now if we make a concerted effort to accept pathlib in the stdlib then this may be the kick in the pants that it takes to start getting people to accept it externally and the transition band-aid of inheriting from str may not be needed. To me it seems to basically be a question of whether people can be patient during a transition and embrace pathlib over time or if they will simply refuse to add support in libraries and refuse to use `getattr(path, 'path', path)` or `str(path)` in the mean time. Personally, if we can wait out the Python 3 transition I have no issue waiting on a transition like this that has no backward-compatibility issues and has a one-liner solution for adding shallow support (and thus is ripe for quick patches to projects). After the whole str thing the only other major topic was coming up with some easier way to produce pathlib.Path instances (e.g. the p-string suggestion). Nothing really came of those discussions that seemed concrete and reach consensus, though (I think that may have been where your scripting/substantial programming comment came from). > I'm sure there was a discussion about this when PEP 428 was originally > proposed, and I recall I was strongly in the camp of "it should not > inherit from str", but unfortunately the PEP has no mention of this > discussion or even the stated reason. > https://www.python.org/dev/peps/pep-0428/#no-confusion-with-builtins is the best you get in the PEP. -Brett > > --Guido > > > On Tue, Apr 5, 2016 at 3:41 PM, Brett Cannon wrote: > > After a rather extensive discussion on python-ideas about > pathlib.PurePath > > not inheriting from str, another point that came up was that the use of > > pathlib has been rather light. Unfortunately even the stdlib doesn't > really > > use pathlib because it's currently marked as provisional (or at least > that's > > why I haven't tried to use it where possible in importlib). > > > > Do we have a plan of what is required to remove the provisional label > from > > pathlib? > > > > ___ > > Python-Dev mailing list > > Python-Dev@python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > > -- > --Guido van Rossum (python.org/~guido) > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Tue, Apr 5, 2016 at 4:13 PM, Chris Angelicowrote: > On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walters > wrote: >> * pathlib should be improved (specifically by making it inherit from str) > > I'd like to see this specific change settled on in the PEP, actually. > There are some arguments on both sides, and some hybrid solutions > being proposed, and it looks to be an important enough issue to people > for there to be an answer somewhere. It seems to come down to a > sloppiness vs strictness concern, I think, but I'm not sure. This does sound like it's the crucial issue, and it is worth writing up clearly the pros and cons. Let's draft those lists in a thread (this one's fine) and then add them to the PEP. We can then decide to: - keep the status quo - change PurePath to inherit from str - decide it's never going to be settled and kill pathlib.py (And yes, I'm dead serious about the latter, rather Solomonic option.) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On Wed, Apr 6, 2016 at 9:08 AM, Alexander Walterswrote: > * pathlib should be improved (specifically by making it inherit from str) I'd like to see this specific change settled on in the PEP, actually. There are some arguments on both sides, and some hybrid solutions being proposed, and it looks to be an important enough issue to people for there to be an answer somewhere. It seems to come down to a sloppiness vs strictness concern, I think, but I'm not sure. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
On 4/5/2016 18:55, Guido van Rossum wrote: My last/only comment in the discussion was about there possibly being a dichotomy between people who use Python for scripting and those who use it to write more substantial programs (I'm trying not to judge one group more important than another -- I'm just observing there seem to be these two groups). But I didn't stick around long enough to watch for responses to this idea. This was all but ignored. The opinions mentioned in the thread, without throwing my opinion behind any of them were: * pathlib should be improved (specifically by making it inherit from str) * the stdlib should be made to deal with pathlib without changing pathlib * pathlib is redundant to third party modules which work better * the continued existence of pathlib was briefly discussed You can insert the never-ending arguments for and against each of those points in your head - none of them were particularly convincing (in that i don't think anyone changed their position.) the split between utility scripting and application development was not really discussed. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
I think the provisional status can be safely lifted now. Even though pathlib hasn't seen that much use, there have been enough reports and discussion since its acception that I think the API has proven it's sane for general use. (as for importlib, pathlib might have too many dependencies for sane bootstrapping) Regards Antoine. Le 06/04/2016 00:41, Brett Cannon a écrit : > After a rather extensive discussion on python-ideas about > pathlib.PurePath not inheriting from str, another point that came up was > that the use of pathlib has been rather light. Unfortunately even the > stdlib doesn't really use pathlib because it's currently marked as > provisional (or at least that's why I haven't tried to use it where > possible in importlib). > > Do we have a plan of what is required to remove the provisional label > from pathlib? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] When should pathlib stop being provisional?
It's been provisional since 3.4. I think if it is still there in 3.6.0 it should be considered no longer provisional. But this may indeed be a test case for the ultimate fate of provisional modules -- should we remove it? I have to admit I got tired of the discussions and muted them all. Personally I am not worried about the light use (I always expected it would take a long time to get adoption) but I am worried about the hostility towards the module. My last/only comment in the discussion was about there possibly being a dichotomy between people who use Python for scripting and those who use it to write more substantial programs (I'm trying not to judge one group more important than another -- I'm just observing there seem to be these two groups). But I didn't stick around long enough to watch for responses to this idea. Would making it inherit from str cause most hostility to disappear? I'm sure there was a discussion about this when PEP 428 was originally proposed, and I recall I was strongly in the camp of "it should not inherit from str", but unfortunately the PEP has no mention of this discussion or even the stated reason. --Guido On Tue, Apr 5, 2016 at 3:41 PM, Brett Cannonwrote: > After a rather extensive discussion on python-ideas about pathlib.PurePath > not inheriting from str, another point that came up was that the use of > pathlib has been rather light. Unfortunately even the stdlib doesn't really > use pathlib because it's currently marked as provisional (or at least that's > why I haven't tried to use it where possible in importlib). > > Do we have a plan of what is required to remove the provisional label from > pathlib? > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] When should pathlib stop being provisional?
After a rather extensive discussion on python-ideas about pathlib.PurePath not inheriting from str, another point that came up was that the use of pathlib has been rather light. Unfortunately even the stdlib doesn't really use pathlib because it's currently marked as provisional (or at least that's why I haven't tried to use it where possible in importlib). Do we have a plan of what is required to remove the provisional label from pathlib? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com