Re: [Python-Dev] proposed os.fspath() change
On Wed, Jun 15, 2016 at 11:00 PM, Ethan Furmanwrote: > On 06/15/2016 12:24 PM, Koos Zevenhoven wrote: >> >> And the other question could be turned into whether to make str and >> bytes also PathLike in __subclasshook__. > > No, for two reasons. > > - most str's and bytes' are not paths; True. Well, at least most str and bytes objects are not *meant* to be used as paths, even if they could be. > - PathLike indicates a rich-path object, which str's and bytes' are not. This does not count as a reason. If this were called pathlib.PathABC, I would definitely agree [1]. But since this is called os.PathLike, I'm not quite as sure. Anyway, including str and bytes is more of a type hinting issue. And since type hints will in also act as documentation, the naming of types is becoming more important. -- Koos [1] No, I'm not proposing moving this to pathlib > -- > ~Ethan~ > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven + http://twitter.com/k7hoven + ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
On 06/15/2016 12:24 PM, Koos Zevenhoven wrote: On Wed, Jun 15, 2016 at 10:15 PM, Brett Cannon wrote: ABCs like os.PathLike can override __subclasshook__ so that registration isn't required (see https://hg.python.org/cpython/file/default/Lib/os.py#l1136). So registration is definitely good to do to be explicit that you're trying to meet an ABC, but it isn't strictly required. And the other question could be turned into whether to make str and bytes also PathLike in __subclasshook__. No, for two reasons. - most str's and bytes' are not paths; - PathLike indicates a rich-path object, which str's and bytes' are not. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
PEP 519 updated: https://hg.python.org/peps/rev/92feff129ee4 On Wed, 15 Jun 2016 at 11:44 Brett Cannonwrote: > On Wed, 15 Jun 2016 at 11:39 Guido van Rossum wrote: > >> OK, so let's add a check on the return of __fspath__() and keep the check >> on path-like or string/bytes. >> > > I'll update the PEP. > > Ethan, do you want to leave a note on the os.fspath() issue to update the > code and go through where we've used os.fspath() to see where we can cut > out redundant type checks? > > >> --Guido (mobile) >> On Jun 15, 2016 11:29 AM, "Nick Coghlan" wrote: >> >>> On 15 June 2016 at 10:59, Brett Cannon wrote: >>> > >>> > >>> > On Wed, 15 Jun 2016 at 09:48 Guido van Rossum >>> wrote: >>> >> >>> >> These are really two separate proposals. >>> >> >>> >> I'm okay with checking the return value of calling obj.__fspath__; >>> that's >>> >> an error in the object anyways, and it doesn't matter much whether we >>> do >>> >> this or not (though when approving the PEP I considered this and >>> decided not >>> >> to insert a check for this). But it doesn't affect your example, does >>> it? I >>> >> guess it's easier to raise now and change the API in the future to >>> avoid >>> >> raising in this case (if we find that raising is undesirable) than >>> the other >>> >> way around, so I'm +0 on this. >>> > >>> > +0 from me as well. I know in some code in the stdlib that has been >>> ported >>> > which prior to adding support was explicitly checking for str/bytes >>> this >>> > will eliminate its own checking (obviously not a motivating factor as >>> it's >>> > pretty minor). >>> >>> I'd like a strong assertion that the return value of os.fspath() is a >>> plausible filesystem path representation (so either bytes or str), and >>> *not* some other kind of object that can also be used for accessing >>> the filesystem (like a file descriptor or an IO stream) >>> >>> >> The other proposal (passing anything that's not understood right >>> through) >>> >> is more interesting and your use case is somewhat compelling. >>> Catching the >>> >> exception coming out of os.fspath() would certainly be much messier. >>> The >>> >> question remaining is whether, when this behavior is not desired >>> (e.g. when >>> >> the caller of os.fspath() just wants a string that it can pass to >>> open()), >>> >> the condition of passing that's neither a string not supports >>> __fspath__ >>> >> still produces an understandable error. I'm not sure that that's the >>> case. >>> >> E.g. open() accepts file descriptors in addition to paths, but I'm >>> not sure >>> >> that accepting an integer is a good idea in most cases -- it either >>> gives a >>> >> mystery "Bad file descriptor" error or starts reading/writing some >>> random >>> >> system file, which it then closes once the stream is closed. >>> > >>> > The FD issue of magically passing through an int was also a concern >>> when >>> > Ethan brought this up in an issue on the tracker. My argument is that >>> FDs >>> > are not file paths and so shouldn't magically pass through if we're >>> going to >>> > type-check anything or claim os.fspath() only works with paths (FDs are >>> > already open file objects). So in my view either we go ahead and >>> type-check >>> > the return value of __fspath__() and thus restrict everything coming >>> out of >>> > os.fspath() to Union[str, bytes] or we don't type check anything and be >>> > consistent that os.fspath() simply does is call __fspath__() if >>> present. >>> > >>> > And just because I'm thinking about it, I would special-case the FDs, >>> not >>> > os.PathLike (clearer why you care and faster as it skips the override >>> of >>> > __subclasshook__): >>> > >>> > # Can be a single-line ternary operator if preferred. >>> > if not isinstance(filename, int): >>> > filename = os.fspath(filename) >>> >>> Note that the LZMA case Ethan cites is one where the code accepts >>> either an already opened file-like object *or* a path-like object, and >>> does different things based on which it receives. >>> >>> In that scenario, rather than introducing an unconditional "filename = >>> os.fspath(filename)" before the current logic, it makes more sense to >>> me to change the current logic to use the new protocol check rather >>> than a strict typecheck on str/bytes: >>> >>> if isinstance(filename, os.PathLike): # Changed line >>> filename = os.fspath(filename)# New line >>> if "b" not in mode: >>> mode += "b" >>> self._fp = builtins.open(filename, mode) >>> self._closefp = True >>> self._mode = mode_code >>> elif hasattr(filename, "read") or hasattr(filename, "write"): >>> self._fp = filename >>> self._mode = mode_code >>> else: >>> raise TypeError( >>> "filename must be a path-like or file-like object" >>> ) >>> >>> I *don't* think it makes sense
Re: [Python-Dev] proposed os.fspath() change
On Wed, Jun 15, 2016 at 10:15 PM, Brett Cannonwrote: > > > On Wed, 15 Jun 2016 at 12:12 Koos Zevenhoven wrote: >> >> >> if isinstance(filename, os.PathLike): >> >> By the way, regarding the line of code above, is there a convention >> regarding whether implementing some protocol/interface requires >> registering with (or inheriting from) the appropriate ABC for it to >> work in all situations. IOW, in this case, is it sufficient to >> implement __fspath__ to make your type pathlike? Is there a conscious >> trend towards requiring the ABC? > > > ABCs like os.PathLike can override __subclasshook__ so that registration > isn't required (see > https://hg.python.org/cpython/file/default/Lib/os.py#l1136). So registration > is definitely good to do to be explicit that you're trying to meet an ABC, > but it isn't strictly required. Ok I suppose that's fine, so I propose we update the ABC part in the PEP with __subclasshook__. And the other question could be turned into whether to make str and bytes also PathLike in __subclasshook__. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven + ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
On 06/15/2016 12:10 PM, Koos Zevenhoven wrote: if isinstance(filename, os.PathLike): By the way, regarding the line of code above, is there a convention regarding whether implementing some protocol/interface requires registering with (or inheriting from) the appropriate ABC for it to work in all situations. IOW, in this case, is it sufficient to implement __fspath__ to make your type pathlike? Is there a conscious trend towards requiring the ABC? The ABC is not required, simply having the __fspath__ attribute is enough. Of course, to actually work that attribute should be a function that returns a str or bytes object. ;) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
On Wed, 15 Jun 2016 at 12:12 Koos Zevenhovenwrote: > >> if isinstance(filename, os.PathLike): > > By the way, regarding the line of code above, is there a convention > regarding whether implementing some protocol/interface requires > registering with (or inheriting from) the appropriate ABC for it to > work in all situations. IOW, in this case, is it sufficient to > implement __fspath__ to make your type pathlike? Is there a conscious > trend towards requiring the ABC? > ABCs like os.PathLike can override __subclasshook__ so that registration isn't required (see https://hg.python.org/cpython/file/default/Lib/os.py#l1136). So registration is definitely good to do to be explicit that you're trying to meet an ABC, but it isn't strictly required. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
>> if isinstance(filename, os.PathLike): By the way, regarding the line of code above, is there a convention regarding whether implementing some protocol/interface requires registering with (or inheriting from) the appropriate ABC for it to work in all situations. IOW, in this case, is it sufficient to implement __fspath__ to make your type pathlike? Is there a conscious trend towards requiring the ABC? -- Koos ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
On 06/15/2016 11:44 AM, Brett Cannon wrote: On Wed, 15 Jun 2016 at 11:39 Guido van Rossum wrote: OK, so let's add a check on the return of __fspath__() and keep the check on path-like or string/bytes. I'll update the PEP. Ethan, do you want to leave a note on the os.fspath() issue to update the code and go through where we've used os.fspath() to see where we can cut out redundant type checks? Will do. I didn't see this subthread before my last post, so unless you agree with those other changes feel free to ignore it. ;) -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
On Wed, Jun 15, 2016 at 9:29 PM, Nick Coghlanwrote: > On 15 June 2016 at 10:59, Brett Cannon wrote: >> >> >> On Wed, 15 Jun 2016 at 09:48 Guido van Rossum wrote: >>> >>> These are really two separate proposals. >>> >>> I'm okay with checking the return value of calling obj.__fspath__; that's >>> an error in the object anyways, and it doesn't matter much whether we do >>> this or not (though when approving the PEP I considered this and decided not >>> to insert a check for this). But it doesn't affect your example, does it? I >>> guess it's easier to raise now and change the API in the future to avoid >>> raising in this case (if we find that raising is undesirable) than the other >>> way around, so I'm +0 on this. >> >> +0 from me as well. I know in some code in the stdlib that has been ported >> which prior to adding support was explicitly checking for str/bytes this >> will eliminate its own checking (obviously not a motivating factor as it's >> pretty minor). > > I'd like a strong assertion that the return value of os.fspath() is a > plausible filesystem path representation (so either bytes or str), and > *not* some other kind of object that can also be used for accessing > the filesystem (like a file descriptor or an IO stream) I agree, so I'm -0.5 on passing through any object (at least by default). >>> The other proposal (passing anything that's not understood right through) >>> is more interesting and your use case is somewhat compelling. Catching the >>> exception coming out of os.fspath() would certainly be much messier. The >>> question remaining is whether, when this behavior is not desired (e.g. when >>> the caller of os.fspath() just wants a string that it can pass to open()), >>> the condition of passing that's neither a string not supports __fspath__ >>> still produces an understandable error. I'm not sure that that's the case. >>> E.g. open() accepts file descriptors in addition to paths, but I'm not sure >>> that accepting an integer is a good idea in most cases -- it either gives a >>> mystery "Bad file descriptor" error or starts reading/writing some random >>> system file, which it then closes once the stream is closed. >> >> The FD issue of magically passing through an int was also a concern when >> Ethan brought this up in an issue on the tracker. My argument is that FDs >> are not file paths and so shouldn't magically pass through if we're going to >> type-check anything or claim os.fspath() only works with paths (FDs are >> already open file objects). So in my view either we go ahead and type-check >> the return value of __fspath__() and thus restrict everything coming out of >> os.fspath() to Union[str, bytes] or we don't type check anything and be >> consistent that os.fspath() simply does is call __fspath__() if present. >> >> And just because I'm thinking about it, I would special-case the FDs, not >> os.PathLike (clearer why you care and faster as it skips the override of >> __subclasshook__): >> >> # Can be a single-line ternary operator if preferred. >> if not isinstance(filename, int): >> filename = os.fspath(filename) > > Note that the LZMA case Ethan cites is one where the code accepts > either an already opened file-like object *or* a path-like object, and > does different things based on which it receives. > > In that scenario, rather than introducing an unconditional "filename = > os.fspath(filename)" before the current logic, it makes more sense to > me to change the current logic to use the new protocol check rather > than a strict typecheck on str/bytes: > > if isinstance(filename, os.PathLike): # Changed line > filename = os.fspath(filename)# New line You are making one of my earlier points here, thanks ;). The point is that the name PathLike sounds like it would mean anything path-like, except that os.PathLike does not include str and bytes. And I still think the naming should be a little different. So that would be (os.Pathlike, str, bytes) instead of just os.PathLike. > if "b" not in mode: > mode += "b" > self._fp = builtins.open(filename, mode) > self._closefp = True > self._mode = mode_code > elif hasattr(filename, "read") or hasattr(filename, "write"): > self._fp = filename > self._mode = mode_code > else: > raise TypeError( > "filename must be a path-like or file-like object" > ) > > I *don't* think it makes sense to weaken the guarantees on os.fspath > to let it propagate non-path-like objects. > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com -- + Koos Zevenhoven +
Re: [Python-Dev] proposed os.fspath() change
On 06/15/2016 10:59 AM, Brett Cannon wrote: On Wed, 15 Jun 2016 at 09:48 Guido van Rossum wrote: These are really two separate proposals. I'm okay with checking the return value of calling obj.__fspath__; that's an error in the object anyways, and it doesn't matter much whether we do this or not (though when approving the PEP I considered this and decided not to insert a check for this). But it doesn't affect your example, does it? I guess it's easier to raise now and change the API in the future to avoid raising in this case (if we find that raising is undesirable) than the other way around, so I'm +0 on this. +0 from me as well. I know in some code in the stdlib that has been ported which prior to adding support was explicitly checking for str/bytes this will eliminate its own checking (obviously not a motivating factor as it's pretty minor). If we accept both parts of this proposal the checking will have to stay in place as the original argument may not have been bytes, str, nor os.PathLike. The other proposal (passing anything that's not understood right through) is more interesting and your use case is somewhat compelling. Catching the exception coming out of os.fspath() would certainly be much messier. The question remaining is whether, when this behavior is not desired (e.g. when the caller of os.fspath() just wants a string that it can pass to open()), the condition of passing that's neither a string not supports __fspath__ still produces an understandable error. This is no different than before os.fspath() existed -- if the function wasn't checking that the "filename" was a str but just used it as-is, then whatever strange, possibly-hard-to-debug error they would get now is the same as what they would have gotten before. I'm not sure that that's the case. E.g. open() accepts file descriptors in addition to paths, but I'm not sure that accepting an integer is a good idea in most cases -- it either gives a mystery "Bad file descriptor" error or starts reading/writing some random system file, which it then closes once the stream is closed. My vision of os.fspath() is simply to reduce rich-path objects to their component str or bytes representation, and pass anything else through. The advantage: - if os.open accepts str/bytes/fd it can prep the argument by calling os.fspath() and then do it's argument checking all in one place; - if lzma accepts bytes/str/filelike-obj it can prep its argument by calling os.fspath() and then do it's argument checking all in one place - if Path accepts str/os.PathLike it can prep it's argument(s) with os.fspath() and then do its argument checking all in one place. The FD issue of magically passing through an int was also a concern when Ethan brought this up in an issue on the tracker. My argument is that FDs are not file paths and so shouldn't magically pass through if we're going to type-check anything or claim os.fspath() only works with paths (FDs are already open file objects). So in my view either we go ahead and type-check the return value of __fspath__() and thus restrict everything coming out of os.fspath() to Union[str, bytes] or we don't type check anything and be consistent that os.fspath() simply does is call __fspath__() if present. This is better than what os.fspath() currently does as it has all the advantages listed above, but why is checking the output of __fspath__ incompatible with not checking anything else? And just because I'm thinking about it, I would special-case the FDs, not os.PathLike (clearer why you care and faster as it skips the override of __subclasshook__): # Can be a single-line ternary operator if preferred. if not isinstance(filename, int): filename = os.fspath(filename) That example will not do the right thing in the lzma case. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
OK, so let's add a check on the return of __fspath__() and keep the check on path-like or string/bytes. --Guido (mobile) On Jun 15, 2016 11:29 AM, "Nick Coghlan"wrote: > On 15 June 2016 at 10:59, Brett Cannon wrote: > > > > > > On Wed, 15 Jun 2016 at 09:48 Guido van Rossum wrote: > >> > >> These are really two separate proposals. > >> > >> I'm okay with checking the return value of calling obj.__fspath__; > that's > >> an error in the object anyways, and it doesn't matter much whether we do > >> this or not (though when approving the PEP I considered this and > decided not > >> to insert a check for this). But it doesn't affect your example, does > it? I > >> guess it's easier to raise now and change the API in the future to avoid > >> raising in this case (if we find that raising is undesirable) than the > other > >> way around, so I'm +0 on this. > > > > +0 from me as well. I know in some code in the stdlib that has been > ported > > which prior to adding support was explicitly checking for str/bytes this > > will eliminate its own checking (obviously not a motivating factor as > it's > > pretty minor). > > I'd like a strong assertion that the return value of os.fspath() is a > plausible filesystem path representation (so either bytes or str), and > *not* some other kind of object that can also be used for accessing > the filesystem (like a file descriptor or an IO stream) > > >> The other proposal (passing anything that's not understood right > through) > >> is more interesting and your use case is somewhat compelling. Catching > the > >> exception coming out of os.fspath() would certainly be much messier. The > >> question remaining is whether, when this behavior is not desired (e.g. > when > >> the caller of os.fspath() just wants a string that it can pass to > open()), > >> the condition of passing that's neither a string not supports __fspath__ > >> still produces an understandable error. I'm not sure that that's the > case. > >> E.g. open() accepts file descriptors in addition to paths, but I'm not > sure > >> that accepting an integer is a good idea in most cases -- it either > gives a > >> mystery "Bad file descriptor" error or starts reading/writing some > random > >> system file, which it then closes once the stream is closed. > > > > The FD issue of magically passing through an int was also a concern when > > Ethan brought this up in an issue on the tracker. My argument is that FDs > > are not file paths and so shouldn't magically pass through if we're > going to > > type-check anything or claim os.fspath() only works with paths (FDs are > > already open file objects). So in my view either we go ahead and > type-check > > the return value of __fspath__() and thus restrict everything coming out > of > > os.fspath() to Union[str, bytes] or we don't type check anything and be > > consistent that os.fspath() simply does is call __fspath__() if present. > > > > And just because I'm thinking about it, I would special-case the FDs, > not > > os.PathLike (clearer why you care and faster as it skips the override of > > __subclasshook__): > > > > # Can be a single-line ternary operator if preferred. > > if not isinstance(filename, int): > > filename = os.fspath(filename) > > Note that the LZMA case Ethan cites is one where the code accepts > either an already opened file-like object *or* a path-like object, and > does different things based on which it receives. > > In that scenario, rather than introducing an unconditional "filename = > os.fspath(filename)" before the current logic, it makes more sense to > me to change the current logic to use the new protocol check rather > than a strict typecheck on str/bytes: > > if isinstance(filename, os.PathLike): # Changed line > filename = os.fspath(filename)# New line > if "b" not in mode: > mode += "b" > self._fp = builtins.open(filename, mode) > self._closefp = True > self._mode = mode_code > elif hasattr(filename, "read") or hasattr(filename, "write"): > self._fp = filename > self._mode = mode_code > else: > raise TypeError( > "filename must be a path-like or file-like object" > ) > > I *don't* think it makes sense to weaken the guarantees on os.fspath > to let it propagate non-path-like objects. > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
On Wed, 15 Jun 2016 at 11:39 Guido van Rossumwrote: > OK, so let's add a check on the return of __fspath__() and keep the check > on path-like or string/bytes. > I'll update the PEP. Ethan, do you want to leave a note on the os.fspath() issue to update the code and go through where we've used os.fspath() to see where we can cut out redundant type checks? > --Guido (mobile) > On Jun 15, 2016 11:29 AM, "Nick Coghlan" wrote: > >> On 15 June 2016 at 10:59, Brett Cannon wrote: >> > >> > >> > On Wed, 15 Jun 2016 at 09:48 Guido van Rossum wrote: >> >> >> >> These are really two separate proposals. >> >> >> >> I'm okay with checking the return value of calling obj.__fspath__; >> that's >> >> an error in the object anyways, and it doesn't matter much whether we >> do >> >> this or not (though when approving the PEP I considered this and >> decided not >> >> to insert a check for this). But it doesn't affect your example, does >> it? I >> >> guess it's easier to raise now and change the API in the future to >> avoid >> >> raising in this case (if we find that raising is undesirable) than the >> other >> >> way around, so I'm +0 on this. >> > >> > +0 from me as well. I know in some code in the stdlib that has been >> ported >> > which prior to adding support was explicitly checking for str/bytes this >> > will eliminate its own checking (obviously not a motivating factor as >> it's >> > pretty minor). >> >> I'd like a strong assertion that the return value of os.fspath() is a >> plausible filesystem path representation (so either bytes or str), and >> *not* some other kind of object that can also be used for accessing >> the filesystem (like a file descriptor or an IO stream) >> >> >> The other proposal (passing anything that's not understood right >> through) >> >> is more interesting and your use case is somewhat compelling. Catching >> the >> >> exception coming out of os.fspath() would certainly be much messier. >> The >> >> question remaining is whether, when this behavior is not desired (e.g. >> when >> >> the caller of os.fspath() just wants a string that it can pass to >> open()), >> >> the condition of passing that's neither a string not supports >> __fspath__ >> >> still produces an understandable error. I'm not sure that that's the >> case. >> >> E.g. open() accepts file descriptors in addition to paths, but I'm not >> sure >> >> that accepting an integer is a good idea in most cases -- it either >> gives a >> >> mystery "Bad file descriptor" error or starts reading/writing some >> random >> >> system file, which it then closes once the stream is closed. >> > >> > The FD issue of magically passing through an int was also a concern when >> > Ethan brought this up in an issue on the tracker. My argument is that >> FDs >> > are not file paths and so shouldn't magically pass through if we're >> going to >> > type-check anything or claim os.fspath() only works with paths (FDs are >> > already open file objects). So in my view either we go ahead and >> type-check >> > the return value of __fspath__() and thus restrict everything coming >> out of >> > os.fspath() to Union[str, bytes] or we don't type check anything and be >> > consistent that os.fspath() simply does is call __fspath__() if present. >> > >> > And just because I'm thinking about it, I would special-case the FDs, >> not >> > os.PathLike (clearer why you care and faster as it skips the override of >> > __subclasshook__): >> > >> > # Can be a single-line ternary operator if preferred. >> > if not isinstance(filename, int): >> > filename = os.fspath(filename) >> >> Note that the LZMA case Ethan cites is one where the code accepts >> either an already opened file-like object *or* a path-like object, and >> does different things based on which it receives. >> >> In that scenario, rather than introducing an unconditional "filename = >> os.fspath(filename)" before the current logic, it makes more sense to >> me to change the current logic to use the new protocol check rather >> than a strict typecheck on str/bytes: >> >> if isinstance(filename, os.PathLike): # Changed line >> filename = os.fspath(filename)# New line >> if "b" not in mode: >> mode += "b" >> self._fp = builtins.open(filename, mode) >> self._closefp = True >> self._mode = mode_code >> elif hasattr(filename, "read") or hasattr(filename, "write"): >> self._fp = filename >> self._mode = mode_code >> else: >> raise TypeError( >> "filename must be a path-like or file-like object" >> ) >> >> I *don't* think it makes sense to weaken the guarantees on os.fspath >> to let it propagate non-path-like objects. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia >> > ___ Python-Dev mailing list
Re: [Python-Dev] proposed os.fspath() change
On 15 June 2016 at 10:59, Brett Cannonwrote: > > > On Wed, 15 Jun 2016 at 09:48 Guido van Rossum wrote: >> >> These are really two separate proposals. >> >> I'm okay with checking the return value of calling obj.__fspath__; that's >> an error in the object anyways, and it doesn't matter much whether we do >> this or not (though when approving the PEP I considered this and decided not >> to insert a check for this). But it doesn't affect your example, does it? I >> guess it's easier to raise now and change the API in the future to avoid >> raising in this case (if we find that raising is undesirable) than the other >> way around, so I'm +0 on this. > > +0 from me as well. I know in some code in the stdlib that has been ported > which prior to adding support was explicitly checking for str/bytes this > will eliminate its own checking (obviously not a motivating factor as it's > pretty minor). I'd like a strong assertion that the return value of os.fspath() is a plausible filesystem path representation (so either bytes or str), and *not* some other kind of object that can also be used for accessing the filesystem (like a file descriptor or an IO stream) >> The other proposal (passing anything that's not understood right through) >> is more interesting and your use case is somewhat compelling. Catching the >> exception coming out of os.fspath() would certainly be much messier. The >> question remaining is whether, when this behavior is not desired (e.g. when >> the caller of os.fspath() just wants a string that it can pass to open()), >> the condition of passing that's neither a string not supports __fspath__ >> still produces an understandable error. I'm not sure that that's the case. >> E.g. open() accepts file descriptors in addition to paths, but I'm not sure >> that accepting an integer is a good idea in most cases -- it either gives a >> mystery "Bad file descriptor" error or starts reading/writing some random >> system file, which it then closes once the stream is closed. > > The FD issue of magically passing through an int was also a concern when > Ethan brought this up in an issue on the tracker. My argument is that FDs > are not file paths and so shouldn't magically pass through if we're going to > type-check anything or claim os.fspath() only works with paths (FDs are > already open file objects). So in my view either we go ahead and type-check > the return value of __fspath__() and thus restrict everything coming out of > os.fspath() to Union[str, bytes] or we don't type check anything and be > consistent that os.fspath() simply does is call __fspath__() if present. > > And just because I'm thinking about it, I would special-case the FDs, not > os.PathLike (clearer why you care and faster as it skips the override of > __subclasshook__): > > # Can be a single-line ternary operator if preferred. > if not isinstance(filename, int): > filename = os.fspath(filename) Note that the LZMA case Ethan cites is one where the code accepts either an already opened file-like object *or* a path-like object, and does different things based on which it receives. In that scenario, rather than introducing an unconditional "filename = os.fspath(filename)" before the current logic, it makes more sense to me to change the current logic to use the new protocol check rather than a strict typecheck on str/bytes: if isinstance(filename, os.PathLike): # Changed line filename = os.fspath(filename)# New line if "b" not in mode: mode += "b" self._fp = builtins.open(filename, mode) self._closefp = True self._mode = mode_code elif hasattr(filename, "read") or hasattr(filename, "write"): self._fp = filename self._mode = mode_code else: raise TypeError( "filename must be a path-like or file-like object" ) I *don't* think it makes sense to weaken the guarantees on os.fspath to let it propagate non-path-like objects. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] proposed os.fspath() change
My proposal at the point of the first PEP draft solved both of these issues. That version of the fspath function passed anything right through that was an instance of the keyword-only `type_constraint`. If not, it would ask __fspath__, and before returning the result, it would check that __fspath__ returned an instance of `type_constraint` and otherwise raise a TypeError. `type_constraint=object` would then have given the behavior you want. I always wanted fspath to spare the caller from all the instance checking (most of which it does even now). The main problem with setting type_constraint to something broader than (str, bytes) is that then that parameter would affect the return type of the function, which would at least complicate the type hinting issue. Mypy might now support things like @overload def fspath(path: T, type_constraint: Type[T] = (str, bytes)) -> T: ... but then again, isinstance and Union are not compatible (for a reason?), and PEP484 for a reason does not allow tuples like (str, bytes) in place of Unions. Anyway, if we were to go back to this behavior, we would need to decide whether to officially allow a wider type constraint or whether to leave that to Stack Overflow, so to speak. -- Koos On Wed, Jun 15, 2016 at 7:46 PM, Guido van Rossumwrote: > These are really two separate proposals. > > I'm okay with checking the return value of calling obj.__fspath__; that's an > error in the object anyways, and it doesn't matter much whether we do this > or not (though when approving the PEP I considered this and decided not to > insert a check for this). But it doesn't affect your example, does it? I > guess it's easier to raise now and change the API in the future to avoid > raising in this case (if we find that raising is undesirable) than the other > way around, so I'm +0 on this. > > The other proposal (passing anything that's not understood right through) is > more interesting and your use case is somewhat compelling. Catching the > exception coming out of os.fspath() would certainly be much messier. The > question remaining is whether, when this behavior is not desired (e.g. when > the caller of os.fspath() just wants a string that it can pass to open()), > the condition of passing that's neither a string not supports __fspath__ > still produces an understandable error. I'm not sure that that's the case. > E.g. open() accepts file descriptors in addition to paths, but I'm not sure > that accepting an integer is a good idea in most cases -- it either gives a > mystery "Bad file descriptor" error or starts reading/writing some random > system file, which it then closes once the stream is closed. > > On Wed, Jun 15, 2016 at 9:12 AM, Ethan Furman wrote: >> >> I would like to make a change to os.fspath(). >> >> Specifically, os.fspath() currently raises an exception if something >> besides str, bytes, or os.PathLike is passed in, but makes no checks >> if an os.PathLike object returns something besides a str or bytes. >> >> I would like to change that to the opposite: if a non-os.PathLike is >> passed in, return it unchanged (so no change for str and bytes); if >> an os.PathLike object returns something that is not a str nor bytes, >> raise. >> >> An example of the difference in the lzma file: >> >> Current code (has not been upgraded to use os.fspath() yet) >> --- >> >> if isinstance(filename, (str, bytes)): >> if "b" not in mode: >> mode += "b" >> self._fp = builtins.open(filename, mode) >> self._closefp = True >> self._mode = mode_code >> elif hasattr(filename, "read") or hasattr(filename, "write"): >> self._fp = filename >> self._mode = mode_code >> else: >> raise TypeError( >> "filename must be a str or bytes object, or a file" >> ) >> >> Code change if using upgraded os.fspath() (placed before above stanza): >> >> filename = os.fspath(filename) >> >> Code change with current os.fspath() (ditto): >> >> if isinstance(filename, os.PathLike): >> filename = os.fspath(filename) >> >> My intention with the os.fspath() function was to minimize boiler-plate >> code and make PathLike objects easy and painless to support; having to >> discover if any given parameter is PathLike before calling os.fspath() >> on it is, IMHO, just the opposite. >> >> There is also precedent for having a __dunder__ check the return type: >> >> --> class Huh: >> ... def __int__(self): >> ... return 'string' >> ... def __index__(self): >> ... return b'bytestring' >> ... def __bool__(self): >> ... return 'true-ish' >> ... >> --> h = Huh() >> >> --> int(h) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: __int__ returned non-int (type str) >> >> --> ''[h] >> Traceback (most recent call last):
Re: [Python-Dev] proposed os.fspath() change
On Wed, 15 Jun 2016 at 09:48 Guido van Rossumwrote: > These are really two separate proposals. > > I'm okay with checking the return value of calling obj.__fspath__; that's > an error in the object anyways, and it doesn't matter much whether we do > this or not (though when approving the PEP I considered this and decided > not to insert a check for this). But it doesn't affect your example, does > it? I guess it's easier to raise now and change the API in the future to > avoid raising in this case (if we find that raising is undesirable) than > the other way around, so I'm +0 on this. > +0 from me as well. I know in some code in the stdlib that has been ported which prior to adding support was explicitly checking for str/bytes this will eliminate its own checking (obviously not a motivating factor as it's pretty minor). > > The other proposal (passing anything that's not understood right through) > is more interesting and your use case is somewhat compelling. Catching the > exception coming out of os.fspath() would certainly be much messier. The > question remaining is whether, when this behavior is not desired (e.g. when > the caller of os.fspath() just wants a string that it can pass to open()), > the condition of passing that's neither a string not supports __fspath__ > still produces an understandable error. I'm not sure that that's the case. > E.g. open() accepts file descriptors in addition to paths, but I'm not sure > that accepting an integer is a good idea in most cases -- it either gives a > mystery "Bad file descriptor" error or starts reading/writing some random > system file, which it then closes once the stream is closed. > The FD issue of magically passing through an int was also a concern when Ethan brought this up in an issue on the tracker. My argument is that FDs are not file paths and so shouldn't magically pass through if we're going to type-check anything or claim os.fspath() only works with paths (FDs are already open file objects). So in my view either we go ahead and type-check the return value of __fspath__() and thus restrict everything coming out of os.fspath() to Union[str, bytes] or we don't type check anything and be consistent that os.fspath() simply does is call __fspath__() if present. And just because I'm thinking about it, I would special-case the FDs, not os.PathLike (clearer why you care and faster as it skips the override of __subclasshook__): # Can be a single-line ternary operator if preferred. if not isinstance(filename, int): filename = os.fspath(filename) > On Wed, Jun 15, 2016 at 9:12 AM, Ethan Furman wrote: > >> I would like to make a change to os.fspath(). >> >> Specifically, os.fspath() currently raises an exception if something >> besides str, bytes, or os.PathLike is passed in, but makes no checks >> if an os.PathLike object returns something besides a str or bytes. >> >> I would like to change that to the opposite: if a non-os.PathLike is >> passed in, return it unchanged (so no change for str and bytes); if >> an os.PathLike object returns something that is not a str nor bytes, >> raise. >> >> An example of the difference in the lzma file: >> >> Current code (has not been upgraded to use os.fspath() yet) >> --- >> >> if isinstance(filename, (str, bytes)): >> if "b" not in mode: >> mode += "b" >> self._fp = builtins.open(filename, mode) >> self._closefp = True >> self._mode = mode_code >> elif hasattr(filename, "read") or hasattr(filename, "write"): >> self._fp = filename >> self._mode = mode_code >> else: >> raise TypeError( >> "filename must be a str or bytes object, or a file" >> ) >> >> Code change if using upgraded os.fspath() (placed before above stanza): >> >> filename = os.fspath(filename) >> >> Code change with current os.fspath() (ditto): >> >> if isinstance(filename, os.PathLike): >> filename = os.fspath(filename) >> >> My intention with the os.fspath() function was to minimize boiler-plate >> code and make PathLike objects easy and painless to support; having to >> discover if any given parameter is PathLike before calling os.fspath() >> on it is, IMHO, just the opposite. >> >> There is also precedent for having a __dunder__ check the return type: >> >> --> class Huh: >> ... def __int__(self): >> ... return 'string' >> ... def __index__(self): >> ... return b'bytestring' >> ... def __bool__(self): >> ... return 'true-ish' >> ... >> --> h = Huh() >> >> --> int(h) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: __int__ returned non-int (type str) >> >> --> ''[h] >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: __index__ returned non-int (type bytes) >> >> --> bool(h)
Re: [Python-Dev] proposed os.fspath() change
These are really two separate proposals. I'm okay with checking the return value of calling obj.__fspath__; that's an error in the object anyways, and it doesn't matter much whether we do this or not (though when approving the PEP I considered this and decided not to insert a check for this). But it doesn't affect your example, does it? I guess it's easier to raise now and change the API in the future to avoid raising in this case (if we find that raising is undesirable) than the other way around, so I'm +0 on this. The other proposal (passing anything that's not understood right through) is more interesting and your use case is somewhat compelling. Catching the exception coming out of os.fspath() would certainly be much messier. The question remaining is whether, when this behavior is not desired (e.g. when the caller of os.fspath() just wants a string that it can pass to open()), the condition of passing that's neither a string not supports __fspath__ still produces an understandable error. I'm not sure that that's the case. E.g. open() accepts file descriptors in addition to paths, but I'm not sure that accepting an integer is a good idea in most cases -- it either gives a mystery "Bad file descriptor" error or starts reading/writing some random system file, which it then closes once the stream is closed. On Wed, Jun 15, 2016 at 9:12 AM, Ethan Furmanwrote: > I would like to make a change to os.fspath(). > > Specifically, os.fspath() currently raises an exception if something > besides str, bytes, or os.PathLike is passed in, but makes no checks > if an os.PathLike object returns something besides a str or bytes. > > I would like to change that to the opposite: if a non-os.PathLike is > passed in, return it unchanged (so no change for str and bytes); if > an os.PathLike object returns something that is not a str nor bytes, > raise. > > An example of the difference in the lzma file: > > Current code (has not been upgraded to use os.fspath() yet) > --- > > if isinstance(filename, (str, bytes)): > if "b" not in mode: > mode += "b" > self._fp = builtins.open(filename, mode) > self._closefp = True > self._mode = mode_code > elif hasattr(filename, "read") or hasattr(filename, "write"): > self._fp = filename > self._mode = mode_code > else: > raise TypeError( > "filename must be a str or bytes object, or a file" > ) > > Code change if using upgraded os.fspath() (placed before above stanza): > > filename = os.fspath(filename) > > Code change with current os.fspath() (ditto): > > if isinstance(filename, os.PathLike): > filename = os.fspath(filename) > > My intention with the os.fspath() function was to minimize boiler-plate > code and make PathLike objects easy and painless to support; having to > discover if any given parameter is PathLike before calling os.fspath() > on it is, IMHO, just the opposite. > > There is also precedent for having a __dunder__ check the return type: > > --> class Huh: > ... def __int__(self): > ... return 'string' > ... def __index__(self): > ... return b'bytestring' > ... def __bool__(self): > ... return 'true-ish' > ... > --> h = Huh() > > --> int(h) > Traceback (most recent call last): > File "", line 1, in > TypeError: __int__ returned non-int (type str) > > --> ''[h] > Traceback (most recent call last): > File "", line 1, in > TypeError: __index__ returned non-int (type bytes) > > --> bool(h) > Traceback (most recent call last): > File "", line 1, in > TypeError: __bool__ should return bool, returned str > > Arguments in favor or against? > > -- > ~Ethan~ > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] proposed os.fspath() change
I would like to make a change to os.fspath(). Specifically, os.fspath() currently raises an exception if something besides str, bytes, or os.PathLike is passed in, but makes no checks if an os.PathLike object returns something besides a str or bytes. I would like to change that to the opposite: if a non-os.PathLike is passed in, return it unchanged (so no change for str and bytes); if an os.PathLike object returns something that is not a str nor bytes, raise. An example of the difference in the lzma file: Current code (has not been upgraded to use os.fspath() yet) --- if isinstance(filename, (str, bytes)): if "b" not in mode: mode += "b" self._fp = builtins.open(filename, mode) self._closefp = True self._mode = mode_code elif hasattr(filename, "read") or hasattr(filename, "write"): self._fp = filename self._mode = mode_code else: raise TypeError( "filename must be a str or bytes object, or a file" ) Code change if using upgraded os.fspath() (placed before above stanza): filename = os.fspath(filename) Code change with current os.fspath() (ditto): if isinstance(filename, os.PathLike): filename = os.fspath(filename) My intention with the os.fspath() function was to minimize boiler-plate code and make PathLike objects easy and painless to support; having to discover if any given parameter is PathLike before calling os.fspath() on it is, IMHO, just the opposite. There is also precedent for having a __dunder__ check the return type: --> class Huh: ... def __int__(self): ... return 'string' ... def __index__(self): ... return b'bytestring' ... def __bool__(self): ... return 'true-ish' ... --> h = Huh() --> int(h) Traceback (most recent call last): File "", line 1, in TypeError: __int__ returned non-int (type str) --> ''[h] Traceback (most recent call last): File "", line 1, in TypeError: __index__ returned non-int (type bytes) --> bool(h) Traceback (most recent call last): File "", line 1, in TypeError: __bool__ should return bool, returned str Arguments in favor or against? -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com