Re: [Python-Dev] ctypes compatibility with 2.3

2016-05-11 Thread Thomas Heller

Am 10.05.2016 um 19:39 schrieb Brett Cannon:



On Tue, 10 May 2016 at 01:18 Martin Panter mailto:vadmium%[email protected]>> wrote:

I am working on , to fix shell
injection problems with ctypes.util.find_library(). The proposal for
Python 3 is to change os.popen(shell-script) calls to use
subprocess.Popen().

However the Python 2.7 version of the module has a comment which says
“This file should be kept compatible with Python 2.3, see PEP 291.”
Looking at , it is not
clear why we have to maintain this compatibility. My best guess is
that there may be an external ctypes package that people want(ed) to
keep compatible with 2.3, and also keep synchronized with 2.7.


That's correct and the maintainer is/was Thomas Heller who I have cc'ed
to see if he's okay with lifting the restriction.


For me it is totally ok to lift this restriction.

Thomas

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes compatibility with 2.3

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 04:35 Thomas Heller  wrote:

> Am 10.05.2016 um 19:39 schrieb Brett Cannon:
> >
> >
> > On Tue, 10 May 2016 at 01:18 Martin Panter  > > wrote:
> >
> > I am working on , to fix shell
> > injection problems with ctypes.util.find_library(). The proposal for
> > Python 3 is to change os.popen(shell-script) calls to use
> > subprocess.Popen().
> >
> > However the Python 2.7 version of the module has a comment which says
> > “This file should be kept compatible with Python 2.3, see PEP 291.”
> > Looking at , it is not
> > clear why we have to maintain this compatibility. My best guess is
> > that there may be an external ctypes package that people want(ed) to
> > keep compatible with 2.3, and also keep synchronized with 2.7.
> >
> >
> > That's correct and the maintainer is/was Thomas Heller who I have cc'ed
> > to see if he's okay with lifting the restriction.
>
> For me it is totally ok to lift this restriction.
>

Great! I'll also update PEP 291.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes compatibility with 2.3

2016-05-11 Thread Thomas Heller

Am 11.05.2016 um 18:04 schrieb Brett Cannon:



On Wed, 11 May 2016 at 04:35 Thomas Heller mailto:[email protected]>> wrote:

Am 10.05.2016 um 19:39 schrieb Brett Cannon:
>
>
> On Tue, 10 May 2016 at 01:18 Martin Panter mailto:vadmium%[email protected]>
> >>
wrote:
>
> I am working on , to fix shell
> injection problems with ctypes.util.find_library(). The
proposal for
> Python 3 is to change os.popen(shell-script) calls to use
> subprocess.Popen().
>
> However the Python 2.7 version of the module has a comment
which says
> “This file should be kept compatible with Python 2.3, see PEP
291.”
> Looking at , it is not
> clear why we have to maintain this compatibility. My best guess is
> that there may be an external ctypes package that people
want(ed) to
> keep compatible with 2.3, and also keep synchronized with 2.7.
>
>
> That's correct and the maintainer is/was Thomas Heller who I have
cc'ed
> to see if he's okay with lifting the restriction.

For me it is totally ok to lift this restriction.


Great! I'll also update PEP 291.


Cool.  While you're at it, the compatibility restriction for 
modulefinder could also be lifted.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes compatibility with 2.3

2016-05-11 Thread Meador Inge
On Wed, May 11, 2016 at 11:07 AM, Thomas Heller  wrote:

Cool.  While you're at it, the compatibility restriction for modulefinder
> could also be lifted.


+1

The question of modulefinder actually came up recently*:

http://bugs.python.org/issue26881

-- Meador

* Posting here for reference.  Thomas already knows this as he is on the
issue26881 watch list :-)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes compatibility with 2.3

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 09:07 Thomas Heller  wrote:

> Am 11.05.2016 um 18:04 schrieb Brett Cannon:
> >
> >
> > On Wed, 11 May 2016 at 04:35 Thomas Heller  > > wrote:
> >
> > Am 10.05.2016 um 19:39 schrieb Brett Cannon:
> > >
> > >
> > > On Tue, 10 May 2016 at 01:18 Martin Panter  > 
> > > >>
> > wrote:
> > >
> > > I am working on , to fix
> shell
> > > injection problems with ctypes.util.find_library(). The
> > proposal for
> > > Python 3 is to change os.popen(shell-script) calls to use
> > > subprocess.Popen().
> > >
> > > However the Python 2.7 version of the module has a comment
> > which says
> > > “This file should be kept compatible with Python 2.3, see PEP
> > 291.”
> > > Looking at , it is
> not
> > > clear why we have to maintain this compatibility. My best
> guess is
> > > that there may be an external ctypes package that people
> > want(ed) to
> > > keep compatible with 2.3, and also keep synchronized with 2.7.
> > >
> > >
> > > That's correct and the maintainer is/was Thomas Heller who I have
> > cc'ed
> > > to see if he's okay with lifting the restriction.
> >
> > For me it is totally ok to lift this restriction.
> >
> >
> > Great! I'll also update PEP 291.
>
> Cool.  While you're at it, the compatibility restriction for
> modulefinder could also be lifted.
>

Will do.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-11 Thread Brett Cannon
Is there anything holding up PEP 515 at this point in terms of acceptance
or implementation?

On Sat, 19 Mar 2016 at 11:56 Guido van Rossum  wrote:

> All that sounds fine!
>
> On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah  wrote:
> > Guido van Rossum  python.org> writes:
> >> So should the preprocessing step just be s.replace('_', ''), or should
> >> it reject underscores that don't follow the rules from the PEP
> >> (perhaps augmented so they follow the spirit of the PEP and the letter
> >> of the IBM spec)?
> >>
> >> Honestly I think it's also fine if specifying this exactly is left out
> >> of the PEP, and handled by whoever adds this to Decimal. Having a PEP
> >> to work from for the language spec and core builtins (int(), float()
> >> complex()) is more important.
> >
> > I'd keep it simple for Decimal: Remove left and right whitespace (we're
> > already doing this), then remove underscores from the remaining string
> > (which must not contain any further whitespace), then use the IBM
> grammar.
> >
> >
> > We could add a clause to the PEP that only those strings that follow
> > the spirit of the PEP are guaranteed to be accepted in the future.
> >
> >
> > One reason for keeping it simple is that I would not like to slow down
> > string conversion, but thinking about two grammars is also a problem --
> > part of the string conversion in libmpdec is modeled in ACL2, which
> > would be invalidated or at least complicated with two grammars.
> >
> >
> >
> > Stefan Krah
> >
> > ___
> > Python-Dev mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] file system path protocol PEP

2016-05-11 Thread Brett Cannon
**deep, calming breath**

Here is the PEP for __fspath__(). The draft lives at
https://github.com/brettcannon/path-pep so feel free to send me PRs for
spelling mistakes, grammatical errors, etc.

-

PEP: NNN
Title: Adding a file system path protocol
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-May-2016
Post-History: 11-May-2016


Abstract


This PEP proposes a protocol for classes which represent a file system
path to be able to provide a ``str`` or ``bytes`` representation.
Changes to Python's standard library are also proposed to utilize this
protocol where appropriate to facilitate the use of path objects where
historically only ``str`` and/or ``bytes`` file system paths are
accepted. The goal is to allow users to use the representation of a
file system path that's easiest for them now as they migrate towards
using path objects in the future.


Rationale
=

Historically in Python, file system paths have been represented as
strings or bytes. This choice of representation has stemmed from C's
own decision to represent file system paths as
``const char *`` [#libc-open]_. While that is a totally serviceable
format to use for file system paths, it's not necessarily optimal. At
issue is the fact that while all file system paths can be represented
as strings or bytes, not all strings or bytes represent a file system
path. This can lead to issues where any e.g. string duck-types to a
file system path whether it actually represents a path or not.

To help elevate the representation of file system paths from their
representation as strings and bytes to a more appropriate object
representation, the pathlib module [#pathlib]_ was provisionally
introduced in Python 3.4 through PEP 428. While considered by some as
an improvement over strings and bytes for file system paths, it has
suffered from a lack of adoption. Typically the key issue listed
for the low adoption rate has been the lack of support in the standard
library. This lack of support required users of pathlib to manually
convert path objects to strings by calling ``str(path)`` which many
found error-prone.

One issue in converting path objects to strings comes from
the fact that only generic way to get a string representation of the
path was to pass the object to ``str()``. This can pose a
problem when done blindly as nearly all Python objects have some
string representation whether they are a path or not, e.g.
``str(None)`` will give a result that
``builtins.open()`` [#builtins-open]_ will happily use to create a new
file.

Exacerbating this whole situation is the
``DirEntry`` object [#os-direntry]_. While path objects have a
representation that can be extracted using ``str()``, ``DirEntry``
objects expose a ``path`` attribute instead. Having no common
interface between path objects, ``DirEntry``, and any other
third-party path library had become an issue. A solution that allowed
any path-representing object to declare that is was a path and a way
to extract a low-level representation that all path objects could
support was desired.

This PEP then proposes to introduce a new protocol to be followed by
objects which represent file system paths. Providing a protocol allows
for clear signalling of what objects represent file system paths as
well as a way to extract a lower-level representation that can be used
with older APIs which only support strings or bytes.

Discussions regarding path objects that led to this PEP can be found
in multiple threads on the python-ideas mailing list archive
[#python-ideas-archive]_ for the months of March and April 2016 and on
the python-dev mailing list archives [#python-dev-archive]_ during
April 2016.


Proposal


This proposal is split into two parts. One part is the proposal of a
protocol for objects to declare and provide support for exposing a
file system path representation. The other part is changes to Python's
standard library to support the new protocol. These changes will also
have the pathlib module drop its provisional status.


Protocol


The following abstract base class defines the protocol for an object
to be considered a path object::

import abc
import typing as t


class PathLike(abc.ABC):

"""Abstract base class for implementing the file system path
protocol."""

@abc.abstractmethod
def __fspath__(self) -> t.Union[str, bytes]:
"""Return the file system path representation of the object."""
raise NotImplementedError


Objects representing file system paths will implement the
``__fspath__()`` method which will return the ``str`` or ``bytes``
representation of the path. The ``str`` representation is the
preferred low-level path representation as it is human-readable and
what people historically represent paths as.


Standard library changes


It is expected that most APIs in Python's standard librar

Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-11 Thread Guido van Rossum
If the authors are happy I'll accept it right away.

(I vaguely recall there's another PEP that's ready for pronouncement -- but
which one?)

On Wed, May 11, 2016 at 9:34 AM, Brett Cannon  wrote:

> Is there anything holding up PEP 515 at this point in terms of acceptance
> or implementation?
>
> On Sat, 19 Mar 2016 at 11:56 Guido van Rossum  wrote:
>
>> All that sounds fine!
>>
>> On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah 
>> wrote:
>> > Guido van Rossum  python.org> writes:
>> >> So should the preprocessing step just be s.replace('_', ''), or should
>> >> it reject underscores that don't follow the rules from the PEP
>> >> (perhaps augmented so they follow the spirit of the PEP and the letter
>> >> of the IBM spec)?
>> >>
>> >> Honestly I think it's also fine if specifying this exactly is left out
>> >> of the PEP, and handled by whoever adds this to Decimal. Having a PEP
>> >> to work from for the language spec and core builtins (int(), float()
>> >> complex()) is more important.
>> >
>> > I'd keep it simple for Decimal: Remove left and right whitespace (we're
>> > already doing this), then remove underscores from the remaining string
>> > (which must not contain any further whitespace), then use the IBM
>> grammar.
>> >
>> >
>> > We could add a clause to the PEP that only those strings that follow
>> > the spirit of the PEP are guaranteed to be accepted in the future.
>> >
>> >
>> > One reason for keeping it simple is that I would not like to slow down
>> > string conversion, but thinking about two grammars is also a problem --
>> > part of the string conversion in libmpdec is modeled in ACL2, which
>> > would be invalidated or at least complicated with two grammars.
>> >
>> >
>> >
>> > Stefan Krah
>> >
>> > ___
>> > Python-Dev mailing list
>> > [email protected]
>> > https://mail.python.org/mailman/listinfo/python-dev
>> > Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> ___
>> Python-Dev mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>>
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 09:47 Guido van Rossum  wrote:

> If the authors are happy I'll accept it right away.
>
> (I vaguely recall there's another PEP that's ready for pronouncement --
> but which one?)
>

PEP 509 is the only one I can think of.

-Brett


>
> On Wed, May 11, 2016 at 9:34 AM, Brett Cannon  wrote:
>
>> Is there anything holding up PEP 515 at this point in terms of acceptance
>> or implementation?
>>
>> On Sat, 19 Mar 2016 at 11:56 Guido van Rossum  wrote:
>>
>>> All that sounds fine!
>>>
>>> On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah 
>>> wrote:
>>> > Guido van Rossum  python.org> writes:
>>> >> So should the preprocessing step just be s.replace('_', ''), or should
>>> >> it reject underscores that don't follow the rules from the PEP
>>> >> (perhaps augmented so they follow the spirit of the PEP and the letter
>>> >> of the IBM spec)?
>>> >>
>>> >> Honestly I think it's also fine if specifying this exactly is left out
>>> >> of the PEP, and handled by whoever adds this to Decimal. Having a PEP
>>> >> to work from for the language spec and core builtins (int(), float()
>>> >> complex()) is more important.
>>> >
>>> > I'd keep it simple for Decimal: Remove left and right whitespace (we're
>>> > already doing this), then remove underscores from the remaining string
>>> > (which must not contain any further whitespace), then use the IBM
>>> grammar.
>>> >
>>> >
>>> > We could add a clause to the PEP that only those strings that follow
>>> > the spirit of the PEP are guaranteed to be accepted in the future.
>>> >
>>> >
>>> > One reason for keeping it simple is that I would not like to slow down
>>> > string conversion, but thinking about two grammars is also a problem --
>>> > part of the string conversion in libmpdec is modeled in ACL2, which
>>> > would be invalidated or at least complicated with two grammars.
>>> >
>>> >
>>> >
>>> > Stefan Krah
>>> >
>>> > ___
>>> > Python-Dev mailing list
>>> > [email protected]
>>> > https://mail.python.org/mailman/listinfo/python-dev
>>> > Unsubscribe:
>>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>>
>>>
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>> ___
>>> Python-Dev mailing list
>>> [email protected]
>>> https://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe:
>>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>>>
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
**another deep, calming breath**

On Wed, May 11, 2016 at 7:43 PM, Brett Cannon  wrote:
> Open Issues
> ===
>
> Should os.fspath() return bytes?
> 
>

In most cases, it of course should not. The section (or the title) do
not represent my view on the topic, but bytes paths are not going away
any time soon, so this requires considerations.

Below is a copy paste of my comment in the discussion on my pull
request ( https://github.com/brettcannon/path-pep/pull/2 ). It is
about whether os.fspath should look like

def fspath(path, *, type_constraint=str): ...

This was already discussed before, receiving positive reactions. It
would be an extension to what is in the draft Brett just posted. Note
that the return type, however, does not depend on `type_constraint`.
It fully depends on `path`. The constraint only determines when an
exception is raised instead of returning a value.  When using
str-based paths normally, one would do

os.fspath(path)

Calling it with

os.fspath(path, type_constraint=(str, bytes))

would turn off rejection of bytes paths (and be consistent with the
rest of `os` and `os.path` where functions accept both types). But the
default value for the keyword argument would be a good reminder that
str-based paths should be used by default. This str-constraint is also
present in the current drafted version, but it is not optional or
visible in the signature.

(See the diffs of my pull request for an example implementation of this.)

So below is the copy-paste from the pull request discussion that I promised:

"""
Nobody wants to promote using `bytes` for paths without a proper
reason. However, `os.fspath(path, type_contraint=str)` is already a
compromise from the current `os.*` convention (of symmetrically
supporting both `str` and `bytes`) towards enforcing `str`-based
paths.

As you know, the reason for using `os.fspath` is to switch to a
lower-level representation of paths as strings. When you use it, you
are already deciding that you are lower-level and want to examine or
manipulate path strings manually. I think the `type_constraint=str`
keyword-only argument (with the default) is a good way to remind the
'user' that `str` is the way to go unless you know what you are doing
and, to `bytes` users, that `os.fspath` (by default) rejects bytes.

Looking at the discussions on python-dev, one notices that the active
people in the discussions were mostly in favor of exposing the
bytes-supporting version in one way or the other. Some were even
against bytes-rejecting versions.

Most people (Python programmers), especially in the long term, should
not be using `os.fspath` a lot. If they want to print a path, they
should simply do so: `print(path_obj)`, or `print(f"The path is:
{path_obj}")`. However, there will always be people that for whatever
reason want to convert a pathlib object into a string, and without
even consulting the docs, they may write `str(path_obj)`. And that's
not the end of the world. It's not any worse than `str(integer)` for
instance. After all, if the main point is to convert something into a
generic string, the function should probably be called `str`, not
`fspath`. But if the idea is to convert any path object into something
that `os` and similar code internally understand, then `os.fspath`
sounds fine to me. So I would not go as far as 'never use
`str(path)`'. What we don't want is things like `open(str(path),
'w')`. I know you agree, because you wrote that yourself. But once
this protocol is available, I see no reason why people would
voluntarily wrap their path objects in `str(...)`, so I don't see that
as a problem in the future (thanks to this PEP). So, especially in the
long term, I don't expect `fspath` to be an everyday tool for people.
Even in shorter term, people don't usually need it, because the stdlib
will already support pathlib objects.

Since it is not an everyday tool, I think it's ok to have that extra
keyword-only argument. Even if it were an everyday tool, I can see the
keyword-only argument in the signature as useful thing in a
documentation sense.

[As a side note, I could imagine someone convincing me that `fspath`
should *always* accept both `str`- and `bytes`-based paths, because
that would simplify it and align it with the rest of `os`. This would
not be any worse than the status quo in terms of early failures. That
would, however, still not detect accidental use of bytes paths (when
forgotten to decode from the encoding it was received in).]
"""

-- Koos
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-11 Thread Guido van Rossum
On Wed, May 11, 2016 at 10:28 AM, Brett Cannon  wrote:

>
>
> On Wed, 11 May 2016 at 09:47 Guido van Rossum  wrote:
>
>> If the authors are happy I'll accept it right away.
>>
>> (I vaguely recall there's another PEP that's ready for pronouncement --
>> but which one?)
>>
>
> PEP 509 is the only one I can think of.
>

That's in limbo pending conclusive proof (through benchmarks) that at least
one of Yury's patches that depends on it makes a big enough difference.

Which IIUC itself is in limbo pending the wordcode changes (we're doing
that right?).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Slow downloads from python.org

2016-05-11 Thread Dima Tisnek
Sorry, this is probably wrong place to ask, but is it only me?
I can't get more than 40KB/s downloading from python.org
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 10:49 Guido van Rossum  wrote:

> On Wed, May 11, 2016 at 10:28 AM, Brett Cannon  wrote:
>
>>
>>
>> On Wed, 11 May 2016 at 09:47 Guido van Rossum  wrote:
>>
>>> If the authors are happy I'll accept it right away.
>>>
>>> (I vaguely recall there's another PEP that's ready for pronouncement --
>>> but which one?)
>>>
>>
>> PEP 509 is the only one I can think of.
>>
>
> That's in limbo pending conclusive proof (through benchmarks) that at
> least one of Yury's patches that depends on it makes a big enough
> difference.
>
> Which IIUC itself is in limbo pending the wordcode changes (we're doing
> that right?).
>

Yes. Last I checked the author of the patch was waiting on a review from
Serhiy.

-Brett


>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slow downloads from python.org

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 10:56 Dima Tisnek  wrote:

> Sorry, this is probably wrong place to ask, but is it only me?
> I can't get more than 40KB/s downloading from python.org


It's just you or the problem has passed; just downloaded much faster than
40KB/s.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Brett Cannon
A quick comment about sending me fixes. While I do appreciate them, sending
them as a pull request is much easier for me as (a) I don't have to hunt
the changes down in the text, and (b) you will see the fixes others have
done already to the PEP and I then don't have to figure out what changes
have not already been fixed. And honestly, reading the PEP in its rendered
format on GitHub is easier IMO than in the text format unless you have
something specific to respond to (and even if you do, you can copy and
paste the relevant bits into an email reply).

On Wed, 11 May 2016 at 09:43 Brett Cannon  wrote:

> **deep, calming breath**
>
> Here is the PEP for __fspath__(). The draft lives at
> https://github.com/brettcannon/path-pep so feel free to send me PRs for
> spelling mistakes, grammatical errors, etc.
>
> -
>
> PEP: NNN
> Title: Adding a file system path protocol
> Version: $Revision$
> Last-Modified: $Date$
> Author: Brett Cannon 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-May-2016
> Post-History: 11-May-2016
>
>
> Abstract
> 
>
> This PEP proposes a protocol for classes which represent a file system
> path to be able to provide a ``str`` or ``bytes`` representation.
> Changes to Python's standard library are also proposed to utilize this
> protocol where appropriate to facilitate the use of path objects where
> historically only ``str`` and/or ``bytes`` file system paths are
> accepted. The goal is to allow users to use the representation of a
> file system path that's easiest for them now as they migrate towards
> using path objects in the future.
>
>
> Rationale
> =
>
> Historically in Python, file system paths have been represented as
> strings or bytes. This choice of representation has stemmed from C's
> own decision to represent file system paths as
> ``const char *`` [#libc-open]_. While that is a totally serviceable
> format to use for file system paths, it's not necessarily optimal. At
> issue is the fact that while all file system paths can be represented
> as strings or bytes, not all strings or bytes represent a file system
> path. This can lead to issues where any e.g. string duck-types to a
> file system path whether it actually represents a path or not.
>
> To help elevate the representation of file system paths from their
> representation as strings and bytes to a more appropriate object
> representation, the pathlib module [#pathlib]_ was provisionally
> introduced in Python 3.4 through PEP 428. While considered by some as
> an improvement over strings and bytes for file system paths, it has
> suffered from a lack of adoption. Typically the key issue listed
> for the low adoption rate has been the lack of support in the standard
> library. This lack of support required users of pathlib to manually
> convert path objects to strings by calling ``str(path)`` which many
> found error-prone.
>
> One issue in converting path objects to strings comes from
> the fact that only generic way to get a string representation of the
> path was to pass the object to ``str()``. This can pose a
> problem when done blindly as nearly all Python objects have some
> string representation whether they are a path or not, e.g.
> ``str(None)`` will give a result that
> ``builtins.open()`` [#builtins-open]_ will happily use to create a new
> file.
>
> Exacerbating this whole situation is the
> ``DirEntry`` object [#os-direntry]_. While path objects have a
> representation that can be extracted using ``str()``, ``DirEntry``
> objects expose a ``path`` attribute instead. Having no common
> interface between path objects, ``DirEntry``, and any other
> third-party path library had become an issue. A solution that allowed
> any path-representing object to declare that is was a path and a way
> to extract a low-level representation that all path objects could
> support was desired.
>
> This PEP then proposes to introduce a new protocol to be followed by
> objects which represent file system paths. Providing a protocol allows
> for clear signalling of what objects represent file system paths as
> well as a way to extract a lower-level representation that can be used
> with older APIs which only support strings or bytes.
>
> Discussions regarding path objects that led to this PEP can be found
> in multiple threads on the python-ideas mailing list archive
> [#python-ideas-archive]_ for the months of March and April 2016 and on
> the python-dev mailing list archives [#python-dev-archive]_ during
> April 2016.
>
>
> Proposal
> 
>
> This proposal is split into two parts. One part is the proposal of a
> protocol for objects to declare and provide support for exposing a
> file system path representation. The other part is changes to Python's
> standard library to support the new protocol. These changes will also
> have the pathlib module drop its provisional status.
>
>
> Protocol
> 
>
> The following abstract base class defines the protocol for an objec

Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Serhiy Storchaka

On 11.05.16 19:43, Brett Cannon wrote:

os.path
'''

The various path-manipulation functions of ``os.path`` [#os-path]_
will be updated to accept path objects. For polymorphic functions that
accept both bytes and strings, they will be updated to simply use
code very much similar to
``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
will allow for their pre-existing type-checking code to continue to
function.


I afraid that this will hit a performance. Some os.path functions are 
used in tight loops, they are hard optimized, and adding support of path 
protocol can have visible negative effect.


I suggest first implement other changes and then look whether it is 
worth to add support of path protocol in os.path functions.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Ethan Furman

On 05/11/2016 01:44 PM, Serhiy Storchaka wrote:


os.path
'''

The various path-manipulation functions of ``os.path`` [#os-path]_
will be updated to accept path objects. For polymorphic functions that
accept both bytes and strings, they will be updated to simply use
code very much similar to
``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
will allow for their pre-existing type-checking code to continue to
function.


I afraid that this will hit a performance. Some os.path functions are
used in tight loops, they are hard optimized, and adding support of path
protocol can have visible negative effect.


Do you have an example of os.path functions being used in a tight loop?

--
~Ethan~

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
On Wed, May 11, 2016 at 11:04 PM, Brett Cannon  wrote:
> A quick comment about sending me fixes. While I do appreciate them, sending
> them as a pull request is much easier for me as (a) I don't have to hunt the
> changes down in the text, and (b) you will see the fixes others have done
> already to the PEP and I then don't have to figure out what changes have not
> already been fixed. And honestly, reading the PEP in its rendered format on
> GitHub is easier IMO than in the text format unless you have something
> specific to respond to (and even if you do, you can copy and paste the
> relevant bits into an email reply).
>

Personally, I find it more important to settle the open issues first,
so I will be looking for typos at a later stage. Besides, more typos
may be added in the process. I previously left some typos uncorrected
in attempt to keep the diffs of my commits cleaner. By the way, to see
what additions and wordings I have suggested, the right way is to look
at my PR(s).  I might send more PRs to clarify open issues etc., but
this will largely be based on my previous commits.

-- Koos
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Ethan Furman

On 05/11/2016 01:51 PM, Ethan Furman wrote:

On 05/11/2016 01:44 PM, Serhiy Storchaka wrote:



os.path
'''

The various path-manipulation functions of ``os.path`` [#os-path]_
will be updated to accept path objects. For polymorphic functions that
accept both bytes and strings, they will be updated to simply use
code very much similar to
``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
will allow for their pre-existing type-checking code to continue to
function.


I afraid that this will hit a performance. Some os.path functions are
used in tight loops, they are hard optimized, and adding support of path
protocol can have visible negative effect.


Do you have an example of os.path functions being used in a tight loop?


Also, the C code for fspath can check types first and take the fast path 
if bytes/str are passed in, only falling back to the __fspath__ protocol 
if something else was passed in -- which should make any performance 
hits negligible.


--
~Ethan~

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 12:15 AM, Ethan Furman  wrote:
> On 05/11/2016 01:51 PM, Ethan Furman wrote:
>>
>> On 05/11/2016 01:44 PM, Serhiy Storchaka wrote:
>
>
 os.path
 '''

 The various path-manipulation functions of ``os.path`` [#os-path]_
 will be updated to accept path objects. For polymorphic functions that
 accept both bytes and strings, they will be updated to simply use
 code very much similar to
 ``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
 will allow for their pre-existing type-checking code to continue to
 function.
>>>
>>>
>>> I afraid that this will hit a performance. Some os.path functions are
>>> used in tight loops, they are hard optimized, and adding support of path
>>> protocol can have visible negative effect.
>>
>>
>> Do you have an example of os.path functions being used in a tight loop?
>

I'd be interested in this too.

>
> Also, the C code for fspath can check types first and take the fast path if
> bytes/str are passed in, only falling back to the __fspath__ protocol if
> something else was passed in -- which should make any performance hits
> negligible.

My suggestion for the *python version* already does this too (again,
see my PR). This is a win-win, as it also improves error messages (as
suggested by Nick in the earlier discussions).

-- Koos

>
>
> --
> ~Ethan~
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Nikolaus Rath
On May 11 2016, Brett Cannon  wrote:
> This PEP proposes a protocol for classes which represent a file system
> path to be able to provide a ``str`` or ``bytes`` representation.
[...]

As I said before, to me this seems like a lot of effort for a very
specific use-case. So let me put forward two hypothetical scenarios to
better understand your position:

- A new module for URL handling is added to the standard library (or
  urllib is suitably extended). There is a proposal to add a new
  protocol that allows classes to provide a ``str`` or ``bytes``
  representation of URLs.

- A new (third-party) library for natural language processing arises
  that exposes a specific class for representing audio data. Existing
  language processing code just uses bytes objects. To ease transition
  and interoperability, it is proposed to add a new protocol for classes
  that represend audio data to provide a bytes representation.

Do you think you would you be in favor of adding these protocols to
the stdlib/languange reference as well? If not, what's the crucial
difference to file system paths?


Thanks,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 12:28 AM, Nikolaus Rath  wrote:
> On May 11 2016, Brett Cannon  wrote:
>> This PEP proposes a protocol for classes which represent a file system
>> path to be able to provide a ``str`` or ``bytes`` representation.
> [...]
>
> As I said before, to me this seems like a lot of effort for a very
> specific use-case. So let me put forward two hypothetical scenarios to
> better understand your position:

I think you are touching important points.

> - A new module for URL handling is added to the standard library (or
>   urllib is suitably extended). There is a proposal to add a new
>   protocol that allows classes to provide a ``str`` or ``bytes``
>   representation of URLs.

This reminds me of the thread I recently started on python-ideas about
extending the concept of paths to URLs. I don't know if you are
referring to something like that or not.

Anyway, it would be important to know whether the str or bytes
representation is to be used a file system path or an URL, so that
would need to be a separate protocol. But everyone would have the
experience from these discussions, so hopefully less discussion then
:). (By the way, this is one reason to have 'fs' in the name of the
__fspath__ method, although I have not mentioned it before.)

>
> - A new (third-party) library for natural language processing arises
>   that exposes a specific class for representing audio data. Existing
>   language processing code just uses bytes objects. To ease transition
>   and interoperability, it is proposed to add a new protocol for classes
>   that represend audio data to provide a bytes representation.
>
> Do you think you would you be in favor of adding these protocols to
> the stdlib/languange reference as well? If not, what's the crucial
> difference to file system paths?
>

File system paths are very fundamental, and will probably be used in
context of your natural language example too.

-- Koos

>
> Thanks,
> -Nikolaus
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Ethan Furman

On 05/11/2016 02:28 PM, Nikolaus Rath wrote:

On May 11 2016, Brett Cannon wrote:



This PEP proposes a protocol for classes which represent a file system
path to be able to provide a ``str`` or ``bytes`` representation.

[...]

As I said before, to me this seems like a lot of effort for a very
specific use-case. So let me put forward two hypothetical scenarios to
better understand your position:

- A new module for URL handling is added to the standard library (or
   urllib is suitably extended). There is a proposal to add a new
   protocol that allows classes to provide a ``str`` or ``bytes``
   representation of URLs.

- A new (third-party) library for natural language processing arises
   that exposes a specific class for representing audio data. Existing
   language processing code just uses bytes objects. To ease transition
   and interoperability, it is proposed to add a new protocol for classes
   that represend audio data to provide a bytes representation.

Do you think you would you be in favor of adding these protocols to
the stdlib/languange reference as well? If not, what's the crucial
difference to file system paths?


I think a crucial reason for this work is to unify the stdlib: we 
currently have four (?) different things that can be or represent a 
file-system path:


- str
- bytes
- DirEntry
- Path

Half of those objects don't work well with the rest of the standard library.

As for your second example, the protocol already exists: it's called 
__bytes__.  ;)


--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 14:29 Nikolaus Rath  wrote:

> On May 11 2016, Brett Cannon  wrote:
> > This PEP proposes a protocol for classes which represent a file system
> > path to be able to provide a ``str`` or ``bytes`` representation.
> [...]
>
> As I said before, to me this seems like a lot of effort for a very
> specific use-case. So let me put forward two hypothetical scenarios to
> better understand your position:
>
> - A new module for URL handling is added to the standard library (or
>   urllib is suitably extended). There is a proposal to add a new
>   protocol that allows classes to provide a ``str`` or ``bytes``
>   representation of URLs.
>
> - A new (third-party) library for natural language processing arises
>   that exposes a specific class for representing audio data. Existing
>   language processing code just uses bytes objects. To ease transition
>   and interoperability, it is proposed to add a new protocol for classes
>   that represend audio data to provide a bytes representation.
>
> Do you think you would you be in favor of adding these protocols to
> the stdlib/languange reference as well?


Maybe for URLs, not for audio data (at least not in the stdlib; community
can do what they want).


> If not, what's the crucial
> difference to file system paths?
>

Nearly everyone uses file system paths on a regular basis, less so than
URLs but still a good amount of people. Very few people work with audio
data.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 13:45 Serhiy Storchaka  wrote:

> On 11.05.16 19:43, Brett Cannon wrote:
> > os.path
> > '''
> >
> > The various path-manipulation functions of ``os.path`` [#os-path]_
> > will be updated to accept path objects. For polymorphic functions that
> > accept both bytes and strings, they will be updated to simply use
> > code very much similar to
> > ``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
> > will allow for their pre-existing type-checking code to continue to
> > function.
>
> I afraid that this will hit a performance. Some os.path functions are
> used in tight loops, they are hard optimized, and adding support of path
> protocol can have visible negative effect.
>

As others have asked, what specific examples do you have that os.path is
used in a tight loop w/o any I/O that would overwhelm the performance?


>
> I suggest first implement other changes and then look whether it is
> worth to add support of path protocol in os.path functions.
>

I see this whole discussion breaking down into a few groups which changes
what gets done upfront and what might be done farther down the line:

   1. Maximum acceptance: do whatever we can to make all representation of
   paths just work, which means making all places working with a path in the
   stdlib accept path objects, str, and bytes.
   2. Safely use path objects: __fspath__() is there to signal an object is
   a file system path and to get back a lower-level representation so people
   stop calling str() on everything, providing some interface signaling that
   someone doesn't misuse an object as a path and only changing path
   consumptions APIs -- e.g. open() -- and not path manipulation APIs -- e.g.
   os.path -- in the stdlib.
   3. It ain't worth it: those that would rather just skip all of this and
   drop pathlib from the stdlib.

Ethan and Koos are in group #1 and I'm personally in group #2 but I tried
to compromise somewhat and find a middle ground in the PEP with the level
of changes in the stdlib but being more restrictive with os.fspath(). If I
were doing a pure group #2 PEP I would drop os.path changes and make
os.fspath() do what Ethan and Koos have suggested and simply pass through
without checks whatever path.__fspath__() returned if the argument wasn't
str or bytes.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Ethan Furman

On 05/11/2016 03:13 PM, Brett Cannon wrote:


If [...] I would drop os.path changes and make os.fspath() do what

> Ethan and Koos have suggested and simply pass through without checks

whatever path.__fspath__() returned if the argument wasn't str or bytes.


Not to derail the conversation too much, as I know we're all getting 
burned out on the topic, but that last bit is not accurate: my druthers 
are to have __fspath__ be able to return str /or/ bytes, and if anything 
else comes from the object in question an exception must be raised. 
Maybe a word got lost between your thoughts and your fingers -- happens 
to me all the time.  :)


--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 1:13 AM, Brett Cannon  wrote:
>
>
> On Wed, 11 May 2016 at 13:45 Serhiy Storchaka  wrote:
>>
>> On 11.05.16 19:43, Brett Cannon wrote:
>> > os.path
>> > '''
>> >
>> > The various path-manipulation functions of ``os.path`` [#os-path]_
>> > will be updated to accept path objects. For polymorphic functions that
>> > accept both bytes and strings, they will be updated to simply use
>> > code very much similar to
>> > ``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
>> > will allow for their pre-existing type-checking code to continue to
>> > function.
>>
>> I afraid that this will hit a performance. Some os.path functions are
>> used in tight loops, they are hard optimized, and adding support of path
>> protocol can have visible negative effect.
>
>
> As others have asked, what specific examples do you have that os.path is
> used in a tight loop w/o any I/O that would overwhelm the performance?
>
>>
>>
>> I suggest first implement other changes and then look whether it is
>> worth to add support of path protocol in os.path functions.
>
>
> I see this whole discussion breaking down into a few groups which changes
> what gets done upfront and what might be done farther down the line:
>
> 1. Maximum acceptance: do whatever we can to make all representation of paths
> just work, which means making all places working with a path in the stdlib
> accept path objects, str, and bytes.

Since you are putting me in this camp, there is at least one thing you
are wrong about. I don't want all places that work with a path to
accept bytes. Only those that already do so, including os/os.path. And
yes, I think the stdlib should show a good example in accepting path
types (especially those provided in the stdlib itself).

Whether Ethan is fully in camp 1, I don't know. Not that I think he
would be any closer to the other camps, though.

> 2. Safely use path objects: __fspath__() is there to signal an object is a 
> file
> system path and to get back a lower-level representation so people stop
> calling str() on everything, providing some interface signaling that someone
> doesn't misuse an object as a path and only changing path consumptions APIs
> -- e.g. open() -- and not path manipulation APIs -- e.g. os.path -- in the
> stdlib.
>
> 3. It ain't worth it: those that would rather just skip all of this and drop
> pathlib from the stdlib.
>
> Ethan and Koos are in group #1 and I'm personally in group #2 but I tried to
> compromise somewhat and find a middle ground in the PEP with the level of
> changes in the stdlib but being more restrictive with os.fspath(). If I were
> doing a pure group #2 PEP I would drop os.path changes and make os.fspath()
> do what Ethan and Koos have suggested and simply pass through without checks
> whatever path.__fspath__() returned if the argument wasn't str or bytes.
>

Related to this, based on the earlier discussions, I had the
impression that you were largely in the same camp as me. In fact, I
thought you had politely left some things out of the PEP draft so I
could fill them in. It turned out I was wrong about that, because you
didn't merge them.

-- Koos

> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 2:05 AM, Ethan Furman  wrote:
> On 05/11/2016 03:13 PM, Brett Cannon wrote:
>
>> If [...] I would drop os.path changes and make os.fspath() do what
>
>> Ethan and Koos have suggested and simply pass through without checks
>>
>> whatever path.__fspath__() returned if the argument wasn't str or bytes.
>
>
> Not to derail the conversation too much, as I know we're all getting burned
> out on the topic, but that last bit is not accurate: my druthers are to have
> __fspath__ be able to return str /or/ bytes, and if anything else comes from
> the object in question an exception must be raised. Maybe a word got lost
> between your thoughts and your fingers -- happens to me all the time.  :)

Yes. This would also be equivalent to my fspath(path,
type_constraint=(str,bytes)). And if the compromise I mentioned about
the rejecting (by default or optionally) is lifted, the keyword
argument would not be needed.  I might be ok with throwing away the
isinstance check on the return value of __fspath__() if it has
significant impact on performance in realistic cases (with DirEntry
most likely, I suppose), but I doubt it.

-- Koos

>
> --
> ~Ethan~
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 2:49 AM, Koos Zevenhoven  wrote:
> On Thu, May 12, 2016 at 2:05 AM, Ethan Furman  wrote:
>> On 05/11/2016 03:13 PM, Brett Cannon wrote:
>>
>>> If [...] I would drop os.path changes and make os.fspath() do what
>>
>>> Ethan and Koos have suggested and simply pass through without checks
>>>
>>> whatever path.__fspath__() returned if the argument wasn't str or bytes.
>>
>>
>> Not to derail the conversation too much, as I know we're all getting burned
>> out on the topic, but that last bit is not accurate: my druthers are to have
>> __fspath__ be able to return str /or/ bytes, and if anything else comes from
>> the object in question an exception must be raised. Maybe a word got lost
>> between your thoughts and your fingers -- happens to me all the time.  :)
>
> Yes. This would also be equivalent to my fspath(path,
> type_constraint=(str,bytes)). And if the compromise I mentioned about
> the rejecting (by default or optionally) is lifted, the keyword

the rejecting -> rejecting bytes

Good night.

-- Koos


> argument would not be needed.  I might be ok with throwing away the
> isinstance check on the return value of __fspath__() if it has
> significant impact on performance in realistic cases (with DirEntry
> most likely, I suppose), but I doubt it.
>
> -- Koos
>
>>
>> --
>> ~Ethan~
>>
>> ___
>> Python-Dev mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 2:53 AM, Koos Zevenhoven  wrote:
> On Thu, May 12, 2016 at 2:49 AM, Koos Zevenhoven  wrote:
>> On Thu, May 12, 2016 at 2:05 AM, Ethan Furman  wrote:
>>> On 05/11/2016 03:13 PM, Brett Cannon wrote:
>>>
 If [...] I would drop os.path changes and make os.fspath() do what
>>>
 Ethan and Koos have suggested and simply pass through without checks

 whatever path.__fspath__() returned if the argument wasn't str or bytes.
>>>
>>>
>>> Not to derail the conversation too much, as I know we're all getting burned
>>> out on the topic, but that last bit is not accurate: my druthers are to have
>>> __fspath__ be able to return str /or/ bytes, and if anything else comes from
>>> the object in question an exception must be raised. Maybe a word got lost
>>> between your thoughts and your fingers -- happens to me all the time.  :)
>>
>> Yes. This would also be equivalent to my fspath(path,
>> type_constraint=(str,bytes)). And if the compromise I mentioned about
>> the rejecting (by default or optionally) is lifted, the keyword
>
> the rejecting -> rejecting bytes
>
>> argument would not be needed.  I might be ok with throwing away the
>> isinstance check on the return value of __fspath__() if it has
>> significant impact on performance in realistic cases (with DirEntry
>> most likely, I suppose), but I doubt it.
>>
>> -- Koos
>>

I will send a pull request about this tomorrow.

-- Koos

>>>
>>> --
>>> ~Ethan~
>>>
>>> ___
>>> Python-Dev mailing list
>>> [email protected]
>>> https://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe:
>>> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Arthur Darcet
On 11 May 2016 at 22:51, Ethan Furman  wrote:

> On 05/11/2016 01:44 PM, Serhiy Storchaka wrote:
>
> os.path
>>> '''
>>>
>>> The various path-manipulation functions of ``os.path`` [#os-path]_
>>> will be updated to accept path objects. For polymorphic functions that
>>> accept both bytes and strings, they will be updated to simply use
>>> code very much similar to
>>> ``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
>>> will allow for their pre-existing type-checking code to continue to
>>> function.
>>>
>>
>> I afraid that this will hit a performance. Some os.path functions are
>> used in tight loops, they are hard optimized, and adding support of path
>> protocol can have visible negative effect.
>>
>
> Do you have an example of os.path functions being used in a tight loop?
>

os.path.getmtime could be used in a tight loop, to sync directories with a
lot of files for instance.

% python3 -m timeit -s "import os.path; p = 'out'" "hasattr(p,
'__fspath__'), os.path.getmtime(p)"
10 loops, best of 3: 2.67 usec per loop
% python3 -m timeit -s "import os.path; p = 'out'" "isinstance(p, (str,
bytes)), os.path.getmtime(p)"
10 loops, best of 3: 2.45 usec per loop
% python3 -m timeit -s "import os.path; p = 'out'" "os.path.getmtime(p)"
10 loops, best of 3: 2.02 usec per loop

a 25% markup is a lot imo.

a isinstance check prior to the hasattr might be a way to mitigate this a
bit (but it doesn't help much)
Granted, this example could be optimised by calling os.stat directly, which
is not in os.path, but still, worth considering
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Ethan Furman

On 05/11/2016 05:13 PM, Arthur Darcet wrote:


os.path.getmtime could be used in a tight loop, to sync directories with
a lot of files for instance.

% python3 -m timeit -s "import os.path; p = 'out'" "hasattr(p,
'__fspath__'), os.path.getmtime(p)"
10 loops, best of 3: 2.67 usec per loop
% python3 -m timeit -s "import os.path; p = 'out'" "isinstance(p, (str,
bytes)), os.path.getmtime(p)"
10 loops, best of 3: 2.45 usec per loop
% python3 -m timeit -s "import os.path; p = 'out'" "os.path.getmtime(p)"
10 loops, best of 3: 2.02 usec per loop

a 25% markup is a lot imo.


I don't think those results are very informative, since 
os.path.getmtime() accepts both str and bytes, so it must already have 
the str/bytes check.


--
~Ethan~

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 16:08 Koos Zevenhoven  wrote:

> On Thu, May 12, 2016 at 1:13 AM, Brett Cannon  wrote:
> >
> >
> > On Wed, 11 May 2016 at 13:45 Serhiy Storchaka 
> wrote:
> >>
> >> On 11.05.16 19:43, Brett Cannon wrote:
> >> > os.path
> >> > '''
> >> >
> >> > The various path-manipulation functions of ``os.path`` [#os-path]_
> >> > will be updated to accept path objects. For polymorphic functions that
> >> > accept both bytes and strings, they will be updated to simply use
> >> > code very much similar to
> >> > ``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
> >> > will allow for their pre-existing type-checking code to continue to
> >> > function.
> >>
> >> I afraid that this will hit a performance. Some os.path functions are
> >> used in tight loops, they are hard optimized, and adding support of path
> >> protocol can have visible negative effect.
> >
> >
> > As others have asked, what specific examples do you have that os.path is
> > used in a tight loop w/o any I/O that would overwhelm the performance?
> >
> >>
> >>
> >> I suggest first implement other changes and then look whether it is
> >> worth to add support of path protocol in os.path functions.
> >
> >
> > I see this whole discussion breaking down into a few groups which changes
> > what gets done upfront and what might be done farther down the line:
> >
> > 1. Maximum acceptance: do whatever we can to make all representation of
> paths
> > just work, which means making all places working with a path in the
> stdlib
> > accept path objects, str, and bytes.
>
> Since you are putting me in this camp, there is at least one thing you
> are wrong about. I don't want all places that work with a path to
> accept bytes. Only those that already do so, including os/os.path. And
> yes, I think the stdlib should show a good example in accepting path
> types (especially those provided in the stdlib itself).
>

That's actually what I meant. I'm not advocating widening the APIs that
accept bytes at all.

-Brett


>
> Whether Ethan is fully in camp 1, I don't know. Not that I think he
> would be any closer to the other camps, though.
>
> > 2. Safely use path objects: __fspath__() is there to signal an object is
> a file
> > system path and to get back a lower-level representation so people stop
> > calling str() on everything, providing some interface signaling that
> someone
> > doesn't misuse an object as a path and only changing path consumptions
> APIs
> > -- e.g. open() -- and not path manipulation APIs -- e.g. os.path -- in
> the
> > stdlib.
> >
> > 3. It ain't worth it: those that would rather just skip all of this and
> drop
> > pathlib from the stdlib.
> >
> > Ethan and Koos are in group #1 and I'm personally in group #2 but I
> tried to
> > compromise somewhat and find a middle ground in the PEP with the level of
> > changes in the stdlib but being more restrictive with os.fspath(). If I
> were
> > doing a pure group #2 PEP I would drop os.path changes and make
> os.fspath()
> > do what Ethan and Koos have suggested and simply pass through without
> checks
> > whatever path.__fspath__() returned if the argument wasn't str or bytes.
> >
>
> Related to this, based on the earlier discussions, I had the
> impression that you were largely in the same camp as me. In fact, I
> thought you had politely left some things out of the PEP draft so I
> could fill them in. It turned out I was wrong about that, because you
> didn't merge them.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 15:13 Brett Cannon  wrote:

> On Wed, 11 May 2016 at 13:45 Serhiy Storchaka  wrote:
>
>> On 11.05.16 19:43, Brett Cannon wrote:
>> > os.path
>> > '''
>> >
>> > The various path-manipulation functions of ``os.path`` [#os-path]_
>> > will be updated to accept path objects. For polymorphic functions that
>> > accept both bytes and strings, they will be updated to simply use
>> > code very much similar to
>> > ``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
>> > will allow for their pre-existing type-checking code to continue to
>> > function.
>>
>> I afraid that this will hit a performance. Some os.path functions are
>> used in tight loops, they are hard optimized, and adding support of path
>> protocol can have visible negative effect.
>>
>
> As others have asked, what specific examples do you have that os.path is
> used in a tight loop w/o any I/O that would overwhelm the performance?
>
>
>>
>> I suggest first implement other changes and then look whether it is
>> worth to add support of path protocol in os.path functions.
>>
>
> I see this whole discussion breaking down into a few groups which changes
> what gets done upfront and what might be done farther down the line:
>
>1. Maximum acceptance: do whatever we can to make all representation
>of paths just work, which means making all places working with a path in
>the stdlib accept path objects, str, and bytes.
>2. Safely use path objects: __fspath__() is there to signal an object
>is a file system path and to get back a lower-level representation so
>people stop calling str() on everything, providing some interface signaling
>that someone doesn't misuse an object as a path and only changing path
>consumptions APIs -- e.g. open() -- and not path manipulation APIs -- e.g.
>os.path -- in the stdlib.
>3. It ain't worth it: those that would rather just skip all of this
>and drop pathlib from the stdlib.
>
> Ethan and Koos are in group #1 and I'm personally in group #2 but I tried
> to compromise somewhat and find a middle ground in the PEP with the level
> of changes in the stdlib but being more restrictive with os.fspath(). If I
> were doing a pure group #2 PEP I would drop os.path changes and make
> os.fspath() do what Ethan and Koos have suggested and simply pass through
> without checks whatever path.__fspath__() returned if the argument wasn't
> str or bytes.
>

I should mention there's also a side group who really wanted to either
minimize bytes usage in all of this or not have a polymorphic os.fspath(),
hence why that function is defined the way it is around only returning
strings in the PEP.

IOW the PEP is already an attempt at compromise and if we can't find a
consensus somehow I'll update the PEP to directly reflect my personal
preferences and let others write their own competing PEPs if they prefer
(but I think it's in everyone's best interests if that's a last resort).
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes compatibility with 2.3

2016-05-11 Thread Brett Cannon
On Wed, 11 May 2016 at 09:28 Brett Cannon  wrote:

> On Wed, 11 May 2016 at 09:07 Thomas Heller  wrote:
>
>> Am 11.05.2016 um 18:04 schrieb Brett Cannon:
>> >
>> >
>> > On Wed, 11 May 2016 at 04:35 Thomas Heller > > > wrote:
>> >
>> > Am 10.05.2016 um 19:39 schrieb Brett Cannon:
>> > >
>> > >
>> > > On Tue, 10 May 2016 at 01:18 Martin Panter > > 
>> > > > >>>
>> > wrote:
>> > >
>> > > I am working on , to fix
>> shell
>> > > injection problems with ctypes.util.find_library(). The
>> > proposal for
>> > > Python 3 is to change os.popen(shell-script) calls to use
>> > > subprocess.Popen().
>> > >
>> > > However the Python 2.7 version of the module has a comment
>> > which says
>> > > “This file should be kept compatible with Python 2.3, see PEP
>> > 291.”
>> > > Looking at , it
>> is not
>> > > clear why we have to maintain this compatibility. My best
>> guess is
>> > > that there may be an external ctypes package that people
>> > want(ed) to
>> > > keep compatible with 2.3, and also keep synchronized with 2.7.
>> > >
>> > >
>> > > That's correct and the maintainer is/was Thomas Heller who I have
>> > cc'ed
>> > > to see if he's okay with lifting the restriction.
>> >
>> > For me it is totally ok to lift this restriction.
>> >
>> >
>> > Great! I'll also update PEP 291.
>>
>> Cool.  While you're at it, the compatibility restriction for
>> modulefinder could also be lifted.
>>
>
> Will do.
>

PEP 291 no longer lists any restrictions on ctypes or modulefinder.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ctypes compatibility with 2.3

2016-05-11 Thread Martin Panter
On 12 May 2016 at 01:05, Brett Cannon  wrote:
>
>
> On Wed, 11 May 2016 at 09:28 Brett Cannon  wrote:
>>
>> On Wed, 11 May 2016 at 09:07 Thomas Heller  wrote:
>>>
>>> Am 11.05.2016 um 18:04 schrieb Brett Cannon:
>>> >
>>> >
>>> > On Wed, 11 May 2016 at 04:35 Thomas Heller >> > > wrote:
>>> > For me it is totally ok to lift this restriction.
>>> >
>>> >
>>> > Great! I'll also update PEP 291.
>>>
>>> Cool.  While you're at it, the compatibility restriction for
>>> modulefinder could also be lifted.
>>
>>
>> Will do.
>
>
> PEP 291 no longer lists any restrictions on ctypes or modulefinder.

Thanks everyone for your responses. I will look at removing the
notices in the code when I get a chance. That would probably involve
reverting

https://hg.python.org/cpython/rev/381a72ab5fb8
And also the modulefinder.py comment

There are also these commits that could be backported
https://hg.python.org/cpython/rev/0980034adaa7 (ctypes)
https://hg.python.org/cpython/diff/627db59031be/Lib/modulefinder.py
but it might be safer just to leave the compatibility code there,
perhaps with a clarifying comment.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Serhiy Storchaka

On 11.05.16 23:51, Ethan Furman wrote:

On 05/11/2016 01:44 PM, Serhiy Storchaka wrote:

I afraid that this will hit a performance. Some os.path functions are
used in tight loops, they are hard optimized, and adding support of path
protocol can have visible negative effect.


Do you have an example of os.path functions being used in a tight loop?


posixpath.realpath(), os.walk(), glob.glob() calls split() and join() 
for every path component. dirname() and basename() are also often 
called. I doesn't count functions like islink() and isfile() since they 
just pass the argument to underlying stat function and don't need 
conversion.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-11 Thread Serhiy Storchaka

On 12.05.16 01:13, Brett Cannon wrote:



On Wed, 11 May 2016 at 13:45 Serhiy Storchaka mailto:[email protected]>> wrote:

On 11.05.16 19:43, Brett Cannon wrote:
 > os.path
 > '''
 >
 > The various path-manipulation functions of ``os.path`` [#os-path]_
 > will be updated to accept path objects. For polymorphic functions
that
 > accept both bytes and strings, they will be updated to simply use
 > code very much similar to
 > ``path.__fspath__() if  hasattr(path, '__fspath__') else path``. This
 > will allow for their pre-existing type-checking code to continue to
 > function.

I afraid that this will hit a performance. Some os.path functions are
used in tight loops, they are hard optimized, and adding support of path
protocol can have visible negative effect.


As others have asked, what specific examples do you have that os.path is
used in a tight loop w/o any I/O that would overwhelm the performance?


Most examples does some I/O (like os.lstat()): posixpath.realpath(), 
os.walk(), glob.glob(). But for example os.walk() was significantly 
boosted with using os.scandir(), it would be sad to make it slower 
again. os.path is used in number of files, sometimes in loops, sometimes 
indirectly. It is hard to find all examples.


Such functions as glob.glob() calls split() and join() for every 
component, but they also use string or bytes operations with paths. So 
they need to convert argument to str or bytes before start iteration, 
and always call os.path functions only with str or bytes. Additional 
conversion in every os.path function is redundant. I suppose most other 
high-level functions that manipulates paths in a loop also should 
convert arguments once at the start and don't need the support of path 
protocol in os.path functions.



I see this whole discussion breaking down into a few groups which
changes what gets done upfront and what might be done farther down the line:

 1. Maximum acceptance: do whatever we can to make all representation of
paths just work, which means making all places working with a path
in the stdlib accept path objects, str, and bytes.
 2. Safely use path objects: __fspath__() is there to signal an object
is a file system path and to get back a lower-level representation
so people stop calling str() on everything, providing some interface
signaling that someone doesn't misuse an object as a path and only
changing path consumptions APIs -- e.g. open() -- and not path
manipulation APIs -- e.g. os.path -- in the stdlib.
 3. It ain't worth it: those that would rather just skip all of this and
drop pathlib from the stdlib.

Ethan and Koos are in group #1 and I'm personally in group #2 but I
tried to compromise somewhat and find a middle ground in the PEP with
the level of changes in the stdlib but being more restrictive with
os.fspath(). If I were doing a pure group #2 PEP I would drop os.path
changes and make os.fspath() do what Ethan and Koos have suggested and
simply pass through without checks whatever path.__fspath__() returned
if the argument wasn't str or bytes.


I'm for adding conversions in C implemented path consuming APIs and may 
be in high-level path manipulation functions like os.walk(), but left 
low-level API of os.path, fnmatch and glob unchanged.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-11 Thread Georg Brandl
I'm happy with the latest version.

Georg

On 05/11/2016 06:46 PM, Guido van Rossum wrote:
> If the authors are happy I'll accept it right away.
> 
> (I vaguely recall there's another PEP that's ready for pronouncement -- but
> which one?)
> 
> On Wed, May 11, 2016 at 9:34 AM, Brett Cannon  > wrote:
> 
> Is there anything holding up PEP 515 at this point in terms of acceptance 
> or
> implementation?
> 
> On Sat, 19 Mar 2016 at 11:56 Guido van Rossum  > wrote:
> 
> All that sounds fine!
> 
> On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah  > wrote:
> > Guido van Rossum  python.org > writes:
> >> So should the preprocessing step just be s.replace('_', ''), or 
> should
> >> it reject underscores that don't follow the rules from the PEP
> >> (perhaps augmented so they follow the spirit of the PEP and the 
> letter
> >> of the IBM spec)?
> >>
> >> Honestly I think it's also fine if specifying this exactly is left 
> out
> >> of the PEP, and handled by whoever adds this to Decimal. Having a 
> PEP
> >> to work from for the language spec and core builtins (int(), 
> float()
> >> complex()) is more important.
> >
> > I'd keep it simple for Decimal: Remove left and right whitespace 
> (we're
> > already doing this), then remove underscores from the remaining 
> string
> > (which must not contain any further whitespace), then use the IBM 
> grammar.
> >
> >
> > We could add a clause to the PEP that only those strings that follow
> > the spirit of the PEP are guaranteed to be accepted in the future.
> >
> >
> > One reason for keeping it simple is that I would not like to slow 
> down
> > string conversion, but thinking about two grammars is also a 
> problem --
> > part of the string conversion in libmpdec is modeled in ACL2, which
> > would be invalidated or at least complicated with two grammars.
> >
> >
> >
> > Stefan Krah
> >
> > ___
> > Python-Dev mailing list
> > [email protected] 
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
> 
> 
> 
> --
> --Guido van Rossum (python.org/~guido )
> ___
> Python-Dev mailing list
> [email protected] 
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
> 
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido )
> 
> 


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com