Re: [Python-Dev] Discussion overload

2016-06-19 Thread Kevin Ollivier
Hi Nick,



On 6/17/16, 6:12 PM, "Nick Coghlan"  wrote:

>On 16 June 2016 at 19:00, Kevin Ollivier  wrote:
>> Hi Guido,
>>
>> From:  on behalf of Guido van Rossum
>> 
>> Reply-To: 
>> Date: Thursday, June 16, 2016 at 5:27 PM
>> To: Kevin Ollivier 
>> Cc: Python Dev 
>> Subject: Re: [Python-Dev] Discussion overload
>>
>> Hi Kevin,
>>
>> I often feel the same way. Are you using GMail? It combines related messages
>> in threads and lets you mute threads. I often use this feature so I can
>> manage my inbox. (I presume other mailers have the same features, but I
>> don't know if all of them do.) There are also many people who read the list
>> on a website, e.g. gmane. (Though I think that sometimes the delays incurred
>> there add to the noise -- e.g. when a decision is reached on the list
>> sometimes people keep responding to earlier threads.)
>>
>>
>> I fear I did quite a poor job of making my point. :( I've been on open
>> source mailing lists since the late 90s, so I've learned strategies for
>> dealing with mailing list overload. I've got my mail folders, my mail rules,
>> etc. Having been on many mailing lists over the years, I've seen many
>> productive discussions and many unproductive ones, and over time you start
>> to see patterns. You also see what happens to those communities over time.
>
>This is one of the major reasons we have the option of escalating
>things to the PEP process (and that's currently in train for
>os.urandom), as well as the SIGs for when folks really need to dig
>into topics that risk incurring a relatively low signal-to-noise
>ration on python-dev. It's also why python-ideas was turned into a
>separate list, since folks without the time for more speculative
>discussions and brainstorming can safely ignore it, while remaining
>confident that any ideas considered interesting enough for further
>review will be brought to python-dev's attention.
>
>But yes, one of the more significant design errors I've made with the
>contextlib API was due to just such a draining pile-on by folks that
>weren't happy the original name wasn't a 100% accurate description of
>the underlying mechanics (even though it was an accurate description
>of the intended use case), and "people yelling at you on project
>communication channels without doing adequate research first" is the
>number one reason we see otherwise happily engaged core developers
>decide to find something else to do with their time.

Yeah, the sad truth is that when you start having these problems, it's the good 
people that leave. The key though is not to treat this as some unsolvable 
problem, which honestly is what I've seen many projects do. :( My guess is that 
once these issues are addressed, at least some of the people who left would be 
willing to give it another try. 

I had written a couple paragraphs about some different tools and approaches 
that might help with that, but I think Guido's got the right idea by taking it 
off-list to determine the best way to move forward first.

Regards,

Kevin


>The challenge and art in community management in that context is
>balancing telling both old and new list participants "It's OK to ask
>'Why is this so?', as sometimes the answer is that there isn't a good
>reason and we may want to change it" and "Learn to be a good peer
>manager, and avoid behaving like a micro-managing autocrat that chases
>away experienced contributors".

>Cheers,
>Nick.
>
>-- 
>Nick Coghlan   |   [email protected]   |   Brisbane, Australia

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] security SIG?

2016-06-19 Thread Nick Coghlan
On 18 June 2016 at 10:36, Ethan Furman  wrote:
> One of the big advantages of a SIG is the much reduced pool of participants,
> and that those participants are usually interested in forward progress.  It
> would also be helpful to have a single person both champion and act as
> buffer for the proposals (not necessarily the same person each time).  I am
> reminded of the matrix-multiply PEP brought forward by Nathaniel a few
> months ago -- the proposal was researched outside of py-dev, presented to
> py-dev when ready, Nathaniel acted as the gateway between py-dev and those
> that wanted/needed the change, the discussion stayed (pretty much) on track,
> and it felt like the whole thing was very smooth.  (If it was somebody else,
> my apologies for my terrible memory! ;)
>
> To sum up:  I think it would be a good idea.

I'm coming around to this point of view as well. import-sig, for
example, is a very low traffic SIG, but I think it serves three key
useful purposes:

- it clearly indicates that import is a specialist topic with
additional considerations to take into account that may not be obvious
to developers touching the import system for the first time
- it provides a forum to collaboratively craft explanations of
proposed changes that should make sense to folks that *aren't*
specialists
- anyone that wants to become an "import system expert" can join the
SIG and learn from the intermittent discussions of proposed changes

distutils-sig is an example at the other end of the scale - while
distutils-sig and python-dev subscribers aren't a disjoint set, those
of us that fall into the intersection are a clear minority on both
lists, and can act as representatives of the interests of the other
group when needed.

As far as names go, my vote would be for "paranoia-sig" - it nicely
avoids any risk of folks submitting security bugs there instead of to
the PSRT, and "We're professionally paranoid, so you don't need to be"
is an apt description of good security sensitive API design in a
general purpose language like Python :)

Cheers,
Nick.

P.S. Hopefully we could get some of the Python Cryptographic Authority
folks to sign up, just as distutils-sig is a point of collaboration
between python-dev and PyPA. "Secure software design in Python" covers
a lot more than just the standard library, since in many cases you
really want to reach beyond the standard library and grab something
like cryptography or passlib, or delegate the problem to a domain
specific framework like Django or the relevant components of the Flask
or Pyramid ecosystems.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] security SIG?

2016-06-19 Thread Ethan Furman

On 06/19/2016 12:39 PM, Nick Coghlan wrote:

On 18 June 2016 at 10:36, Ethan Furman wrote:



To sum up:  I think it would be a good idea.


I'm coming around to this point of view as well. import-sig, for
example, is a very low traffic SIG, but I think it serves three key
useful purposes:

- it clearly indicates that import is a specialist topic with
additional considerations to take into account that may not be obvious
to developers touching the import system for the first time
- it provides a forum to collaboratively craft explanations of
proposed changes that should make sense to folks that *aren't*
specialists
- anyone that wants to become an "import system expert" can join the
SIG and learn from the intermittent discussions of proposed changes


[...]


As far as names go, my vote would be for "paranoia-sig" - it nicely
avoids any risk of folks submitting security bugs there instead of to
the PSRT, and "We're professionally paranoid, so you don't need to be"
is an apt description of good security sensitive API design in a
general purpose language like Python :)


Heh.  I like it.  If no one comes up with any other names I'll get the 
SIG requested mid-week-ish.


--
~Ethan~

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] security SIG?

2016-06-19 Thread Guido van Rossum
I think it's fine to have this SIG. I could see it going different ways in
terms of discussions and membership, but it's definitely worth a try. I
don't like clever names, and I very much doubt that it'll be mistaken for
an address to report sensitive issues, so I think it should just be
security-sig. (The sensitive-issues people are usually paranoid enough to
check before they post; the script kiddies reporting python.org "issues"
probably will get a faster and more appropriate response from the
security-sig.)

So let's just do it.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] frame evaluation API PEP

2016-06-19 Thread Brett Cannon
On Sat, 18 Jun 2016 at 21:49 Guido van Rossum  wrote:

> Hi Brett,
>
> I've got a few questions about the specific design. Probably you know the
> answers, it would be nice to have them in the PEP.
>

Once you're happy with my answers I'll update the PEP.


>
> First, why not have a global hook? What does a hook per interpreter give
> you? Would even finer granularity buy anything?
>

We initially considered a per-code object hook, but we figured it was
unnecessary to have that level of control, especially since people like
Numba have gotten away with not needing it for this long (although I
suspect that's because they are a decorator so they can just return an
object that overrides __call__()). We didn't think that a global one was
appropriate as different workloads may call for different
JITs/debuggers/etc. and there is no guarantee that you are executing every
interpreter with the same workload. Plus we figured people might simply
import their JIT of choice and as a side-effect set the hook, and since
imports are a per-interpreter thing that seemed to suggest the granularity
of interpreters.

IOW it seemed to be more in line with sys.settrace() than some global thing
for the process.


>
> Next, I'm a bit (but no more than a bit) concerned about the extra 8 bytes
> per code object, especially since for most people this is just waste
> (assuming most people won't be using Pyjion or Numba). Could it be a
> compile-time feature (requiring recompilation of CPython but not
> extensions)?
>

Probably. It does water down potential usage thanks to needing a special
build. If the decision is "special build or not", I would simply pull out
this part of the proposal as I wouldn't want to add a flag that influences
what is or is not possible for an interpreter.


> Could you figure out some other way to store per-code-object data? It
> seems you considered this but decided that the co_extra field was simpler
> and faster; I'm basically pushing a little harder on this. Of course most
> of the PEP would disappear without this feature; the extra interpreter
> field is fine.
>

Dino and I thought of two potential alternatives, neither of which we have
taken the time to implement and benchmark. One is to simply have a hash
table of memory addresses to JIT data that is kept on the JIT side of
things. Obviously it would be nice to avoid the overhead of a hash table
lookup on every function call. This also doesn't help minimize memory when
the code object gets GC'ed.

The other potential solution we came up with was to use weakrefs. I have
not looked into the details, but we were thinking that if we registered the
JIT data object as a weakref on the code object, couldn't we iterate
through the weakrefs attached to the code object to look for the JIT data
object, and then get the reference that way? It would let us avoid a more
expensive hash table lookup if we assume most code objects won't have a
weakref on it (assuming weakrefs are stored in a list), and it gives us the
proper cleanup semantics we want by getting the weakref cleanup callback
execution to make sure we decref the JIT data object appropriately. But as
I said, I have not looked into the feasibility of this at all to know if
I'm remembering the weakref implementation details correctly.


>
> Finally, there are some error messages from pep2html.py:
> https://www.python.org/dev/peps/pep-0523/#copyright
>

All fixed in
https://github.com/python/peps/commit/6929f850a5af07e51d0163558a5fe8d6b85dccfe
 .

-Brett


>
>
> --Guido
>
> On Fri, Jun 17, 2016 at 7:58 PM, Brett Cannon  wrote:
>
>> I have taken PEP 523 for this:
>> https://github.com/python/peps/blob/master/pep-0523.txt .
>>
>> I'm waiting until Guido gets back from vacation, at which point I'll ask
>> for a pronouncement or assignment of a BDFL delegate.
>>
>> On Fri, 3 Jun 2016 at 14:37 Brett Cannon  wrote:
>>
>>> For those of you who follow python-ideas or were at the PyCon US 2016
>>> language summit, you have already seen/heard about this PEP. For those of
>>> you who don't fall into either of those categories, this PEP proposed a
>>> frame evaluation API for CPython. The motivating example of this work has
>>> been Pyjion, the experimental CPython JIT Dino Viehland and I have been
>>> working on in our spare time at Microsoft. The API also works for
>>> debugging, though, as already demonstrated by Google having added a very
>>> similar API internally for debugging purposes.
>>>
>>> The PEP is pasted in below and also available in rendered form at
>>> https://github.com/Microsoft/Pyjion/blob/master/pep.rst (I will assign
>>> myself a PEP # once discussion is finished as it's easier to work in git
>>> for this for the rich rendering of the in-progress PEP).
>>>
>>> I should mention that the difference from python-ideas and the language
>>> summit in the PEP are the listed support from Google's use of a very
>>> similar API as well as clarifying the co_extra field on code objects
>>> doesn't change their 

Re: [Python-Dev] frame evaluation API PEP

2016-06-19 Thread MRAB

On 2016-06-20 02:29, Brett Cannon wrote:


On Sat, 18 Jun 2016 at 21:49 Guido van Rossum mailto:[email protected]>> wrote:


[snip]


Could you figure out some other way to store per-code-object data?
It seems you considered this but decided that the co_extra field was
simpler and faster; I'm basically pushing a little harder on this.
Of course most of the PEP would disappear without this feature; the
extra interpreter field is fine.

Dino and I thought of two potential alternatives, neither of which we
have taken the time to implement and benchmark. One is to simply have a
hash table of memory addresses to JIT data that is kept on the JIT side
of things. Obviously it would be nice to avoid the overhead of a hash
table lookup on every function call. This also doesn't help minimize
memory when the code object gets GC'ed.


[snip]
If you had a flag in co_flags that said whether it should look in the 
hash table, then that might reduce the overhead.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] frame evaluation API PEP

2016-06-19 Thread Guido van Rossum
On Sun, Jun 19, 2016 at 6:29 PM, Brett Cannon  wrote:

>
>
> On Sat, 18 Jun 2016 at 21:49 Guido van Rossum  wrote:
>
>> Hi Brett,
>>
>> I've got a few questions about the specific design. Probably you know the
>> answers, it would be nice to have them in the PEP.
>>
>
> Once you're happy with my answers I'll update the PEP.
>

Soon!


>
>
>>
>> First, why not have a global hook? What does a hook per interpreter give
>> you? Would even finer granularity buy anything?
>>
>
> We initially considered a per-code object hook, but we figured it was
> unnecessary to have that level of control, especially since people like
> Numba have gotten away with not needing it for this long (although I
> suspect that's because they are a decorator so they can just return an
> object that overrides __call__()).
>

So they do it at the function object level?


> We didn't think that a global one was appropriate as different workloads
> may call for different JITs/debuggers/etc. and there is no guarantee that
> you are executing every interpreter with the same workload. Plus we figured
> people might simply import their JIT of choice and as a side-effect set the
> hook, and since imports are a per-interpreter thing that seemed to suggest
> the granularity of interpreters.
>

I like import as the argument here.


>
> IOW it seemed to be more in line with sys.settrace() than some global
> thing for the process.
>
>
>>
>> Next, I'm a bit (but no more than a bit) concerned about the extra 8
>> bytes per code object, especially since for most people this is just waste
>> (assuming most people won't be using Pyjion or Numba). Could it be a
>> compile-time feature (requiring recompilation of CPython but not
>> extensions)?
>>
>
> Probably. It does water down potential usage thanks to needing a special
> build. If the decision is "special build or not", I would simply pull out
> this part of the proposal as I wouldn't want to add a flag that influences
> what is or is not possible for an interpreter.
>

MRAB's response made me think of a possible approach: the co_extra field
could be the very last field of the PyCodeObject struct and only present if
a certain flag is set in co_flags. This is similar to a trick used by X11
(I know, it's long ago :-).

>
>
>> Could you figure out some other way to store per-code-object data? It
>> seems you considered this but decided that the co_extra field was simpler
>> and faster; I'm basically pushing a little harder on this. Of course most
>> of the PEP would disappear without this feature; the extra interpreter
>> field is fine.
>>
>
> Dino and I thought of two potential alternatives, neither of which we have
> taken the time to implement and benchmark. One is to simply have a hash
> table of memory addresses to JIT data that is kept on the JIT side of
> things. Obviously it would be nice to avoid the overhead of a hash table
> lookup on every function call. This also doesn't help minimize memory when
> the code object gets GC'ed.
>

I guess the prospect of the extra hash lookup per call isn't great given
that this is about perf...

>
> The other potential solution we came up with was to use weakrefs. I have
> not looked into the details, but we were thinking that if we registered the
> JIT data object as a weakref on the code object, couldn't we iterate
> through the weakrefs attached to the code object to look for the JIT data
> object, and then get the reference that way? It would let us avoid a more
> expensive hash table lookup if we assume most code objects won't have a
> weakref on it (assuming weakrefs are stored in a list), and it gives us the
> proper cleanup semantics we want by getting the weakref cleanup callback
> execution to make sure we decref the JIT data object appropriately. But as
> I said, I have not looked into the feasibility of this at all to know if
> I'm remembering the weakref implementation details correctly.
>

That would be even slower than the hash table lookup, and unbounded. So
let's not go there.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] frame evaluation API PEP

2016-06-19 Thread Mark Shannon



On 19/06/16 18:29, Brett Cannon wrote:



On Sat, 18 Jun 2016 at 21:49 Guido van Rossum mailto:[email protected]>> wrote:

Hi Brett,

I've got a few questions about the specific design. Probably you
know the answers, it would be nice to have them in the PEP.


Once you're happy with my answers I'll update the PEP.


First, why not have a global hook? What does a hook per interpreter
give you? Would even finer granularity buy anything?


We initially considered a per-code object hook, but we figured it was
unnecessary to have that level of control, especially since people like
Numba have gotten away with not needing it for this long (although I
suspect that's because they are a decorator so they can just return an
object that overrides __call__()). We didn't think that a global one was
appropriate as different workloads may call for different
JITs/debuggers/etc. and there is no guarantee that you are executing
every interpreter with the same workload. Plus we figured people might
simply import their JIT of choice and as a side-effect set the hook, and
since imports are a per-interpreter thing that seemed to suggest the
granularity of interpreters.

IOW it seemed to be more in line with sys.settrace() than some global
thing for the process.


Next, I'm a bit (but no more than a bit) concerned about the extra 8
bytes per code object, especially since for most people this is just
waste (assuming most people won't be using Pyjion or Numba). Could
it be a compile-time feature (requiring recompilation of CPython but
not extensions)?


Probably. It does water down potential usage thanks to needing a special
build. If the decision is "special build or not", I would simply pull
out this part of the proposal as I wouldn't want to add a flag that
influences what is or is not possible for an interpreter.

Could you figure out some other way to store per-code-object data?
It seems you considered this but decided that the co_extra field was
simpler and faster; I'm basically pushing a little harder on this.
Of course most of the PEP would disappear without this feature; the
extra interpreter field is fine.


Dino and I thought of two potential alternatives, neither of which we
have taken the time to implement and benchmark. One is to simply have a
hash table of memory addresses to JIT data that is kept on the JIT side
of things. Obviously it would be nice to avoid the overhead of a hash
table lookup on every function call. This also doesn't help minimize
memory when the code object gets GC'ed.


Hash lookups aren't that slow. If you combine it with the custom flags 
suggested by MRAB, then you would only suffer the lookup penalty when 
actually entering the special interpreter.

You can use a weakref callback to ensure things get GC'd properly.

Also, if there is a special extra field on code-object, then everyone 
will want to use it. How do you handle clashes?




The other potential solution we came up with was to use weakrefs. I have
not looked into the details, but we were thinking that if we registered
the JIT data object as a weakref on the code object, couldn't we iterate
through the weakrefs attached to the code object to look for the JIT
data object, and then get the reference that way? It would let us avoid
a more expensive hash table lookup if we assume most code objects won't
have a weakref on it (assuming weakrefs are stored in a list), and it
gives us the proper cleanup semantics we want by getting the weakref
cleanup callback execution to make sure we decref the JIT data object
appropriately. But as I said, I have not looked into the feasibility of
this at all to know if I'm remembering the weakref implementation
details correctly.


Finally, there are some error messages from pep2html.py:
https://www.python.org/dev/peps/pep-0523/#copyright


All fixed in
https://github.com/python/peps/commit/6929f850a5af07e51d0163558a5fe8d6b85dccfe .

-Brett



--Guido

On Fri, Jun 17, 2016 at 7:58 PM, Brett Cannon mailto:[email protected]>> wrote:

I have taken PEP 523 for this:
https://github.com/python/peps/blob/master/pep-0523.txt .

I'm waiting until Guido gets back from vacation, at which point
I'll ask for a pronouncement or assignment of a BDFL delegate.

On Fri, 3 Jun 2016 at 14:37 Brett Cannon mailto:[email protected]>> wrote:

For those of you who follow python-ideas or were at the
PyCon US 2016 language summit, you have already seen/heard
about this PEP. For those of you who don't fall into either
of those categories, this PEP proposed a frame evaluation
API for CPython. The motivating example of this work has
been Pyjion, the experimental CPython JIT Dino Viehland and
I have been working on in our spare time at Microsoft. The
API also works for debugging, though, as already
demonstrated by

Re: [Python-Dev] security SIG?

2016-06-19 Thread Ethan Furman

On 06/19/2016 03:51 PM, Guido van Rossum wrote:


I think it's fine to have this SIG. I could see it going different ways
in terms of discussions and membership, but it's definitely worth a try.
I don't like clever names, and I very much doubt that it'll be mistaken
for an address to report sensitive issues, so I think it should just be
security-sig. (The sensitive-issues people are usually paranoid enough
to check before they post; the script kiddies reporting python.org
 "issues" probably will get a faster and more
appropriate response from the security-sig.)

So let's just do it.


Started the process of creating "security-sig".

--
~Ethan~

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com