Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-09-07 Thread Guido van Rossum
Folks,

At the sprint both Victor and Yury have petitioned me to accept this
PEP. I now agree. Let's do it! PEP 509 is hereby officially accepted.
(Some implementation details have to be sorted out, but I need to
unblock Victor before the sprint is over.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-18 Thread Jim J. Jewett
On Fri, Apr 15, 2016 at 7:31 PM, Victor Stinner
 wrote:
> .2016-04-15 23:45 GMT+02:00 Jim J. Jewett :
...
>> I just worry that you may end up closing off even better optimizations
>> later, if you make too many promises about exactly how you will do
>> which ones.

>> Today, dict only cares about ==, and you (reasonably) think that full
>> == isn't always worth running ... but when it comes to which tests
>> *are* worth running, I'm not confident that the answers won't change
>> over the years.

> I checked, currently there is no unit test for a==b, only for a is b.
> I will add add a test for a==b but a is not b, and ensure that the
> version is increased.

Again, why?  Why not just say "If an object is replaced by something
equal to itself, the version_tag may not be changed.  While the
initial heuristics are simply to check for identity but not full
equality, this may change in future releases."

>> For example, if I know that my dict values are all 4-digit integers,
>> can I write:
>>
>> d[k]  = d[k] + 0
>>
>> and be assured that the version_tag will bump?  Or is that something
>> that a future optimizer might optimize out?

> Hum, I will try to clarify that.

I would prefer that you clarify it to say that while the initial patch
doesn't optimize that out, a future optimizer might.

> The problem with storing an identifier (a pointer in C) with no strong
> reference is when the object is destroyed, a new object can likely get
> the same identifier. So it's likely that "dict[key] is old_value_id"
> can be true even if dict[key] is now a new object.

Yes, but it shouldn't actually be destroyed until it is removed from
the dict, which should change version_tag, so that there will be no
need to compare it.

> Do you want to modify the PEP 509 to fix this issue? Or you don't
> understand why the PEP 509 cannot be used to fix the issue? I'm
> lost...

I believe it *does* fix the issue in some (but not all) cases.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Victor Stinner
.2016-04-15 23:45 GMT+02:00 Jim J. Jewett :
>> It's an useful property. For example, let's say that you have a guard
>> on globals()['value']. The guard is created with value=3. An unit test
>> replaces the value with 50, but then restore the value to its previous
>> value (3). Later, the guard is checked to decide if an optimization
>> can be used.
>
>> If the dict version is increased, you need a lookup. If the dict
>> version is not increased, the guard is cheap.
>
> I would expect the version to be increased twice, and therefore to
> require a lookup.  Are you suggesting that unittest should provide an
> example of resetting the version back to the original value when it
> cleans up after itself?

Sorry, as I wrote in another email that I was wrong. If you modify the
value, the version is increased. The discussed case is really a corner
case: the version does not change if the key is set again to exactly
the same value.

d[key] = value
d[key] = value

It's just that it's cheap to implement it :-)


>> In C, it's very cheap to implement the test "new_value == old_value",
>> it just compares two pointers.
>
> Yeah, I understand that it is likely a win in terms of performance,
> and a good way to start off (given that you're willing to do the
> work).
>
> I just worry that you may end up closing off even better optimizations
> later, if you make too many promises about exactly how you will do
> which ones.
>
> Today, dict only cares about ==, and you (reasonably) think that full
> == isn't always worth running ... but when it comes to which tests
> *are* worth running, I'm not confident that the answers won't change
> over the years.

I checked, currently there is no unit test for a==b, only for a is b.
I will add add a test for a==b but a is not b, and ensure that the
version is increased.


>>> [2A] Do you want to promise that replacing a value with a
>>> non-identical object *will* trigger a version_tag update *even*
>>> if the objects are equal?
>
>> It's already written in the PEP:
>
> I read that as a description of what the code does, rather than a spec
> for what it should do... so it isn't clear whether I could count on
> that remaining true.
>
> For example, if I know that my dict values are all 4-digit integers,
> can I write:
>
> d[k]  = d[k] + 0
>
> and be assured that the version_tag will bump?  Or is that something
> that a future optimizer might optimize out?

Hum, I will try to clarify that.


>>> (4)  Please be explicit about the locking around version++; it
>>> is enough to say that the relevant methods already need to hold
>>> the GIL (assuming that is true).
>
>> I don't think that it's important to mention it in the PEP. It's more
>> an implementation detail. The version can be protected by atomic
>> operations.
>
> Now I'm the one arguing from a specific implementation.  :D
>
> My thought was that any sort of locking (including atomic operations)
> is slow, but if the GIL is already held, then there is no *extra*
> locking cost. (Well, a slightly longer hold on the lock, but...)

Hum, since the PEP clarify targets CPython, I will simply described
its implementation, so explain that the GIL ensures that version++ is
atomic.


>>> On the one hand, you never need a strong reference to the value;
>>> if it has been collected, then it has obviously been removed from
>>> the dict and should trigger a change even with per-dict.
>>
>> Let's say that you watch the key1 of a dict. The key2 is modified, it
>> increases the version. Later, you test the guard: to check if the key1
>> was modified, you need to lookup the key and compare the value. You
>> need the value to compare it.
>
> And the value for key1 is still there, so you can.

Sorry, how do you want to compare that dict[key1] value didn't change,
using the value identifier? dict[key1] is old_value_id?

The problem with storing an identifier (a pointer in C) with no strong
reference is when the object is destroyed, a new object can likely get
the same identifier. So it's likely that "dict[key] is old_value_id"
can be true even if dict[key] is now a new object.


> The only reason you would notice that the key2 value had gone away is
> if you also care about key2 -- in which case the cached value is out
> of date, regardless of what specific value it used to hold.

I don't understand, technically, what do you mean by "out of date" for
an object?


>> If the dictionary values are modified during the loop, the dict
>> version is increased. But it's allowed to modify values when you
>> iterate on *keys*.
>
> Sure.  So?
>
> I see three cases:
>
> (A)  I don't care that the collection changed.  The python
> implementation might, but I don't.  (So no bug even today.)

I'm sorry, I don't understand your description. What do you mean by
"collection"? It's different if you modify dict *keys*, or dict
*values*, or both.

Serhiy opened an issue because he wants to raise an exception if keys
are modified while you iterate 

Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Jim J. Jewett
On Fri, Apr 15, 2016 at 4:41 PM, Victor Stinner
 wrote:
> 2016-04-15 19:54 GMT+02:00 Jim J. Jewett :

>> (2)  Why *promise* not to update the version_tag when replacing a
>> value with itself?

> It's an useful property. For example, let's say that you have a guard
> on globals()['value']. The guard is created with value=3. An unit test
> replaces the value with 50, but then restore the value to its previous
> value (3). Later, the guard is checked to decide if an optimization
> can be used.

> If the dict version is increased, you need a lookup. If the dict
> version is not increased, the guard is cheap.

I would expect the version to be increased twice, and therefore to
require a lookup.  Are you suggesting that unittest should provide an
example of resetting the version back to the original value when it
cleans up after itself?

> In C, it's very cheap to implement the test "new_value == old_value",
> it just compares two pointers.

Yeah, I understand that it is likely a win in terms of performance,
and a good way to start off (given that you're willing to do the
work).

I just worry that you may end up closing off even better optimizations
later, if you make too many promises about exactly how you will do
which ones.

Today, dict only cares about ==, and you (reasonably) think that full
== isn't always worth running ... but when it comes to which tests
*are* worth running, I'm not confident that the answers won't change
over the years.

>> [2A] Do you want to promise that replacing a value with a
>> non-identical object *will* trigger a version_tag update *even*
>> if the objects are equal?

> It's already written in the PEP:

I read that as a description of what the code does, rather than a spec
for what it should do... so it isn't clear whether I could count on
that remaining true.

For example, if I know that my dict values are all 4-digit integers,
can I write:

d[k]  = d[k] + 0

and be assured that the version_tag will bump?  Or is that something
that a future optimizer might optimize out?

>> (3)  It is worth being explicit on whether empty dicts can share
>> a version_tag of 0.  If this PEP is about dict content, then that
>> seems fine, and it may well be worth optimizing dict creation.

> This is not part of the PEP yet. I'm not sure that I will modify the
> PEP to use the version 0 for empty dictionaries. Antoine doesn't seem
> to be convinced :-)

True.  But do note that "not hitting the global counter an extra time
for every dict creation" is a more compelling reason than "we could
speed up dict.clear(), sometimes".


>> (4)  Please be explicit about the locking around version++; it
>> is enough to say that the relevant methods already need to hold
>> the GIL (assuming that is true).

> I don't think that it's important to mention it in the PEP. It's more
> an implementation detail. The version can be protected by atomic
> operations.

Now I'm the one arguing from a specific implementation.  :D

My thought was that any sort of locking (including atomic operations)
is slow, but if the GIL is already held, then there is no *extra*
locking cost. (Well, a slightly longer hold on the lock, but...)

>> (5)  I'm not sure I understand the arguments around a per-entry
>> version.

>> On the one hand, you never need a strong reference to the value;
>> if it has been collected, then it has obviously been removed from
>> the dict and should trigger a change even with per-dict.
>
> Let's say that you watch the key1 of a dict. The key2 is modified, it
> increases the version. Later, you test the guard: to check if the key1
> was modified, you need to lookup the key and compare the value. You
> need the value to compare it.

And the value for key1 is still there, so you can.

The only reason you would notice that the key2 value had gone away is
if you also care about key2 -- in which case the cached value is out
of date, regardless of what specific value it used to hold.

>> (6)  I'm also not sure why version_tag *doesn't* solve the problem
>> of dicts that fool the iteration guards by mutating without changing
>> size ( https://bugs.python.org/issue19332 ) ... are you just saying
>> that the iterator views aren't allowed to rely on the version-tag
>> remaining stable, because replacing a value (as opposed to a
>> key-value pair) is allowed?

> If the dictionary values are modified during the loop, the dict
> version is increased. But it's allowed to modify values when you
> iterate on *keys*.

Sure.  So?

I see three cases:

(A)  I don't care that the collection changed.  The python
implementation might, but I don't.  (So no bug even today.)

(B)  I want to process exactly the collection that I started with.  If
some of the values get replaced, then I want to complain, even if
python doesn't.  version_tag is what I want.

(C)  I want to process exactly the original keys, but go ahead and use
updated values.  The bug still bites, but ... I don't think 

Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Victor Stinner
Hi,

FYI I updated the implementation of the PEP 509:
https://bugs.python.org/issue26058


2016-04-15 11:01 GMT+02:00 Antoine Pitrou :
> Why do this? It's a nice property that two dicts always have different
> version tags, and now you're killing this property for... no obvious
> reason?
>
> Do you really think dict.clear() is in need of micro-optimizing a
> couple CPU cycles away?

So, I played with Armin's idea. I confirm that it works for my use
case, guards on dict keys. It should also work on Yury's use case.

Antoine is right, it's really a micro-optimization. It shouldn't help
much for the integer overflow (which is not a real issue in practice).

I propose to leave the PEP unchanged to keep the nice property of
unique identifier for empty dictionaries. It can help for future use
cases.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Victor Stinner
2016-04-15 23:16 GMT+02:00 Ethan Furman :
>> It's an useful property. For example, let's say that you have a guard
>> on globals()['value']. The guard is created with value=3. An unit test
>> replaces the value with 50, but then restore the value to its previous
>> value (3). Later, the guard is checked to decide if an optimization
>> can be used.
>
> I don't understand -- shouldn't the version be incremented with the value
> was replaced with 50, and again when re-replaced with 3?

Oh wait, I'm tired and you are right.

Not increasing the value only helps on this code:

dict[key] = value
dict[key] = value  # version doesn't change


>> If the dictionary values are modified during the loop, the dict
>> version is increased. But it's allowed to modify values when you
>> iterate on *keys*.
>
> I don't understand.  Could you provide a small example?

For example, this loop is fine:

for key in dict:
   dict[key] = None

In this loop, the dict version is increased at each loop iteration.

For iter(dict), the check prevents a crash.

The following example raises a RuntimeError("dictionary changed size
during iteration"):

d={1:2}
for k in d:
d[k+1] = None

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Victor Stinner
2016-04-15 23:07 GMT+02:00 Random832 :
> Why is iterating over items different from iterating over keys?
>
> in other words, why do I have to write:
>
> for k in dict:
> v = dict[k]
> ...do some stuff...
> dict[k] = something
>
> rather than
>
> for k, v in dict.items():
> ...do some stuff...
> dict[k] = something
>
> It's not clear why the latter is something you want to prevent.

Hum, I think that you misunderstood what should be prevented. Please
see https://bugs.python.org/issue19332

Sorry, I don't know well this issue. I just know that sadly the PEP
509 doesn't help to fix this issue. Maybe it's not worth to mention
it...

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Ethan Furman

On 04/15/2016 01:41 PM, Victor Stinner wrote:

2016-04-15 19:54 GMT+02:00 Jim J. Jewett:



(2)  Why *promise* not to update the version_tag when replacing a
value with itself?


It's an useful property. For example, let's say that you have a guard
on globals()['value']. The guard is created with value=3. An unit test
replaces the value with 50, but then restore the value to its previous
value (3). Later, the guard is checked to decide if an optimization
can be used.


I don't understand -- shouldn't the version be incremented with the 
value was replaced with 50, and again when re-replaced with 3?




(6)  I'm also not sure why version_tag *doesn't* solve the problem
of dicts that fool the iteration guards by mutating without changing
size ( https://bugs.python.org/issue19332 ) ... are you just saying
that the iterator views aren't allowed to rely on the version-tag
remaining stable, because replacing a value (as opposed to a
key-value pair) is allowed?


If the dictionary values are modified during the loop, the dict
version is increased. But it's allowed to modify values when you
iterate on *keys*.


I don't understand.  Could you provide a small example?

--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Random832
On Fri, Apr 15, 2016, at 16:41, Victor Stinner wrote:
> If the dictionary values are modified during the loop, the dict
> version is increased. But it's allowed to modify values when you
> iterate on *keys*.

Why is iterating over items different from iterating over keys?

in other words, why do I have to write:

for k in dict:
v = dict[k]
...do some stuff...
dict[k] = something

rather than

for k, v in dict.items():
...do some stuff...
dict[k] = something


It's not clear why the latter is something you want to prevent.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Victor Stinner
2016-04-15 19:54 GMT+02:00 Jim J. Jewett :
> (1)  Meta Question:  If this is really only for CPython, then is
> "Standards Track" the right classification?

Yes, I think so. It doesn't seem to be an Informal nor a Process:
https://www.python.org/dev/peps/pep-0001/#pep-types


> (2)  Why *promise* not to update the version_tag when replacing a
> value with itself?

It's an useful property. For example, let's say that you have a guard
on globals()['value']. The guard is created with value=3. An unit test
replaces the value with 50, but then restore the value to its previous
value (3). Later, the guard is checked to decide if an optimization
can be used.

If the dict version is increased, you need a lookup. If the dict
version is not increased, the guard is cheap.

In C, it's very cheap to implement the test "new_value == old_value",
it just compares two pointers.

If an overhead is visible, I can drop it from the PEP, and implement
the check in the guard.


>  Isn't that the sort of quality-of-implementation
> issue that got pushed to a note for objects that happen to be
> represented as singletons, such as small integers or ASCII chars?

I prefer to require this property.


> [2A] Do you want to promise that replacing a value with a
> non-identical object *will* trigger a version_tag update *even*
> if the objects are equal?

It's already written in the PEP:

"The version is not incremented if an existing key is set to the same
value. For efficiency, values are compared by their identity:
new_value is old_value , not by their content: new_value == old_value
."


> (3)  It is worth being explicit on whether empty dicts can share
> a version_tag of 0.  If this PEP is about dict content, then that
> seems fine, and it may well be worth optimizing dict creation.

This is not part of the PEP yet. I'm not sure that I will modify the
PEP to use the version 0 for empty dictionaries. Antoine doesn't seem
to be convinced :-)


> (4)  Please be explicit about the locking around version++; it
> is enough to say that the relevant methods already need to hold
> the GIL (assuming that is true).

I don't think that it's important to mention it in the PEP. It's more
an implementation detail. The version can be protected by atomic
operations.


> (5)  I'm not sure I understand the arguments around a per-entry
> version.

It doesn't matter since I don't want this option :-)


> On the one hand, you never need a strong reference to the value;
> if it has been collected, then it has obviously been removed from
> the dict and should trigger a change even with per-dict.

Let's say that you watch the key1 of a dict. The key2 is modified, it
increases the version. Later, you test the guard: to check if the key1
was modified, you need to lookup the key and compare the value. You
need the value to compare it.

> On the other hand, I'm not sure per-entry would really allow
> finer-grained guards to avoid lookups; just because an entry hasn't
> been modified doesn't prove it hasn't been moved to another location,
> perhaps by replacing a dummy in a slot it would have preferred.

The main advantage of per-entry version is to avoid the strong
reference to values.

According to my tests, the drawbacks are too important to take this
option. I prefer a simple version per dictionary.


> (6)  I'm also not sure why version_tag *doesn't* solve the problem
> of dicts that fool the iteration guards by mutating without changing
> size ( https://bugs.python.org/issue19332 ) ... are you just saying
> that the iterator views aren't allowed to rely on the version-tag
> remaining stable, because replacing a value (as opposed to a
> key-value pair) is allowed?

If the dictionary values are modified during the loop, the dict
version is increased. But it's allowed to modify values when you
iterate on *keys*.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Oscar Benjamin
On 15 April 2016 at 18:54, Jim J. Jewett  wrote:
>
> [2A] Do you want to promise that replacing a value with a
> non-identical object *will* trigger a version_tag update *even*
> if the objects are equal?
>
> I would vote no, but I realize backwards-compatibility may create
> such a promise implicitly.

It needs to trigger a version update. Equality  doesn't guarantee any
kind of equivalence in Python. It's not even guaranteed that a==b will
come to the same value if evaluated twice in a row.

An example:

>>> from fractions import Fraction as F
>>> F(1) == 1
True
>>> d = globals()
>>> d['a'] = F(1)
>>> a.limit_denominator()
Fraction(1, 1)
>>> d['a'] = 1
>>> a.limit_denominator()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'int' object has no attribute 'limit_denominator'

--
Oscar
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Jim J. Jewett

On Thu Apr 14 11:19:42 EDT 2016, Victor Stinner posted the latest
draft of PEP 509; dict version_tag

(1)  Meta Question:  If this is really only for CPython, then is
"Standards Track" the right classification?

(2)  Why *promise* not to update the version_tag when replacing a
value with itself?  Isn't that the sort of quality-of-implementation
issue that got pushed to a note for objects that happen to be
represented as singletons, such as small integers or ASCII chars?

I think it is a helpful optimization, and worth documenting ... I
just think it should be at the layer of "this particular patch",
rather than something that sounds like part of the contract.

e.g.,

... The global version is also incremented and copied to the
dictionary version at each dictionary change.  The following
dict methods can trigger changes:

* ``clear()`` 
* ``pop(key)``
* ``popitem()`` 
* ``setdefault(key, value)`` 
* ``__detitem__(key)`` 
* ``__setitem__(key, value)`` 
* ``update(...)``

.. note::  As a quality of implementation issue, the actual patch
does not increment the version_tag when it can prove that there
was no actual change.  For example, clear() on an already-empty
dict will not trigger a version_tag change, nor will updating a
dict with itself, since the values will be unchanged.  For efficiency,
the analysis considers only object identity (not equality) when
deciding whether to increment the version_tag.

[2A] Do you want to promise that replacing a value with a
non-identical object *will* trigger a version_tag update *even*
if the objects are equal?

I would vote no, but I realize backwards-compatibility may create
such a promise implicitly.

(3)  It is worth being explicit on whether empty dicts can share
a version_tag of 0.  If this PEP is about dict content, then that
seems fine, and it may well be worth optimizing dict creation.

There are times when it is important to keep the same empty dict;
I can't think of any use cases where it is important to verify
that some *other* code has done so, *and* I can't get a reference
to the correct dict for an identity check.

(4)  Please be explicit about the locking around version++; it
is enough to say that the relevant methods already need to hold
the GIL (assuming that is true).

(5)  I'm not sure I understand the arguments around a per-entry
version.

On the one hand, you never need a strong reference to the value;
if it has been collected, then it has obviously been removed from
the dict and should trigger a change even with per-dict.

On the other hand, I'm not sure per-entry would really allow
finer-grained guards to avoid lookups; just because an entry hasn't
been modified doesn't prove it hasn't been moved to another location,
perhaps by replacing a dummy in a slot it would have preferred.

(6)  I'm also not sure why version_tag *doesn't* solve the problem
of dicts that fool the iteration guards by mutating without changing
size ( https://bugs.python.org/issue19332 ) ... are you just saying
that the iterator views aren't allowed to rely on the version-tag
remaining stable, because replacing a value (as opposed to a
key-value pair) is allowed?

I had always viewed the failing iterators as a supporting-this-case-
makes-the-code-too-slow-and-ugly limitation, rather than a data
integrity check.  When I do care about the data not changing,
(an exposed variant of) version_tag is as likely to be what I want as
a hypothetical keys_version_tag would be. 

-jJ

--

If there are still threading problems with my replies, please
email me with details, so that I can try to resolve them.  -jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Victor Stinner
2016-04-15 11:01 GMT+02:00 Antoine Pitrou :
> Victor Stinner  gmail.com> writes:
>> You're right that incrementing the global version is useless for these
>> specific cases, and using the version 0 should work. It only matters
>> that the version (version? version tag?) is different.
>
> Why do this? It's a nice property that two dicts always have different
> version tags, and now you're killing this property for... no obvious
> reason?

I guess that the reason is to reduce *a little bit* the risk of
integer overflow (especially the bug when a guard doesn't see a change
between new_version = old_version % 2**64).

> Do you really think dict.clear() is in need of micro-optimizing a
> couple CPU cycles away?

The advantage of having a different version for empty dict is to be
able to use the version to check that they are different. Using the
dictionary pointer is not enough, since it's common that a new
dictionary gets the address of a previously destroyed dictionary. This
case can be avoided if you keep dictionaries alive by keeping a strong
reference, but there are good reasons to not keep a strong reference.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Stefan Behnel
Victor Stinner schrieb am 15.04.2016 um 10:20:
> Le vendredi 15 avril 2016, Stefan Behnel a écrit :
> 
>> How can that be achieved? If the tag is just a sequentially growing number,
>> creating two dicts and applying one operation to the first one should give
>> both the same version tag, right?
>>
> 
> Armin didn't propose to get ride of the global version.
> 
> a = dict() # version = 0
> b = dict() # version = 0
> a['key'] = 'value' # version = 300
> b['key'] = 'value' # version = 301

Ah, sorry, should have read the PEP more closely. It's *always* the global
version that gets incremented. Then yes, that's a safe point of distinction
for dicts and their status.

Stefan


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Antoine Pitrou
Victor Stinner  gmail.com> writes:
> 
> Hi,
> 
> 2016-04-14 22:42 GMT+02:00 Armin Rigo  tunes.org>:
> > Hi Victor,
> >
> > On 14 April 2016 at 17:19, Victor Stinner 
gmail.com> wrote:
> >> Each time a dictionary is created, the global
> >> version is incremented and the dictionary version is initialized to the
> >> global version.
> >
> > A detail, but why not set the version tag of new empty dictionaries to
> > zero, always?   Same after a clear().  This would satisfy the
> > condition: equality of the version tag is supposed to mean "the
> > dictionary content is precisely the same".
> 
> You're right that incrementing the global version is useless for these
> specific cases, and using the version 0 should work. It only matters
> that the version (version? version tag?) is different.

Why do this? It's a nice property that two dicts always have different
version tags, and now you're killing this property for... no obvious
reason?

Do you really think dict.clear() is in need of micro-optimizing a
couple CPU cycles away?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-15 Thread Victor Stinner
Le vendredi 15 avril 2016, Stefan Behnel  a écrit :

> How can that be achieved? If the tag is just a sequentially growing number,
> creating two dicts and applying one operation to the first one should give
> both the same version tag, right?
>

Armin didn't propose to get ride of the global version.

a = dict() # version = 0
b = dict() # version = 0
a['key'] = 'value' # version = 300
b['key'] = 'value' # version = 301

Victor
PS: It looks like the iPad Gmail app foces me to use HTML, I don't know how
to use plain text :-/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Stefan Behnel
Victor Stinner schrieb am 15.04.2016 um 00:33:
> 2016-04-15 0:22 GMT+02:00 Brett Cannon:
>> And even if it was GIL-free you do run the risk of two dicts ending up at
>> the same version # by simply mutating the same number of times if the
>> counters were per-dict instead of process-wide.
> 
> For some optimizations, it is not needed to check if the dictionary
> was replaced, or you check it directly. So it doesn't matter to have
> the same version with the same number of operations.
> 
> For the use case of Yury's optimization, having a globally unique
> version tag makes the guard much cheaper, and the guard must check
> that the dictionary was not replaced.

How can that be achieved? If the tag is just a sequentially growing number,
creating two dicts and applying one operation to the first one should give
both the same version tag, right?

Stefan


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Brett Cannon
On Thu, Apr 14, 2016, 17:14 MRAB  wrote:

> On 2016-04-14 21:42, Armin Rigo wrote:
> > Hi Victor,
> >
> > On 14 April 2016 at 17:19, Victor Stinner 
> wrote:
> >> Each time a dictionary is created, the global
> >> version is incremented and the dictionary version is initialized to the
> >> global version.
> >
> > A detail, but why not set the version tag of new empty dictionaries to
> > zero, always?   Same after a clear().  This would satisfy the
> > condition: equality of the version tag is supposed to mean "the
> > dictionary content is precisely the same".
> >
> If you did that, wouldn't it then be possible to replace an empty dict
> with another empty dict with you noticing?


If you meant to say "without" then yes.


Would that matter?
>

Nope because this is about versioining content, so having identical/empty
content compare equal is fine.

-brett


> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread MRAB

On 2016-04-14 21:42, Armin Rigo wrote:

Hi Victor,

On 14 April 2016 at 17:19, Victor Stinner  wrote:

Each time a dictionary is created, the global
version is incremented and the dictionary version is initialized to the
global version.


A detail, but why not set the version tag of new empty dictionaries to
zero, always?   Same after a clear().  This would satisfy the
condition: equality of the version tag is supposed to mean "the
dictionary content is precisely the same".

If you did that, wouldn't it then be possible to replace an empty dict 
with another empty dict with you noticing? Would that matter?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Yury Selivanov



On 2016-04-14 4:42 PM, Armin Rigo wrote:

Hi Victor,

On 14 April 2016 at 17:19, Victor Stinner  wrote:

Each time a dictionary is created, the global
version is incremented and the dictionary version is initialized to the
global version.

A detail, but why not set the version tag of new empty dictionaries to
zero, always?   Same after a clear().  This would satisfy the
condition: equality of the version tag is supposed to mean "the
dictionary content is precisely the same".


So

{}.version_tag == {}.version_tag == 0
{'a':1}.version_tag != {'a':1}.version_tag

right?

For my patches I need globally unique version tags
(making an exception for empty dicts is OK).

Yury

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Glenn Linderman

On 4/14/2016 3:33 PM, Victor Stinner wrote:

When we will be able to get ride of the GIL for the dict type, we will
probably be able to get an atomic "global_version++" for 64-bit
integer. Right now, I don't think that an atomic int64++ is available
on 32-bit archs.
By the time we get an atomic increment for 64-bit integer, we'll be 
wanting it for 128-bit...
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
2016-04-15 0:22 GMT+02:00 Brett Cannon :
> And even if it was GIL-free you do run the risk of two dicts ending up at
> the same version # by simply mutating the same number of times if the
> counters were per-dict instead of process-wide.

For some optimizations, it is not needed to check if the dictionary
was replaced, or you check it directly. So it doesn't matter to have
the same version with the same number of operations.

For the use case of Yury's optimization, having a globally unique
version tag makes the guard much cheaper, and the guard must check
that the dictionary was not replaced.

IMHO it's cheap enough to make the version globally unique. I don't
see any technical drawback of having a globally unique version. It
doesn't make the integer overflow much more likely. We are still
talking about many years before an overflow occurs.

--

When we will be able to get ride of the GIL for the dict type, we will
probably be able to get an atomic "global_version++" for 64-bit
integer. Right now, I don't think that an atomic int64++ is available
on 32-bit archs.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Brett Cannon
On Thu, 14 Apr 2016 at 15:14 Victor Stinner 
wrote:

> 2016-04-14 23:29 GMT+02:00 Barry Warsaw :
> > I can see why you might want a global version number, but not doing so
> would
> > eliminate an implicit reliance on the GIL, or in a GIL-less
> implementation
> >  a lock around incrementing the global version number.
>
> It's not like the builtin dict type is going to become GIL-free... So
> I think that it's ok to use a global version.
>
> A very few know that, but the GIL has some advantages sometimes...
>

And even if it was GIL-free you do run the risk of two dicts ending up at
the same version # by simply mutating the same number of times if the
counters were per-dict instead of process-wide.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
2016-04-14 23:29 GMT+02:00 Barry Warsaw :
> I can see why you might want a global version number, but not doing so would
> eliminate an implicit reliance on the GIL, or in a GIL-less implementation
>  a lock around incrementing the global version number.

It's not like the builtin dict type is going to become GIL-free... So
I think that it's ok to use a global version.

A very few know that, but the GIL has some advantages sometimes...

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Barry Warsaw
On Apr 14, 2016, at 11:17 PM, Victor Stinner wrote:

>You're right that incrementing the global version is useless for these
>specific cases, and using the version 0 should work. It only matters
>that the version (version? version tag?) is different.
>
>I will play with that. If I don't see any issue, I will update the PEP.
>
>It's more an implementation detail, but it may help to mention it in the PEP.

I can see why you might want a global version number, but not doing so would
eliminate an implicit reliance on the GIL, or in a GIL-less implementation
 a lock around incrementing the global version number.

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
2016-04-14 22:50 GMT+02:00 Barry Warsaw :
> Although I'm not totally convinced, I won't continue to object.  You've
> provided some performance numbers in the PEP even without FAT, and you aren't
> exposing the API to Python, so it's not a burden being imposed on other
> implementations.

Cool!

Ah right, the PEP evolved since its first version sent to
python-ideas. I didn't recall the full context of the discussion. The
PEP is now more complete and it has more known (future) use cases ;-)
(now maybe also Cython?)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
Hi,

2016-04-14 22:42 GMT+02:00 Armin Rigo :
> Hi Victor,
>
> On 14 April 2016 at 17:19, Victor Stinner  wrote:
>> Each time a dictionary is created, the global
>> version is incremented and the dictionary version is initialized to the
>> global version.
>
> A detail, but why not set the version tag of new empty dictionaries to
> zero, always?   Same after a clear().  This would satisfy the
> condition: equality of the version tag is supposed to mean "the
> dictionary content is precisely the same".

You're right that incrementing the global version is useless for these
specific cases, and using the version 0 should work. It only matters
that the version (version? version tag?) is different.

I will play with that. If I don't see any issue, I will update the PEP.

It's more an implementation detail, but it may help to mention it in the PEP.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Émanuel Barry
> From Armin Rigo
> Sent: Thursday, April 14, 2016 4:42 PM
> To: Victor Stinner
> Cc: Python Dev
> Subject: Re: [Python-Dev] RFC: PEP 509: Add a private version to dict
> 
> Hi Victor,
> 
> On 14 April 2016 at 17:19, Victor Stinner <victor.stin...@gmail.com> wrote:
> > Each time a dictionary is created, the global
> > version is incremented and the dictionary version is initialized to the
> > global version.
> 
> A detail, but why not set the version tag of new empty dictionaries to
> zero, always?   Same after a clear().  This would satisfy the
> condition: equality of the version tag is supposed to mean "the
> dictionary content is precisely the same".

>From Victor's original post:

"Globally unique identifier is a requirement for Yury's patch
optimizing method calls ( https://bugs.python.org/issue26110 ). It
allows to check for free if the dictionary was replaced."

I think it's a good design idea, and there's no chance that this counter will 
ever overflow (I think Victor is using 64-bit unsigned integer). I don't think 
there's really any drawback to using a global vs per-dict counter (but Victor 
is better placed to answer that :))

-Emanuel

~Ducks lay where no programmer has ever been~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Barry Warsaw
On Apr 14, 2016, at 09:49 PM, Victor Stinner wrote:

>It would be nice to hear Barry Warsow who was opposed to the PEP in
>january. He wanted to wait until FAT Python was proven to really be faster,
>which is still not case right now. (I mean that I didnt't run seriously
>benchmarks, but early macro benchmarks are not really promising, only micro
>benchmarks. I expect better results when the implemenation will be more
>complete.)

Although I'm not totally convinced, I won't continue to object.  You've
provided some performance numbers in the PEP even without FAT, and you aren't
exposing the API to Python, so it's not a burden being imposed on other
implementations.

Cheers,
-Barry


pgpf0nC005Wjr.pgp
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Armin Rigo
Hi Victor,

On 14 April 2016 at 17:19, Victor Stinner  wrote:
> Each time a dictionary is created, the global
> version is incremented and the dictionary version is initialized to the
> global version.

A detail, but why not set the version tag of new empty dictionaries to
zero, always?   Same after a clear().  This would satisfy the
condition: equality of the version tag is supposed to mean "the
dictionary content is precisely the same".


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Stefan Behnel
Victor Stinner schrieb am 14.04.2016 um 21:56:
> Which kind of usage do you see in Cython?

Mainly caching, I guess. We could avoid global/module name lookups in some
cases, especially inside of loops.


> Off-topic (PEP 510):
> 
> I really want to experiment automatic generation of Cython code from the
> Python using profiling to discover function parameters types. Then use the
> PEP 510 to attach the fast Cython code to a Python function, but fallback
> to bytecode if the types are different. See the example of builtin
> functions in the PEP:
> https://www.python.org/dev/peps/pep-0510/#using-builtin-function
> 
> Before having something fully automated, we can use some manual steps, like
> annotate manually function types, compile manually the code, etc.

Sounds like Cython's "Fused Types" could help here:

http://docs.cython.org/src/userguide/fusedtypes.html

It's essentially a generic functions implementation and you get a dispatch
either at compile time or runtime, depending on where (Python/Cython) and
how you call a function.

Stefan


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
Which kind of usage do you see in Cython?

Off-topic (PEP 510):

I really want to experiment automatic generation of Cython code from the
Python using profiling to discover function parameters types. Then use the
PEP 510 to attach the fast Cython code to a Python function, but fallback
to bytecode if the types are different. See the example of builtin
functions in the PEP:
https://www.python.org/dev/peps/pep-0510/#using-builtin-function

Before having something fully automated, we can use some manual steps, like
annotate manually function types, compile manually the code, etc.

Victor

Le jeudi 14 avril 2016, Stefan Behnel  a écrit :

> +1 from me, too. I'm sure we can make some use of this in Cython.
>
> Stefan
>
>
> Victor Stinner schrieb am 14.04.2016 um 17:19:
> > PEP: 509
> > Title: Add a private version to dict
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org 
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
It would be nice to hear Barry Warsow who was opposed to the PEP in
january. He wanted to wait until FAT Python was proven to really be faster,
which is still not case right now. (I mean that I didnt't run seriously
benchmarks, but early macro benchmarks are not really promising, only micro
benchmarks. I expect better results when the implemenation will be more
complete.)

The main change since january is that Yury wrote a patch making method
calls using the PEP.
https://mail.python.org/pipermail/python-dev/2016-January/142772.html

Victor

Le jeudi 14 avril 2016, Guido van Rossum  a écrit :

> I'll wait a day before formally pronouncing to see if any objections
> are made, but it looks good to me.
>
> On Thu, Apr 14, 2016 at 8:19 AM, Victor Stinner
> > wrote:
> > Hi,
> >
> > I updated my PEP 509 to make the dictionary version globally unique.
> > With *two* use cases of this PEP (Yury's method call patch and my FAT
> > Python project), I think that the PEP is now ready to be accepted.
> >
> > Globally unique identifier is a requirement for Yury's patch
> > optimizing method calls ( https://bugs.python.org/issue26110 ). It
> > allows to check for free if the dictionary was replaced.
> >
> > I also renamed the ma_version field to ma_version_tag.
> >
> > HTML version:
> > https://www.python.org/dev/peps/pep-0509/
> >
> > Victor
> >
> >
> > PEP: 509
> > Title: Add a private version to dict
> > Version: $Revision$
> > Last-Modified: $Date$
> > Author: Victor Stinner >
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
> > Created: 4-January-2016
> > Python-Version: 3.6
> >
> >
> > Abstract
> > 
> >
> > Add a new private version to the builtin ``dict`` type, incremented at
> > each dictionary creation and at each dictionary change, to implement
> > fast guards on namespaces.
> >
> >
> > Rationale
> > =
> >
> > In Python, the builtin ``dict`` type is used by many instructions. For
> > example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
> > global namespace, or in the builtins namespace (two dict lookups).
> > Python uses ``dict`` for the builtins namespace, globals namespace, type
> > namespaces, instance namespaces, etc. The local namespace (namespace of
> > a function) is usually optimized to an array, but it can be a dict too.
> >
> > Python is hard to optimize because almost everything is mutable: builtin
> > functions, function code, global variables, local variables, ... can be
> > modified at runtime. Implementing optimizations respecting the Python
> > semantics requires to detect when "something changes": we will call
> > these checks "guards".
> >
> > The speedup of optimizations depends on the speed of guard checks. This
> > PEP proposes to add a version to dictionaries to implement fast guards
> > on namespaces.
> >
> > Dictionary lookups can be skipped if the version does not change which
> > is the common case for most namespaces. Since the version is globally
> > unique, the version is also enough to check if the namespace dictionary
> > was not replaced with a new dictionary. The performance of a guard does
> > not depend on the number of watched dictionary entries, complexity of
> > O(1), if the dictionary version does not change.
> >
> > Example of optimization: copy the value of a global variable to function
> > constants.  This optimization requires a guard on the global variable to
> > check if it was modified. If the variable is modified, the variable must
> > be loaded at runtime when the function is called, instead of using the
> > constant.
> >
> > See the `PEP 510 -- Specialized functions with guards
> > `_ for the concrete usage of
> > guards to specialize functions and for the rationale on Python static
> > optimizers.
> >
> >
> > Guard example
> > =
> >
> > Pseudo-code of an fast guard to check if a dictionary entry was modified
> > (created, updated or deleted) using an hypothetical
> > ``dict_get_version(dict)`` function::
> >
> > UNSET = object()
> >
> > class GuardDictKey:
> > def __init__(self, dict, key):
> > self.dict = dict
> > self.key = key
> > self.value = dict.get(key, UNSET)
> > self.version = dict_get_version(dict)
> >
> > def check(self):
> > """Return True if the dictionary entry did not changed
> > and the dictionary was not replaced."""
> >
> > # read the version of the dict structure
> > version = dict_get_version(self.dict)
> > if version == self.version:
> > # Fast-path: dictionary lookup avoided
> > return True
> >
> > # lookup in the dictionary
> > value = self.dict.get(self.key, UNSET)
> > if value is self.value:
> > # another key was 

Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Guido van Rossum
I'll wait a day before formally pronouncing to see if any objections
are made, but it looks good to me.

On Thu, Apr 14, 2016 at 8:19 AM, Victor Stinner
 wrote:
> Hi,
>
> I updated my PEP 509 to make the dictionary version globally unique.
> With *two* use cases of this PEP (Yury's method call patch and my FAT
> Python project), I think that the PEP is now ready to be accepted.
>
> Globally unique identifier is a requirement for Yury's patch
> optimizing method calls ( https://bugs.python.org/issue26110 ). It
> allows to check for free if the dictionary was replaced.
>
> I also renamed the ma_version field to ma_version_tag.
>
> HTML version:
> https://www.python.org/dev/peps/pep-0509/
>
> Victor
>
>
> PEP: 509
> Title: Add a private version to dict
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 4-January-2016
> Python-Version: 3.6
>
>
> Abstract
> 
>
> Add a new private version to the builtin ``dict`` type, incremented at
> each dictionary creation and at each dictionary change, to implement
> fast guards on namespaces.
>
>
> Rationale
> =
>
> In Python, the builtin ``dict`` type is used by many instructions. For
> example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
> global namespace, or in the builtins namespace (two dict lookups).
> Python uses ``dict`` for the builtins namespace, globals namespace, type
> namespaces, instance namespaces, etc. The local namespace (namespace of
> a function) is usually optimized to an array, but it can be a dict too.
>
> Python is hard to optimize because almost everything is mutable: builtin
> functions, function code, global variables, local variables, ... can be
> modified at runtime. Implementing optimizations respecting the Python
> semantics requires to detect when "something changes": we will call
> these checks "guards".
>
> The speedup of optimizations depends on the speed of guard checks. This
> PEP proposes to add a version to dictionaries to implement fast guards
> on namespaces.
>
> Dictionary lookups can be skipped if the version does not change which
> is the common case for most namespaces. Since the version is globally
> unique, the version is also enough to check if the namespace dictionary
> was not replaced with a new dictionary. The performance of a guard does
> not depend on the number of watched dictionary entries, complexity of
> O(1), if the dictionary version does not change.
>
> Example of optimization: copy the value of a global variable to function
> constants.  This optimization requires a guard on the global variable to
> check if it was modified. If the variable is modified, the variable must
> be loaded at runtime when the function is called, instead of using the
> constant.
>
> See the `PEP 510 -- Specialized functions with guards
> `_ for the concrete usage of
> guards to specialize functions and for the rationale on Python static
> optimizers.
>
>
> Guard example
> =
>
> Pseudo-code of an fast guard to check if a dictionary entry was modified
> (created, updated or deleted) using an hypothetical
> ``dict_get_version(dict)`` function::
>
> UNSET = object()
>
> class GuardDictKey:
> def __init__(self, dict, key):
> self.dict = dict
> self.key = key
> self.value = dict.get(key, UNSET)
> self.version = dict_get_version(dict)
>
> def check(self):
> """Return True if the dictionary entry did not changed
> and the dictionary was not replaced."""
>
> # read the version of the dict structure
> version = dict_get_version(self.dict)
> if version == self.version:
> # Fast-path: dictionary lookup avoided
> return True
>
> # lookup in the dictionary
> value = self.dict.get(self.key, UNSET)
> if value is self.value:
> # another key was modified:
> # cache the new dictionary version
> self.version = version
> return True
>
> # the key was modified
> return False
>
>
> Usage of the dict version
> =
>
> Speedup method calls 1.2x
> -
>
> Yury Selivanov wrote a `patch to optimize method calls
> `_. The patch depends on the
> `implement per-opcode cache in ceval
> `_ patch which requires dictionary
> versions to invalidate the cache if the globals dictionary or the
> builtins dictionary has been modified.
>
> The cache also requires that the dictionary version is globally unique.
> It is possible to define a function in a namespace and call it
> in a different namespace: using ``exec()`` with the *globals* 

Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Stefan Behnel
+1 from me, too. I'm sure we can make some use of this in Cython.

Stefan


Victor Stinner schrieb am 14.04.2016 um 17:19:
> PEP: 509
> Title: Add a private version to dict


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
2016-04-14 18:28 GMT+02:00 Brett Cannon :
> +1 from me!

Thanks.

> A couple of grammar/typo suggestions below.

Fixed. (Yes, I want to use unsigned type, so PY_UINT64_T.)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Brett Cannon
+1 from me!

A couple of grammar/typo suggestions below.

On Thu, 14 Apr 2016 at 08:20 Victor Stinner 
wrote:

> Hi,
>
> I updated my PEP 509 to make the dictionary version globally unique.
> With *two* use cases of this PEP (Yury's method call patch and my FAT
> Python project), I think that the PEP is now ready to be accepted.
>
> Globally unique identifier is a requirement for Yury's patch
> optimizing method calls ( https://bugs.python.org/issue26110 ). It
> allows to check for free if the dictionary was replaced.
>
> I also renamed the ma_version field to ma_version_tag.
>
> HTML version:
> https://www.python.org/dev/peps/pep-0509/
>
> Victor
>
>
> PEP: 509
> Title: Add a private version to dict
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 4-January-2016
> Python-Version: 3.6
>
>
> Abstract
> 
>
> Add a new private version to the builtin ``dict`` type, incremented at
> each dictionary creation and at each dictionary change, to implement
> fast guards on namespaces.
>
>
> Rationale
> =
>
> In Python, the builtin ``dict`` type is used by many instructions. For
> example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
> global namespace, or in the builtins namespace (two dict lookups).
> Python uses ``dict`` for the builtins namespace, globals namespace, type
> namespaces, instance namespaces, etc. The local namespace (namespace of
> a function) is usually optimized to an array, but it can be a dict too.
>
> Python is hard to optimize because almost everything is mutable: builtin
> functions, function code, global variables, local variables, ... can be
> modified at runtime. Implementing optimizations respecting the Python
> semantics requires to detect when "something changes": we will call
> these checks "guards".
>
> The speedup of optimizations depends on the speed of guard checks. This
> PEP proposes to add a version to dictionaries to implement fast guards
> on namespaces.
>
> Dictionary lookups can be skipped if the version does not change which
> is the common case for most namespaces. Since the version is globally
> unique, the version is also enough to check if the namespace dictionary
> was not replaced with a new dictionary. The performance of a guard does
> not depend on the number of watched dictionary entries, complexity of
> O(1), if the dictionary version does not change.
>
> Example of optimization: copy the value of a global variable to function
> constants.  This optimization requires a guard on the global variable to
> check if it was modified. If the variable is modified, the variable must
> be loaded at runtime when the function is called, instead of using the
> constant.
>
> See the `PEP 510 -- Specialized functions with guards
> `_ for the concrete usage of
> guards to specialize functions and for the rationale on Python static
> optimizers.
>
>
> Guard example
> =
>
> Pseudo-code of an fast guard to check if a dictionary entry was modified
> (created, updated or deleted) using an hypothetical
> ``dict_get_version(dict)`` function::
>
> UNSET = object()
>
> class GuardDictKey:
> def __init__(self, dict, key):
> self.dict = dict
> self.key = key
> self.value = dict.get(key, UNSET)
> self.version = dict_get_version(dict)
>
> def check(self):
> """Return True if the dictionary entry did not changed
> and the dictionary was not replaced."""
>

"did not change"


>
> # read the version of the dict structure
> version = dict_get_version(self.dict)
> if version == self.version:
> # Fast-path: dictionary lookup avoided
> return True
>
> # lookup in the dictionary
> value = self.dict.get(self.key, UNSET)
> if value is self.value:
> # another key was modified:
> # cache the new dictionary version
> self.version = version
> return True
>
> # the key was modified
> return False
>
>
> Usage of the dict version
> =
>
> Speedup method calls 1.2x
> -
>
> Yury Selivanov wrote a `patch to optimize method calls
> `_. The patch depends on the
> `implement per-opcode cache in ceval
> `_ patch which requires dictionary
> versions to invalidate the cache if the globals dictionary or the
> builtins dictionary has been modified.
>
> The cache also requires that the dictionary version is globally unique.
> It is possible to define a function in a namespace and call it
> in a different namespace: using ``exec()`` with the *globals* parameter
> for example. In this 

[Python-Dev] RFC: PEP 509: Add a private version to dict

2016-04-14 Thread Victor Stinner
Hi,

I updated my PEP 509 to make the dictionary version globally unique.
With *two* use cases of this PEP (Yury's method call patch and my FAT
Python project), I think that the PEP is now ready to be accepted.

Globally unique identifier is a requirement for Yury's patch
optimizing method calls ( https://bugs.python.org/issue26110 ). It
allows to check for free if the dictionary was replaced.

I also renamed the ma_version field to ma_version_tag.

HTML version:
https://www.python.org/dev/peps/pep-0509/

Victor


PEP: 509
Title: Add a private version to dict
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-January-2016
Python-Version: 3.6


Abstract


Add a new private version to the builtin ``dict`` type, incremented at
each dictionary creation and at each dictionary change, to implement
fast guards on namespaces.


Rationale
=

In Python, the builtin ``dict`` type is used by many instructions. For
example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
global namespace, or in the builtins namespace (two dict lookups).
Python uses ``dict`` for the builtins namespace, globals namespace, type
namespaces, instance namespaces, etc. The local namespace (namespace of
a function) is usually optimized to an array, but it can be a dict too.

Python is hard to optimize because almost everything is mutable: builtin
functions, function code, global variables, local variables, ... can be
modified at runtime. Implementing optimizations respecting the Python
semantics requires to detect when "something changes": we will call
these checks "guards".

The speedup of optimizations depends on the speed of guard checks. This
PEP proposes to add a version to dictionaries to implement fast guards
on namespaces.

Dictionary lookups can be skipped if the version does not change which
is the common case for most namespaces. Since the version is globally
unique, the version is also enough to check if the namespace dictionary
was not replaced with a new dictionary. The performance of a guard does
not depend on the number of watched dictionary entries, complexity of
O(1), if the dictionary version does not change.

Example of optimization: copy the value of a global variable to function
constants.  This optimization requires a guard on the global variable to
check if it was modified. If the variable is modified, the variable must
be loaded at runtime when the function is called, instead of using the
constant.

See the `PEP 510 -- Specialized functions with guards
`_ for the concrete usage of
guards to specialize functions and for the rationale on Python static
optimizers.


Guard example
=

Pseudo-code of an fast guard to check if a dictionary entry was modified
(created, updated or deleted) using an hypothetical
``dict_get_version(dict)`` function::

UNSET = object()

class GuardDictKey:
def __init__(self, dict, key):
self.dict = dict
self.key = key
self.value = dict.get(key, UNSET)
self.version = dict_get_version(dict)

def check(self):
"""Return True if the dictionary entry did not changed
and the dictionary was not replaced."""

# read the version of the dict structure
version = dict_get_version(self.dict)
if version == self.version:
# Fast-path: dictionary lookup avoided
return True

# lookup in the dictionary
value = self.dict.get(self.key, UNSET)
if value is self.value:
# another key was modified:
# cache the new dictionary version
self.version = version
return True

# the key was modified
return False


Usage of the dict version
=

Speedup method calls 1.2x
-

Yury Selivanov wrote a `patch to optimize method calls
`_. The patch depends on the
`implement per-opcode cache in ceval
`_ patch which requires dictionary
versions to invalidate the cache if the globals dictionary or the
builtins dictionary has been modified.

The cache also requires that the dictionary version is globally unique.
It is possible to define a function in a namespace and call it
in a different namespace: using ``exec()`` with the *globals* parameter
for example. In this case, the globals dictionary was changed and the
cache must be invalidated.


Specialized functions using guards
--

The `PEP 510 -- Specialized functions with guards
`_ proposes an API to support
specialized functions with guards. It allows to implement static
optimizers for Python without breaking the Python