Re: [Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

2017-01-15 Thread Armin Rigo
Hi,

Sorry to reply in this old thread.  We just noticed this on #pypy:

On 22 October 2016 at 05:32, Nick Coghlan  wrote:
> The weakref-before-__del__ ordering change in
> https://www.python.org/dev/peps/pep-0442/#disposal-of-cyclic-isolates
> only applies to cyclic garbage collection,so for normal refcount
> driven object cleanup in CPython, the __del__ still happens first:
>
> >>> class C:
> ... def __del__(self):
> ... print("__del__ called")
> ...
> >>> c = C()
> >>> import weakref
> >>> def cb():
> ... print("weakref callback called")
> ...
> >>> weakref.finalize(c, cb)
> 
> >>> del c
> __del__ called
> weakref callback called

Nick, it seems you're suggesting that before PEP 442 (in CPython 3.4)
the __del__ happened before the weakref-clearing operation as well.
That's not the case: before CPython 3.4, weakrefs are always cleared
first.  The situation became more muddy in 3.4, where weakrefs are
usually cleared after the __del__ is called---in the absence of
reference cycles (so it's a backward-incompatible change).  If there
are reference cycles, then the weakref is cleared before the __del__
is called.

This can be shown in your example by replacing "weakref.finalize(c,
cb)" with an old-style "wr = weakref.ref(c, cb)".  Then CPython <= 3.3
and >= 3.4 print the two lines in opposite order.


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

2016-10-22 Thread Nathaniel Smith
On Sat, Oct 22, 2016 at 3:01 AM, Nick Coghlan  wrote:
> On 22 October 2016 at 16:05, Nathaniel Smith  wrote:
>> On Fri, Oct 21, 2016 at 8:32 PM, Nick Coghlan  wrote:
>> But PEP 442 already broke all that :-). Now weakref callbacks can
>> happen before __del__, and they can happen on objects that are about
>> to be resurrected.
>
> Right, but the resurrection can still only happen *in* __del__, so the
> interpreter doesn't need to deal with the case where it happens in a
> weakref callback instead - that's where the freedom to do the
> callbacks and the __del__ in either order comes from.

I think we're probably on the same page here, but to be clear, my
point is that right now the resurrection logic seems to be (a) run
some arbitrary Python code (__del__), (b) run a second check to see if
a resurrection occurred (and the details of that check depend on
whether the object is part of a cyclic isolate). Since these two
phases are already decoupled from each other, it shouldn't cause any
particular difficulty for the interpreter if we add weakref callbacks
to the "run arbitrary code" phase. If we wanted to.

>> There remains one obscure corner case where multiple resurrection is
>> possible, because the resurrection-prevention flag doesn't exist on
>> non-GC objects, so you'd still be able to take new weakrefs to those.
>> But in that case __del__ can already do multiple resurrections, and
>> some fellow named Nick Coghlan seemed to think that was okay back in
>> 2013 [1], so probably it's not too bad ;-).
>>
>> [1] https://mail.python.org/pipermail/python-dev/2013-June/126850.html
>
> Right, that still doesn't bother me.
>
>>> Changing that to support resurrecting the object so it can be passed
>>> into the callback without the callback itself holding a strong
>>> reference means losing the main "reasoning about software" benefit
>>> that weakref callbacks offer: they currently can't resurrect the
>>> object they relate to (since they never receive a strong reference to
>>> it), so it nominally doesn't matter if the interpreter calls them
>>> before or after that object has been entirely cleaned up.
>>
>> I guess I'm missing the importance of this -- does the interpreter
>> gain some particular benefit from having flexibility about when to
>> fire weakref callbacks? Obviously it has to pick one in practice.
>
> Sorry, my attempted clarification of one practical implication made it
> look like I was defining the phrase I had in quotes. However, the
> "reasoning about software" benefit I see is "If you don't define
> __del__, you don't need to worry about object resurrection, as it's
> categorically impossible when only using weakref callbacks".
> Interpreter implementors are just one set of beneficiaries of that
> simplification - everyone writing weakref callbacks qualifies as well.

I do like invariants, but I'm having trouble seeing why this one is
super valuable. I mean, if your object doesn't define __del__, then
it's also impossible to distinguish between a weakref causing
resurrection and a strong reference that prevents the object from
being collected in the first place. And certainly it's harmless in the
use case I have in mind, where normally the weakref would be created
in the object's __init__ anyway :-).

> However, if you're happy defining __del__ methods, then PEP 442 means
> you can already inject lazy cyclic cleanup that supports resurrection:
>
> >>> class Target:
> ... pass
> ...
> >>> class Resurrector:
> ... def __init__(self, target):
> ... _self_ref = "_resurrector_{:d}".format(id(self))
> ... self.target = target
> ... setattr(target, _self_ref, self)
> ... def __del__(self):
> ... globals()["resurrected"] = self.target
> ...
> >>> obj = Target()
> >>> Resurrector(obj)
> <__main__.Resurrector object at 0x7f42f8ae34e0>
> >>> del obj
> >>> resurrected
> Traceback (most recent call last):
>   File "", line 1, in 
> NameError: name 'resurrected' is not defined
> >>> import gc
> >>> gc.collect(); gc.collect(); gc.collect()
> 6
> 4
> 0
> >>> resurrected
> <__main__.Target object at 0x7f42f8ae3438>
>
> Given that, I don't see a lot of benefit in making weakref callbacks
> harder to reason about when __del__ + attribute injection already
> makes this possible.

That's a cute trick :-). But it does have one major downside compared
to allowing weakref callbacks to access the object normally. With
weakrefs you don't interfere with when the object is normally
collected, and in particular for objects that aren't part of cycles,
they're still collected promptly (on CPython). Here every object
becomes part of a cycle, so objects that would otherwise be collected
promptly won't be.

(Remember that the reason I started thinking about this was that I was
wondering if we could have a nice API for the 

Re: [Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

2016-10-22 Thread Nick Coghlan
On 22 October 2016 at 16:05, Nathaniel Smith  wrote:
> On Fri, Oct 21, 2016 at 8:32 PM, Nick Coghlan  wrote:
> But PEP 442 already broke all that :-). Now weakref callbacks can
> happen before __del__, and they can happen on objects that are about
> to be resurrected.

Right, but the resurrection can still only happen *in* __del__, so the
interpreter doesn't need to deal with the case where it happens in a
weakref callback instead - that's where the freedom to do the
callbacks and the __del__ in either order comes from.

> There remains one obscure corner case where multiple resurrection is
> possible, because the resurrection-prevention flag doesn't exist on
> non-GC objects, so you'd still be able to take new weakrefs to those.
> But in that case __del__ can already do multiple resurrections, and
> some fellow named Nick Coghlan seemed to think that was okay back in
> 2013 [1], so probably it's not too bad ;-).
>
> [1] https://mail.python.org/pipermail/python-dev/2013-June/126850.html

Right, that still doesn't bother me.

>> Changing that to support resurrecting the object so it can be passed
>> into the callback without the callback itself holding a strong
>> reference means losing the main "reasoning about software" benefit
>> that weakref callbacks offer: they currently can't resurrect the
>> object they relate to (since they never receive a strong reference to
>> it), so it nominally doesn't matter if the interpreter calls them
>> before or after that object has been entirely cleaned up.
>
> I guess I'm missing the importance of this -- does the interpreter
> gain some particular benefit from having flexibility about when to
> fire weakref callbacks? Obviously it has to pick one in practice.

Sorry, my attempted clarification of one practical implication made it
look like I was defining the phrase I had in quotes. However, the
"reasoning about software" benefit I see is "If you don't define
__del__, you don't need to worry about object resurrection, as it's
categorically impossible when only using weakref callbacks".
Interpreter implementors are just one set of beneficiaries of that
simplification - everyone writing weakref callbacks qualifies as well.

However, if you're happy defining __del__ methods, then PEP 442 means
you can already inject lazy cyclic cleanup that supports resurrection:

>>> class Target:
... pass
...
>>> class Resurrector:
... def __init__(self, target):
... _self_ref = "_resurrector_{:d}".format(id(self))
... self.target = target
... setattr(target, _self_ref, self)
... def __del__(self):
... globals()["resurrected"] = self.target
...
>>> obj = Target()
>>> Resurrector(obj)
<__main__.Resurrector object at 0x7f42f8ae34e0>
>>> del obj
>>> resurrected
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'resurrected' is not defined
>>> import gc
>>> gc.collect(); gc.collect(); gc.collect()
6
4
0
>>> resurrected
<__main__.Target object at 0x7f42f8ae3438>

Given that, I don't see a lot of benefit in making weakref callbacks
harder to reason about when __del__ + attribute injection already
makes this possible.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

2016-10-22 Thread Nathaniel Smith
On Fri, Oct 21, 2016 at 8:32 PM, Nick Coghlan  wrote:
> On 21 October 2016 at 17:09, Nathaniel Smith  wrote:
>> But that was 2.4. In the mean time, of course, PEP 442 fixed it so
>> that finalizers and weakrefs mix just fine. In fact, weakref callbacks
>> are now run *before* __del__ methods [2], so clearly it's now okay for
>> arbitrary code to touch the objects during that phase of the GC -- at
>> least in principle.
>>
>> So what I'm wondering is, would anything terrible happen if we started
>> passing still-live weakrefs into weakref callbacks, and then clearing
>> them afterwards?
>
> The weakref-before-__del__ ordering change in
> https://www.python.org/dev/peps/pep-0442/#disposal-of-cyclic-isolates
> only applies to cyclic garbage collection,so for normal refcount
> driven object cleanup in CPython, the __del__ still happens first:
>
> >>> class C:
> ... def __del__(self):
> ... print("__del__ called")
> ...
> >>> c = C()
> >>> import weakref
> >>> def cb():
> ... print("weakref callback called")
> ...
> >>> weakref.finalize(c, cb)
> 
> >>> del c
> __del__ called
> weakref callback called

Ah, interesting! And in the old days this was of course the right way
to do it, because until __del__ has completed it's possible that the
object will get resurrected, and you don't want to clear the weakref
until you're certain that it's dead.

But PEP 442 already broke all that :-). Now weakref callbacks can
happen before __del__, and they can happen on objects that are about
to be resurrected. So if we wanted to pursue this then it seems like
it would make sense to standardize on the following sequence for
object teardown:

0) object becomes collectible (either refcount == 0 or it's part of a
cyclic isolate)
1) weakref callbacks fire
2) weakrefs are cleared (unconditionally, so we keep the rule that any
given weakref fires at most once, even if the object is resurrected)
3) if _PyGC_REFS_MASK_FINALIZED isn't set, __del__ fires, and then
_PyGC_REFS_MASK_FINALIZED is set
4) check for resurrection
5) deallocate the object

On further thought, this does still introduce one new edge case, which
is that even if we keep the guarantee that no individual weakref can
fire more than once, it's possible for *new* weakrefs to be registered
after resurrection, so it becomes possible for an object to be
resurrected multiple times. (Currently, resurrection can only happen
once, because __del__ is disabled on resurrected objects and weakrefs
can't resurrect at all.) I'm not actually sure that this is even a
problem, but in any case it's easy to fix by making a rule that you
can't take a weakref to an object whose _PyGC_REFS_MASK_FINALIZED flag
is already set, plus adjust the teardown sequence to be:

0) object becomes collectible (either refcount == 0 or it's part of a
cyclic isolate)
1) if _PyGC_REFS_MASK_FINALIZED is set, then go to step 7. Otherwise:
2) set _PyGC_REFS_MASK_FINALIZED
3) weakref callbacks fire
4) weakrefs are cleared (unconditionally)
5) __del__ fires
6) check for resurrection
7) deallocate the object

There remains one obscure corner case where multiple resurrection is
possible, because the resurrection-prevention flag doesn't exist on
non-GC objects, so you'd still be able to take new weakrefs to those.
But in that case __del__ can already do multiple resurrections, and
some fellow named Nick Coghlan seemed to think that was okay back in
2013 [1], so probably it's not too bad ;-).

[1] https://mail.python.org/pipermail/python-dev/2013-June/126850.html

> This means the main problem with a strong reference being reachable
> from the weakref callback object remains: if the callback itself is
> reachable, then the original object is reachable, and you don't have a
> collectible cycle anymore.
>
> >>> c = C()
> >>> def cb2(obj):
> ... print("weakref callback called with object reference")
> ...
> >>> weakref.finalize(c, cb2, c)
> 
> >>> del c
> >>>
>
> Changing that to support resurrecting the object so it can be passed
> into the callback without the callback itself holding a strong
> reference means losing the main "reasoning about software" benefit
> that weakref callbacks offer: they currently can't resurrect the
> object they relate to (since they never receive a strong reference to
> it), so it nominally doesn't matter if the interpreter calls them
> before or after that object has been entirely cleaned up.

I guess I'm missing the importance of this -- does the interpreter
gain some particular benefit from having flexibility about when to
fire weakref callbacks? Obviously it has to pick one in practice.

(The async use case that got me thinking about this is, of course,
exactly one where we would want a weakref callback to resurrect the
object it refers to. Only once, though.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

Re: [Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

2016-10-21 Thread Nick Coghlan
On 21 October 2016 at 17:09, Nathaniel Smith  wrote:
> But that was 2.4. In the mean time, of course, PEP 442 fixed it so
> that finalizers and weakrefs mix just fine. In fact, weakref callbacks
> are now run *before* __del__ methods [2], so clearly it's now okay for
> arbitrary code to touch the objects during that phase of the GC -- at
> least in principle.
>
> So what I'm wondering is, would anything terrible happen if we started
> passing still-live weakrefs into weakref callbacks, and then clearing
> them afterwards?

The weakref-before-__del__ ordering change in
https://www.python.org/dev/peps/pep-0442/#disposal-of-cyclic-isolates
only applies to cyclic garbage collection,so for normal refcount
driven object cleanup in CPython, the __del__ still happens first:

>>> class C:
... def __del__(self):
... print("__del__ called")
...
>>> c = C()
>>> import weakref
>>> def cb():
... print("weakref callback called")
...
>>> weakref.finalize(c, cb)

>>> del c
__del__ called
weakref callback called

This means the main problem with a strong reference being reachable
from the weakref callback object remains: if the callback itself is
reachable, then the original object is reachable, and you don't have a
collectible cycle anymore.

>>> c = C()
>>> def cb2(obj):
... print("weakref callback called with object reference")
...
>>> weakref.finalize(c, cb2, c)

>>> del c
>>>

Changing that to support resurrecting the object so it can be passed
into the callback without the callback itself holding a strong
reference means losing the main "reasoning about software" benefit
that weakref callbacks offer: they currently can't resurrect the
object they relate to (since they never receive a strong reference to
it), so it nominally doesn't matter if the interpreter calls them
before or after that object has been entirely cleaned up.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

2016-10-21 Thread Nathaniel Smith
Hi all,

It's an old feature of the weakref API that you can define an
arbitrary callback to be invoked when the referenced object dies, and
that when this callback is invoked, it gets handed the weakref wrapper
object -- BUT, only after it's been cleared, so that the callback
can't access the originally referenced object. (I.e., this callback
will never raise: def callback(ref): assert ref() is None.)

AFAICT the original motivation for this seems was that if the weakref
callback could get at the object, then the weakref callback would
effectively be another finalizer like __del__, and finalizers and
reference cycles don't mix, so weakref callbacks can't be finalizers.
There's a long document from the 2.4 days about all the terrible
things that could happen if arbitrary code like callbacks could get
unfettered access to cyclic isolates at weakref cleanup time [1].

But that was 2.4. In the mean time, of course, PEP 442 fixed it so
that finalizers and weakrefs mix just fine. In fact, weakref callbacks
are now run *before* __del__ methods [2], so clearly it's now okay for
arbitrary code to touch the objects during that phase of the GC -- at
least in principle.

So what I'm wondering is, would anything terrible happen if we started
passing still-live weakrefs into weakref callbacks, and then clearing
them afterwards? (i.e. making step 1 of the PEP 442 cleanup order be
"run callbacks and then clear weakrefs", instead of the current "clear
weakrefs and then run callbacks"). I skimmed through the PEP 442
discussion, and AFAICT the rationale for keeping the old weakref
behavior was just that no-one could be bothered to mess with it [3].

[The motivation for my question is partly curiosity, and partly that
in the discussion about how to handle GC for async objects, it
occurred to me that it might be very nice if arbitrary classes that
needed access to the event loop during cleanup could do something like

  def __init__(self, ...):
  loop = asyncio.get_event_loop()
  loop.gc_register(self)

  # automatically called by the loop when I am GC'ed; async equivalent
of __del__
  async def aclose(self):
  ...

Right now something *sort* of like this is possible but it requires a
much more cumbersome API, where every class would have to implement
logic to fetch a cleanup callback from the loop, store it, and then
call it from its __del__ method -- like how PEP 525 does it. Delaying
weakref clearing would make this simpler API possible.]

-n

[1] https://github.com/python/cpython/blob/master/Modules/gc_weakref.txt
[2] https://www.python.org/dev/peps/pep-0442/#id7
[3] https://mail.python.org/pipermail/python-dev/2013-May/126592.html

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com