Re: [Python-Dev] PEP 442: Safe object finalization

2013-06-03 Thread Antoine Pitrou
On Sun, 2 Jun 2013 19:27:49 -0700
Benjamin Peterson benja...@python.org wrote:
 2013/5/18 Antoine Pitrou solip...@pitrou.net:
 
  Hello,
 
  I would like to submit the following PEP for discussion and evaluation.
 
 Will the API of the gc module be at all affected? I assume nothing
 will just be printed for DEBUG_UNCOLLECTABLE.

Objects with tp_del may still exist (third-party extensions perhaps).

 Maybe there should be a
 way to discover when a cycle is resurrected?

Is it more important than discovering when a non-cycle is resurrected?

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-06-03 Thread Maciej Fijalkowski
On Sat, May 18, 2013 at 10:33 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Sat, 18 May 2013 16:22:55 +0200
 Armin Rigo ar...@tunes.org wrote:
 Hi Antoine,

 On Sat, May 18, 2013 at 3:45 PM, Antoine Pitrou solip...@pitrou.net wrote:
  How is this done?  I don't see a clear way to determine it by looking
  only at the objects in the CI, given that arbitrary modifications of
  the object graph may have occurred.
 
  The same way a generation is traversed, but restricted to the CI.
 
  First the gc_refs field of each CI object is initialized to its
  ob_refcnt (again).
 
  Then, tp_traverse is called on each CI object, and each visited
  CI object has its gc_refs decremented. This substracts CI-internal
  references from the gc_refs fields.
 
  At the end of the traversal, if all CI objects have their gc_refs equal
  to 0, then the CI has no external reference to it and can be cleared.
  If at least one CI object has non-zero gc_refs, the CI cannot be
  cleared.

 Ok, indeed.  Then you really should call finalizers only once: in case
 one of the finalizers in a cycle did a trivial change like I
 described, the algorithm above will conservatively assume the cycle
 should be kept alive.  At the next GC collection we must not call the
 finalizer again, because it's likely to just do a similar trivial
 change.

 Well, the finalizer will only be called if the resurrected object is
 dereferenced again; otherwise the object won't be considered by the GC.
 So, this will only happen if someone keeps trying to destroy a
 resurrected object.

 Calling finalizers only once is fine with me, but it would be a change
 in behaviour; I don't know if it may break existing code.

 (for example, say someone is using __del__ to manage a freelist)

 Regards

 Antoine.

PyPy already ever calls finalizers once. If you resurrect an object,
it'll be alive, but it's finalizer will not be called again. We
discussed a few changes a while ago and we decided (I think even
debated on python-dev) than even such behavior is correct:

* you have a reference cycle A - B, C references A. C references itself.

* you collect the stuff. We do topological order, so C finalizer is
called first (they're only undefined inside a cycle)

* then A and B finalizers are called in undefined order, even if C
finalizer would resurrect C.

* no more finalizers for those objects are called

I'm not sure if it's cool for CPython or not to do such changes

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-06-02 Thread Benjamin Peterson
2013/5/18 Antoine Pitrou solip...@pitrou.net:
 Calling finalizers only once is fine with me, but it would be a change
 in behaviour; I don't know if it may break existing code.

I agree with Armin that this is better behavior. (Mostly significantly
consistent with weakrefs.)


 (for example, say someone is using __del__ to manage a freelist)

Do you know if it breaks any of the projects you tested it with?


--
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-06-02 Thread Benjamin Peterson
2013/5/18 Antoine Pitrou solip...@pitrou.net:

 Hello,

 I would like to submit the following PEP for discussion and evaluation.

Will the API of the gc module be at all affected? I assume nothing
will just be printed for DEBUG_UNCOLLECTABLE. Maybe there should be a
way to discover when a cycle is resurrected?


--
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou

Hello,

I would like to submit the following PEP for discussion and evaluation.

Regards

Antoine.



PEP: 442
Title: Safe object finalization
Version: $Revision$
Last-Modified: $Date$
Author: Antoine Pitrou solip...@pitrou.net
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2013-05-18
Python-Version: 3.4
Post-History:
Resolution: TBD


Abstract


This PEP proposes to deal with the current limitations of object
finalization.  The goal is to be able to define and run finalizers
for any object, regardless of their position in the object graph.

This PEP doesn't call for any change in Python code.  Objects
with existing finalizers will benefit automatically.


Definitions
===

Reference
A directional link from an object to another.  The target of the
reference is kept alive by the reference, as long as the source is
itself alive and the reference isn't cleared.

Weak reference
A directional link from an object to another, which doesn't keep
alive its target.  This PEP focusses on non-weak references.

Reference cycle
A cyclic subgraph of directional links between objects, which keeps
those objects from being collected in a pure reference-counting
scheme.

Cyclic isolate (CI)
A reference cycle in which no object is referenced from outside the
cycle *and* whose objects are still in a usable, non-broken state:
they can access each other from their respective finalizers.

Cyclic garbage collector (GC)
A device able to detect cyclic isolates and turn them into cyclic
trash.  Objects in cyclic trash are eventually disposed of by
the natural effect of the references being cleared and their
reference counts dropping to zero.

Cyclic trash (CT)
A reference cycle, or former reference cycle, in which no object
is referenced from outside the cycle *and* whose objects have
started being cleared by the GC.  Objects in cyclic trash are
potential zombies; if they are accessed by Python code, the symptoms
can vary from weird AttributeErrors to crashes.

Zombie / broken object
An object part of cyclic trash.  The term stresses that the object
is not safe: its outgoing references may have been cleared, or one
of the objects it references may be zombie.  Therefore,
it should not be accessed by arbitrary code (such as finalizers).

Finalizer
A function or method called when an object is intended to be
disposed of.  The finalizer can access the object and release any
resource held by the object (for example mutexes or file
descriptors).  An example is a ``__del__`` method.

Resurrection
The process by which a finalizer creates a new reference to an
object in a CI.  This can happen as a quirky but supported
side-effect of ``__del__`` methods.


Impact
==

While this PEP discusses CPython-specific implementation details, the
change in finalization semantics is expected to affect the Python
ecosystem as a whole.  In particular, this PEP obsoletes the current
guideline that objects with a ``__del__`` method should not be part of
a reference cycle.


Benefits


The primary benefits of this PEP regard objects with finalizers, such
as objects with a ``__del__`` method and generators with a ``finally``
block.  Those objects can now be reclaimed when they are part of a
reference cycle.

The PEP also paves the way for further benefits:

* The module shutdown procedure may not need to set global variables to
  None anymore.  This could solve a well-known class of irritating
  issues.

The PEP doesn't change the semantics of:

* Weak references caught in reference cycles.

* C extension types with a custom ``tp_dealloc`` function.


Description
===

Reference-counted disposal
--

In normal reference-counted disposal, an object's finalizer is called
just before the object is deallocated.  If the finalizer resurrects
the object, deallocation is aborted.

*However*, if the object was already finalized, then the finalizer isn't
called.  This prevents us from finalizing zombies (see below).

Disposal of cyclic isolates
---

Cyclic isolates are first detected by the garbage collector, and then
disposed of.  The detection phase doesn't change and won't be described
here.  Disposal of a CI traditionally works in the following order:

1. Weakrefs to CI objects are cleared, and their callbacks called. At
   this point, the objects are still safe to use.

2. The CI becomes a CT as the GC systematically breaks all
   known references inside it (using the ``tp_clear`` function).

3. Nothing.  All CT objects should have been disposed of in step 2
   (as a side-effect of clearing references); this collection is
   finished.

This PEP proposes to turn CI disposal into the following sequence (new
steps are in bold):

1. Weakrefs to CI objects are cleared, and their callbacks called. At
   this point, the objects are still safe to 

Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Nick Coghlan
On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Resurrection
 The process by which a finalizer creates a new reference to an
 object in a CI.  This can happen as a quirky but supported
 side-effect of ``__del__`` methods.

I really like the PEP overall, but could we at least get the option to
have cases of object resurrection spit out a warning? And a clear
rationale for not turning on such a warning by default?

Cheers,
Nick.

--
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou
On Sat, 18 May 2013 21:05:48 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou solip...@pitrou.net wrote:
  Resurrection
  The process by which a finalizer creates a new reference to an
  object in a CI.  This can happen as a quirky but supported
  side-effect of ``__del__`` methods.
 
 I really like the PEP overall, but could we at least get the option to
 have cases of object resurrection spit out a warning? And a clear
 rationale for not turning on such a warning by default?

Where would you put the option?
As for the rationale, it's simply compatibility: resurrection works
without warnings right now :)

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Nick Coghlan
On Sat, May 18, 2013 at 9:46 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Sat, 18 May 2013 21:05:48 +1000
 Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou solip...@pitrou.net wrote:
  Resurrection
  The process by which a finalizer creates a new reference to an
  object in a CI.  This can happen as a quirky but supported
  side-effect of ``__del__`` methods.

 I really like the PEP overall, but could we at least get the option to
 have cases of object resurrection spit out a warning? And a clear
 rationale for not turning on such a warning by default?

 Where would you put the option?
 As for the rationale, it's simply compatibility: resurrection works
 without warnings right now :)

Command line, probably. However, you're right that's something we can
consider later - for the PEP it's enough that it still works, and we
just avoid calling the __del__ method a second time.

Cheers,
Nick.

--
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou
On Sat, 18 May 2013 22:51:35 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, May 18, 2013 at 9:46 PM, Antoine Pitrou solip...@pitrou.net wrote:
  On Sat, 18 May 2013 21:05:48 +1000
  Nick Coghlan ncogh...@gmail.com wrote:
  On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou solip...@pitrou.net 
  wrote:
   Resurrection
   The process by which a finalizer creates a new reference to an
   object in a CI.  This can happen as a quirky but supported
   side-effect of ``__del__`` methods.
 
  I really like the PEP overall, but could we at least get the option to
  have cases of object resurrection spit out a warning? And a clear
  rationale for not turning on such a warning by default?
 
  Where would you put the option?
  As for the rationale, it's simply compatibility: resurrection works
  without warnings right now :)
 
 Command line, probably. However, you're right that's something we can
 consider later - for the PEP it's enough that it still works, and we
 just avoid calling the __del__ method a second time.

Actually, the __del__ method is called again on the next destruction
attempt - as mentioned in the PEP:

« Following this scheme, an object's finalizer is always called exactly
once. The only exception is if an object is resurrected: the finalizer
will be called again later. »

I could change it to only call __del__ ever once, it just sounded
more logical to call it each time destruction is attempted.

(this is in contrast to weakrefs, though, which are cleared once and
for all)

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Armin Rigo
Hi Antoine,

On Sat, May 18, 2013 at 10:59 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Cyclic isolate (CI)
 A reference cycle in which no object is referenced from outside the
 cycle *and* whose objects are still in a usable, non-broken state:
 they can access each other from their respective finalizers.

Does this definition include more complicated cases?  For example:

A - B - Aand   A - C - A

Neither cycle is isolated.  If there is no reference from outside,
then the set of all three objects is isolated, but isn't strictly a
cycle.  I think the term is strongly connected component.

 1. Weakrefs to CI objects are cleared, and their callbacks called. At
this point, the objects are still safe to use.

 2. **The finalizers of all CI objects are called.**

You need to be very careful about what each call to a finalizer can do
to the object graph.  It may already be what you're doing, but the
most careful solution is to collect in 1. the complete list of
objects with finalizers that are in cycles; then incref them all; then
call the finalizer of each of them; then decref them all.  Such a
solution gives new cases to think about, which are slightly unexpected
for CPython's model: for example, if you have a cycle A - B - A,
let's say the GC calls A.__del__ first; it might cause it to store a
reference to B somewhere else, e.g. in some global; but then the GC
calls B.__del__ anyway.  This is probably fine but should be
considered.

 3. **The CI is traversed again to determine if it is still isolated.

How is this done?  I don't see a clear way to determine it by looking
only at the objects in the CI, given that arbitrary modifications of
the object graph may have occurred.  The solution I can think of
doesn't seem robust against minor changes done by the finalizer.  Take
the example A - lst - B - A, where the reference from A to B is
via a list (e.g. there is an attribute A.attr = [B]).  If A.__del__
does the seemingly innocent change of replacing the list with a copy
of itself, e.g. A.attr = A.attr[:], then after the finalizers are
called, lst is gone and we're left with A - lst2 - B - A.
Checking that this cycle is still isolated requires a possibly large
number of checks, as far as I can tell.  This can lead to O(n**2)
behavior if there are n objects in total and O(n) cycles.

The solution seems to be to simply wait for the next GC execution.
Assuming that a finalizer is only called once, this only delays a bit
freeing objects with finalizers in cycles (but your PEP still works to
call finalizers and eventually collect the objects).  Alternatively,
this might be done immediately: in the point 3. above we can forget
everything we found so far, and redo the tracking on all objects (this
time ignoring finalizers that were already called).  In fact, it may
be necessary anyway: anything found before might be invalid after the
finalizers are called, so forgetting it all and redoing the tracking
from scratch seems to be the only way.

 Type objects get a new ``tp_finalize`` slot to which ``__del__`` methods
 are bound.  Generators are also modified to use this slot, rather than
 ``tp_del``.  At the C level, a ``tp_finalize`` function is a normal
 function which will be called with a regular, alive object as its only
 argument.  It should not attempt to revive or collect the object.

Do you mean the opposite in the latest sentence?  ``tp_finalize`` can
do anything...


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Eli Bendersky
Great PEP, I would really like to see this happen as it defines much saner
semantics for finalization than what we currently have. One small question
below:


 This PEP proposes to turn CI disposal into the following sequence (new
 steps are in bold):

 1. Weakrefs to CI objects are cleared, and their callbacks called. At
this point, the objects are still safe to use.

 2. **The finalizers of all CI objects are called.**

 3. **The CI is traversed again to determine if it is still isolated.
If it is determined that at least one object in CI is now reachable
from outside the CI, this collection is aborted and the whole CI
is resurrected.  Otherwise, proceed.**


Not sure if my question is the same as Armin's here, but worth a try: by
saying the CI is traversed again do you mean the original objects from
the CI as discovered earlier, or is a new scan being done? What about a new
object entering the CI during step (2)? I.e. the original CI was A-B-A
but now one of the finalizers created some C such that B-C and C-A adding
it to the connected component?

Reading your description in (3) strictly it says: in this case the
collection is aborted. This CI will be disposed next time collection is
run. Is this correct?

Eli




 4. The CI becomes a CT as the GC systematically breaks all
known references inside it (using the ``tp_clear`` function).

 5. Nothing.  All CT objects should have been disposed of in step 4
(as a side-effect of clearing references); this collection is
finished.


Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou

Hi Armin,

On Sat, 18 May 2013 15:24:08 +0200
Armin Rigo ar...@tunes.org wrote:
 Hi Antoine,
 
 On Sat, May 18, 2013 at 10:59 AM, Antoine Pitrou solip...@pitrou.net wrote:
  Cyclic isolate (CI)
  A reference cycle in which no object is referenced from outside the
  cycle *and* whose objects are still in a usable, non-broken state:
  they can access each other from their respective finalizers.
 
 Does this definition include more complicated cases?  For example:
 
 A - B - Aand   A - C - A
 
 Neither cycle is isolated.  If there is no reference from outside,
 then the set of all three objects is isolated, but isn't strictly a
 cycle.  I think the term is strongly connected component.

Yes, I should fix this definition to be more exact.

  1. Weakrefs to CI objects are cleared, and their callbacks called. At
 this point, the objects are still safe to use.
 
  2. **The finalizers of all CI objects are called.**
 
 You need to be very careful about what each call to a finalizer can do
 to the object graph.  It may already be what you're doing, but the
 most careful solution is to collect in 1. the complete list of
 objects with finalizers that are in cycles; then incref them all; then
 call the finalizer of each of them; then decref them all.  Such a
 solution gives new cases to think about, which are slightly unexpected
 for CPython's model: for example, if you have a cycle A - B - A,
 let's say the GC calls A.__del__ first; it might cause it to store a
 reference to B somewhere else, e.g. in some global; but then the GC
 calls B.__del__ anyway.  This is probably fine but should be
 considered.

Yes, I know this is possible. My opinion is that it is fine to call B's
finalizer anyway. Calling all finalizers regardless of interim changes
in the object graph also makes things a bit more deterministic:
otherwise, which finalizers are called would depend on the call order,
which is undefined.

  3. **The CI is traversed again to determine if it is still isolated.
 
 How is this done?  I don't see a clear way to determine it by looking
 only at the objects in the CI, given that arbitrary modifications of
 the object graph may have occurred.

The same way a generation is traversed, but restricted to the CI.

First the gc_refs field of each CI object is initialized to its
ob_refcnt (again).

Then, tp_traverse is called on each CI object, and each visited
CI object has its gc_refs decremented. This substracts CI-internal
references from the gc_refs fields.

At the end of the traversal, if all CI objects have their gc_refs equal
to 0, then the CI has no external reference to it and can be cleared.
If at least one CI object has non-zero gc_refs, the CI cannot be
cleared.

 Alternatively,
 this might be done immediately: in the point 3. above we can forget
 everything we found so far, and redo the tracking on all objects (this
 time ignoring finalizers that were already called).

This would also be more costly, performance-wise. A CI should
generally be quite small, but a whole generation is arbitrary big.

  Type objects get a new ``tp_finalize`` slot to which ``__del__`` methods
  are bound.  Generators are also modified to use this slot, rather than
  ``tp_del``.  At the C level, a ``tp_finalize`` function is a normal
  function which will be called with a regular, alive object as its only
  argument.  It should not attempt to revive or collect the object.
 
 Do you mean the opposite in the latest sentence?  ``tp_finalize`` can
 do anything...

Not exactly, but I worded it poorly. What I meant is that the C code in
tp_finalize shouldn't *manually* revive the object, since it is called
with an object with a strictly positive refcount.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou
On Sat, 18 May 2013 06:37:54 -0700
Eli Bendersky eli...@gmail.com wrote:
 Great PEP, I would really like to see this happen as it defines much saner
 semantics for finalization than what we currently have. One small question
 below:
 
 
  This PEP proposes to turn CI disposal into the following sequence (new
  steps are in bold):
 
  1. Weakrefs to CI objects are cleared, and their callbacks called. At
 this point, the objects are still safe to use.
 
  2. **The finalizers of all CI objects are called.**
 
  3. **The CI is traversed again to determine if it is still isolated.
 If it is determined that at least one object in CI is now reachable
 from outside the CI, this collection is aborted and the whole CI
 is resurrected.  Otherwise, proceed.**
 
 
 Not sure if my question is the same as Armin's here, but worth a try: by
 saying the CI is traversed again do you mean the original objects from
 the CI as discovered earlier, or is a new scan being done? What about a new
 object entering the CI during step (2)? I.e. the original CI was A-B-A
 but now one of the finalizers created some C such that B-C and C-A adding
 it to the connected component?

It is the original CI which is traversed. If a new reference is
introduced into the reference chain, the traversal in step 3 will
decide to resurrect the CI. This is not necessarily a problem, since
the next GC collection will try collecting again.

 Reading your description in (3) strictly it says: in this case the
 collection is aborted. This CI will be disposed next time collection is
 run. Is this correct?

Yup.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Richard Oudkerk

On 18/05/2013 9:59am, Antoine Pitrou wrote:

This PEP proposes to turn CI disposal into the following sequence (new
steps are in bold):

1. Weakrefs to CI objects are cleared, and their callbacks called. At
this point, the objects are still safe to use.

2. **The finalizers of all CI objects are called.**


How do you know that one of the finalizers will not do something which 
causes another to fail?


Presumably the following would cause an AttributeError to be printed:

class Node:
def __init__(self):
self.next = None
def __del__(self):
print(self, self.next)
del self.next   # break Node object

a = Node()
b = Node()
a.next = b
b.next = a
del a, b
gc.collect()

Are there are less contrived examples which will cause errors where 
currently there are none?


--
Richard

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Eli Bendersky
On Sat, May 18, 2013 at 6:47 AM, Antoine Pitrou solip...@pitrou.net wrote:

 On Sat, 18 May 2013 06:37:54 -0700
 Eli Bendersky eli...@gmail.com wrote:
  Great PEP, I would really like to see this happen as it defines much
 saner
  semantics for finalization than what we currently have. One small
 question
  below:
 
 
   This PEP proposes to turn CI disposal into the following sequence (new
   steps are in bold):
  
   1. Weakrefs to CI objects are cleared, and their callbacks called. At
  this point, the objects are still safe to use.
  
   2. **The finalizers of all CI objects are called.**
  
   3. **The CI is traversed again to determine if it is still isolated.
  If it is determined that at least one object in CI is now reachable
  from outside the CI, this collection is aborted and the whole CI
  is resurrected.  Otherwise, proceed.**
  
 
  Not sure if my question is the same as Armin's here, but worth a try: by
  saying the CI is traversed again do you mean the original objects from
  the CI as discovered earlier, or is a new scan being done? What about a
 new
  object entering the CI during step (2)? I.e. the original CI was A-B-A
  but now one of the finalizers created some C such that B-C and C-A
 adding
  it to the connected component?

 It is the original CI which is traversed. If a new reference is
 introduced into the reference chain, the traversal in step 3 will
 decide to resurrect the CI. This is not necessarily a problem, since
 the next GC collection will try collecting again.

  Reading your description in (3) strictly it says: in this case the
  collection is aborted. This CI will be disposed next time collection is
  run. Is this correct?

 Yup.


Thanks, this actually makes a lot of sense. It's strictly better than the
current situation where objects with __del__ are never collected. In the
proposed scheme, the weird ones will be delayed and some really weird ones
may never be collected, but the vast majority of __del__ methods do no
resurrection so usually it will just work.

This is a great proposal - killer new feature for 3.4 ;-)

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou
On Sat, 18 May 2013 14:56:38 +0100
Richard Oudkerk shibt...@gmail.com wrote:
 On 18/05/2013 9:59am, Antoine Pitrou wrote:
  This PEP proposes to turn CI disposal into the following sequence (new
  steps are in bold):
 
  1. Weakrefs to CI objects are cleared, and their callbacks called. At
  this point, the objects are still safe to use.
 
  2. **The finalizers of all CI objects are called.**
 
 How do you know that one of the finalizers will not do something which 
 causes another to fail?
 
 Presumably the following would cause an AttributeError to be printed:
 
  class Node:
  def __init__(self):
  self.next = None
  def __del__(self):
  print(self, self.next)
  del self.next   # break Node object
 
  a = Node()
  b = Node()
  a.next = b
  b.next = a
  del a, b
  gc.collect()

It works fine:

$ ./python sbt.py 
__main__.Node object at 0x7f3acbf8f400 __main__.Node object at 
0x7f3acbf8f878
__main__.Node object at 0x7f3acbf8f878 __main__.Node object at 
0x7f3acbf8f400

The reason is that, when you execute del self.next, this removes the
last reference to self.next and destroys it immediately.

In essence, you were expecting to see:
- enter a.__del__, destroy b
- leave a.__del__
- enter b.__del__ oops?

But what happens is:
- enter a.__del__, destroy b
  - enter b.__del__
  - leave b.__del__
- leave a.__del__

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Armin Rigo
Hi Antoine,

On Sat, May 18, 2013 at 3:45 PM, Antoine Pitrou solip...@pitrou.net wrote:
 How is this done?  I don't see a clear way to determine it by looking
 only at the objects in the CI, given that arbitrary modifications of
 the object graph may have occurred.

 The same way a generation is traversed, but restricted to the CI.

 First the gc_refs field of each CI object is initialized to its
 ob_refcnt (again).

 Then, tp_traverse is called on each CI object, and each visited
 CI object has its gc_refs decremented. This substracts CI-internal
 references from the gc_refs fields.

 At the end of the traversal, if all CI objects have their gc_refs equal
 to 0, then the CI has no external reference to it and can be cleared.
 If at least one CI object has non-zero gc_refs, the CI cannot be
 cleared.

Ok, indeed.  Then you really should call finalizers only once: in case
one of the finalizers in a cycle did a trivial change like I
described, the algorithm above will conservatively assume the cycle
should be kept alive.  At the next GC collection we must not call the
finalizer again, because it's likely to just do a similar trivial
change.

(There are other open questions about calling finalizers multiple
times; e.g. an instance of this class has its finalizer called ad
infinitum and leaks, even though X() is never part of any cycle:

class X(object):
   def __del__(self):
  print tick
  lst = [self]
  lst.append(lst)

Try interactively: every gc.collect() prints tick, even if you make
only one instance.)


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou
On Sat, 18 May 2013 16:22:55 +0200
Armin Rigo ar...@tunes.org wrote:
 Hi Antoine,
 
 On Sat, May 18, 2013 at 3:45 PM, Antoine Pitrou solip...@pitrou.net wrote:
  How is this done?  I don't see a clear way to determine it by looking
  only at the objects in the CI, given that arbitrary modifications of
  the object graph may have occurred.
 
  The same way a generation is traversed, but restricted to the CI.
 
  First the gc_refs field of each CI object is initialized to its
  ob_refcnt (again).
 
  Then, tp_traverse is called on each CI object, and each visited
  CI object has its gc_refs decremented. This substracts CI-internal
  references from the gc_refs fields.
 
  At the end of the traversal, if all CI objects have their gc_refs equal
  to 0, then the CI has no external reference to it and can be cleared.
  If at least one CI object has non-zero gc_refs, the CI cannot be
  cleared.
 
 Ok, indeed.  Then you really should call finalizers only once: in case
 one of the finalizers in a cycle did a trivial change like I
 described, the algorithm above will conservatively assume the cycle
 should be kept alive.  At the next GC collection we must not call the
 finalizer again, because it's likely to just do a similar trivial
 change.

Well, the finalizer will only be called if the resurrected object is
dereferenced again; otherwise the object won't be considered by the GC.
So, this will only happen if someone keeps trying to destroy a
resurrected object.

Calling finalizers only once is fine with me, but it would be a change
in behaviour; I don't know if it may break existing code.

(for example, say someone is using __del__ to manage a freelist)

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Richard Oudkerk

On 18/05/2013 3:18pm, Antoine Pitrou wrote:

It works fine:

$ ./python sbt.py
__main__.Node object at 0x7f3acbf8f400 __main__.Node object at 
0x7f3acbf8f878
__main__.Node object at 0x7f3acbf8f878 __main__.Node object at 
0x7f3acbf8f400

The reason is that, when you execute del self.next, this removes the
last reference to self.next and destroys it immediately.


So even more contrived:

 class Node:
 def __init__(self, x):
 self.x = x
 self.next = None
 def __del__(self):
 print(self.x, self.next.x)
 del self.x

 a = Node(1)
 b = Node(2)
 a.next = b
 b.next = a
 del a, b
 gc.collect()

--
Richard

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Antoine Pitrou
On Sat, 18 May 2013 15:52:56 +0100
Richard Oudkerk shibt...@gmail.com wrote:
 On 18/05/2013 3:18pm, Antoine Pitrou wrote:
  It works fine:
 
  $ ./python sbt.py
  __main__.Node object at 0x7f3acbf8f400 __main__.Node object at 
  0x7f3acbf8f878
  __main__.Node object at 0x7f3acbf8f878 __main__.Node object at 
  0x7f3acbf8f400
 
  The reason is that, when you execute del self.next, this removes the
  last reference to self.next and destroys it immediately.
 
 So even more contrived:
 
   class Node:
   def __init__(self, x):
   self.x = x
   self.next = None
   def __del__(self):
   print(self.x, self.next.x)
   del self.x
 
   a = Node(1)
   b = Node(2)
   a.next = b
   b.next = a
   del a, b
   gc.collect()

Indeed, there is an exception during destruction (which is ignored as
any exception raised from __del__):

$ ./python sbt.py 
1 2
Exception ignored in: bound method Node.__del__ of __main__.Node object at 
0x7f543cf0bb50
Traceback (most recent call last):
  File sbt.py, line 17, in __del__
print(self.x, self.next.x)
AttributeError: 'Node' object has no attribute 'x'


The only reason this currently succeeds is that the objects end up in
gc.garbage, of course.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442: Safe object finalization

2013-05-18 Thread Terry Jan Reedy

On 5/18/2013 11:22 AM, Antoine Pitrou wrote:

On Sat, 18 May 2013 15:52:56 +0100
Richard Oudkerk shibt...@gmail.com wrote:



So even more contrived:

   class Node:
   def __init__(self, x):
   self.x = x
   self.next = None
   def __del__(self):
   print(self.x, self.next.x)
   del self.x


An attribute reference that can fail should be wrapped with try-except.



   a = Node(1)
   b = Node(2)
   a.next = b
   b.next = a
   del a, b
   gc.collect()


Indeed, there is an exception during destruction (which is ignored as
any exception raised from __del__):

$ ./python sbt.py
1 2
Exception ignored in: bound method Node.__del__ of __main__.Node object at 
0x7f543cf0bb50
Traceback (most recent call last):
   File sbt.py, line 17, in __del__
 print(self.x, self.next.x)
AttributeError: 'Node' object has no attribute 'x'


Though ignored, the bug is reported, hinting that you should fix it ;-).



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com