Mark Shannon, 06.05.2011 20:45:
Stefan Behnel wrote:
Michael Foord, 06.05.2011 19:06:
On 06/05/2011 17:51, Stefan Behnel wrote:
Mark Shannon, 06.05.2011 18:33:
s...@pobox.com wrote:
Antoine> Since we're sharing links, here's Matt Mackall's take:
Antoine>
http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
From that note:
1: You can't have meaningful destructors, because when destruction
happens is undefined. And going-out-of-scope destructors are extremely
useful. Python is already a rather broken in this regard, so feel free
to ignore this point.
Given the presence of cyclic data I don't see how reference counting or
garbage collection win. Ignoring the fact that in a pure reference
counted
system you won't even consider cycles for reclmation, would both RC
and GC
have to punt because they can't tell which object's destructor to call
first?
It doesn't matter which is called first.
May I quote you on that one the next time my software crashes?
Arbitrarily breaking cycles *could* cause a problem if a destructor
attempts to access an already collected object.
This is more real than the "could" suggests. Remember that CPython
includes a lot of C code, and is commonly used to interface with C
libraries. While you will simply get an exception when cycles are broken
in Python code, cycles that involve C code can suffer quite badly from
this problem.
There was a bug in the lxml.etree XML library a while ago that could let
it crash hard when its Element objects participated in a reference cycle.
It's based on libxml2, so there's an underlying C tree that potentially
involves disconnected subtrees, and a Python space representation using
Element proxies, with at least one Element for each disconnected subtree.
Basically, Elements reference their Document (not the other way round)
even if they are disconnected from the main C document tree. The Document
needs to do some final cleanup in the end, whereas the Elements require
the Document to be alive to do their own subtree cleanup, if only to know
what exactly to clean up, as the subtrees share some C state through the
document. Now, if any of the Elements ends up in a reference cycle for
some reason, the GC will throw its dices and may decide to call the
Document destructor first. Then the Element destructors are bound to
crash, trying to access dead memory of the Document.
With a tracing collector it is *impossible* to access dead memory, ever.
If it can be reached the GC will *not* collect it.
This should be a fundamental invariant of *all* GCs.
If an object is finalizable or reachable from any finalizable objects
then it is reachable and its memory should not be reclaimed until it is
truly unreachable.
Finalization and reclamation are separate phases.
Sure. However, I'm talking about Python types and C memory here. Even if
the Python objects are still alive, they may already have freed the
underlying C memory during their *finalisation*. When an Element goes out
of scope, it must free its C subtree if it is disconnected, even if the
Document stays alive. So that's what Elements do in their destructor, and
they need the Document's C memory for that, which the Document frees during
its own finalisation.
I do agree that CPython's destructor call algorithms could have been
smarter in this case. After all, the described crash case indicates that
the Document destructor was called before all of the Element destructors
had been called, although all Elements reference their Document, but the
Document does not refer to any of the Elements, so it's basically a dead
end. That would have provided a detectable hint to call the Document
destructor last, after the ones of all objects that reference it.
Apparently, this hint did not lead to an appropriate action, possibly
because it's an unimplemented special case and there are enough cases where
multiple objects with destructors are actually part of the 'real' cycle.
Stefan
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com