#715: Parents probably not reclaimed due to too much caching
-------------------------------------------------------------------+--------
Reporter: robertwb |
Owner: somebody
Type: defect |
Status: needs_work
Priority: major |
Milestone: sage-pending
Component: coercion |
Resolution:
Keywords: weak cache coercion Cernay2012 | Work
issues:
Report Upstream: N/A |
Reviewers: Jean-Pierre Flori, Simon King
Authors: Simon King, Jean-Pierre Flori | Merged
in:
Dependencies: #9138, #11900, #11599, to be merged with #11521 |
Stopgaps:
-------------------------------------------------------------------+--------
Comment (by SimonKing):
Replying to [comment:198 nbruin]:
> When reviewing #12313 I observed a possible problem for slight leaking
> ...
> By all means, if you have a good argument why this is not necessary,
revert to Positive Review (and I'd be interested in seeing the argument).
Yes, it is a potential leak. The argument would be:
* ''If'' all items indexed by a certain key triple are gone, we are left
with three size_t and with one pointer to an empty list, that will not be
collected; that's just a few bytes.
* It is (I believe) quite likely that the same key triple will be used
again. Hence, the few bytes will actually be used again.
* As long as it is not noticeable in a practical computation, I am not
sure if it is a good idea to slow deallocation down with a test "if
len(L)==0".
OK, that is not more than a heuristical argument. The patch would allow a
(I believe) very small leak, for the sake of a (probably) very small
speed-up.
> '''unweakreffable keys'''
>
> Note that currently, any key that doesn't allow weakreffing, gets a
(permanent, global) strong ref in `_refcache` in the value list, keyed by
their `id`. That's worse than a normal `dict`. A possible solution is to
have a `strongrefcache` on the `MonoDict` or `TripleDict` itself. Then at
least the references disappear when the Dict itself goes.
Hm. It is quite a long time ago that I wrote the code, so I need some time
to reconstruct what I thought.
The data of a `TripleDict` are stored in buckets. The buckets just provide
memory locations of the keys. This is in order to make access to the data
very fast: Otherwise, one would have to do special cases for keys that are
weak-refable and those that are not. By consequence, the weak references
(with callback function) to the keys need to be stored somewhere else: in
_refcache. In that way, items whose keys got garbage collected can be
removed from cache.
But why did I put ''strong'' references in _refcache as well? Let
(k1,k2,k3) be a key, and assume that k1 is not weak-refable. Assume
further that no external reference to k1 is left, but there are external
strong references to k2 and k3. If I would not store a strong reference to
k1 in _refcache, then k1 would be garbage collected. Since we do not have
a weak reference with callback for k1 and since k2 and k3 can not be
collected, the item for (k1,k2,k3) remains in the `TripleDict`. Hence,
when iterating over the items (and there is existing code that does
iterate over the items!), we would meet a reference to k1 ''after'' it was
garbage collected. That means a segfault occurs.
In other words: If k2 and k3 are not collectable and k1 can not be weak-
refed, then we must ensure that k1 stays alive. The solution is to keep a
strong reference to k1 in _refcache.
But now I wonder: Wouldn't it be better to have _refcache not as a global
dictionary, but have a separate _refcache for each `TripleDict`, so that
it gets collected if the `TripleDict` gets collected? Is that your
suggestion?
I think this would be worth trying.
> You'd have to ensure that whenever an entry gets deleted from the
`MonoDict` or the `TripleDict`, that any references in `strongrefcache` to
relevant key components get removed too. Especially for `TripleDict`, this
needs to happen in `TripleDictEraser` too, because if any weakreffable key
component gets GCd, the whole entry gets removed, so strong refs to other
key components should be released.
As I have pointed out, it is important that weak or strong references are
stored in _refcache. But perhaps the items in _refcache should be triples
of weak or strong references? If I am not mistaken, if (k1,k2,k3) is a
key, then it is uniquely determined by (id(k1),id(k2),id(k3)). We store
weak references (if possible) that provide (id(k1),id(k2),id(k3)). Hence,
the callback function of the weak reference can simply delete this entry.
__Conclusion__
* I will try if the `"if len(L)==0"` test leads to a slow-down
* I will try to replace the global _refcache by a dictionary that is
local to each `TripleDict`.
* I will store the references provided by _refcache in a different form,
so that they can more easily be deleted.
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/715#comment:199>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.