#12313: Fix yet another memory leak caused by caching of coercion data
--------------------------------------------------------+-------------------
Reporter: SimonKing | Owner:
Type: defect | Status: new
Priority: major | Milestone:
sage-5.8
Component: memleak | Resolution:
Keywords: coercion weak dictionary | Work issues:
Report Upstream: N/A | Reviewers: Simon
King, Jean-Pierre Flori, John Perry, Nils Bruin
Authors: Simon King, Jean-Pierre Flori | Merged in:
Dependencies: #715, #11521, #12215, #13746, #13378 | Stopgaps:
--------------------------------------------------------+-------------------
Comment (by nbruin):
I am not able to reproduce this error, however "coerce_dict.pyx", line
1079 is
{{{
# Test whether the old references are still active
r1,r2,r3 = <tuple>(self._refcache[h1,h2,h3])
}}}
the line "homset.py", line 266:
{{{
_cache[key] = KeyedRef(H, _cache.eraser, (id(X),id(Y),id(category)))
}}}
was introduced in #11521, the
[http://trac.sagemath.org/sage_trac/attachment/ticket/11521/trac_11521_callback.patch
trac_11521_callback.patch], to be precise.
I'm a little worried about this implementation, because it's relying on
implementation details of a tripledict that aren't really part of the API
(in particular, here it's relying on the `TripleDict.eraser` attribute and
its call convention). I think this should be refactored to a method
{{{
_cache.set_weakvalued(key, H)
}}}
so that the tripledict can figure out how to put a properly formatted
callback on the value.
A quick "solution" might be to merge #13387 as well, since that gets rid
of `self._refcache`. However, I'm a little puzzled how this error can
occur:
- By the time we arrive at line 1079 we have succeeded in finding the key
triple `(h1,h2,h3)`.
- The KeyError indicates that `_refcache` doesn't contain the key
anymore.
- But key removal from `_refcache` and from `bucket` are inextricably
linked! This means that a key removal was triggered somewhere between
finding the element in the bucket and looking up in `_refcache`. But that
should mean that `bucket` got modified too! That's really worrisome,
because at this point we're ready to scribble in `bucket` at a location
we've counted before.
Indeed, if we are allocating memory anywhere in the loop that walks
through the bucket we could be triggering a garbage collection, which
could indeed include callbacks that modify this very bucket. I don't think
any of the lines here could do that, though, except for
`(self._refcache[h1,h2,h3])` itself, if that needs to construct a tuple or
something like that.
So my guess is: The entry that's keyed `(17671424, 82669344, 20657216)` is
indeed liable to get a callback from a garbage collection and thus be
removed, but that hasn't happened yet. However, in preparing the call
`self._refcache[h1,h2,h3]` the system needs to make a python tuple
(h1,h2,h3) to pass to `__getitem__` and that triggers the very GC that
deletes the entry from `bucket` and `_refcache`; hence the key error.
In other words, we need to make absolutely sure that in this lookup loop,
no memory allocations whatsoever are done, because any of those could
trigger a GC that can corrupt our view of what the bucket should look
like.
Looking at the lookup loop at #13387 for this, I think no memory
allocations are made there, so if my diagnosis is correct, then merging
that ticket would in fact be a proper fix.
I still think we shouldn't be making `KeyedRef` instances using the
`eraser` of a `TripleDict` outside the `TripleDict` itself, though.
Looking at the code cython generates I think my diagnosis is a
possibility:
{{{
* # Test whether the old references are still
active
* r1,r2,r3 = <tuple>(self._refcache[h1,h2,h3])
# <<<<<<<<<<<<<<
* if (isinstance(r1,KeyedRef) and r1() is None)
or \
* (isinstance(r2,KeyedRef) and r2() is None)
or \
*/
...
__pyx_t_9 = PyTuple_New(3); if (unlikely(!__pyx_t_9))
__Pyx_GOTREF(__pyx_t_9);
PyTuple_SET_ITEM(__pyx_t_9, 0, __pyx_t_1);
__Pyx_GIVEREF(__pyx_t_1);
PyTuple_SET_ITEM(__pyx_t_9, 1, __pyx_t_5);
__Pyx_GIVEREF(__pyx_t_5);
PyTuple_SET_ITEM(__pyx_t_9, 2, __pyx_t_8);
__Pyx_GIVEREF(__pyx_t_8);
...
__pyx_t_8 = __Pyx_PyDict_GetItem(((PyObject
*)__pyx_v_self->__pyx_base._refcache), ((PyObject *)__pyx_t_9)); if
(!__pyx_t_8) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1079;
__pyx_clineno = __LINE__; goto __pyx_L1_error;}
...
}}}
As you see the lookup code does start with a `PyTuple_New` that could
trigger a GC leading to removal of the key that gets looked up in the last
line (where the `KeyError` would originate).
'''TL;DR:''' merge #13387
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/12313#comment:318>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.