#12313: Fix yet another memory leak caused by caching of coercion data
--------------------------------------------------+-------------------------
Reporter: SimonKing | Owner:
Type: defect | Status:
needs_review
Priority: major | Milestone: sage-5.3
Component: memleak | Resolution:
Keywords: coercion weak dictionary | Work issues:
Report Upstream: N/A | Reviewers: Simon King,
Jean-Pierre Flori, John Perry
Authors: Simon King, Jean-Pierre Flori | Merged in:
Dependencies: #11521, #11599, #12969, #12215 | Stopgaps:
--------------------------------------------------+-------------------------
Comment (by SimonKing):
Replying to [comment:160 nbruin]:
> That's not what I meant. You make the buckets larger, so that their
layout becomes:
> {{{
> [h1,r1,v1,h2,r2,v2,h3,r3,v3,...]
> }}}
> where for, say, the key-value pair `k1: v1` we have
> {{{
> h1=id(k1)
> r1=weakref.KeyedRef(k1,f,h1)
> }}}
I see.
> This way you can do your whole lookup without doing an extra search in
the now non-existent `_refcache`.
> {{{
> for i from 0 <= i < PyList_GET_SIZE(bucket) by 3:
> tmp = <object>PyList_GET_ITEM(bucket, i)
> if <size_t>tmp == h:
> r = <object>PyList_GET_ITEM(bucket, i+1)
> if isinstance(r, KeyedRef) and r() is None:
> # the object got collected but we didn't receive a callback
(yet?)
> # either remove it now:
> del r #we dont WANT the callback anymore!
> del bucket[i:i+3]
> self._size -= 1
> # or skip that step and hope we'll get our callback later.
In any case:
> raise KeyError, k
> else:
> # We've just checked that r's ref (the actual key) was still
alive
> # so the fact that id(k) equals tmp means that
> # k is r() [or k is r if not weakreffable, but that would
have been true anyway]
> return <object>PyList_GET_ITEM(bucket, i+2)
> raise KeyError, k
> }}}
Since you already have code, it should be relatively easy to test the
performance.
> This should be a bit faster because you avoid an extra
`self._refcache[h]` lookup and all the management that comes with
maintaining an
> independent hashtable that has exactly the same keys!
I wouldn't say that it is obviously faster. Let's simply test it, I'd say!
> It is also more memory efficient because you're now storing h only once
instead of twice.
>
> For `TripleDict` you can do it in exactly the same way, just a little
more work.
Not that much. Actually, I wonder whether one could have a `TupleDict`,
that is created with an additional parameter that prescribes the length L
of the key items. Then, the data in each bucket would be stored by L
references (weak, if possible), L memory locations, and one value. The
look-up would be uniform.
I think one should create a proof of concept, to test the performance. If
the performance is as good as with the current `MonoDict` and
`TripleDict`, then we have to think how to organise work. Perhaps #715 and
#12313 will be superseded by a new ticket, introducing `TupleDict`?
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/12313#comment:161>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.