#12313: Fix yet another memory leak caused by caching of coercion data
--------------------------------------------------+-------------------------
       Reporter:  SimonKing                       |         Owner:              
                             
           Type:  defect                          |        Status:  
needs_review                             
       Priority:  major                           |     Milestone:  sage-5.3    
                             
      Component:  memleak                         |    Resolution:              
                             
       Keywords:  coercion weak dictionary        |   Work issues:              
                             
Report Upstream:  N/A                             |     Reviewers:  Simon King, 
Jean-Pierre Flori, John Perry
        Authors:  Simon King, Jean-Pierre Flori   |     Merged in:              
                             
   Dependencies:  #11521, #11599, #12969, #12215  |      Stopgaps:              
                             
--------------------------------------------------+-------------------------

Comment (by nbruin):

 > With the current patch, the buckets are of the form
 `[h1,v1,h2,v2,h3,v3,...]`, where h1,h2,h3 are the memory addresses of the
 keys, and v1,v2,v3 are the values. Now assume that we would put the (weak)
 references r1,r2,r3 to the keys into the bucket:
 `[r1,v1,r2,v2,r3,v3,...]`.
 >
 > The weak references should have a callback function removing entries
 from the dictionary. Hence, they are not just `weakref.ref`, but
 `weakref.KeyedRef`.

 That's not what I meant.  You make the buckets larger, so that their
 layout becomes:
 {{{
 [h1,r1,v1,h2,r2,v2,h3,r3,v3,...]
 }}}
 where for, say, the key-value pair `k1: v1` we have
 {{{
 h1=id(k1)
 r1=weakref.KeyedRef(k1,f,h1)
 }}}
 This way you can do your whole lookup without doing an extra search in the
 now non-existent `_refcache`.
 {{{
 for i from 0 <= i < PyList_GET_SIZE(bucket) by 3:
     tmp = <object>PyList_GET_ITEM(bucket, i)
     if <size_t>tmp == h:
         r = <object>PyList_GET_ITEM(bucket, i+1)
         if isinstance(r, KeyedRef) and r() is None:
             # the object got collected but we didn't receive a callback
 (yet?)
             # either remove it now:
             del r #we dont WANT the callback anymore!
             del bucket[i:i+3]
             self._size -= 1
             # or skip that step and hope we'll get our callback later. In
 any case:
                 raise KeyError, k
         else:
             # We've just checked that r's ref (the actual key) was still
 alive
             # so the fact that id(k) equals tmp means that
             # k is r() [or k is r if not weakreffable, but that would have
 been true anyway]
                 return <object>PyList_GET_ITEM(bucket, i+2)
 raise KeyError, k
 }}}

 Deletion and setting are easily similarly modified.

 This should be a bit faster because you avoid an extra `self._refcache[h]`
 lookup and all the management that comes with maintaining an
 independent hashtable that has exactly the same keys!

 It is also more memory efficient because you're now storing h only once
 instead of twice.

 For `TripleDict` you can do it in exactly the same way, just a little more
 work.

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/12313#comment:160>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to