#12313: Fix yet another memory leak caused by caching of coercion data
--------------------------------------------------------+-------------------
       Reporter:  SimonKing                             |         Owner:  rlm   
                                                              
           Type:  defect                                |        Status:  
needs_review                                                        
       Priority:  major                                 |     Milestone:  
sage-5.3                                                            
      Component:  memleak                               |    Resolution:        
                                                              
       Keywords:  coercion weak dictionary              |   Work issues:  
Rebase wrt the new version of #715. Take into account Nils' comments
Report Upstream:  N/A                                   |     Reviewers:  Simon 
King, Jean-Pierre Flori, John Perry                           
        Authors:  Simon King, Jean-Pierre Flori         |     Merged in:        
                                                              
   Dependencies:  #715, #11521, #11599, #12969, #12215  |      Stopgaps:        
                                                              
--------------------------------------------------------+-------------------
Changes (by SimonKing):

  * status:  needs_work => needs_review
  * dependencies:  #715, #11521, #11599, #12969 => #715, #11521, #11599,
                   #12969, #12215


Comment:

 Replying to [comment:125 nbruin]:
 > '''sage/structure/coerce_dict.pyx''':
 >
 > line 96:
 >
 > Perhaps you can explain the mechanism here.

 I did explain it on #715, and I changed the patch similar to #715,
 according to your comments (_refcache is now local, etc).

 > line 103:
 > {{{
 >         for i from 0 <= i < PyList_GET_SIZE(bucket) by 4:
 >             if <size_t><object>PyList_GET_ITEM(bucket,i)==k:
 >                 del bucket[i:i+2]
 > }}}
 > It reads to me that that should be `... by 2`

 Right! Thanks for spotting that.

 > line 497: (in `Monodict._getitem__`)
 >
 > How bad is the race condition here? I guess we're trusting that
 > `bucket` doesn't change and that `bucket` has been purged of any dead
 references
 > (i.e., that all `MonoDictEraser` callbacks have happened). Does keeping
 the GIL
 > guarantee this?

 That is a good question.

 I tried to stress test it:
 {{{
 #!python
 L = MonoDict(31)
 class A: pass
 while 1:
     keys = [A() for _ in range(100)]
     for i,k in enumerate(keys):
         L[k] = i
     a = keys[99]
     keys = None
     print L[a]
 }}}

 I suppose that one could expect that garbage collection would sometimes
 occur while getting L[a]. Hence, one would expect a segfault to occur at
 some point. But it works fine (it constantly prints "99").

 > Does the line
 > {{{
 >                 return <object>PyList_GET_ITEM(bucket, i+1)
 > }}}
 > increase a refcount on the object we retrieve? Does the `<object>`
 typecast do
 > that? Or does the `return` do that?

 As much as I know, the `<object>` cast does that. Note that without the
 cast, the code wouldn't even compile, because "return" can not return the
 result of `PyList_GET_ITEM` (which is just a reference to an object).

 Here is some test that (I think) demonstrates that the reference count is
 right:
 {{{
 sage: cython("""
 ....: include "../ext/python_list.pxi"
 ....: def test(list L):
 ....:     return <object>PyList_GET_ITEM(L,4)
 ....: """)
 sage: import gc
 sage: class A(object): pass
 ....:
 sage: L = [A() for _ in range(100)]
 sage: id(L[4])
 80680784
 sage: a = test(L)
 sage: del L
 sage: _ = gc.collect()
 sage: [id(x) for x in gc.get_objects() if isinstance(x,A)]
 [80680784]
 sage: id(a)
 80680784
 sage: del a
 sage: _ = gc.collect()
 sage: [id(x) for x in gc.get_objects() if isinstance(x,A)]
 []
 }}}

 > line 536:
 >
 > Am I reading correctly that if k cannot be keyreffed then k is going to
 get a
 > strong ref in _refcache and will be eternalized?

 See the comments at #715: That used to be a problem, but is now solved.

 Some comments:

  * This patch fixes a typo introduced at #715. I suppose that #715,
 #11521, #12215 and #12313 will be merged in one go anyway, so, that should
 be fine.
  * There was a doctest in sage/categories/modules_with_basis.py that
 explicitly relied on an "eternal cache". Namely, some module was defined
 in one doctest and named "X". In ''another'' doctest, the same module was
 taken, and it was still called "X", because it has not been garbage
 collected. With my patch, garbage collection works, and thus the custom
 name chosen in the first doctest is not present in the second doctest.

 Apply trac_12313-mono_dict-combined-random-sk.patch

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/12313#comment:128>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to