#715: Parents probably not reclaimed due to too much caching
-------------------------------------------------------------------+--------
       Reporter:  robertwb                                         |         
Owner:  somebody                                 
           Type:  defect                                           |        
Status:  needs_review                             
       Priority:  major                                            |     
Milestone:  sage-5.4                                 
      Component:  coercion                                         |    
Resolution:                                           
       Keywords:  weak cache coercion Cernay2012                   |   Work 
issues:                                           
Report Upstream:  N/A                                              |     
Reviewers:  Jean-Pierre Flori, Simon King, Nils Bruin
        Authors:  Simon King, Jean-Pierre Flori                    |     Merged 
in:                                           
   Dependencies:  #9138, #11900, #11599, to be merged with #11521  |      
Stopgaps:                                           
-------------------------------------------------------------------+--------

Comment (by nbruin):

 OK, I've taken out the `omStrDup` call in `sage/libs/singular/ring.pyx`
 and just manually copy the strings over:
 {{{
     for i from 0 <= i < n:
         _name = names[i]
         sys.stderr.write("calling omStrDup for i=%s with
 name=%s\n"%(i,names[i]))
         j = 0
         while <bint> _name[j]:
             j+=1
         j+=1     #increment to include the 0
         sys.stderr.write("string length (including 0) seems to be %s\n"%j)
         copiedname =  <char*>omAlloc(sizeof(char)*(j+perturb))
         sys.stderr.write("Done reserving memory buffer; got address
 %x\n"%(<long>copiedname))
         for 0 <= offset < j:
             sys.stderr.write("copying character nr %s\n"%offset)
             copiedname[offset] = _name[offset]
         _names[i] = copiedname
         sys.stderr.write("after omStrDup\n")
 }}}
 If I set this code with `perturb=7`, I don't get a segfault. With smaller
 values I do, and the segfault happens in the `omAlloc` line. Given that
 `j==2` for most of this code, I guess that memory blocks are at least 8
 bytes (this is OSX 64bits).

 If `omAlloc` fails, I guess some of the internal omAlloc data structures
 is failing (I think the idea is that memory is managed in equal-sized
 blocks with just a free list on a system mAlloc-ed page). If I were to
 implement that, I'd store the pointers of the free block linked list in
 the actual blocks (hence minimum 8 byte blocks), so if anyone omAllocs an
 8-byte block and then writes past it, they could ruin the linked list and
 likely cause a subsequent omAlloc to segfault (because the omAlloc would
 actually have to access the location pointed to to check if the there is a
 next node in the free list). Even more likely: some code decides to "zero
 out" a block after it's already been `omFree'd`. That could also be a
 double deallocation.

 There must be people with vast omAlloc debugging experience who have
 wonderful tricks to track down this kind of error. A tiny bit of
 instrumentation should do the trick (frequent verification of free lists,
 checking that a block is not already in the free list when asked to
 deallocate -- these are things one could easily do without changing memory
 layout.

 In the mean time, we can "fix" the segfault on bsd by allocating a little
 extra space for variable names. At least 9 bytes seems to do the trick. By
 now it's pretty clear that the real error is probably a refcounting error
 in sage libsingular rings, which didn't become apparent until these things
 actually do get deallocated.

 If we insist that libsingular behaves as specified, then part of their
 specification is likely that they should not be deallocated, so then we
 should put in a strong reffing cache on these things (easy to do). Then
 one can make another ticket "make libsingular rings deallocatable".

 I think exposing the rest of sage to mortal parents is too important to
 delay on a hard-to-track-down memory issue for deallocation in
 libsingular.

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/715#comment:298>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to