#13387: Improve MonoDict and TripleDict data structures
----------------------------------+-----------------------------------------
       Reporter:  nbruin          |         Owner:  Nils Bruin
           Type:  enhancement     |        Status:  needs_work
       Priority:  major           |     Milestone:  sage-5.8  
      Component:  memleak         |    Resolution:            
       Keywords:                  |   Work issues:            
Report Upstream:  N/A             |     Reviewers:            
        Authors:  Nils Bruin      |     Merged in:            
   Dependencies:  #11521, #12313  |      Stopgaps:            
----------------------------------+-----------------------------------------

Comment (by nbruin):

 Replying to [comment:29 jdemeyer]:
 > To me, the expressions
 > {{{
 > PyInt_AsSsize_t(PyList_GET_ITEM(bucket, i))
 > }}}
 > look the most suspicious. What is `PyList_GET_ITEM(bucket, i)` and why
 are we sure it fits in a `Py_ssize_t`?

 Ah yes! Good catch. This could very well be the source. The stored values
 here are `id`s of python objects. They are stored via a `<size_t><void *>`
 cast (which subsequently gets converted to a python int). When we try to
 retrieve that via a `PyInt_AsSsize_t` we're trying to retrieve it into a
 `signed` `size_t` (that's the purpose of the `Ssize_t`, right?), so indeed
 this might not fit.

 What's the best fix, though? Should we be storing via a `<Py_ssize_t><void
 *>` cast or should we do a `<size_t><void *>PyList_GET_ITEM(...)` instead?
 I think I found `PyInt_AsSsize_t` by looking at the code generated by that
 statement.

 It depends a bit: I would assume that python has relatively efficient
 storage for both small positive and negative integers, so storing as a
 signed quantity would make better use of the bits python offers for its
 "small" (as in not `PyLong`) integers.

 Previously, the code prevented this whole issue by constructions such as
 {{{
             tmp = <object>PyList_GET_ITEM(bucket, i)
             if <size_t>tmp == h1:
 }}}
 which of course works, but is less efficient: The cast to `<object>` will
 introduce refcounting.

 So, my hunch is: store keys as `<Py_ssize_t>` and cast to `<size_t>`
 before doing modulo operations. I suppose that `<size_t><Py_ssize_t>(-1)`
 equals `256^(size_of(size_t))-1` (and probably doesn't generate code).

 Advice welcome.

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/13387#comment:31>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to