#13394: Write a WeakValueDictionary with safer key removal
-------------------------------------+-------------------------------------
       Reporter:  nbruin             |        Owner:  rlm
           Type:  enhancement        |       Status:  needs_review
       Priority:  major              |    Milestone:  sage-5.13
      Component:  memleak            |   Resolution:
       Keywords:                     |    Merged in:
        Authors:  Simon King, Nils   |    Reviewers:  Simon King
  Bruin                              |  Work issues:
Report Upstream:  None of the above  |       Commit:
  - read trac for reasoning.         |  851cc9522dde332561101f1c84182a0a84b8eed4
         Branch:                     |     Stopgaps:
  u/SimonKing/ticket/13394           |
   Dependencies:                     |
-------------------------------------+-------------------------------------

Comment (by nbruin):

 Replying to [comment:56 SimonKing]:

 > Some remarks/questions about your code:
 > - I wanted to call `PyWeakref_GetObject` with borrowed references, but
 it somehow didn't work. Why is it working in your code?

 Because in the cython pxd header files, they define all those routines
 with `<object>` type parameters. That tends to work nicer, because it
 means the routine can "borrow" the reference itself. In the cython file I
 referenced there is a warning about reference counting and a remark that
 you can change that to `<PyObject *>` if you know what you're doing, which
 I pretended here.

 As you are away, a type cast from `<PyObject *>` to `<object>` generates
 code: an incref.

 > - You still do `cdef PyObject* Py_None = <PyObject*>None`. Couldn't we
 import `Py_None` from somewhere? Unfortunately I couldn't find it,
 although `Py_None` is mentioned in the documentation of the C-API.

 It's in `object.h`. line 839 or so:
 {{{
 PyAPI_DATA(PyObject) _Py_NoneStruct; /* Don't use this directly */
 #define Py_None (&_Py_NoneStruct)
 }}}
 We can do "cdef extern PyObject * Py_None" without problem.

 > - In `del_dictitem_by_exact_value`, you ask `#perhaps we should exit
 silently if no entry is found?`. I agree.

 We've considered things like this when working on TripleDict and I don't
 think we ever found a smoking gun. It seems that Python tries to organize
 things such that callbacks don't have to run. Obviously when I was testing
 this routine I DID want to see the errors. I'm fine with silent, I would
 prefer to see a scenario actually play out where it is necessary.

 >   - `D.__delitem__(k)` proceeds until `(k,r)` is removed from the
 dictionary, but it does not return yet.
 >   - Just before `r` is freed inside of `__delitem__`, a garbage
 collection happens on `v`. Hence, the callback of `r` is executed.

 The only shot you have for this is the dereffing of `k`. If the value is
 freed before the key, this never happens. The deletion code in
 `delitem_..._exact` is copied from Python's `PyDict_DelItem` and there
 also first the value is dereffed before the key is. So this scenario
 doesn't happen.

 > - Why is `del_dictitem_by_exact_value` cpdef and not cdef? Does `<void
 *>value` cost a CPU cycle? If this is so, one shouldn't call it in a while
 loop, and instead have `cdef

 No, it's a cast from one pointer type to another, so is free. It was `def`
 before (the routine is in principle safe to call from python too, except
 that its semantics are a bit wonky), but I don't mind if it's something
 else.

 > - In `pop()`, you say `#we turn out into a new reference right away
 because...`. However, wouldn't the following save a CPU cycle in case of
 an error:
 >  {{{
 > cdef PyObject *out_ref = PyWeakref_GetObject(wr)
 > if out_ref==Py_None:
 >     raise KeyError(k)
 > out = <object>out_ref
 > del self[k]
 > return out
 >  }}}

 Yes, I figured the error raising and handling would swamp a "cycle"
 (whatever that is on a modern CPU) anyway, so I went for the shortest code
 and making sure that we don't waste an allocation in the most common code
 path (although a good compile should optimize that out).

 > - The last line of `__contains__` should be deleted, as it will never be
 executed (in my code, I have a while loop, and thus I need to have `return
 False` after the while loop.

 Ah right! I noted the "unreachable code" warning, but when you `load`
 something into sage, the reported line numbers are off, so I thought it
 was complaining about unreachable comments ...

 > I suggest that I'll merge your code into my branch, do the changes
 suggested above and create a new commit.

 cool.

--
Ticket URL: <http://trac.sagemath.org/ticket/13394#comment:59>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to