#13394: Write a WeakValueDictionary with safer key removal
-------------------------------------+-------------------------------------
Reporter: nbruin | Owner: rlm
Type: enhancement | Status: needs_review
Priority: major | Milestone: sage-5.13
Component: memleak | Resolution:
Keywords: | Merged in:
Authors: Simon King | Reviewers:
Report Upstream: None of the above | Work issues:
- read trac for reasoning. | Commit:
Branch: | fab0ed4112b9f798e2690f4c885b57cd711ea698
u/SimonKing/ticket/13394 | Stopgaps:
Dependencies: |
-------------------------------------+-------------------------------------
Comment (by SimonKing):
Replying to [comment:55 nbruin]:
> Even python's `dict` isn't quite properly guarded against mutating
iteration: they only check that the size doesn't change from one yield to
the next, but that isn't enough, of course:
Aha. That's safer in my implementation (I think): During iteration, I
protect against changing the length of a bucket, not just against changing
the length of the whole dict. Anyway, I do believe that it is enough to
have a weak value dictionary that is as robust as a plain dict---but we
don't need to be better than `<dict>`.
Some remarks/questions about your code:
- I wanted to call `PyWeakref_GetObject` with borrowed references, but it
somehow didn't work. Why is it working in your code?
- You still do `cdef PyObject* Py_None = <PyObject*>None`. Couldn't we
import `Py_None` from somewhere? Unfortunately I couldn't find it,
although `Py_None` is mentioned in the documentation of the C-API.
- In `del_dictitem_by_exact_value`, you ask `#perhaps we should exit
silently if no entry is found?`. I agree. Namely, you use this function
during callback (that's the only place), and I think we really don't want
an error being raised there. Could it be that the callback of a reference
can not find the item that contains the reference? I think so, by a weird
race condition! Namely:
- Create an item `D[k] = v`, with weak reference `r` to `v`
- delete v, but make sure that v does not become garbage collected yet
- Do `del D[k]`
Now, I believe the following could happen:
- `D.__delitem__(k)` proceeds until `(k,r)` is removed from the
dictionary, but it does not return yet.
- Just before `r` is freed inside of `__delitem__`, a garbage collection
happens on `v`. Hence, the callback of `r` is executed.
- The callback finds that `(k,r)` is not in the dict and raises an error
I don't know if this can really happen. But in any case, I guess silently
returning is what we want here.
- Why is `del_dictitem_by_exact_value` cpdef and not cdef? Does `<void
*>value` cost a CPU cycle? If this is so, one shouldn't call it in a while
loop, and instead have `cdef del_dictitem_by_exact_value(dict D, PyObject
*value_addr, long hash)`.
- In `pop()`, you say `#we turn out into a new reference right away
because...`. However, wouldn't the following save a CPU cycle in case of
an error:
{{{
cdef PyObject *out_ref = PyWeakref_GetObject(wr)
if out_ref==Py_None:
raise KeyError(k)
out = <object>out_ref
del self[k]
return out
}}}
- The last line of `__contains__` should be deleted, as it will never be
executed (in my code, I have a while loop, and thus I need to have `return
False` after the while loop.
- Thanks for spotting the race condition in `_IterationContext.__exit__`.
I suggest that I'll merge your code into my branch, do the changes
suggested above and create a new commit.
--
Ticket URL: <http://trac.sagemath.org/ticket/13394#comment:56>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/groups/opt_out.