#15956: WeakValueDictionary does not handle unhashable keys correctly
-------------------------------------+-------------------------------------
       Reporter:  saraedum           |        Owner:
           Type:  defect             |       Status:  needs_review
       Priority:  minor              |    Milestone:  sage-6.2
      Component:  misc               |   Resolution:
       Keywords:  hash               |    Merged in:
        Authors:                     |    Reviewers:
Report Upstream:  N/A                |  Work issues:
         Branch:                     |       Commit:
  u/saraedum/ticket/15956            |  3a05a1aec6459e59f0b7069af4bc1100a71a9502
   Dependencies:                     |     Stopgaps:
-------------------------------------+-------------------------------------
Changes (by nbruin):

 * commit:   => 3a05a1aec6459e59f0b7069af4bc1100a71a9502


Comment:

 The proposed fix is very elegant and minimal, but it does noticeably
 affect a code path that deserves to be very fast: testing that a key does
 ''not'' occur in the dictionary. Timings:
 {{{
 %cpaste
 cython("""
 def test(N,D):
     cdef long n=N
     for i in range(n):
         1 in D
 """)
 --
 }}}
 {{{
 sage: D=sage.misc.weak_dict.WeakValueDictionary()
 sage: D[2]=ZZ
 sage: timeit("test(10^7,D)")
 5 loops, best of 3: 122 ms per loop
 //with the extra hash call:
 5 loops, best of 3: 151 ms per loop
 sage: D[1]=ZZ
 sage: timeit("test(10^7,D)")
 5 loops, best of 3: 466 ms per loop
 //this of course times the same with the extra hash call
 }}}
 Compared with a normal dictionary:
 {{{
 sage: N={}
 sage: N[2]=ZZ
 sage: timeit("test(10^7,N)")
 5 loops, best of 3: 90.6 ms per loop
 sage: N[1]=ZZ
 sage: timeit("test(10^7,N)")
 5 loops, best of 3: 377 ms per loop
 }}}
 The change from 122ms to 151ms is an increase of 25% in time taken (and
 this is for integers, which hash extremely quickly. I've also tried it
 with a string key '1', where I got 101ms vs. 127ms, so a penalty of around
 30ms/10^7 seems to be expected).

 Of course, it's nice if `sage.misc.weak_dict.WeakValueDictionary` is a
 perfect drop-in replacement for `weakref.WeakValueDictionary`, but having
 better performance at the expense of different edge-case error reporting
 (and it's just reporting "key not here" in cases where that's actually
 true, and where some other dictionaries raise "key not hashable", so it's
 hard to think of an edge case where that seriously affects legitimate
 program logic) shouldn't just be dismissed.

 We could of course patch python and backport `PyDict_GetItemWithError` and
 get a proper fix, without performance degradation.

 For the current proposal, I'd like to see some real-world motivation to
 bring the error reporting in line with other dictionaries before I'd
 consider significantly degrading performance.

 A further problem with the current proposal: it only fixes one symptom.
 The nastier example
 [http://hg.python.org/cpython/file/5cab0ada97b2/Objects/dictobject.c#l1046]
 (stack overflow during key comparison) isn't addressed by verifying that
 the key hashes. Going all the way with the `PyDict_GetItemWithError`
 solution (or equivalent) would also solve that.
 ----
 New commits:
 
||[http://git.sagemath.org/sage.git/commit/?id=3a05a1aec6459e59f0b7069af4bc1100a71a9502
 3a05a1a]||{{{Improved handling of unhashable keys in weak dictionary}}}||

--
Ticket URL: <http://trac.sagemath.org/ticket/15956#comment:5>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.

Reply via email to