#12290: Fix the hash of matrix spaces and improve its performance
------------------------------+---------------------------------------------
   Reporter:  SimonKing       |          Owner:  jason, was       
       Type:  defect          |         Status:  needs_review     
   Priority:  critical        |      Milestone:  sage-5.0         
  Component:  linear algebra  |       Keywords:  hash matrix space
Work_issues:                  |       Upstream:  N/A              
   Reviewer:                  |         Author:  Simon King       
     Merged:                  |   Dependencies:                   
------------------------------+---------------------------------------------
Changes (by newvalueoldvalue):

  * status:  new => needs_review
  * author:  => Simon King


Comment:

 With my patch, the `__hash__` method returns the hash of the same data
 that are used for comparison of matrix spaces. Hence, we have the new doc
 test
 {{{
 sage: M = MatrixSpace(ZZ, 10)
 sage: N = MatrixSpace(ZZ, 10,sparse=True)
 sage: M == N
 True
 sage: hash(M)==hash(N)
 True
 }}}

 Now about speed. If one computes the hash value over and over again by
 {{{
 #!python
     def __hash__(self):
         return hash((self.base(),self.__nrows, self.__ncols))
 }}}
 then the timing is
 {{{
 sage: timeit("hash(M)", number=10^6)
 1000000 loops, best of 3: 1.72 µs per loop
 }}}

 If the hash value is stored in a single underscore attribute, such as
 {{{
 #!python
     def __hash__(self):
         try:
             return self._hash
         except AttributeError:
             self._hash = h = hash((self.base(),self.__nrows,
 self.__ncols))
         return h
 }}}
 then one gets
 {{{
 sage: timeit("hash(M)", number=10^6)
 1000000 loops, best of 3: 801 ns per loop
 }}}

 With a double-underscore `__hash` instead of `_hash`, one has
 {{{
 sage: timeit("hash(M)", number=10^6)
 1000000 loops, best of 3: 712 ns per loop
 }}}

 With directly accessing the `__dict__` such as
 {{{
 #!python
     def __hash__(self):
         try:
             return self.__dict__['_hash']
         except KeyError:
             self.__dict__['_hash'] = h = hash((self.base(),self.__nrows,
 self.__ncols))
         return h
 }}}
 one has
 {{{
 sage: timeit("hash(M)", number=10^6)
 1000000 loops, best of 3: 611 ns per loop
 }}}
 and with the patch, one has
 {{{
 sage: timeit("hash(M)", number=10^6)
 1000000 loops, best of 3: 547 ns per loop
 }}}

 How is that possible? Apparently a "try-except" block has some overhead.
 Hence, simply returning a lazy attribute (which is solution of my patch)
 is fastest. Note that it is not possible to use @cached_method on the
 `__hash__` method.

 Needs review (although I still need to run full doctests)!

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/12290#comment:1>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to