#12290: Fix the hash of matrix spaces and improve its performance
------------------------------+---------------------------------------------
Reporter: SimonKing | Owner: jason, was
Type: defect | Status: needs_review
Priority: critical | Milestone: sage-5.0
Component: linear algebra | Keywords: hash matrix space
Work_issues: | Upstream: N/A
Reviewer: | Author: Simon King
Merged: | Dependencies:
------------------------------+---------------------------------------------
Changes (by newvalueoldvalue):
* status: new => needs_review
* author: => Simon King
Comment:
With my patch, the `__hash__` method returns the hash of the same data
that are used for comparison of matrix spaces. Hence, we have the new doc
test
{{{
sage: M = MatrixSpace(ZZ, 10)
sage: N = MatrixSpace(ZZ, 10,sparse=True)
sage: M == N
True
sage: hash(M)==hash(N)
True
}}}
Now about speed. If one computes the hash value over and over again by
{{{
#!python
def __hash__(self):
return hash((self.base(),self.__nrows, self.__ncols))
}}}
then the timing is
{{{
sage: timeit("hash(M)", number=10^6)
1000000 loops, best of 3: 1.72 µs per loop
}}}
If the hash value is stored in a single underscore attribute, such as
{{{
#!python
def __hash__(self):
try:
return self._hash
except AttributeError:
self._hash = h = hash((self.base(),self.__nrows,
self.__ncols))
return h
}}}
then one gets
{{{
sage: timeit("hash(M)", number=10^6)
1000000 loops, best of 3: 801 ns per loop
}}}
With a double-underscore `__hash` instead of `_hash`, one has
{{{
sage: timeit("hash(M)", number=10^6)
1000000 loops, best of 3: 712 ns per loop
}}}
With directly accessing the `__dict__` such as
{{{
#!python
def __hash__(self):
try:
return self.__dict__['_hash']
except KeyError:
self.__dict__['_hash'] = h = hash((self.base(),self.__nrows,
self.__ncols))
return h
}}}
one has
{{{
sage: timeit("hash(M)", number=10^6)
1000000 loops, best of 3: 611 ns per loop
}}}
and with the patch, one has
{{{
sage: timeit("hash(M)", number=10^6)
1000000 loops, best of 3: 547 ns per loop
}}}
How is that possible? Apparently a "try-except" block has some overhead.
Hence, simply returning a lazy attribute (which is solution of my patch)
is fastest. Note that it is not possible to use @cached_method on the
`__hash__` method.
Needs review (although I still need to run full doctests)!
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/12290#comment:1>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.