#11794: Optional cythonised cached hash for Python classes
---------------------------+------------------------------------------------
   Reporter:  SimonKing    |          Owner:  jason         
       Type:  enhancement  |         Status:  needs_review  
   Priority:  major        |      Milestone:  sage-4.7.2    
  Component:  misc         |       Keywords:                
Work_issues:               |       Upstream:  N/A           
   Reviewer:               |         Author:  Simon King    
     Merged:               |   Dependencies:  #11115, #11791
---------------------------+------------------------------------------------
Changes (by newvalueoldvalue):

  * status:  new => needs_review
  * dependencies:  => #11115, #11791
  * author:  => Simon King


Comment:

 The attached patch provides a new metaclass that overrides the hash method
 of an existing class with a fast cached method.

 '''1. __Providing existing classes with a fast hash__'''

 One way of using it is to take an existing class and apply the metaclass
 to it:
 {{{
 sage: P = FastHashMetaclass(Parent)()
 sage: P
 <class '__main__.Parent_with_hash'>
 sage: timeit("hash(P)")
 625 loops, best of 3: 464 ns per loop

 sage: C = FastHashMetaclass(QQ['x'].__class__)
 sage: C
 <class
 
'sage.rings.polynomial.polynomial_ring.PolynomialRing_field_with_category_with_hash'>
 sage: P = C(QQ,'x')
 sage: P
 Univariate Polynomial Ring in x over Rational Field
 sage: timeit("hash(P)")
 625 loops, best of 3: 458 ns per loop
 }}}

 '''2. __Class definition with metaclass__'''

 The second way of using it is to provide it as `__metaclass__` in a class
 definition. It allows to demonstrate (by printing status information) that
 the hash really is cached:
 {{{
 sage: class Bar:
 ....:     __metaclass__ = FastHashMetaclass
 ....:     def __init__(self,R):
 ....:         self.R = R
 ....:     def __repr__(self):
 ....:         return "[%s]"%self.R
 ....:     def __hash__(self):
 ....:         print "base hash"
 ....:         return hash(repr(self))
 ....:
 sage: P = Bar(9)
 sage: P
 [9]
 sage: hash(P)
 base hash
 8209412804338245734
 sage: hash(P)
 8209412804338245734
 sage: hash(repr(P))
 8209412804338245734
 sage: timeit("hash(P)")
 625 loops, best of 3: 467 ns per loop
 }}}

 '''3. __Dynamic metaclasses__'''

 Pickling of classes requires (as usual) that the class definition can be
 imported from a module. Thus, in the next example, I give the definition
 in interactive Cython. It demonstrates that the new metaclass can be
 combined with other metaclasses:
 {{{
 sage: cython_code = [
 ... 'from sage.misc.fast_hash import FastHashMetaclass',
 ... 'class Bar(UniqueRepresentation):',
 ... '    __metaclass__ = FastHashMetaclass',
 ... '    def __init__(self,R):',
 ... '        self.R = R',
 ... '    def __repr__(self):',
 ... '        return "[%s]"%self.R',
 ... '    def __hash__(self):',
 ... '        print "base hash"',
 ... '        return hash(repr(self))']
 sage: cython('\n'.join(cython_code))
 }}}

 The metaclass associated with `UniqueRepresentation` makes the Bar
 instances unique parents:
 {{{
 sage: b = Bar(5)
 sage: b is Bar(5)
 True
 sage: loads(dumps(b)) is b    # indirect doctest
 True
 }}}

 `WithFastHashMetaclass` additionally makes the hash cached and fast:
 {{{
 sage: hash(b)
 base hash
 8209412804334245746
 sage: hash(b)
 8209412804334245746
 sage: timeit("hash(b)")
 625 loops, best of 3: 475 ns per loop
 }}}

 Technical detail: In order to mix the metaclasses, I turn the metaclasses
 into dynamic classes. In other words, the metaclasses have a metaclass,
 themselves:
 {{{
 sage: type(b)
 <class '_mnt_local_king__sage_temp_mpc622_31222_tmp_0_spyx_0.Bar'>
 sage: type(type(b))
 <class 'sage.misc.fast_hash.FastHashAndClasscallMetaclass'>
 sage: type(type(b)).__bases__
 (<class 'sage.misc.fast_hash.FastHashMetaclass'>, <class
 'sage.misc.classcall_metaclass.ClasscallMetaclass'>)
 sage: type(type(type(b)))
 <class 'sage.structure.dynamic_class.DynamicMetaclass'>
 }}}
 I think that this approach could be scalable, and might be useful in other
 places.

 '''4. __Speeding up existing instances'''

 Another way of using it is to provide single instances of a class with the
 fast hash. This is only possible if the class is a "heap type":
 {{{
 sage: F = CombinatorialFreeModule(QQ, ['a','b'])
 sage: timeit("hash(F)")
 625 loops, best of 3: 1.09 µs per loop
 sage: F.__class__ = FastHashMetaclass(F.__class__)
 sage: timeit("hash(F)")
 625 loops, best of 3: 467 ns per loop
 sage: P.<x> = QQ[]
 sage: P.__class__ = FastHashMetaclass(P.__class__)
 sage: timeit("hash(P)")
 625 loops, best of 3: 483 ns per loop
 sage: x.__class__ = FastHashMetaclass(x.__class__)
 ---------------------------------------------------------------------------
 TypeError                                 Traceback (most recent call
 last)

 /mnt/local/king/SAGE/sandbox/sage-4.7.2.alpha2/devel/sage-main/<ipython
 console> in <module>()

 TypeError: __class__ assignment: only for heap types
 }}}

 '''5. __The price to pay__'''

 There is no free lunch. Thus, one might expect that the fast hash comes
 with a price to pay. Unfortunately, this is indeed the case:
 {{{
 sage: F = CombinatorialFreeModule(QQ, ['a','b'])
 sage: D = dir(F)
 sage: timeit('for s in D: a = getattr(F,s)', number=10000)
 10000 loops, best of 3: 84.9 µs per loop
 sage: from sage.misc.fast_hash import FastHashMetaclass
 sage: F.__class__ = FastHashMetaclass(F.__class__)
 sage: timeit('for s in D: a = getattr(F,s)', number=10000)
 10000 loops, best of 3: 96.2 µs per loop
 sage: (96.2-84.9)/84.9
 0.133097762073027
 }}}

 Hence, I am not suggesting to use the new metaclass everywhere. But when a
 fast hash is needed, it may be a useful option. Note that the regression
 in the previous example is temporary:
 {{{
 sage: F.__class__
 <class
 'sage.combinat.free_module.CombinatorialFreeModule_with_category_with_hash'>
 sage: F.__class__.__base__
 <class 'sage.combinat.free_module.CombinatorialFreeModule_with_category'>
 sage: F.__class__ = F.__class__.__base__
 sage: timeit('for s in D: a = getattr(F,s)', number=10000)
 10000 loops, best of 3: 84.6 µs per loop
 }}}

 Hence, it could even be an option to temporarily get a fast hash and later
 bring all other methods back to speed.

 '''__Dependencies__'''

 #11115 makes cached functions picklable, and that is what we need here.

 #11791 provides introspection for classes with dynamic metaclasses. As we
 have seen above, we ''have'' dynamic metaclasses.

 I'll try to milden the speed regression exposed above. But I think it is
 OK to make it "needs review".

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11794#comment:1>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to