#11115: Rewrite cached_method in Cython
---------------------------+------------------------------------------------
   Reporter:  SimonKing    |          Owner:  jason                
       Type:  enhancement  |         Status:  needs_review         
   Priority:  major        |      Milestone:  sage-4.7             
  Component:  misc         |       Keywords:  category cython cache
Work_issues:               |       Upstream:  N/A                  
   Reviewer:               |         Author:  Simon King           
     Merged:               |   Dependencies:  #9976                
---------------------------+------------------------------------------------
Changes (by SimonKing):

  * status:  needs_work => needs_review
  * dependencies:  => #9976


Old description:

> Broadening the original description of the ticket:
>
> The aim is to provide a Cython version of the cached_method decorator.
> There are three benefits:
>
> 1)
> Speed (see timings in the comments)
>
> 2)
> Using cached methods in Cython code.
>
> 3) Parent and element methods of categories
>
> Let me elaborate on the last point:
>
> In order to make full use of the parent and element methods of a
> category, it should be possible to define a ''cached'' method in the
> category, so that any object of that category inherits it ''with
> caching''.
>
> Currently, it fails as follows:
> {{{
> sage: class MyCategory(Category):
> ....:     def super_categories(self):
> ....:         return [Objects()]
> ....:     class ParentMethods:
> ....:         @cached_method
> ....:         def f(self, x):
> ....:             return x^2
> ....:
> sage: cython("""from sage.structure.parent cimport Parent
> ....: cdef class MyClass(Parent): pass
> ....: """)
> sage: O = MyClass(category=MyCategory())
> sage: O.f(4)
> 16
> sage: O.f(x) is O.f(x)
> False
> }}}
>
> So, the method is inherited, but not cached.
>

> Depends on #9976

New description:

 Broadening the original description of the ticket:

 The aim is to provide a Cython version of the cached_method decorator.
 There are three benefits:

 1)
 Speed (see timings in the comments)

 2)
 Using cached methods in Cython code.

 3) Parent and element methods of categories

 Let me elaborate on the last point:

 In order to make full use of the parent and element methods of a category,
 it should be possible to define a ''cached'' method in the category, so
 that any object of that category inherits it ''with caching''.

 Currently, it fails as follows:
 {{{
 sage: class MyCategory(Category):
 ....:     def super_categories(self):
 ....:         return [Objects()]
 ....:     class ParentMethods:
 ....:         @cached_method
 ....:         def f(self, x):
 ....:             return x^2
 ....:
 sage: cython("""from sage.structure.parent cimport Parent
 ....: cdef class MyClass(Parent): pass
 ....: """)
 sage: O = MyClass(category=MyCategory())
 sage: O.f(4)
 16
 sage: O.f(x) is O.f(x)
 False
 }}}

 So, the method is inherited, but not cached.


 Depends on #9976

 Apply trac11115-cached_cython.patch

--

Comment:

 I have replaced the previous patches by a single patch. I am sorry in
 advance for the long post below.

 It was indicated in previous posts that there was a certain danger of
 slowing down object creation. Also, one may think of startuptime. It seems
 to me that it is now solved.

 I announced that I would like to use the new `__cached_methods` attribute
 of parents to store attributes that are obtained by slow attribute lookup
 in the category framework, and to make lazy attributes work on all
 parents.

 Here are examples. I compare sage-4.7.alpha5 plus #9976 (referred to by
 "without patch") with sage-4.7.alpha5 plus #9976 plus the patch from here
 (referred to by "with patch").

 __1. Startuptime__
 With patch:
 {{{
 ...
 == Slowest (including children) ==
 1.155 sage.all (None)
 0.275 sage.schemes.all (sage.all)
 0.232 sage.misc.all (sage.all)
 0.158 elliptic_curves.all (sage.schemes.all)
 0.155 ell_rational_field (elliptic_curves.all)
 ...
 }}}
 Without patch:
 {{{
 ...
 == Slowest (including children) ==
 1.229 sage.all (None)
 0.278 sage.schemes.all (sage.all)
 0.241 sage.misc.all (sage.all)
 0.161 elliptic_curves.all (sage.schemes.all)
 0.159 ell_rational_field (elliptic_curves.all)
 ...
 }}}
 => no slow down.

 __2. Element creation__

 This is with patch. I indicate the corresponding timings without patch in
 the comments:
 {{{
 sage: a = get_memory_usage()
 sage: K = GF(2)
 sage: %time L = [K(i) for i in xrange(10^7)]
 CPU times: user 25.23 s, sys: 0.02 s, total: 25.25 s
 Wall time: 25.26 s  # without patch: 25.52 s
 sage: get_memory_usage() - a
 78.49609375   # same as without patch
 sage: a = get_memory_usage()
 sage: K = GF(101)
 sage: %time L = [K(i) for i in xrange(10^7)]
 CPU times: user 25.13 s, sys: 0.03 s, total: 25.16 s
 Wall time: 25.16 s  # without patch: 26.20 s
 sage: get_memory_usage() - a
 0.0
 sage: a = get_memory_usage()
 sage: K = GF(next_prime(10^6))
 sage: %time L = [K(i) for i in xrange(10^7)]
 CPU times: user 87.40 s, sys: 0.24 s, total: 87.63 s
 Wall time: 87.64 s   # without patch: 88.87 s
 sage: get_memory_usage() - a
 794.94140625   # without patch: 794.94921875
 }}}
 => no slow down.

 __3. Cached methods in Python code__

 Here, I consider existing code, namely `gens()` and `groebner_basis()` for
 multivariate polynomial ideals. Again, the timings are with patch, the old
 timings are in comments
 {{{
 sage: P.<x,y> = QQ[]
 sage: I = P*[x,y]
 sage: timeit('I.gens()',number=10^7)
 10000000 loops, best of 3: 142 ns per loop
 # Without patch:
 # 10000000 loops, best of 3: 703 ns per loop
 sage: timeit('I.groebner_basis()',number=10^7)
 10000000 loops, best of 3: 249 ns per loop
 # Without patch:
 # 10000000 loops, best of 3: 746 ns per loop
 }}}

 __4. Accessing parent attributes that are inherited from the category, if
 the element class of the category does not appear in the method resolution
 order__

 It may happen that a parent is not an instance of the parent class of its
 category. That can be for two reasons: Either the category is not properly
 initialised (it is partially taken care of in #9944 and #9138), or it is a
 Cython class (if I remember correctly, they don't play well with dynamic
 classes). Then, attribute lookup must go beyond the usual method
 resolution order, and that's very slow. So, I suggest to use a shortpath
 in that case:

 {{{
 sage: P.<x,y> = QQ[]
 sage: P._is_category_initialized()
 False
 sage: P._init_category_(Algebras(QQ))
 sage: P.sum
 <bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in
 x, y over Rational Field>
 sage: P.sum.__module__
 'sage.categories.commutative_additive_monoids'
 sage: timeit('P.sum', number=10^6)
 1000000 loops, best of 3: 755 ns per loop
 # Without patch
 # 1000000 loops, best of 3: 3.76 µs per loop
 sage: P.__cached_methods['sum']
 <bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in
 x, y over Rational Field>
 }}}

 __5. Cython: Lazy attributes and cached methods inherit from the
 category__

 The following would not work at all without the patch.

 We define a category (written in Cython) with some cached methods.
 {{{
 sage: cython_code = ["from sage.all import cached_method,
 cached_in_parent_method, Category",
 ... "class MyCategory(Category):",
 ... "    @cached_method",
 ... "    def super_categories(self):",
 ... "        return [Objects()]",
 ... "    class ElementMethods:",
 ... "        @cached_method",
 ... "        def element_cache_test(self):",
 ... "            return -self",
 ... "        @cached_in_parent_method",
 ... "        def element_via_parent_test(self):",
 ... "            return -self",
 ... "    class ParentMethods:",
 ... "        @cached_method",
 ... "        def one(self):",
 ... "            return self.element_class(self,1)",
 ... "        @cached_method",
 ... "        def invert(self, x):",
 ... "            return -x"]
 sage: cython('\n'.join(cython_code))
 }}}

 ''Case 1''

 Define elements and parents as Python classes (but in Cython code), so
 that attribute assignment is possible. That's the easy case.
 {{{
 sage: cython_code = ["from sage.structure.element cimport Element",
 ... "class MyElement(Element):",
 ... "    def __init__(self,P,x):",
 ... "        self.x=x",
 ... "        Element.__init__(self,P)",
 ... "    def __neg__(self):",
 ... "        return MyElement(self.parent(),-self.x)",
 ... "    def _repr_(self):",
 ... "        return '<%s>'%self.x",
 ... "    def __hash__(self):",
 ... "        return hash(self.x)",
 ... "    def raw_test(self):",
 ... "        return -self",
 ... "from sage.structure.parent cimport Parent",
 ... "class MyParent(Parent):",
 ... "    Element = MyElement"]
 sage: cython('\n'.join(cython_code))
 sage: C = MyCategory()
 sage: P = MyParent(category=C)
 sage: e = MyElement(P,5)
 }}}

 Cached method for the parent:
 {{{
 # Cache without arguments
 sage: P.one()
 <1>
 sage: P.one() is P.one()
 True
 sage: timeit('P.one()', number=10^6)
 1000000 loops, best of 3: 210 ns per loop
 # Cache with arguments
 sage: P.invert(e)
 <-5>
 sage: P.invert(e) is P.invert(e)
 True
 sage: timeit('P.invert(e)', number=10^6)
 1000000 loops, best of 3: 815 ns per loop
 }}}

 Cached methods for elements (one with cache in the element, the other with
 cache in the parent):
 {{{
 # Cache without arguments
 sage: e.element_cache_test()
 <-5>
 sage: e.element_cache_test() is e.element_cache_test()
 True
 sage: timeit('e.element_cache_test()', number=10^6)
 1000000 loops, best of 3: 173 ns per loop

 # Cached in parent
 sage: e.element_via_parent_test()
 <-5>
 sage: e.element_via_parent_test() is e.element_via_parent_test()
 True
 sage: timeit('e.element_via_parent_test()', number=10^6)
 1000000 loops, best of 3: 574 ns per loop

 # Comparison with element creation:
 sage: e.raw_test()
 <-5>
 sage: e.raw_test() is e.raw_test()
 False
 sage: timeit('e.raw_test()', number=10^6)
 1000000 loops, best of 3: 1.57 µs per loop
 }}}

 ''Case 2''

 We use Cython classes for which attribute assignment is impossible. This
 is no problem at all, for the parent class. For the element class, we need
 to provide an additional public attribute `__cached_methods`, or things
 partially break:
 {{{
 sage: cython_code = ["from sage.structure.element cimport Element",
 ... "cdef class MyBrokenElement(Element):",
 ... "    cdef object x",
 ... "    def __init__(self,P,x):",
 ... "        self.x=x",
 ... "        Element.__init__(self,P)",
 ... "    def __neg__(self):",
 ... "        return MyElement(self.parent(),-self.x)",
 ... "    def _repr_(self):",
 ... "        return '<%s>'%self.x",
 ... "    def __hash__(self):",
 ... "        return hash(self.x)",
 ... "    def raw_test(self):",
 ... "        return -self",
 ... "cdef class MyElement(Element):",
 ... "    cdef public dict __cached_methods",
 ... "    cdef object x",
 ... "    def __init__(self,P,x):",
 ... "        self.x=x",
 ... "        Element.__init__(self,P)",
 ... "    def __neg__(self):",
 ... "        return MyElement(self.parent(),-self.x)",
 ... "    def _repr_(self):",
 ... "        return '<%s>'%self.x",
 ... "    def __hash__(self):",
 ... "        return hash(self.x)",
 ... "    def raw_test(self):",
 ... "        return -self",
 ... "from sage.structure.parent cimport Parent",
 ... "cdef class MyParent(Parent):",
 ... "    Element = MyElement"]
 sage: cython('\n'.join(cython_code))
 sage: C = MyCategory()
 sage: P = MyParent(category=C)
 sage: ebroken = MyBrokenElement(P,5)
 sage: e = MyElement(P,5)
 }}}

 Every parent should have a '''lazy attribute''' `element_class` -- but so
 far, it used to fail on extension classes. That problem is solved by the
 patch:
 {{{
 sage: P.element_class
 <type '_mnt_local_king__sage_temp_mpc622_29604_tmp_2_spyx_0.MyElement'>
 }}}

 If attribute assignment is impossible then the cache (for parents) still
 works, but gets a bit slower. Without the patch, the cache would break,
 and even a working cache would be slower.
 {{{
 sage: P.one()
 <1>
 sage: P.one() is P.one()
 True
 sage: timeit('P.one()', number=10^6)
 1000000 loops, best of 3: 685 ns per loop
 # Cache with arguments
 sage: P.invert(e)
 <-5>
 sage: P.invert(e) is P.invert(e)
 True
 sage: timeit('P.invert(e)', number=10^6)
 1000000 loops, best of 3: 1.06 µs per loop
 }}}

 We now consider the broken cache for elements. It turns out that the
 cached method for the "broken" element class can still be called, but is
 not cached. However, the cached-in-parent method ''is'' cached, but is
 terribly slow:
 {{{
 sage: ebroken.element_cache_test()
 <-5>
 # This is broken:
 sage: ebroken.element_cache_test() is ebroken.element_cache_test()
 False
 sage: ebroken.element_via_parent_test()
 <-5>
 # This is not broken:
 sage: ebroken.element_via_parent_test() is
 ebroken.element_via_parent_test()
 True
 # But it is very very slow:
 sage: timeit('ebroken.element_via_parent_test()')
 625 loops, best of 3: 31.2 µs per loop
 }}}

 However, the simple addition of a public dict `__cached_methods` make the
 cache work nicely (even though attribute assignment is still faster):
 {{{
 sage: e.element_cache_test()
 <-5>
 sage: e.element_cache_test() is e.element_cache_test()
 True
 sage: timeit('e.element_cache_test()', number=10^6)
 1000000 loops, best of 3: 921 ns per loop
 # Cached in parent
 sage: e.element_via_parent_test()
 <-5>
 sage: e.element_via_parent_test() is e.element_via_parent_test()
 True
 sage: timeit('e.element_via_parent_test()', number=10^6)
 1000000 loops, best of 3: 1.13 µs per loop
 # Comparison with element creation
 sage: timeit('e.raw_test()', number=10^6)
 1000000 loops, best of 3: 696 ns per loop
 }}}

 __6. Clear cache on pickle__

 There was a special class `ClearCacheOnPickle` in the cachfunc module.
 However, it was not sufficiently documented (and not provided with tests),
 and in fact it was broken. I made it work, and this is one is taken from
 the doc strings:
 {{{
     In the following example, we create a Python class that inherits
     from multivariate polynomial ideals, but does not pickle cached
     values.  We provide the definition in Cython, however, since
     interactive Cython definitions provide introspection by trac
     ticket #9976, whereas Python definitions don't.
     ::

         sage: P.<a,b,c,d> = QQ[]
         sage: I = P*[a,b]
         sage: classdef = ['from sage.misc.cachefunc import
 ClearCacheOnPickle',
         ...    'from sage.all import QQ',
         ...    'P = QQ["a","b","c","d"]; I = P*[P.gen(0),P.gen(1)]',
         ...    'class MyClass(ClearCacheOnPickle,I.__class__):',
         ...    '    def __init__(self,ring,gens):',
         ...    '        I.__class__.__init__(self,ring,gens)',
         ...    '    def __getnewargs__(self):',
         ...    '        return
 (self._Ideal_generic__ring,self._Ideal_generic__gens)']
         sage: cython('\n'.join(classdef))

     We destroy the cache of two methods of ``I`` on purpose
     (demonstrating that the two different implementations of cached
     methods are correctly dealt with).  Pickling ``I`` preserves the
     cache::

         sage: I.gens.set_cache('bar')
         sage: I.groebner_basis.set_cache('foo',algorithm='singular')
         sage: J = loads(dumps(I))
         sage: J.gens()
         'bar'
         sage: J.groebner_basis(algorithm='singular')
         'foo'

     However, if we have an ideal that additionally descends from
     :class:`ClearCacheOnPickle`, the carefully corrupted cache is not
     pickled::

         sage: A = MyClass(P,[a,b])
         sage: A
         Ideal (a, b) of Multivariate Polynomial Ring in a, b, c, d over
 Rational Field
         sage: A.gens.set_cache('foo')
         sage: A.groebner_basis.set_cache('bar',algorithm='singular')
         sage: A.gens()
         'foo'
         sage: A.groebner_basis(algorithm='singular')
         'bar'
         sage: B = loads(dumps(A))
         sage: B.gens()
         [a, b]
         sage: B.groebner_basis(algorithm='singular')
         [a, b]
         sage: A.gens()
         'foo'
 }}}

 Doctests pass for me. So, I guess it is in need of review, again!

 For the patchbot:

 Depends on #9976

 Apply trac11115-cached_cython.patch

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11115#comment:41>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to