#11115: Rewrite cached_method in Cython
---------------------------+------------------------------------------------
Reporter: SimonKing | Owner: jason
Type: enhancement | Status: needs_review
Priority: major | Milestone: sage-4.7
Component: misc | Keywords: category cython cache
Work_issues: | Upstream: N/A
Reviewer: | Author: Simon King
Merged: | Dependencies: #9976
---------------------------+------------------------------------------------
Changes (by SimonKing):
* status: needs_work => needs_review
* dependencies: => #9976
Old description:
> Broadening the original description of the ticket:
>
> The aim is to provide a Cython version of the cached_method decorator.
> There are three benefits:
>
> 1)
> Speed (see timings in the comments)
>
> 2)
> Using cached methods in Cython code.
>
> 3) Parent and element methods of categories
>
> Let me elaborate on the last point:
>
> In order to make full use of the parent and element methods of a
> category, it should be possible to define a ''cached'' method in the
> category, so that any object of that category inherits it ''with
> caching''.
>
> Currently, it fails as follows:
> {{{
> sage: class MyCategory(Category):
> ....: def super_categories(self):
> ....: return [Objects()]
> ....: class ParentMethods:
> ....: @cached_method
> ....: def f(self, x):
> ....: return x^2
> ....:
> sage: cython("""from sage.structure.parent cimport Parent
> ....: cdef class MyClass(Parent): pass
> ....: """)
> sage: O = MyClass(category=MyCategory())
> sage: O.f(4)
> 16
> sage: O.f(x) is O.f(x)
> False
> }}}
>
> So, the method is inherited, but not cached.
>
> Depends on #9976
New description:
Broadening the original description of the ticket:
The aim is to provide a Cython version of the cached_method decorator.
There are three benefits:
1)
Speed (see timings in the comments)
2)
Using cached methods in Cython code.
3) Parent and element methods of categories
Let me elaborate on the last point:
In order to make full use of the parent and element methods of a category,
it should be possible to define a ''cached'' method in the category, so
that any object of that category inherits it ''with caching''.
Currently, it fails as follows:
{{{
sage: class MyCategory(Category):
....: def super_categories(self):
....: return [Objects()]
....: class ParentMethods:
....: @cached_method
....: def f(self, x):
....: return x^2
....:
sage: cython("""from sage.structure.parent cimport Parent
....: cdef class MyClass(Parent): pass
....: """)
sage: O = MyClass(category=MyCategory())
sage: O.f(4)
16
sage: O.f(x) is O.f(x)
False
}}}
So, the method is inherited, but not cached.
Depends on #9976
Apply trac11115-cached_cython.patch
--
Comment:
I have replaced the previous patches by a single patch. I am sorry in
advance for the long post below.
It was indicated in previous posts that there was a certain danger of
slowing down object creation. Also, one may think of startuptime. It seems
to me that it is now solved.
I announced that I would like to use the new `__cached_methods` attribute
of parents to store attributes that are obtained by slow attribute lookup
in the category framework, and to make lazy attributes work on all
parents.
Here are examples. I compare sage-4.7.alpha5 plus #9976 (referred to by
"without patch") with sage-4.7.alpha5 plus #9976 plus the patch from here
(referred to by "with patch").
__1. Startuptime__
With patch:
{{{
...
== Slowest (including children) ==
1.155 sage.all (None)
0.275 sage.schemes.all (sage.all)
0.232 sage.misc.all (sage.all)
0.158 elliptic_curves.all (sage.schemes.all)
0.155 ell_rational_field (elliptic_curves.all)
...
}}}
Without patch:
{{{
...
== Slowest (including children) ==
1.229 sage.all (None)
0.278 sage.schemes.all (sage.all)
0.241 sage.misc.all (sage.all)
0.161 elliptic_curves.all (sage.schemes.all)
0.159 ell_rational_field (elliptic_curves.all)
...
}}}
=> no slow down.
__2. Element creation__
This is with patch. I indicate the corresponding timings without patch in
the comments:
{{{
sage: a = get_memory_usage()
sage: K = GF(2)
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 25.23 s, sys: 0.02 s, total: 25.25 s
Wall time: 25.26 s # without patch: 25.52 s
sage: get_memory_usage() - a
78.49609375 # same as without patch
sage: a = get_memory_usage()
sage: K = GF(101)
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 25.13 s, sys: 0.03 s, total: 25.16 s
Wall time: 25.16 s # without patch: 26.20 s
sage: get_memory_usage() - a
0.0
sage: a = get_memory_usage()
sage: K = GF(next_prime(10^6))
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 87.40 s, sys: 0.24 s, total: 87.63 s
Wall time: 87.64 s # without patch: 88.87 s
sage: get_memory_usage() - a
794.94140625 # without patch: 794.94921875
}}}
=> no slow down.
__3. Cached methods in Python code__
Here, I consider existing code, namely `gens()` and `groebner_basis()` for
multivariate polynomial ideals. Again, the timings are with patch, the old
timings are in comments
{{{
sage: P.<x,y> = QQ[]
sage: I = P*[x,y]
sage: timeit('I.gens()',number=10^7)
10000000 loops, best of 3: 142 ns per loop
# Without patch:
# 10000000 loops, best of 3: 703 ns per loop
sage: timeit('I.groebner_basis()',number=10^7)
10000000 loops, best of 3: 249 ns per loop
# Without patch:
# 10000000 loops, best of 3: 746 ns per loop
}}}
__4. Accessing parent attributes that are inherited from the category, if
the element class of the category does not appear in the method resolution
order__
It may happen that a parent is not an instance of the parent class of its
category. That can be for two reasons: Either the category is not properly
initialised (it is partially taken care of in #9944 and #9138), or it is a
Cython class (if I remember correctly, they don't play well with dynamic
classes). Then, attribute lookup must go beyond the usual method
resolution order, and that's very slow. So, I suggest to use a shortpath
in that case:
{{{
sage: P.<x,y> = QQ[]
sage: P._is_category_initialized()
False
sage: P._init_category_(Algebras(QQ))
sage: P.sum
<bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in
x, y over Rational Field>
sage: P.sum.__module__
'sage.categories.commutative_additive_monoids'
sage: timeit('P.sum', number=10^6)
1000000 loops, best of 3: 755 ns per loop
# Without patch
# 1000000 loops, best of 3: 3.76 µs per loop
sage: P.__cached_methods['sum']
<bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in
x, y over Rational Field>
}}}
__5. Cython: Lazy attributes and cached methods inherit from the
category__
The following would not work at all without the patch.
We define a category (written in Cython) with some cached methods.
{{{
sage: cython_code = ["from sage.all import cached_method,
cached_in_parent_method, Category",
... "class MyCategory(Category):",
... " @cached_method",
... " def super_categories(self):",
... " return [Objects()]",
... " class ElementMethods:",
... " @cached_method",
... " def element_cache_test(self):",
... " return -self",
... " @cached_in_parent_method",
... " def element_via_parent_test(self):",
... " return -self",
... " class ParentMethods:",
... " @cached_method",
... " def one(self):",
... " return self.element_class(self,1)",
... " @cached_method",
... " def invert(self, x):",
... " return -x"]
sage: cython('\n'.join(cython_code))
}}}
''Case 1''
Define elements and parents as Python classes (but in Cython code), so
that attribute assignment is possible. That's the easy case.
{{{
sage: cython_code = ["from sage.structure.element cimport Element",
... "class MyElement(Element):",
... " def __init__(self,P,x):",
... " self.x=x",
... " Element.__init__(self,P)",
... " def __neg__(self):",
... " return MyElement(self.parent(),-self.x)",
... " def _repr_(self):",
... " return '<%s>'%self.x",
... " def __hash__(self):",
... " return hash(self.x)",
... " def raw_test(self):",
... " return -self",
... "from sage.structure.parent cimport Parent",
... "class MyParent(Parent):",
... " Element = MyElement"]
sage: cython('\n'.join(cython_code))
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: e = MyElement(P,5)
}}}
Cached method for the parent:
{{{
# Cache without arguments
sage: P.one()
<1>
sage: P.one() is P.one()
True
sage: timeit('P.one()', number=10^6)
1000000 loops, best of 3: 210 ns per loop
# Cache with arguments
sage: P.invert(e)
<-5>
sage: P.invert(e) is P.invert(e)
True
sage: timeit('P.invert(e)', number=10^6)
1000000 loops, best of 3: 815 ns per loop
}}}
Cached methods for elements (one with cache in the element, the other with
cache in the parent):
{{{
# Cache without arguments
sage: e.element_cache_test()
<-5>
sage: e.element_cache_test() is e.element_cache_test()
True
sage: timeit('e.element_cache_test()', number=10^6)
1000000 loops, best of 3: 173 ns per loop
# Cached in parent
sage: e.element_via_parent_test()
<-5>
sage: e.element_via_parent_test() is e.element_via_parent_test()
True
sage: timeit('e.element_via_parent_test()', number=10^6)
1000000 loops, best of 3: 574 ns per loop
# Comparison with element creation:
sage: e.raw_test()
<-5>
sage: e.raw_test() is e.raw_test()
False
sage: timeit('e.raw_test()', number=10^6)
1000000 loops, best of 3: 1.57 µs per loop
}}}
''Case 2''
We use Cython classes for which attribute assignment is impossible. This
is no problem at all, for the parent class. For the element class, we need
to provide an additional public attribute `__cached_methods`, or things
partially break:
{{{
sage: cython_code = ["from sage.structure.element cimport Element",
... "cdef class MyBrokenElement(Element):",
... " cdef object x",
... " def __init__(self,P,x):",
... " self.x=x",
... " Element.__init__(self,P)",
... " def __neg__(self):",
... " return MyElement(self.parent(),-self.x)",
... " def _repr_(self):",
... " return '<%s>'%self.x",
... " def __hash__(self):",
... " return hash(self.x)",
... " def raw_test(self):",
... " return -self",
... "cdef class MyElement(Element):",
... " cdef public dict __cached_methods",
... " cdef object x",
... " def __init__(self,P,x):",
... " self.x=x",
... " Element.__init__(self,P)",
... " def __neg__(self):",
... " return MyElement(self.parent(),-self.x)",
... " def _repr_(self):",
... " return '<%s>'%self.x",
... " def __hash__(self):",
... " return hash(self.x)",
... " def raw_test(self):",
... " return -self",
... "from sage.structure.parent cimport Parent",
... "cdef class MyParent(Parent):",
... " Element = MyElement"]
sage: cython('\n'.join(cython_code))
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: ebroken = MyBrokenElement(P,5)
sage: e = MyElement(P,5)
}}}
Every parent should have a '''lazy attribute''' `element_class` -- but so
far, it used to fail on extension classes. That problem is solved by the
patch:
{{{
sage: P.element_class
<type '_mnt_local_king__sage_temp_mpc622_29604_tmp_2_spyx_0.MyElement'>
}}}
If attribute assignment is impossible then the cache (for parents) still
works, but gets a bit slower. Without the patch, the cache would break,
and even a working cache would be slower.
{{{
sage: P.one()
<1>
sage: P.one() is P.one()
True
sage: timeit('P.one()', number=10^6)
1000000 loops, best of 3: 685 ns per loop
# Cache with arguments
sage: P.invert(e)
<-5>
sage: P.invert(e) is P.invert(e)
True
sage: timeit('P.invert(e)', number=10^6)
1000000 loops, best of 3: 1.06 µs per loop
}}}
We now consider the broken cache for elements. It turns out that the
cached method for the "broken" element class can still be called, but is
not cached. However, the cached-in-parent method ''is'' cached, but is
terribly slow:
{{{
sage: ebroken.element_cache_test()
<-5>
# This is broken:
sage: ebroken.element_cache_test() is ebroken.element_cache_test()
False
sage: ebroken.element_via_parent_test()
<-5>
# This is not broken:
sage: ebroken.element_via_parent_test() is
ebroken.element_via_parent_test()
True
# But it is very very slow:
sage: timeit('ebroken.element_via_parent_test()')
625 loops, best of 3: 31.2 µs per loop
}}}
However, the simple addition of a public dict `__cached_methods` make the
cache work nicely (even though attribute assignment is still faster):
{{{
sage: e.element_cache_test()
<-5>
sage: e.element_cache_test() is e.element_cache_test()
True
sage: timeit('e.element_cache_test()', number=10^6)
1000000 loops, best of 3: 921 ns per loop
# Cached in parent
sage: e.element_via_parent_test()
<-5>
sage: e.element_via_parent_test() is e.element_via_parent_test()
True
sage: timeit('e.element_via_parent_test()', number=10^6)
1000000 loops, best of 3: 1.13 µs per loop
# Comparison with element creation
sage: timeit('e.raw_test()', number=10^6)
1000000 loops, best of 3: 696 ns per loop
}}}
__6. Clear cache on pickle__
There was a special class `ClearCacheOnPickle` in the cachfunc module.
However, it was not sufficiently documented (and not provided with tests),
and in fact it was broken. I made it work, and this is one is taken from
the doc strings:
{{{
In the following example, we create a Python class that inherits
from multivariate polynomial ideals, but does not pickle cached
values. We provide the definition in Cython, however, since
interactive Cython definitions provide introspection by trac
ticket #9976, whereas Python definitions don't.
::
sage: P.<a,b,c,d> = QQ[]
sage: I = P*[a,b]
sage: classdef = ['from sage.misc.cachefunc import
ClearCacheOnPickle',
... 'from sage.all import QQ',
... 'P = QQ["a","b","c","d"]; I = P*[P.gen(0),P.gen(1)]',
... 'class MyClass(ClearCacheOnPickle,I.__class__):',
... ' def __init__(self,ring,gens):',
... ' I.__class__.__init__(self,ring,gens)',
... ' def __getnewargs__(self):',
... ' return
(self._Ideal_generic__ring,self._Ideal_generic__gens)']
sage: cython('\n'.join(classdef))
We destroy the cache of two methods of ``I`` on purpose
(demonstrating that the two different implementations of cached
methods are correctly dealt with). Pickling ``I`` preserves the
cache::
sage: I.gens.set_cache('bar')
sage: I.groebner_basis.set_cache('foo',algorithm='singular')
sage: J = loads(dumps(I))
sage: J.gens()
'bar'
sage: J.groebner_basis(algorithm='singular')
'foo'
However, if we have an ideal that additionally descends from
:class:`ClearCacheOnPickle`, the carefully corrupted cache is not
pickled::
sage: A = MyClass(P,[a,b])
sage: A
Ideal (a, b) of Multivariate Polynomial Ring in a, b, c, d over
Rational Field
sage: A.gens.set_cache('foo')
sage: A.groebner_basis.set_cache('bar',algorithm='singular')
sage: A.gens()
'foo'
sage: A.groebner_basis(algorithm='singular')
'bar'
sage: B = loads(dumps(A))
sage: B.gens()
[a, b]
sage: B.groebner_basis(algorithm='singular')
[a, b]
sage: A.gens()
'foo'
}}}
Doctests pass for me. So, I guess it is in need of review, again!
For the patchbot:
Depends on #9976
Apply trac11115-cached_cython.patch
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11115#comment:41>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.