#8611: speed up cached_function and cached_method
---------------------------------------+------------------------------------
Reporter: jason | Owner: tbd
Type: enhancement | Status: needs_review
Priority: major | Milestone: sage-4.6.1
Component: misc | Keywords: cached method
Author: Jason Grout, Simon King | Upstream: N/A
Reviewer: | Merged:
Work_issues: |
---------------------------------------+------------------------------------
Changes (by newvalueoldvalue):
* keywords: => cached method
* status: needs_work => needs_review
* work_issues: Patch doesn't apply =>
* author: Jason Grout => Jason Grout, Simon King
Comment:
Replying to [comment:4 davidloeffler]:
> Might it not be better for @cached_method to store its cache as an
attribute of the instance object, rather than having a single cache which
stores the values for all instances of that class?
It was not the case that there was a single cache for all instances (the
cache was in the instance for `cached_method` resp. in the parent of the
instance for `cached_in_parent_method`). However, a part of the overhead
came from the fact that the cached method ''itself'' has not been an
attribute of the instance but of its class.
I just attached a new patch that goes far beyond Jason's approach.
These examples show how the cached method performance is
improved by my patch:
'''__Using instance attributes__'''
Polynomial ideals have a cached method `groebner_basis`. Without
my patch, asking for `I.groebner_basis` would return a
`CachedMethodCaller` - always a ''new instance'' whenever the cached
method is requested. Of course, that's a waste of time. Therefore,
I store the cached method caller as an attribute of the instance,
which is done when the attribute is first requested:
{{{
sage: R.<x, y, z> = PolynomialRing(QQ, 3)
sage: I = R*(x^3 + y^3 + z^3,x^4-y^4)
sage: I.__dict__.has_key('groebner_basis')
False
sage: I.groebner_basis
Cached version of <function groebner_basis at 0x15a0668>
sage: I.__dict__['groebner_basis']
Cached version of <function groebner_basis at 0x15a0668>
sage: I.groebner_basis is I.groebner_basis # False without my patch
True
}}}
This already yields a considerable speedup. However, pickling posed
a problem: Since the cached method caller is in the instance's
dictionary, it would be attempted to pickle it when the instance is
pickled. But functions can not be pickled.
The solution is "creative" (I hope you don't find it crazy - at least
it works!):
* `CachedMethodCaller` has a `__reduce__` method, that
creates a pickle - but when unpickling it, something else is created,
namely an instance of the new class `CachedMethodPickle`. It only knows
the name of the original cached method, and it knows the instance to
which it belongs.
* Now, after unpickling, one wants to work with the cached method.
Working means: Requesting attributes (like `__call__`).
`CachedMethodPickle`
has a `__getattr__` method, which first removes the entry of the
instance's dictionary pointing to it; so, it effectively commits
suicide.
* Since the `CachedMethodPickle` has been removed, the original cached
method is again available from the class of the instance.
Here is an example:
{{{
sage: I.groebner_basis()
[y^5*z^3 - 1/4*x^2*z^6 + 1/2*x*y*z^6 + 1/4*y^2*z^6, x^2*y*z^3 - x*y^2*z^3
+ 2*y^3*z^3 + z^6, x*y^3 + y^4 + x*z^3, x^3 + y^3 + z^3]
}}}
We now pickle and unpickle the ideal. The cached method
`groebner_basis` is replaced by a placeholder::
{{{
sage: J = loads(dumps(I))
sage: J.groebner_basis
Pickle of the cached method "groebner_basis"
}}}
But as soon as any other attribute is requested from the
placeholder, it replaces itself by the cached method, and
the entries of the cache are actually preserved::
{{{
sage: J.groebner_basis.is_in_cache()
True
sage: J.groebner_basis
Cached version of <function groebner_basis at 0x...>
sage: J.groebner_basis() == I.groebner_basis()
True
}}}
'''__Creating the cache key__'''
1. I found that most of the overhead comes from the computation
of the cache key. It relies on
`sage.misc.function_mangling.ArgumentFixer`.
Internally, it works on lists and has some loops. So, I thought
it would benefit from Cython - and indeed it does!
Moreover, I made a shortpath in the method `fix_to_pos`: If there
are no named arguments then the result is essentially the list of
positional arguments plus the list of arguments with default value.
2. Some more cache key improvements take place inside
`sage.misc.cachefunc`.
Originally, the key was obtained by a method `get_key`, which was then
calling `_get_instance_key`. In order to reduce the overhead of calling
two methods, I directly inserted the code of `_get_instance_key` and
removed that method.
3. I also use Jason's idea of creating a shortpath for the common case
of an empty argument list. However, if the argument list is empty,
there might still be arguments with a default value. But we want
that explicitly providing these default values yields the same cache
key - hence, the following would not work with Jason's patch (continuing
the example above):
{{{
sage: I.groebner_basis(algorithm='') is I.groebner_basis()
True
}}}
The solution is to compute the cache key for an empty argument list
''once'', and store the result for later. That's another reason why it
is good to have one `CachedMethodCaller` permanently attached to an
instance (rather than always creating a new one), since otherwise
caching of the default key would be difficult.
'''__Accessing the cache__'''
Originally, a method `get_cache` was used to get the cache dictionary.
In some cases, that method used to call another one,
`_get_instance_cache`.
Hence, again, there was an overhead of calling two methods.
By my patch, the cache is ''always'' available as an attribute of
the `CachedMethodCaller` instance - hence, access is superfast. There is
almost no memory problem, since it is simply a new pointer to a
dictionary that is an attribute of the instance (resp. of its parent):
{{{
sage: I.groebner_basis.cache is I._cache__groebner_basis
True
}}}
There is an additional benefit of this approach: If the instance has
no `__dict__` and does not accept an attribute to be set, it is still
possible to cache the method. Similarly, if the parent of the instance
does not accept attributes to be set, `cached_in_parent_method` would
still result in a cached method (but it would be cached in the instance,
not in its parent).
'''__Documentation__'''
I extended the documentation - but I did not cover `ClearCacheOnPickle`,
which my patch does not change. Apart from that, the doctest coverage
of `sage.misc.cachefunc` is complete. For `sage.misc.function_mangling`,
the coverage script complains "Please add a `TestSuite(s).run()` doctest."
I don't know were that comes from.
Of course, all doctests pass for me. I did not test whether the
documentation
looks nice in html.
'''__Performance__'''
Last but not least, what the ticket is about: Performance!
Here is the setting:
{{{
sage: class Foo:
....: def __init__(self,P=None):
....: if P is None:
....: self._parent=self
....: else:
....: self._parent=P
....: @cached_method
....: def bar(self,m,n=3):
....: return m^n
....: @cached_in_parent_method
....: def barP(self,m,n=3):
....: return m^n
....: def parent(self):
....: return self._parent
....:
sage: F = Foo()
sage: F_ZZ = Foo(ZZ)
}}}
__Without my patch__
{{{
# cache in F:
sage: F.bar(2,3) is F.bar(2) is F.bar(2,n=3) is F.bar(n=3,m=2)
True
sage: timeit('F.bar(2)', number=10000)
10000 loops, best of 3: 18 µs per loop
sage: timeit('F.bar(2,3)', number=10000)
10000 loops, best of 3: 18.2 µs per loop
sage: timeit('F.bar(2,n=3)', number=10000)
10000 loops, best of 3: 20.2 µs per loop
sage: timeit('F.bar(n=3,m=2)', number=10000)
10000 loops, best of 3: 20.2 µs per loop
}}}
{{{
# cache in F as a parent of itself:
sage: F.barP(2,3) is F.barP(2) # BUG!
False
sage: F.barP.get_key(2,3)
((<__main__.Foo instance at 0x4570d88>, 2, 3), ())
sage: F.barP.get_key(2)
((<__main__.Foo instance at 0x4570d88>, 2), ())
sage: timeit('F.barP(2)', number=10000)
10000 loops, best of 3: 21.9 µs per loop
sage: timeit('F.barP(2,3)', number=10000)
10000 loops, best of 3: 22.4 µs per loop
sage: timeit('F.barP(2,n=3)', number=10000)
10000 loops, best of 3: 23.8 µs per loop
sage: timeit('F.barP(n=3,m=2)', number=10000)
10000 loops, best of 3: 24.5 µs per loop
}}}
{{{
# cache in ZZ, which does not allow attribute assignment
sage: F_ZZ.barP(2)
Traceback (most recent call last):
...
AttributeError: 'dictproxy' object has no attribute 'setdefault'
}}}
__With my patch__
{{{
# cache in F:
sage: F.bar(2,3) is F.bar(2) is F.bar(2,n=3) is F.bar(n=3,m=2)
True
sage: timeit('F.bar(2)', number=10000)
10000 loops, best of 3: 3.27 µs per loop
sage: timeit('F.bar(2,3)', number=10000)
10000 loops, best of 3: 3.26 µs per loop
sage: timeit('F.bar(2,n=3)', number=10000)
10000 loops, best of 3: 3.63 µs per loop
sage: timeit('F.bar(n=3,m=2)', number=10000)
10000 loops, best of 3: 3.69 µs per loop
}}}
{{{
# cache in F as parent of itself:
# The bug has disappeared
sage: F.barP(2,3) is F.barP(2) is F.barP(2,n=3) is F.barP(n=3,m=2)
True
sage: timeit('F.barP(2)', number=10000)
10000 loops, best of 3: 6.09 µs per loop
sage: timeit('F.barP(2,3)', number=10000)
10000 loops, best of 3: 5.5 µs per loop
sage: timeit('F.barP(2,n=3)', number=10000)
10000 loops, best of 3: 5.86 µs per loop
sage: timeit('F.barP(n=3,m=2)', number=10000)
10000 loops, best of 3: 6.03 µs per loop
}}}
{{{
# cache in ZZ, which does not allow attribute assignment.
# It works, since the cache is now automatically moved
# to F_ZZ, as a last resort
sage: F_ZZ.barP(2,3) is F_ZZ.barP(2) is F_ZZ.barP(2,n=3) is
F_ZZ.barP(n=3,m=2)
True
sage: timeit('F_ZZ.barP(2)', number=10000)
10000 loops, best of 3: 5.6 µs per loop
sage: timeit('F_ZZ.barP(2,3)', number=10000)
10000 loops, best of 3: 5.56 µs per loop
sage: timeit('F_ZZ.barP(2,n=3)', number=10000)
10000 loops, best of 3: 5.86 µs per loop
sage: timeit('F_ZZ.barP(n=3,m=3)', number=10000)
10000 loops, best of 3: 6.1 µs per loop
}}}
Here is the effect of using a shortpath:
{{{
sage: class BAR:
....: @cached_method
....: def bar(self,m=2,n=3):
....: return m^n
....:
sage: B = BAR()
sage: B.bar() is B.bar(2,3)
True
}}}
Without my patch:
{{{
sage: timeit('B.bar(n=2,m=3)', number=10000)
10000 loops, best of 3: 20.4 µs per loop
sage: timeit('B.bar()', number=10000)
10000 loops, best of 3: 17.1 µs per loop
}}}
With my patch:
{{{
sage: timeit('B.bar(n=2,m=3)', number=10000)
10000 loops, best of 3: 3.73 µs per loop
sage: timeit('B.bar()', number=10000)
10000 loops, best of 3: 1.95 µs per loop
}}}
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/8611#comment:7>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.