On Fri, Jan 28, 2011 at 2:02 AM, Nicolas M. Thiery <[email protected]> wrote: > Hi Simon! > > Sorry for my late answer; I missed this e-mail in the sage-devel flow ... > > On Fri, Jan 07, 2011 at 10:58:49AM -0800, Simon King wrote: >> IMHO, the cached_... decorators are currently too slow. It seems to >> me that too much time is spent for looking up the cache (calling the >> method get_cache()). > > There is a consensus on that :-)
Yep. It's sad when sometimes it's quicker to not cache non-trivial computations. >> I'd like to invest some work in it, but I need some info: >> >> The cache can be >> * a dictionary self.cache of the decorated function (this is for >> CachedFunction), >> * a dictionary-like instance self.cache of class FileCache (this is >> for DiskCachedFunction), >> * a dictionary that is an attribute of the instance to which the >> cached method is bound (this is for CachedMethod), or >> * a dictionary that is an attribute of the parent of the instance to >> which the cached method is bound (this is for CachedInParentMethod). >> >> Sometimes, looking up the cache requires two calls (get_cache() and >> _get_instance_cache()). Moreover, C extension types don't provide a >> __dict__ attribute, which is a problem for CachedMethod and >> CachedInParentMethod. >> >> The cache in the case of CachedFunction and DiskCachedFunction is, of >> course, easy to find: It is self.cache, where "self" is a >> CachedFunction (or DiskCachedFunction) instance. >> >> Let us now consider CachedMethod. >> Let x be an instance that has a cached method (or cached in parent >> method) foo. The __get__ method of CachedMethod turns x.foo into an >> instance of CachedMethodCaller, that is specific to x. Now, I wonder >> why the method is not cached in an attribute of the CachedMethodCaller >> instance. >> >> Or explicitly: What is the advantage of caching x.foo in x._cache__foo >> rather than in x.foo.cache? The disadvantage is that the string >> "_cache__foo" must be created and then x.__dict__["_cache__foo"] must >> be obtained, which costs time. > > There are several reasons to cache the results in x rather than in the > method foo: > > (a) You don't need to calculate the hash of x to retrieve something > from the cache; that can be important for large objects (with a > costly hash function) with methods taking small arguments as > input. > > (b) The cache is automatically pickled with the object > > (c) It makes it easy to clear / invalidate the cache of a particular object > > (d) If the object goes of scope and is wiped from memory, then the > same occurs to its cache (a desirable feature; otherwise you would > probably be using CachedInParent). > > The same apply for CachedInParent: if the parent goes out of > scope, then so does the cache. Otherwise you would probably be > using a cached function. +1 to all of the above, I think a cached result should be put on the object itself itself. > Note that there is also a little technical hurdle: > > sage: class bla: > ....: @cached_method > ....: def f(): pass > sage: x = bla() > sage: x.f.cache = 1 > sage: x.f.cache > ------------------------------------------------------------ > Traceback (most recent call last): > File "<ipython console>", line 1, in <module> > AttributeError: 'CachedMethodCaller' object has no attribute 'cache' > > That's because x.f is a new object each time; the following works: > > sage: xf = x.f > sage: xf.cache = 1 > sage: xf.cache > 1 > > Of course, that could be worked around by taking over the management > of assignment to x.f.cache using, e.g., a property. > > > A further note about (c): in fact, I would really like to have all the > cache be grouped in a single attribute x._cache, using x._cache["foo"] > or even x._cache.foo rather than x._cache_foo. Advantages: > > - it's easier to clear/invalidate all the cache at once > - less pollution of the name space > - objects that need fast caching could have x._cache (or maybe even > x._cache.foo) as a Cython attribute That's a really nice idea. > There is one issue that needs to be discussed as well: if we change > the location where the cache is stored in objects, then any currently > pickled object will loose its cache. Do we care? I doubt anyone in the > Sage-Combinat crowd cares at this point (we don't have large pickle > jars of objects, especially with relevant cache). But someone else could! I think caches are, by nature, potentially ephemeral, so it's OK to get back fully-intact objects without their caches. Of coures, someone may be counting on the fact that they've computed this before pickling--if so they should speak up now. - Robert -- To post to this group, send an email to [email protected] To unsubscribe from this group, send an email to [email protected] For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
