Re: [sage-devel] Where should cached_method put its cache?

Robert Bradshaw Fri, 28 Jan 2011 02:15:12 -0800

On Fri, Jan 28, 2011 at 2:02 AM, Nicolas M. Thiery
<[email protected]> wrote:
>        Hi Simon!
>
> Sorry for my late answer; I missed this e-mail in the sage-devel flow ...
>
> On Fri, Jan 07, 2011 at 10:58:49AM -0800, Simon King wrote:
>> IMHO, the cached_... decorators are currently too slow.  It seems to
>> me that too much time is spent for looking up the cache (calling the
>> method get_cache()).
>
> There is a consensus on that :-)


Yep. It's sad when sometimes it's quicker to not cache non-trivial
computations.

>> I'd like to invest some work in it, but I need some info:
>>
>> The cache can be
>>  * a dictionary self.cache of the decorated function (this is for
>> CachedFunction),
>>  * a dictionary-like instance self.cache of class FileCache (this is
>> for DiskCachedFunction),
>>  * a dictionary that is an attribute of the instance to which the
>> cached method is bound (this is for CachedMethod), or
>>  * a dictionary that is an attribute of the parent of the instance to
>> which the cached method is bound (this is for CachedInParentMethod).
>>
>> Sometimes, looking up the cache requires two calls (get_cache() and
>> _get_instance_cache()). Moreover, C extension types don't provide a
>> __dict__ attribute, which is a problem for CachedMethod and
>> CachedInParentMethod.
>>
>> The cache in the case of CachedFunction and DiskCachedFunction is, of
>> course, easy to find: It is self.cache, where "self" is a
>> CachedFunction (or DiskCachedFunction) instance.
>>
>> Let us now consider CachedMethod.
>> Let x be an instance that has a cached method (or cached in parent
>> method) foo. The __get__ method of CachedMethod turns x.foo into an
>> instance of CachedMethodCaller, that is specific to x. Now, I wonder
>> why the method is not cached in an attribute of the CachedMethodCaller
>> instance.
>>
>> Or explicitly: What is the advantage of caching x.foo in x._cache__foo
>> rather than in x.foo.cache? The disadvantage is that the string
>> "_cache__foo" must be created and then x.__dict__["_cache__foo"] must
>> be obtained, which costs time.
>
> There are several reasons to cache the results in x rather than in the
> method foo:
>
> (a) You don't need to calculate the hash of x to retrieve something
>    from the cache; that can be important for large objects (with a
>    costly hash function) with methods taking small arguments as
>    input.
>
> (b) The cache is automatically pickled with the object
>
> (c) It makes it easy to clear / invalidate the cache of a particular object
>
> (d) If the object goes of scope and is wiped from memory, then the
>    same occurs to its cache (a desirable feature; otherwise you would
>    probably be using CachedInParent).
>
>    The same apply for CachedInParent: if the parent goes out of
>    scope, then so does the cache. Otherwise you would probably be
>    using a cached function.

+1 to all of the above, I think a cached result should be put on the
object itself itself.

> Note that there is also a little technical hurdle:
>
>    sage: class bla:
>    ....:     @cached_method
>    ....:     def f(): pass
>    sage: x = bla()
>    sage: x.f.cache = 1
>    sage: x.f.cache
>    ------------------------------------------------------------
>    Traceback (most recent call last):
>      File "<ipython console>", line 1, in <module>
>    AttributeError: 'CachedMethodCaller' object has no attribute 'cache'
>
> That's because x.f is a new object each time; the following works:
>
>    sage: xf = x.f
>    sage: xf.cache = 1
>    sage: xf.cache
>    1
>
> Of course, that could be worked around by taking over the management
> of assignment to x.f.cache using, e.g., a property.
>
>
> A further note about (c): in fact, I would really like to have all the
> cache be grouped in a single attribute x._cache, using x._cache["foo"]
> or even x._cache.foo rather than x._cache_foo. Advantages:
>
>  - it's easier to clear/invalidate all the cache at once
>  - less pollution of the name space
>  - objects that need fast caching could have x._cache (or maybe even
>   x._cache.foo) as a Cython attribute

That's a really nice idea.

> There is one issue that needs to be discussed as well: if we change
> the location where the cache is stored in objects, then any currently
> pickled object will loose its cache. Do we care? I doubt anyone in the
> Sage-Combinat crowd cares at this point (we don't have large pickle
> jars of objects, especially with relevant cache). But someone else could!

I think caches are, by nature, potentially ephemeral, so it's OK to
get back fully-intact objects without their caches. Of coures, someone
may be counting on the fact that they've computed this before
pickling--if so they should speak up now.

- Robert

-- 
To post to this group, send an email to [email protected]
To unsubscribe from this group, send an email to 
[email protected]
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Re: [sage-devel] Where should cached_method put its cache?

Reply via email to