Re: [sympy] New assumptions and caching

Tom Bachmann Wed, 10 Apr 2013 08:46:40 -0700

On 10.04.2013 12:14, Ronan Lamy wrote:

The global cache, as it stands, is bad. Two name only two issues, it
can grow in long computations, and it has often lead to very subtle
bugs (when some function was not truly side-effect free).


Any other general objections to the global cache?

There is also a performance impact. Fetching data from the cache has a
non-trivial overhead. Also, on PyPy, it effectively disables the JIT,
since lookups from a global dictionary cannot be optimised much and are
probably significantly slower than executing a method than has been
compiled down to a dozen of machine instructions. The same
considerations will apply to CPython if we ever attempt to statically
compile the core (using e.g. Cython).


Thanks, I'll keep that in mind.


It is also felt that it should be unnecessary, as demonstrated by the
success of e.g. sympycore. On the other hand, we accept that some
algorithms (e.g. gruntz) can profit a lot from caching. Even if it
would be possible to redesign them in such a way as not to
unnecessarily repeat the same computation over and over, it is felt
that using a cache achieves the same thing at a much smaller cost.

Did I miss any important points regarding caching?


The question now becomes what to do about the cache, in the context of
the new assumptions system, both in the long and in the short term. In
the short term, I see three ways of making new assumptions work with
the cache:

(1) disable the cache
(2) clear the cache on every (gobal) assumptions change
(3) hash the assumptions during caching

>>>
>> [...]
>>>

(2) sounds to me like it would end up being the worst of both worlds,
with caching overhead remaining unchanged, but the cache providing
little benefit since it's cleared all the time. And even if we managed
to avoid clearing during a gruntz() call, this would be brittle against
changes in seemingly unrelated code.


Exactly my point.


Strategy (3) mitigates this to some extent, but not completely.

The big problem is that you need to identify which assumptions actually
matter, which means keeping a record of every time where you make a
decision using assumptions, otherwise it's just like (2).

Not quite, since if we add a gobal assumption and then delete it again,the previous cache entries will be allowed again.


But I was hardly advocating this strategy.



However, what I think should be observed, is that most modules will
need changing for the new assumptions system anyway. So doing *minor*
changes to the caching strategy is actually fine. This opens two more
options, I think:

(4) introduce a clear_cache keyword to assuming
(5) introduce local caches

By (4), I mean that we replace the above pattern by

x = dummy()
with assuming(Q.possitive(x), clear_cache=False):
do some stuff.

clear_cache will be True by default, and do what you think it does.
However, in the above situation, it is safe not to clear the cache,
since the dummy is fresh. I *think* this should solve most problems.
Indeed, this pattern always works when creating "fresh" symbols, and
so it works in all situations where the current assumptions are set up
(since we currently never allow to change them).

(4) looks like (2), only with a nicer syntax.


How is that? We change dummy(positive=True) to

with assuming(Q.possitive(x)) to with assuming(Q.possitive(x),clear_cache=False). Surely the syntax is getting worse?

The point of (4) is to be able to retain the "fresh symbol with newassumptions" pattern without unnecessarily flushing the cache.



By (5), I mean to add an additional "id" to the @cacheit decorator,
roughly like this (code written for clarity, not performance):

def cacheit(id, func):
def wrapped(*args):
if id in local_caches_dict:
if hash(args) in local_caches_dict[id]
return local_caches_dict[id][hash(args)]
else:
res = func(*args)
local_caches_dict[id][hash(args)] = res
return res
else:
with new_local_cache(id):
return wrapped(*args)
return wrapped

Here new_local_cache is a context manager which creates and destroys
an entry in local_caches_dict. This kind of cache decorator is
tailored towards the use of caching in gruntz. Will it help in other
cases?


How about letting gruntz() handle all the caching it needs explicitly?
There doesn't seem to be much point in making local_caches_dict global.
Having a gruntz_cache instead of local_caches_dict["gruntz"] seems cleaner.

Ah. This would of course be essentially equivalent. The reason Iproposed this form is because it stays close to the current syntax (the@cacheit decorator), and I did not want to complicate thingsunnecessarily. Also, by having a "caching framework" used by all modulesthat want caching, we encourage code reusing and code consistency.



It seems to me that, under the general strategy outlined in my
previous mail, (4) and (5) seem essentially equally beneficial. (5)
feels slightly cleaner to me.

What do you guys think?

Removing the global cache is actually rather appealing, even though it
probably needs to be replaced with several smaller (module-level ?) caches.

The caching problem would also be a lot less sensitive if Dummy('x',
positive=True) could continue to be a purely local operation that
doesn't affect any global state.

That would be quite nice. What about introducing "immutable"assumptions? That is, in addition to the global_assumptions variable, weintroduce another variable holding assumptions which are never allowedto change, except for being garbage collected. Part of the contract ofthis variable would be that you are only allowed to add assumptionsabout symbols which have never been ask'ed about.

[This would still manipulate global state, but none which is significantto caching.]

I'm not sure how good this solution is. I would need to investigate ifthe new assumptions system can deal with tons of irrelevant assumptions,or how to garbage-collect the immutable assumptions efficiently.


--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy?hl=en-US.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sympy] New assumptions and caching

Reply via email to