#16316: cached_function and cached_method for unhashable elements
-------------------------------------+-------------------------------------
Reporter: saraedum | Owner:
Type: enhancement | Status: needs_work
Priority: minor | Milestone: sage-6.3
Component: misc | Resolution:
Keywords: | Merged in:
Authors: Julian Rueth | Reviewers: Peter Bruin
Report Upstream: N/A | Work issues:
Branch: | Commit:
u/saraedum/ticket/16316 | dc37d43b55bfdff5c3c4b93be8c298955d53872e
Dependencies: #16251 | Stopgaps:
-------------------------------------+-------------------------------------
Changes (by tscrim):
* status: needs_review => needs_work
Comment:
Replying to [comment:19 saraedum]:
> I'm not sure I understand. If p-adics implemented a non-trivial
`__hash__` then this could never be made to be consistent with our current
`==`. The intermediate function creates something with a new `==` which is
then consistent with its `__hash__`. Does this answer your question?
Just making sure it is different than the hash requirements. By unique you
mean this bit of doc:
{{{
An implementation must make sure that for elements ``a`` and
``b``, if
``a != b``, then also ``a._cache_key() != b._cache_key()``. In
pratice
this means that the ``_cache_key`` should always include the
``id`` of
the parent.
}}}
whose the contrapositive is if `a._cache_key() == b._cache_key()`, then `a
== b` (assuming that `==` in `p`-adics satisfies `a == b` iff `not (a !=
b)`). Now this is the opposite requirement of hash: `a == b` implies
`hash(a) == hash(b)`. In order to retrieve elements from the cache, you
need that `a._cache_key` never changes over the lifetime of `a`.
Enforcement of this is no different than that of `__hash__`.
I'm good with the new wording (BTW, there's a typo (pratice) and this
docstring needs `::`).
However there is a code smell to me. The more I think about it, the more I
feel what should be done is `==` be strengthened (i.e. less things compare
as equals) and a `__hash__` method implemented.
----
Now for some timings, before:
{{{
sage: @cached_function
....: def foo(x):
....: return x
....:
sage: R.<x> = QQ[]
sage: C = crystals.Tableaux(['A',2], shape=[2,1])
sage: p = R(range(100))
sage: q = R(range(10000))
sage: pl = R(range(1000000))
sage: m = matrix(10, 10, range(100))
sage: %timeit foo(5)
1000000 loops, best of 3: 1.46 µs per loop
sage: %timeit foo(C)
1000000 loops, best of 3: 942 ns per loop
sage: %timeit foo(p)
1000 loops, best of 3: 387 µs per loop
sage: %timeit foo(q)
10 loops, best of 3: 29.5 ms per loop
sage: %timeit foo(pl)
1 loops, best of 3: 2.85 s per loop
sage: foo(m) # fails as it should
}}}
with the branch:
{{{
sage: %timeit foo(5)
1000000 loops, best of 3: 1.51 µs per loop
sage: %timeit foo(C)
1000000 loops, best of 3: 976 ns per loop
sage: %timeit foo(p)
1000 loops, best of 3: 390 µs per loop
sage: %timeit foo(q)
10 loops, best of 3: 30.4 ms per loop
sage: %timeit foo(pl)
1 loops, best of 3: 3.07 s per loop
sage: foo(m) # Still fails
}}}
So there's no statistically significant slowdown that I can see.
----
Some other design questions/comments:
- Is there some reason why you didn't `cpdef _cache_key(self)` or make it
into a `@lazy_attribute` since it cannot change over the lifetime of the
object? The latter would actually speed things up for things whose hash is
long to compute (ex. large degree dense polynomials).
- I'm worried about not using the hash for polynomials for caching. I feel
that `_cache_key` should result `self` for every hashable object. For
example, this no longer would cache as the same element:
{{{
sage: R.<x> = QQ[]
sage: p = R(range(100))
sage: hash(p)
1520766690567384545
sage: hash(p._cache_key())
-4253877232222459194
sage: R.<x> = ZZ[]
sage: p = R(range(100))
sage: hash(p)
1520766690567384545
sage: hash(p._cache_key())
6965062123311622150
}}}
So now you're requiring everyone to be ''very'' careful about the parent
an object belongs too.
- I'm pretty sure using the parent's `id` will break pickling with the
cache. So suppose you create parent `A`, store an element `a` in a cache
`C`. Now you pickle and unpickle `a`, `A`, and `C`. The `id` of `A` is
different than before the pickle and unpickling, so we are no longer able
to retrieve `a` from `C`:
{{{
sage: R = QQ['x']
sage: p = R(range(100))
sage: hash(p._cache_key())
-4253877232222459194
sage: dumps(p)
# This is some pickle
# Start a new sage session:
sage: s = # The pickle above
sage: hash(loads(s)._cache_key())
2293648738429524678
sage: hash(loads(s)) # This is persistent across sessions
1520766690567384545
}}}
I think both of the last two are issues that need to be addressed.
--
Ticket URL: <http://trac.sagemath.org/ticket/16316#comment:22>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.