#11895: padic_ZZ_pX_CR_element is unhashable
--------------------------------+-------------------------------------------
Reporter: mmasdeu | Owner: roed
Type: defect | Status: needs_work
Priority: critical | Milestone: sage-5.4
Component: padics | Resolution:
Keywords: p-adic, hash | Work issues: whitespace, line wrapping in
docstrings
Report Upstream: N/A | Reviewers: David Loeffler
Authors: David Roe | Merged in:
Dependencies: | Stopgaps:
--------------------------------+-------------------------------------------
Comment (by nbruin):
Three remarks:
- It looks like the hash maximally only contain about 3 decimal digits of
information. That's way too small! Hash table lookups are based on having
`O(1)` elements in the bins. To give you an idea of how small the constant
in the `O(1)` should be: Python's dictionaries make sure that the average
bin size in 0.7 (smaller than 1). You're fighting the birthday paradox
here, so don't underestimate the likelyhood of collisions. Anyway, when
using more than 1000 elements in a dictionary, the proposed hash will let
dictionaries just degrade to a linear search, but with way larger memory
overhead.
- It may seem that having hashes depend on precision (which isn't taken
into account in equality testing!) is a mild condition, but it's not. If
you have a large matrix or other aggregate element, you can easily end up
with some small precision elements in there. This will lead to very
confusing situation where a dict lookup doesn't find your key, but you
know you have a key that's equal to a stored one. Of course, if you're
relying on equality in p-adic computations you've already lost ...
- Caching functions on equality of arguments is a very bad idea with the
p-adics. If I construct a known invertible matrix to too low a precision,
I'll get determinant 0. If subsequently I construct the same matrix to
higher precision, a cached determinant function would recognize the
argument as equal to one he'd seem before (because equality is only to
lowest common precision) and would just give me the 0 back.
Properties 2 and 3 conspire to hide the problem a bit:
{{{
sage: @cached_function
....: def DISC(f):
....: return discriminant(f)
....:
sage: K=Qp(2,500)
sage: P.<x>=K[]
sage: f=(1+O(2^200))*x^2-(2^200+O(2^200))
sage: DISC(f)
0
sage: g=(1+O(2^300))*x^2-(2^200+O(2^300))
sage: DISC(g)
2^202 + O(2^302)
sage:
sage: hash(f), hash(g)
(15360174650385708, 15377766836429868)
sage: f == g
True
}}}
so the entry under f isn't found when looking up g because they end up
being stored in different bins, so the difference in hash is enough to
distinguish them. As soon as the dictionary gets resized in a way that
lets those two hashes end up in the same bin, it'll be a toss-up which
answer you get back from either `DISC(f)` or `DISC(g)`. Oh course, by then
it will be very painful to track down what happened.
To show that functions that depend on more of their arguments than just
equality should not be cached:
{{{
sage: a=3
sage: b=GF(5)(3)
sage: @cached_function
....: def test(a):
....: return a^2
....:
sage: [test(a),test(b)]
[9, 9]
sage: test.clear_cache()
sage: [test(b),test(a)]
[4, 4]
}}}
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11895#comment:6>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.