#20246: Use `with strict_equality(True)` to work around hashing of p-adics
-------------------------------------+-------------------------------------
Reporter: saraedum | Owner:
Type: enhancement | Status: needs_work
Priority: major | Milestone: sage-7.2
Component: padics | Resolution:
Keywords: days71 | Merged in:
Authors: Julian RĂ¼th | Reviewers:
Report Upstream: N/A | Work issues:
Branch: | Commit:
u/saraedum/use__with_strict_equality_true___to_work_around_hashing_of_p_adics|
f4c5a9db729ccf911517039fdced282588210133
Dependencies: #16342, #16339 | Stopgaps:
-------------------------------------+-------------------------------------
Comment (by nbruin):
Replying to [comment:24 saraedum]:
> However, most parents get used as cache keys in our factories and
`UniqueRepresentation`. And parents may depend on p-adic numbers in some
way.
Parents that are used as cache-keys are usually not a problem: they are
usually `EqualityByID`, so they can basically hash by their ID. If p-adics
are part of the construction parameters (e.g., a p-adic extension field,
perhaps?) of a `unique` parent structure, a little more is needed, of
course. There either we would need `cache_key` or we'd need to transform
the construction parameters into something that can be hashed. For
extension fields, that's actually not such an unreasonable thing to do:
fields defined by a minimal polynomial that is only known to limited
precision is a real pain. It's much more convenient to lift it to an exact
polynomial.
In any case, I would be very surprised if there are any structures out
there that depend on anything else than sequences, polynomials, and
matrices. If those have cache_key strategies, we'd be OK.
> No but a lot of our code assumes that everything is usable as a key in a
dictionary for the sake of caching. I believe that this is a reasonable
assumption: Running a method on indistinguishable inputs should produce
the same output. And in some cases it should even produce the exact same
object, for example when creating parents; in those cases you can not do
without caching.
But I am pretty sure this is limited to polynomials and matrices.
> I am not sure what you mean with "high up". For correctness you have to
implement it in any class that gets instantiated with a parameter that may
depend (in some way) on something unhashable, because essentially you just
need to write this as a method that produces the constructor parameters
that serve to produce an indistinguishable element.
I meant: you could start sticking implementations into the more generic
classes for polynomials and matrices. I don't think we need to define
"correctness" as "p-adic numbers are hashable".
> Though I originally implemented `_cache_key` what I dislike about it is
that you just have to add silly boilerplate to many classes. The new
approach is much better in that sense. The changes are where the actual
problem arises, namely in the classes that can not define a usual
`__hash__`. For everybody else it just works and it is actually correct.
The current `_cache_key` is calling for trouble in the long run: People
add a constructor parameter but don't add it to `_cache_key`; the doctests
pass but caching eventually produces incorrect results.
Most parents do not need a `_cache_key`, because they are actually
hashable thanks to `EqualityByID`. I'd expect that only some element
classes need a `_cache_key`. So the scenario you sketch would only play
out if the identity-determining components of an element change.
> Of course, if the end user decides to run all its code `with
strict_equality(True)` then many things break. But the docstring tries to
be clear about this.
The fundamental point is that you make equality context dependent. It
means you now need to know global state in order to reason about code that
otherwise looks completely local. I think that's an incredibly big
complexification and a major step back in the understandability and
maintainability of sage code. Whatever you write in the docstring doesn't
change that fact. You can't fix bad design by good documentation.
> I think the "strategic places" is a problem here. What this says it that
using p-adics (or fraction field elements) just works in places where
there happened to be a doctest that implied p-adics (and usually extension
fields to actually trigger `__hash__`.) Why not make it work everywhere?
I think you are too pessimistic about the multitude of ways in which padic
numbers can end up being hashed in legitimate ways. I think you can get
this covered pretty quickly (polys and matrices). Plus: a lot of places
where p-adics aren't initially foreseen will actually break with p-adics,
because they are imprecise objects. That usually breaks algorithms that
weren't built with that in mind.
re. why not everywhere: because the proposed solution (making equality
testing depend on global state) incurs too high a price on the
maintainability of sage.
> Yes, global state is bad. The implementation is very clear on that and I
can elaborate on this further in the docstrings.
Being clear about it doesn't fix a bad design decision (an, I think,
justified technical qualification of the proposal which in no way reflects
on the proposers)
I'm sympathetic about the problem. The issue is really that equality tests
for p-adics are probably bad in all cases. We're finding two
interpretations that sort-of work in some circumstances:
- difference is indistinguishable from 0: if you start out with
sufficient starting precision, you can sort-of run "precise" algorithms
and get a more-or-less correct answer
- equal only if numbers are functionally the same: the safer notion for
hashing.
Option 1 at least allows us to do naive linear algebra on small systems,
with a ridiculous amount of excess precision. (in a way that works better
than with floats, say)
Option 2 is required to sort-of properly handle routines that assume
hashability.
We can't have both. Option 1 is what people expect when they start working
with p-adics initially. For option 2 we have a work-around in the form of
`cache_key` (which I thought was a stroke of brilliance!). So I think the
best way forward is to try and make `cache_key` work.
I agree it would be nicer to have another solution, but we're dealing with
two different concepts of equality here, and we can't cram them into the
same interface.
--
Ticket URL: <http://trac.sagemath.org/ticket/20246#comment:25>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.