> PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x and y are
> the same object, then equality comparison returns True and inequality False.
> No attempt is made to execute __eq__ or __ne__ methods in those cases.
>
> This has visible consequences all over the place, but they don't appear to be
> documented. For example,
>
> ...
> despite that math.nan == math.nan is False.
>
> It's usually clear which methods will be called, and when, but not really
> here. Any _context_ that calls PyObject_RichCompareBool() under the covers,
> for an equality or inequality test, may or may not invoke __eq__ or __ne__,
> depending on whether the comparands are the same object. Also any context
> that inlines these special cases to avoid the overhead of calling
> PyObject_RichCompareBool() at all.
>
> If it's intended that Python-the-language requires this, that needs to be
> documented.
This has been slowly, but perhaps incompletely documented over the years and
has become baked in the some of the collections ABCs as well. For example,
Sequence.__contains__() is defined as:
def __contains__(self, value):
for v in self:
if v is value or v == value: # note the identity test
return True
return False
Various collections need to assume reflexivity, not just for speed, but so that
we can reason about them and so that they can maintain internal consistency.
For example, MutableSet defines pop() as:
def pop(self):
"""Return the popped value. Raise KeyError if empty."""
it = iter(self)
try:
value = next(it)
except StopIteration:
raise KeyError from None
self.discard(value)
return value
That pop() logic implicitly assumes an invariant between membership and
iteration:
assert(x in collection for x in collection)
We really don't want to pop() a value *x* and then find that *x* is still in
the container. This would happen if iter() found the *x*, but discard()
couldn't find the object because the object can't or won't recognize itself:
s = {float('NaN')}
s.pop()
assert not s # Do we want the language to guarantee that
s is now empty? I think we must.
The code for clear() depends on pop() working:
def clear(self):
"""This is slow (creates N new iterators!) but effective."""
try:
while True:
self.pop()
except KeyError:
pass
It would unfortunate if clear() could not guarantee a post-condition that the
container is empty:
s = {float('NaN')}
s.clear()
assert not s # Can this be allowed to fail?
The case of count() is less clear-cut, but even there identity-implies-equality
improves our ability to reason about code: Given some list, *s*, possibly
already populated, would you want the following code to always work:
c = s.count(x)
s.append(x)
assert s.count(x) == c + 1 # To me, this is fundamental to what
the word "count" means.
I can't find it now, but remember a possibly related discussion where we
collectively rejected a proposal for an __is__() method. IIRC, the reasoning
was that our ability to think about code correctly depended on this being true:
a = b
assert a is b
Back to the discussion at hand, I had thought our position was roughly:
* __eq__ can return anything it wants.
* Containers are allowed but not required to assume that
identity-implies-equality.
* Python's core containers make that assumption so that we can keep
the containers internally consistent and so that we can reason about
the results of operations.
Also, I believe that even very early dict code (at least as far back as Py
1.5.2) had logic for "v is value or v == value".
As far as NaNs go, the only question is how far to propagate their notion of
irreflexivity. Should "x == x" return False for them? We've decided yes. When
it comes to containers, who makes the rules, the containers or their elements.
Mostly, we let the elements rule, but containers are allowed to make useful
assumptions about the elements when necessary. This isn't much different than
the rules for the "==" operator where __eq__() can return whatever it wants,
but functions are still allowed to write "if x == y: ..." and assumes that
meaningful boolean value has been returned (even if it wasn't). Likewise, the
rule for "<" is that it can return whatever it wants, but sorted() and min()
are allowed to assume a meaningful total ordering (which might or might not be
true). In other words, containers and functions are allowed, when necessary or
useful, to override the decisions made by their data. This seems like a
reasonable state of affairs.
The current docs make an effort to describe what we have now:
https://docs.python.org/3/reference/expressions.html#value-comparisons
Sorry for the lack of concision. I'm posting on borrowed time,
Raymond
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/UIZPD7OJRVID4EMO5WI7FUX6BR7XLR5D/
Code of Conduct: http://python.org/psf/codeofconduct/