[Tim] >> PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x >> and y are the same object, then equality comparison returns True >> and inequality False. No attempt is made to execute __eq__ or >> __ne__ methods in those cases. >> ... >> If it's intended that Python-the-language requires this, that needs to >> be documented.
[Raymond] > This has been slowly, but perhaps incompletely documented over the > years and has become baked in the some of the collections ABCs as well. > For example, Sequence.__contains__() is defined as: > > def __contains__(self, value): > for v in self: > if v is value or v == value: # note the identity test > return True > return False But it's unclear to me whether that's intended to constrain all implementations, or is just mimicking CPython's list.__contains__. That's always a problem with operational definitions. For example, does it also constrain all implementations to check in iteration order? The order can be visible, e.g, in the number of times v.__eq__ is called. > Various collections need to assume reflexivity, not just for speed, but so > that we > can reason about them and so that they can maintain internal consistency. For > example, MutableSet defines pop() as: > > def pop(self): > """Return the popped value. Raise KeyError if empty.""" > it = iter(self) > try: > value = next(it) > except StopIteration: > raise KeyError from None > self.discard(value) > return value As above, except CPyhon's own set implementation implementation doesn't faithfully conform to that: >>> x = set(range(0, 10, 2)) >>> next(iter(x)) 0 >>> x.pop() # returns first in iteration order 0 >>> x.add(1) >>> next(iter(x)) 1 >>> x.pop() # ditto 1 >>> x.add(1) # but try it again! >>> next(iter(x)) 1 >>> x.pop() # oops! didn't pop the first in iteration order 2 Not that I care ;-) Just emphasizing that it's tricky to say no more (or less) than what's intended. > That pop() logic implicitly assumes an invariant between membership and > iteration: > > assert(x in collection for x in collection) Missing an "all". > We really don't want to pop() a value *x* and then find that *x* is still > in the container. This would happen if iter() found the *x*, but discard() > couldn't find the object because the object can't or won't recognize itself: Speaking of which, why is "discard()" called instead of "remove()"? It's sending a mixed message: discard() is appropriate when you're _not_ sure the object being removed is present. > s = {float('NaN')} > s.pop() > assert not s # Do we want the language to guarantee that > # s is now empty? I think we must. I can't imagine an actual container implementation that wouldn't. but no actual container implements pop() in the odd way MutableSet.pop() is written. CPython's set.pop does nothing of the sort - doesn't even have a pointer equality test (except against C's NULL and `dummy`, used merely to find "the first (starting at the search finger)" slot actually in use). In a world where we decided that the identity shortcut is _not_ guaranteed by the language, the real consequence would be that the MutableSet.pop() implementation would need to be changed (or made NotImplemented, or documented as being specific to CPython). > The code for clear() depends on pop() working: > > def clear(self): > """This is slow (creates N new iterators!) but effective.""" > try: > while True: > self.pop() > except KeyError: > pass > > It would unfortunate if clear() could not guarantee a post-condition that the > container is empty: That's again a consequence of how MutableSet.pop was written. No actual container has any problem implementing clear() without needing any kind of object comparison. > s = {float('NaN')} > s.clear() > assert not s # Can this be allowed to fail? No, but as above it's a very far stretch to say that clear() emptying a container _relies_ on the object identity shortcut. That's a just a consequence of an odd specific clear() implementation, relying in turn on an odd specific pop() implementation that assumes the shortcut is in place. > The case of count() is less clear-cut, but even there > identity-implies-equality > improves our ability to reason about code: Absolutely! That "x is x implies equality" is very useful. But that's not the question ;-) > Given some list, *s*, possibly already populated, would you want the > following code to always work: > > c = s.count(x) > s.append(x) > assert s.count(x) == c + 1 # To me, this is fundamental > to what the word > "count" means. I would, yes. But it's also possible to define s.count(x) as sum(x == y for y in s) and live with the consequences of __eq__. > ... > Back to the discussion at hand, I had thought our position was roughly: > > * __eq__ can return anything it wants. > > * Containers are allowed but not required to assume that > identity-implies-equality. > > * Python's core containers make that assumption so that we can keep > the containers internally consistent and so that we can reason about > the results of operations. All reasonable! Python just needs something now like a benevolent dictator ;-) > Also, I believe that even very early dict code (at least as far back > as Py 1.5.2) had logic for "v is value or v == value". Memory fades, but it seems to me that very early Pythons may even have exploited the shortcut for `==` too. > ... > The current docs make an effort to describe what we have now: > https://docs.python.org/3/reference/expressions.html#value-comparisons Yes, that's been pointed out, and it's at worst "a good start". The people on the original PR that kicked this off weren't aware of that it existed. Terry Reedy said he's thinking about how to (at least) make it more discoverable, although at that time Guido appeared to be leaning "implementation defined" instead. [in another msg] > forget to mention that list.index() also uses PyObject_RichCompareBool() A quick scan found about 100 calls to PyObject_RichCompareBool passing Py_EQ. So it screams for a way to spell out what's required that doesn't degenerate into an exhaustive list of specific functions/methods/contexts. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/44XXRXK2MVDY7GKWTURZK7XFCHIR6JRX/ Code of Conduct: http://python.org/psf/codeofconduct/