+1 on everything Raymond says here (and in his second message). I don't see a need for more classes or ABCs.
On Mon, Feb 3, 2020 at 00:36 Raymond Hettinger <raymond.hettin...@gmail.com> wrote: > > PyObject_RichCompareBool(x, y, op) has a (valuable!) shortcut: if x and > y are the same object, then equality comparison returns True and inequality > False. No attempt is made to execute __eq__ or __ne__ methods in those > cases. > > > > This has visible consequences all over the place, but they don't appear > to be documented. For example, > > > > ... > > despite that math.nan == math.nan is False. > > > > It's usually clear which methods will be called, and when, but not > really here. Any _context_ that calls PyObject_RichCompareBool() under the > covers, for an equality or inequality test, may or may not invoke __eq__ or > __ne__, depending on whether the comparands are the same object. Also any > context that inlines these special cases to avoid the overhead of calling > PyObject_RichCompareBool() at all. > > > > If it's intended that Python-the-language requires this, that needs to > be documented. > > This has been slowly, but perhaps incompletely documented over the years > and has become baked in the some of the collections ABCs as well. For > example, Sequence.__contains__() is defined as: > > def __contains__(self, value): > for v in self: > if v is value or v == value: # note the identity test > return True > return False > > Various collections need to assume reflexivity, not just for speed, but so > that we can reason about them and so that they can maintain internal > consistency. For example, MutableSet defines pop() as: > > def pop(self): > """Return the popped value. Raise KeyError if empty.""" > it = iter(self) > try: > value = next(it) > except StopIteration: > raise KeyError from None > self.discard(value) > return value > > That pop() logic implicitly assumes an invariant between membership and > iteration: > > assert(x in collection for x in collection) > > We really don't want to pop() a value *x* and then find that *x* is still > in the container. This would happen if iter() found the *x*, but > discard() couldn't find the object because the object can't or won't > recognize itself: > > s = {float('NaN')} > s.pop() > assert not s # Do we want the language to guarantee > that s is now empty? I think we must. > > The code for clear() depends on pop() working: > > def clear(self): > """This is slow (creates N new iterators!) but effective.""" > try: > while True: > self.pop() > except KeyError: > pass > > It would unfortunate if clear() could not guarantee a post-condition that > the container is empty: > > s = {float('NaN')} > s.clear() > assert not s # Can this be allowed to fail? > > The case of count() is less clear-cut, but even there > identity-implies-equality improves our ability to reason about code: Given > some list, *s*, possibly already populated, would you want the following > code to always work: > > c = s.count(x) > s.append(x) > assert s.count(x) == c + 1 # To me, this is fundamental to > what the word "count" means. > > I can't find it now, but remember a possibly related discussion where we > collectively rejected a proposal for an __is__() method. IIRC, the > reasoning was that our ability to think about code correctly depended on > this being true: > > a = b > assert a is b > > Back to the discussion at hand, I had thought our position was roughly: > > * __eq__ can return anything it wants. > > * Containers are allowed but not required to assume that > identity-implies-equality. > > * Python's core containers make that assumption so that we can keep > the containers internally consistent and so that we can reason about > the results of operations. > > Also, I believe that even very early dict code (at least as far back as Py > 1.5.2) had logic for "v is value or v == value". > > As far as NaNs go, the only question is how far to propagate their notion > of irreflexivity. Should "x == x" return False for them? We've decided > yes. When it comes to containers, who makes the rules, the containers or > their elements. Mostly, we let the elements rule, but containers are > allowed to make useful assumptions about the elements when necessary. This > isn't much different than the rules for the "==" operator where __eq__() > can return whatever it wants, but functions are still allowed to write "if > x == y: ..." and assumes that meaningful boolean value has been returned > (even if it wasn't). Likewise, the rule for "<" is that it can return > whatever it wants, but sorted() and min() are allowed to assume a > meaningful total ordering (which might or might not be true). In other > words, containers and functions are allowed, when necessary or useful, to > override the decisions made by their data. This seems like a reasonable > state of affairs. > > The current docs make an effort to describe what we have now: > https://docs.python.org/3/reference/expressions.html#value-comparisons > > Sorry for the lack of concision. I'm posting on borrowed time, > > > Raymond > > > > > > > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/UIZPD7OJRVID4EMO5WI7FUX6BR7XLR5D/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido (mobile)
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YVZLN735ANNZPKTOK3FQR2MTGIZLXNJ7/ Code of Conduct: http://python.org/psf/codeofconduct/