Steven D'Aprano <ste...@remove.this.cybersource.com.au> wrote: > Such assumptions only hold under particular domains though. You can't > assume equality is an equivalence relation once you start thinking > about arbitrary domains.
>From a formal mathematical point of view, equality /is/ an equivalence relation. If you have a relation on some domain, and it's not an equivalence relation, then it ain't the equality relation, and that's flat. > But there cannot be any such function which is a domain-independent > equivalence relation, not if we're talking about arbitrarily wacky > domains. That looks like a claim which requires a proof to me. But it could also do with a definition of `domain', so I'll settle for one of those first. If we're dealing with sets (i.e., `domain's form a subclass of `sets') then the claim is clearly false, and equality (determined by comparison of elements) is indeed a domain-independent equivalence relation. > Even something as straight-forward as "is" can't be an equivalence > relation under a domain where identity isn't well-defined. You've completely lost me here. The Python `is' operator is (the characteristic function of) an equivalence relation on Python values: that's its definition. You could describe an extension of the `is' relation to a larger set of items, such that it fails to be an equivalence relation on that set, but you'd be (rightly) criticized for failing to preserve one of its two defining properties. (The other is that `is' makes distinctions between values which are at least as fine as any other method, and this property should also be extended .) Let me have another go. All Python objects are instances of `object' or of some more specific class. The `==' operator on `object' is (the characteristic function of) an equivalence relation. In, fact, it's the same as `is' -- but `==' can be overridden by subclasses, and subclasses are permitted -- according to the interface definition -- to coarsen the relation. In fact, they're permitted to make it not be an equivalence class at all. I claim that this is a problem. I /agree/ that domain-specific predicates are useful, and can be sufficiently useful that they deserve the `==' name -- as well as floats and numpy, I've provided SAGE and sympy as examples myself. But I also believe that there are good reasons to want an `equivalence' operator (I'll write it as `=~', though I don't propose this as Python syntax -- see below) with the following properties: * `=~' is the characteristic function[1] of an equivalence relation, i.e., for all values x, y, z: x =~ y in (True, False); (x =~ x) == True; if x =~ y then y =~ x; and if x =~ y and y =~ z then x =~ z * Moreover, `=~' is a coarsening of `is', i.e. for all values x, y: if x is y then x =~ y. A valuable property might be that x =~ y if x and y are indistinguishable without using `is'. That would mean immediately that 'xyz' =~ 'xy' + 'z' (regardless of interning, because strings are immutable). But for tuples this would imply elementwise comparison, which may be expensive -- and, in the case of tuples manufactured by C extensions, nontrivial because manufactured tuples need not be acyclic. On the other hand, `==' is already recursive on tuples. We can envisage a collection of different relations, according to which distinguishing methods we're willing to disallow. For example, for numerical types, there are actually a number of interesting relations, according to whether you think the answers to the following questions are true or false. * Is 1 =~ 1/1? (Here, 1 is an integer, and 1/1 is a rational number; both are the multiplicative identities of their respective rings. I'd suggest that it doesn't seem very useful to say `no' here, but there might be reasons why one would want type(x) is type(y) if x =~ y.) * Is 1 =~ 1.0? (This is trickier. Numerically the values are equal; but the former is exact and the latter inexact, and this is a good reason to want a separation.) Essentially, these are asking whether `type' is a legitimate distinguisher, and I think that the answer, unhelpful as it may be, is `sometimes'. A third useful distinguishing technique is mutation. Given two singleton lists whose respective elements compare equivalent, I can mutate one of them to decide whether the other is in fact the same. Is this something which `=~' should distinguish? Again, the answer is probably `sometimes'. To summarize: we're left with at least three different characteristics which an equivalence predicate might have: * efficient (e.g., bounded recursion depth, works on circular values); * neglects irrelevant (to whom?) differences of type; and * neglects differences due to mutability. A predicate used to compare set elements or hash-table keys should probably /respect/ mutability. (Associating hashing with this predicate, rather than `==', would coherently allow mutable objects such as lists to be used as dictionary keys, though they'd be compared by address. I don't actually know how useful this would be, but suspect that it wouldn't.) Oh, before I go, let me make this very clear: I am /not/ proposing a language change. I think the right way to addres these problems is using existing mechanisms such as generic functions with multimethods. Syntax can come later if it seems sufficiently important. [1] I'll settle for it being a partial function, i.e., attempting to evaluate x =~ y might raise exceptions, e.g., if x is in some invalid state, or perhaps if one or both of x or y is circular, though it would be good to minimize such cases. -- [mdw] -- http://mail.python.org/mailman/listinfo/python-list