On 4/20/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 4/20/07, Jeffrey Yasskin <[EMAIL PROTECTED]> wrote: > > On Sets: > > Do you have an example of a data structure that is a Set but not a > > ComposableSet? > > Take for example a set that is a contiguous range of integers, > represented by a (lo, hi) tuple. In many cases the return type of > union and symmetric difference cannot be represented as an object of > the same type. I don't want to force the implementer of such a set to > provide implementations for operations they don't need.
Ah, good example. I think that PEPs that propose ABCs should be encouraged to describe a few possible implementations, and should be particularly encouraged to give examples of the implementations that made them split two classes, or omit a method from a particular class. > > Set should inherit from a PartiallyOrdered ABC that defines the > > comparison operators. > > I guess then there would also be a TotallyOrdered ABC deriving from > PartiallyOrdered which adds no operations but does add a constraint. > I've added both, and made Set derive from PartiallyOrdered. Cool. Here's some language for their invariants (checking myself with the Fortress spec, and stealing its properties nearly verbatim): PartiallyOrdered: This ABC defines the 5 inequality operations <, <=, >=, >, and cmp(). (Note that == and != are defined by object.) Classes deriving from this ABC should ensure that '<' and '>' form strict partial orders, and '<=' and '>=' form weak partial orders, in the sense of http://en.wikipedia.org/wiki/Partially_ordered_set. These should be compatible with each other and with ==: Where PO is a subtype of PartiallyOrdered, and a and b are instances of PO, cmp(a, b) < 0 ↔ cmp(b, a) > 0 cmp(a, b) == 0 ↔ cmp(b, a) == 0 cmp(a, b) == None ↔ cmp(b, a) == None a < b ↔ cmp(a, b) < 0 a <= b ↔ cmp(a, b) <= 0 a == b ↔ cmp(a, b) == 0 a >=b ↔ cmp(a, b) >= 0 a > b ↔ cmp(a, b) > 0 Subclasses of PartiallyOrdered must define one of cmp() or <=. The other operators are defaulted as follows: def <=(a,b): return cmp(a,b) <= 0 def cmp(a,b): if a<=b: if b<=a: return 0 else: return -1 else: if b<=a: return 1 else: return None def ==(a,b): return a<=b and b<=a def <(a,b): return a<=b and not b<=a def >=(a,b): return b<=a def >(a,b): return b<a Open Issues: 1) Is it reasonable to let cmp() return None for unordered pairs, or should we instead define a 4-value enumeration for it to return? 2) The @abstractmethod decorator isn't enough to specify that subclasses must override at least one of a group of methods, and there are classes with even more complex requirements like "subclasses must override a or (b and (c or d))". 3) http://docs.python.org/ref/customization.html#l2h-187 still implies that < defines a total order over all python objects. Has that changed for py3k? 4) How should we handle the reflected forms of these operators? TotallyOrdered: This ABC derives from PartiallyOrdered. It adds no new operations but implies a promise that <, <=, >, and >= are total orders in the sense of http://en.wikipedia.org/wiki/Totally_ordered_set, and that cmp(a,b) never returns None. Given this additional restriction, subclasses only need to define one of <, <=, or cmp(), and the rest of the methods can be defaulted. Open Issue: How does the TotallyOrdered property inherit? If Real (an abstract class) inherits from TotallyOrdered, do different subclasses need to be totally comparable with each other? If long (a concrete class) inherits from TotallyOrdered, and I subclass long, do I have to ensure that my instances are totally comparable with other longs? I think 'yes' to the second, to obey the Liskov substitution principle. I don't know about the first: for Real it seems fine, but it might inhibit refactoring the ABC hierarchy. > > The final specification of Set should spell out the invariants, but > > they're clear enough that the PEP probably doesn't need to. It's odd, > > though, that you mention optimizations to __eq__ and __le__ but not > > their semantics. > > The semantics are kind of obvious. :-) > > Text spelling out the invariants in more detail that I can just paste > into the PEP is most welcome! Bah! ;) Set: Sets should implement an efficient (no worse than O(log N)) __contains__ operation. Sets are defined by their extensions. Subclasses may override the following default definitions of __eq__ and __le__ with more efficient versions, but they must respect the same semantics: def __eq__(a, b): return len(a) == len(b) and all(x in a for x in b) def __le__(a, b): return len(a) <= len(b) and all(x in a for x in b) ComposableSet: ... mathematical meaning of set composition: Where a and b are instances of ComposableSet and __op__(a,b) is defined: x in (a | b) ↔ x in a or x in b x in (a & b) ↔ x in a and x in b x in (a ^ b) ↔ x in a != x in b x in (a - b) ↔ x in a and not x in b > > ComposableSet's open issues: > > > Should we just pick a concrete return type (e.g. set)? > > > > For set compositions, you should say whether the arguments need to be > > uniform. (Is (IntSet(1,2,3) | set('a','b','c')) allowed?) If they do, > > then you don't need a concrete return type; if they don't, then you > > do, and the default implementations should use it. > > You may be misunderstanding what I meant. I certainly don't want to > constrain the potential return type of the composition operations > (except that they should themselves be composable sets). The question > about picking a specific return type was meant as a way out to > providing a concrete default implementation. I'm still on the fence of > this, as it could be a big waste of time (I imagine that composing > Patricia trees can be done more efficiently than iterating over the > elements). Borrowing some syntax from Java generics, which of: ComposableSet __or__(ComposableSet a, ComposableSet b); <CS extends ComposableSet> ComposableSet __or__(CS a, CS b); <CS extends ComposableSet> CS __or__(CS a, CS b); do you mean to define? I think they have different implications about what defaults you need to provide. The second is probably out because it would mean (a|b|c) would likely break. This gets even more complex if we consider the types of the contents of the sets... Here's a possible overloaded signature for __or__. (Do you already have a better notation for this?) <C super A, C super B, C extends Hashable> ComposableSet<C> __or__(ComposableSet<A> a, ComposableSet<B> b); <C super A, C super B, (for x in A,B,C: CS<x> extends ComposableSet<x>)> CS<C> __or__(CS<A> a, CS<B> b); The first would probably be optimized when its arguments are of the same type, but could fall back on the abstract methods from ComposableSet, which would return a built-in (frozen?)set. > > Numbers: > > Would you mind helping to write a PEP for numbers? I think I'm running > out of round tuits just for the collections and the ABC > infrastructure. Bah! again. Yes, I'll try to throw something together with some help from Neal. Who else has been thinking about the Numbers PEP? > > Cardinal is also useful as the argument to (Sequence * Cardinal). > > I think there's little value for this in a dynamically typed language; > the implementation can raise ValueError or treat negative numbers as > zero (which is what Python currently does). True, and any hypothetical type checkers that want to use such information can define their own subclasses of Integer. -- Namasté, Jeffrey Yasskin _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com