Re: [swift-evolution] [Draft] Rename Sequence.elementsEqual

Kevin Nattinger via swift-evolution Sun, 15 Oct 2017 00:30:06 -0700

> On Oct 14, 2017, at 7:54 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote:
> 
> On Sat, Oct 14, 2017 at 6:17 PM, Kevin Nattinger <sw...@nattinger.net 
> <mailto:sw...@nattinger.net>> wrote:
>>> […]
>>> * A Swift `Sequence` is, to put it simplistically, a thing that can be 
>>> iterated over in a `for...in` loop. If it would make you happy, for the 
>>> rest of the discussion, let's suppose we called the protocol `ForLoopable` 
>>> instead of `Sequence`.
>> 
>> ForLoopable is so ugly. Since we’re just iterating over the elements, how 
>> about, oh, say, `Iterable`? Hey, that looks familiar.
>> 
>> I'm not trying to bikeshed the name of `Sequence`. I'm picking an 
>> intentionally unwieldy name for the purposes of discussing the semantics of 
>> this particular protocol. The point is that the underlying issue has nothing 
>> to do with the name; it can be `Iterable` or it can be `SpongeBob` for all I 
>> care.
> 
> I’m not trying to bikeshed the name either, The underlying issue is that 
> (what is currently) Sequence actually encompasses two separate 
> functionalities, and those functionalities need to be separated with their 
> separate semantic requirements documented. “Sequence: Iterable,” 
> “OrderedSequence: Sequence,” “SpongeBob: ForLoopable,” the names are 100% 
> irrelevant at this point; what’s important is that one is not necessarily 
> ordered and the other guarantees an order.
>  
> 
> What are the "two separate functionalities”?


Iteration, with convenience methods that don’t imply or rely on an order that 
may not be there; and convenience methods applicable to sequences that do have 
an intrinsic order.

> All the extension methods on Sequence are ways of spelling things that you 
> can write in a few lines of code using a `for...in` loop; they're in the 
> stdlib to allow a more functional style which some prefer. If you accept that 
> a type should support iteration with a `for...in` loop, then what is your 
> basis for claiming that these extension methods are "separate 
> functionalities”?

Just because you *can* express something in code doesn’t mean you should, or 
that it’s correct. It is objectively false to say a Set has a first or last 
object, because the objects therein have no order. You can take a random object 
from the set and call it “first”, but that doesn’t make that a correct 
definition of Set.first. A Set has no order, a specific iteration has an 
“order” only in the sense that all and only the objects in the set have to come 
out one at a time, but that doesn’t mean the Set itself has an order, 
specifically a first or last object.

You’re a fan of the principal of least surprise. Tell me, which would be less 
surprising: Set.dropFirst() actually drops a random element, or Set doesn’t 
have a dropFirst()? And if you think dropFirst() removing an element at random 
is not surprising, please explain why.

>>> […]
>> 
>>> * If a type `T` conforms to `ForLoopable` and an instance `t` of that type 
>>> has at least one element, then *something* has to be the first element in a 
>>> `for element in t { ... }` loop. Put another way, every instance of a type 
>>> that conforms to `ForLoopable` must have at least one publicly observable 
>>> order (although, intriguingly, I'm not sure it has to be a repeatable one). 
>>> It is possible, therefore, to have a semantic answer to the question of 
>>> which element is `first` or (if finite) `last`; one can also 
>>> `drop(while:)`, etc., and perform lexicographical comparisons.
>> 
>> As a side effect of Swift being a procedural language each iteration happens 
>> to occur in some order, yes, but that order is meaningless and reflects 
>> nothing about the Set itself.  In fact, I’d say that `first`, `last`, etc. 
>> are not even defined on the original Set per se, only on the specific order 
>> that a particular iteration resulted in. And that order is not necessarily 
>> predictable, nor necessarily stable, as you yourself said.
>> 
>> Consider an Iterable that gives a different order every time it’s iterated. 
>> Should calling `.first` or `last` give a different object every time?  
>> That’s absurd.
>> Should an object lexicographically compare not equal to itself? Even more 
>> absurd. 
>> 
>> What's your basis for saying that such behavior is absurd? It is explicitly 
>> permitted for instances of types conforming to `SpongeBob` to be single-pass 
>> and/or infinite. For a single-pass `SpongeBob`, `first` will certainly 
>> return a different value every time it is invoked.
> 
> Is `first` mutating? No. Should it be? No! `first` and `last` are a peek at 
> the state of the object.
> 
> You're right, `first` should not be mutating; that's actually an important 
> design consideration, as Ole pointed out, and it's not actually available on 
> `Sequence` for that reason. However, `first { _ in true }` is available and 
> is potentially mutating, as are all methods on Sequence by design.
> 
> Is `elementsEqual` (or *shudder* lexicographicallyEqual) reflexive? IMO it 
> clearly should be. Especially with the “lexicographically” part—from 
> everything I can find, a lexicographical ordering is by definition reflexive. 
> Do you have a citation for the idea that lexicographical equality can 
> legitimately be non-reflexive?
> 
> Clearly (tautologically?), such a function should be reflexive for any 
> argument ordered with respect to itself. However, if there is no 
> lexicographical comparison possible, then a thing cannot compare 
> lexicographically equal to anything, even itself.

And that’s PRECISELY why lexicographicallyEqual does not make sense to apply to 
unordered sets. There is no lexicographical comparison possible, so why do you 
keep insisting they should have a method that falsely claims to 
lexicographically compare them?

> 
>> A random number generator fulfills all the semantic requirements of 
>> conforming to `SpongeBob`, and in fact I do just that in NumericAnnex 
>> <https://github.com/xwu/NumericAnnex/blob/master/Sources/PRNG.swift#L53>. 
>> `first` gives a different value every time, and a randomly generated 
>> `SpongeBob` would unsurprisingly compare lexicographically not equal to 
>> itself.
> 
> > IMO that’s a bug in the implementation; lexicographical equality is 
> > reflexive, period.
> 
> > Presumably the `elementsEqual` method contains something along these lines 
> > (take from SequenceAlgorithms.swift.gyb):
> 
>     var iter1 = self.makeIterator()
>     var iter2 = other.makeIterator()
> 
> > By creating two iterators, you’re mutating while iterating. Turns out 
> > there’s even a warning against this in Sequence.swift:
> 
> /// Using Multiple Iterators
> /// ========================
> ///
> /// Whenever you use multiple iterators (or `for`-`in` loops) over a single
> /// sequence, be sure you know that the specific sequence supports repeated
> /// iteration, either because you know its concrete type or because the
> /// sequence is also constrained to the `Collection` protocol.
> ///
> /// Obtain each separate iterator from separate calls to the sequence's
> /// `makeIterator()` method rather than by copying. Copying an iterator is
> /// safe, but advancing one copy of an iterator by calling its `next()` method
> /// may invalidate other copies of that iterator. `for`-`in` loops are safe in
> /// this regard.
> 
> > The default implementation of elementsEqual is therefore unsafe because it 
> > has the potential for using an invalidated iterator.
> 
> You are misunderstanding the warning in the second paragraph here. The 
> implementation (not a default implementation, unless I'm mistaken, as it 
> cannot be overridden)

> makes each iterator using separate calls to `makeIterator()`, just as the 
> documentation tells you to do. Calling next() on one iterator does not 
> invalidate the other iterator, because the second is not a copy of the first.

Indeed, I misread that comment. 
That said, is there a well-defined behavior when iterating a one-shot sequence 
with two iterators?
(It *is* a default implementation, btw)

> 
> You are, however, right that calling `rng.elementsEqual(rng)` is not advised. 
> It isn't unsafe; it's just not useful. That said, calling 
> `array.elementsEqual(array)` is equally safe and equally useless, and the 
> uselessness of such a reflexive comparison is neither here nor there.

Funny how you complain about my code being useless and yet you insist below, 
"If it's not providing you with utility, then don't use it.”  
Regardless, you’re wrong to dismiss this case. `foo.elementsEqual(foo)` on its 
own makes little sense, sure, but you could easily find yourself in a method 
comparing two iterators you obtained from elsewhere, and occasionally they 
happen to be the identical object. Allowing an iterator to return `false` for 
.elementsEqual on itself is unexpected and dangerous.

> 
>> 
>> On the other hand, if I have a collection of objects that I want iterated in 
>> a particular order, I can use a container that iterates in a specific, 
>> known, well-defined way, and use that to construct the sequence of objects.  
>> That’s clearly an Iterable collection, but the guarantee is stronger than 
>> that. Since it iterates objects in a specific sequence, the logical way to 
>> express that would be `Sequence: Iterable`. Again, we’ve seen that before.  
>> 
>> Now, since a Sequence is guaranteed to iterate the same every time, suddenly 
>> our `first`, `last`, `drop*`, etc. methods have a meaning inherent to the 
>> collection itself, rather than a specific iteration.
>> 
>> What you call a "Sequence" here would have to be multi-pass, finite, and 
>> ordered. 
> 
> > Ordered, yes, but it’s only admittedly poor wording that suggests 
> > multi-pass, and I don’t think anything there suggests finite.
> 
> If a Sequence is "guaranteed to iterate the same every time," then surely it 
> must be multi-pass; what's the alternative?

Not sure if you just missed the very next sentence or are actively ignoring it 
just to be argumentative. I already acknowledged that that phrase didn’t convey 
the meaning I intended, and a Sequence is not and should not be required to be 
multi-pass.

>  
> It would be better to say that the iteration order is well-defined. That will 
> almost always mean documented, and usually predictable though obviously e.g. 
> RNGs and iterating in random order will not be predictable by design.
> 
>> That's actually more semantically constrained than what Swift calls a 
>> `Collection` (which requires conforming types to be multi-pass and(?) 
>> finite). By contrast, Swift's `SpongeBob` protocol explicitly permits 
>> conforming single-pass, infinite, and/or unordered types. 
> 
> I think you’re talking about Sequence here, I’ve lost track of your nonsense 
> by now. Yes, the current Swift protocol named Sequence allows unordered 
> types. You seem to keep asserting that but not actually addressing my 
> argument, which is that allowing Sequences to be unordered with the current 
> API is undesired and actively harmful, and should therefore be changed.
> 
> What is harmful about it?

Well, for one, the issue you raised this thread about—two sets that are `==` 
could return either true or false for `elementsEqual`, depending on how they 
arrived at their current state. That’s not acceptable, and the problem isn’t 
with the name of the method.
Then there are all the methods that imply a specific order of iteration. If the 
“sequence” is unordered, who knows what you’ll get? It is incredibly easy for 
an engineer to write a method that implicitly relies on a passed sequence being 
intrinsically ordered and another engineer to pass it an unordered “sequence.”  
The first engineer could neglect to document the dependency, or even not notice 
it; or the second engineer could just fail to read the documentation thoroughly 
enough.  There is currently no way for the compiler to enforce passing only an 
object that is (or at least claims to be) intrinsically ordered.

> 
> 
>> As long as it is possible to iterate over a `SpongeBob`, it is meaningful to 
>> ask what element is first obtained upon iteration or to drop the first 
>> element obtained upon iteration.
>> And as long as iteration is not required to be repeatable (and it isn't), it 
>> is perfectly acceptable for these algorithms to return a different result 
>> every time.
> 
> It is “meaningful” in the sense that it can technically be programmed. The 
> actual results are meaningless beyond returning or dropping a random* element.
> 
> *: Don’t nitpick the word “random”, you know exactly what I mean. It’s just 
> shorter and no less clear than “technically more-or-less deterministic but 
> not documented, not generally predictable, and probably but not necessarily 
> consistent from one call to the next."
> 
> 
> I fail to see the issue here. If it's not providing you with utility, then 
> don't use it.

I have no problem with functions I don’t use provided they are well-defined and 
reasonably accurately named. Functions requiring an order on unordered 
collections don’t pass that bar. 

> Since Collections do guarantee multi-pass iteration, Brent's example of 
> `set.dropFirst().reduce(set.first!, ...)` provides just one instance where an 
> unordered Collection can profitably make use of `first`. Permitting generic 
> algorithms that can operate on either arrays or sets, for example, is the 
> desired effect of having such a protocol; a generic algorithm that takes a 
> Collection can ask for the first element, and in the case of an unordered 
> Collection, an arbitrary element will do just fine.

The generic algorithms should be on a protocol that specifies everything they 
require. If one can work on anything you can iterate over, put it on Iterable. 
If another requires the objects to be ordered, put it on Sequence. Need to 
express that an algorithm requires a multi-pass sequence? Make a 
MultiPassSequence protocol and put the algorithm on an extension containing 
that requirement. Use protocols to express requirements, as they were designed 
for. Don’t just tack a bunch of methods onto a protocol that isn’t sufficient 
to describe their requirements and say, “oh, by the way, only use this method 
if your implementation meets these conditions…"

>  
>> 
>> `first` is the first object in the Sequence. It doesn’t matter how the 
>> sequence came to be in that order; it doesn’t matter whether or not the 
>> sequence has already been iterated or how many times. `first` is the first 
>> object that is, was, and always will be presented by the Sequence’s 
>> Iterator. (Until the collection is mutated, obviously).
>> 
>> To summarize,
>> A Set has no intrinsic order. You can iterate over it, and a specific 
>> iteration of a set has an order, but that order is not tied to the Set 
>> itself beyond including all and only the items therein. Therefore, the Set 
>> itself has no intrinsic `first`, `last`, lexicographical comparison, etc.; 
>> only its iterations do, and they are not themselves Sets.
>> A Sequence does have an intrinsic order. The order of iteration reflects the 
>> order inherent to the Sequence. Therefore, a Sequence has a `first`, `last`, 
>> lexicographical comparison, etc.
>> 
>> Just in case it’s not obvious, `Set` here is pretty much interchangeable 
>> with any other unordered iterable.
>> 
>> What you want to call a "Sequence" is what Swift calls a `Collection`, with 
>> additional restrictions. What you want to be called "Iterable" is what Swift 
>> calls `Sequence` (or now, `SpongeBob`). Clearly, shuffling names will not 
>> make these protocols support any more functionality, so that can be put 
>> aside.
> 
> No, no, no! What I want to call “Iterable” is specified below. It is about 
> HALF of what’s currently in Sequence—the half that has to do with iterating, 
> whence the name.
> What I want to call Sequence is precisely what Swift now calls Sequence—the 
> methods that are in Iterable by virtue of adopting Iterable, PLUS some 
> methods that only make sense on iterable groups of objects where the 
> iteration order is well-defined.
> 
>> 
>> Now, with that out of the way, why do you think that only `Collection` types 
>> should have `first` and `last`? These helper properties and methods are 
>> simply convenient ways to spell things that can be done with `for...in`--the 
>> whole point of supplying them is to allow people to work with these types in 
>> a more functional style.
> 
> Apparently “collection" was a bad choice of word. What I clearly meant was 
> not the current Swift Collection protocol, but rather an unordered assemblage 
> of objects. UnorderedCollection, perhaps, or if that’s still going to cause 
> issues, try UnorderedAssemblage.  What `first` and `last` really mean in an 
> UnorderedAssemblage is give me some object from the assembled objects, I 
> don’t care which one. For which it’s much more clear to have an `anyObject()` 
> as on NSSet; as another user has pointed out, 
> `assemblage.makeIterator().next()` works just as well. (I just checked, and 
> that’s actually how `first` is implemented. But it’s on Collection, which is 
> guaranteed to be multipass,)
> 
> Again, the point of having a protocol-based design is to allow useful 
> _generic_ algorithms to be written; that `first` and `last` would be 
> equivalent to an arbitrary element in the case that a collection is unordered 
> is not at all an argument against these types conforming to `Collection`; if 
> anything, it's an argument for it.

If a protocol demands the first object, you should give it the first object. If 
you don’t have a first object, maybe you shouldn’t conform to the protocol. If 
the protocol really just needs any old object, call it `anyObject`. 


> Just as `Sequence.underestimatedCount` is equivalent to `Collection.count` 
> for types that conform to `Collection`, or the instance 
> `BinaryInteger.bitWidth` is equivalent to a static `bitWidth` for types that 
> conform to `FixedWidthInteger`.

I don’t see how those are relevant, they all mean what they claim to, unlike 
Set.first/dropFirst/etc.

>>>> public protocol Iterable {
>>>>   associatedtype Iterator: IteratorProtocol
>>>>   func map<T>(...) -> [T] // Iterable where .Iterator.Element == T
>>>>   func filter(...) -> [Iterator.Element] // Iterable where 
>>>> .Iterator.Element == Self.Iterator.Element
>>>>   func forEach(...)
>>>>   func makeIterator() -> Iterator
>>>>   var underestimatedCount: Int { get }
>>>> }
>>>> 
>>>> public protocol Sequence: Iterable { // Maybe OrderedSequence just to make 
>>>> the well-defined-order requirement explicit
>>>>   associatedtype SubSequence
>>>>   func dropFirst(...)   -> SubSequence   // Sequence where 
>>>> .Iterator.Element == Self.Iterator.Element
>>>>   func dropLast(...)    -> SubSequence   //    " "
>>>>   func drop(while...)   -> SubSequence   //    " "
>>>>   func prefix(...)      -> SubSequence   //    " "
>>>>   func prefix(while...) -> SubSequence   //    " "
>>>>   func suffix(...)      -> SubSequence   //    " "
>>>>   func split(...where...)  -> [SubSequence] // Iterable where 
>>>> .Iterator.Element == (Sequence where .Iterator.Element == 
>>>> Self.Iterator.Element)
>>>> }
>> 
>> 
>> And just to be explicit, 
>> struct Set: Iterable {…}
>> struct Dictionary: Iterable {…}
>> struct Array: Sequence {…}
>> etc.
>> 
>> Hopefully at some point:
>> struct OrderedSet: Sequence {…}

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Draft] Rename Sequence.elementsEqual

Reply via email to