Re: [swift-evolution] [Draft] Rename Sequence.elementsEqual

Thorsten Seitz via swift-evolution Mon, 16 Oct 2017 22:56:43 -0700


> Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi...@gmail.com>:
> 
> 
> On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseit...@icloud.com> wrote:
>>> Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution 
>>> <swift-evolution@swift.org>:
>>> 
>>> 
>>> On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jh...@gbis.com> wrote:
>>>> 
>>>>> On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote:
>>>>> 
>>>>>> On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jh...@gbis.com> wrote:
>>>>>> 
>>>>>>>> On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote:
>>>>>>>>>>>  That ordering can be arbitrary, but it shouldn’t leak internal 
>>>>>>>>>>> representation such that the method used to create identical things 
>>>>>>>>>>> affects the outcome of generic methods because of differences in 
>>>>>>>>>>> internal representation.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>>  It would be better to say that the iteration order is 
>>>>>>>>>>>>> well-defined. That will almost always mean documented, and 
>>>>>>>>>>>>> usually predictable though obviously e.g. RNGs and iterating in 
>>>>>>>>>>>>> random order will not be predictable by design.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> That's actually more semantically constrained than what Swift 
>>>>>>>>>>>>>>> calls a `Collection` (which requires conforming types to be 
>>>>>>>>>>>>>>> multi-pass and(?) finite). By contrast, Swift's `SpongeBob` 
>>>>>>>>>>>>>>> protocol explicitly permits conforming single-pass, infinite, 
>>>>>>>>>>>>>>> and/or unordered types. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think you’re talking about Sequence here, I’ve lost track of 
>>>>>>>>>>>>>> your nonsense by now. Yes, the current Swift protocol named 
>>>>>>>>>>>>>> Sequence allows unordered types. You seem to keep asserting that 
>>>>>>>>>>>>>> but not actually addressing my argument, which is that allowing 
>>>>>>>>>>>>>> Sequences to be unordered with the current API is undesired and 
>>>>>>>>>>>>>> actively harmful, and should therefore be changed.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> What is harmful about it?
>>>>>>>>>>>> 
>>>>>>>>>>>> After thinking about it, I think the harmful bit is that unordered 
>>>>>>>>>>>> sequences are leaking internal representation (In your example, 
>>>>>>>>>>>> this is causing people to be surprised when two sets with 
>>>>>>>>>>>> identical elements are generating different sequences/orderings 
>>>>>>>>>>>> based on how they were created).  You are correct when you say 
>>>>>>>>>>>> that this problem is even true for for-in.
>>>>>>>>>>> 
>>>>>>>>>>> I would not say it is a problem. Rather, by definition, iteration 
>>>>>>>>>>> involves retrieving one element after another; if you're allowed to 
>>>>>>>>>>> do that with Set, then the elements of a Set are observably ordered 
>>>>>>>>>>> in some way. Since it's not an OrderedSet--i.e., order doesn't 
>>>>>>>>>>> matter--then the only sensible conclusion is that the order of 
>>>>>>>>>>> elements obtained in a for...in loop must be arbitrary. If you 
>>>>>>>>>>> think this is harmful, then you must believe that one should be 
>>>>>>>>>>> prohibited from iterating over an instance of Set. Otherwise, Set 
>>>>>>>>>>> is inescapably a Sequence by the Swift definition of Sequence. All 
>>>>>>>>>>> extension methods on Sequence like drop(while:) are really just 
>>>>>>>>>>> conveniences for common things that you can do with iterated 
>>>>>>>>>>> access; to my mind, they're essentially just alternative ways of 
>>>>>>>>>>> spelling various for...in loops.
>>>>>>>>>> 
>>>>>>>>>> I think an argument could be made that you shouldn’t be able to 
>>>>>>>>>> iterate over a set without first defining an ordering on it (even if 
>>>>>>>>>> that ordering is somewhat arbitrary).  Maybe we have something like 
>>>>>>>>>> a “Sequenc(e)able” protocol which defines things which can be turned 
>>>>>>>>>> into a sequence when combined with some sort of ordering.  One 
>>>>>>>>>> possible ordering could be the internal representation (At least in 
>>>>>>>>>> that case we are calling it out specifically).  If I had to say 
>>>>>>>>>> “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would 
>>>>>>>>>> definitely be less surprised when it returns false even though setA 
>>>>>>>>>> == setB.
>>>>>>>>> 
>>>>>>>>> Well, that's a totally different direction, then; you're arguing that 
>>>>>>>>> `Set` and `Dictionary` should not conform to `Sequence` altogether. 
>>>>>>>>> That's fine (it's also a direction that some of us explored off-list 
>>>>>>>>> a while ago), but at this point in Swift's evolution, realistically, 
>>>>>>>>> it's not within the realm of possible changes.
>>>>>>>> 
>>>>>>>> I am actually suggesting something slightly different.  Basically, Set 
>>>>>>>> and Dictionary’s conformance to Collection would have a different 
>>>>>>>> implementation.  They would conform to another protocol declaring that 
>>>>>>>> they are unordered. That protocol would fill in part of the 
>>>>>>>> conformance to sequence/collection using a default ordering, which is 
>>>>>>>> mostly arbitrary, but guaranteed to produce the same ordering for the 
>>>>>>>> same list of elements (even across collection types).  This would be 
>>>>>>>> safer, but a tiny bit slower than what we have now (We could also 
>>>>>>>> potentially develop a way for collections like set to amortize the 
>>>>>>>> cost). For those who need to recover speed, the new protocol would 
>>>>>>>> also define a property which quickly returns a sequence/iterator using 
>>>>>>>> the internal ordering (I arbitrarily called it .arbitraryOrder).
>>>>>>>> 
>>>>>>>> I believe it would not be source breaking.
>>>>>>> 
>>>>>>> That is indeed something slightly different.
>>>>>>> 
>>>>>>> In an ideal world--and my initial understanding of what you were 
>>>>>>> suggesting--Set and Dictionary would each have a member like 
>>>>>>> `collection`, which would expose the underlying data as a 
>>>>>>> `SetCollection` or `DictionaryCollection` that in turn would conform to 
>>>>>>> `Collection`; meanwhile, Set and Dictionary themselves would not offer 
>>>>>>> methods such as `prefix`, or indexing by subscript, which are not 
>>>>>>> compatible with being unordered. For those who want a particular 
>>>>>>> ordering, there'd be something like `collection(ordered 
>>>>>>> areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.
>>>>>>> 
>>>>>>> What you suggest here instead would be minimally source-breaking. 
>>>>>>> However, I'm unsure of where these guarantees provide benefit to 
>>>>>>> justify the performance cost. Certainly not for `first` or 
>>>>>>> `dropFirst(_:)`, which still yields an arbitrary result which doesn't 
>>>>>>> make sense for something _unordered_. We *could* have an underscored 
>>>>>>> customization point named something like `_customOrderingPass` that is 
>>>>>>> only invoked from `elementsEqual` or other such methods to 
>>>>>>> pre-rearrange the internal ordering of unordered collections in some 
>>>>>>> deterministic way before comparison. Is that what you have in mind?
>>>>>> 
>>>>>> 
>>>>>> Something like that.  Whatever we do, there will be a tradeoff between 
>>>>>> speed, correctness, and ergonomics.
>>>>>> 
>>>>>> My suggestion trades speed for correctness, and provides a way to 
>>>>>> recover speed through additional typing (which is slightly less 
>>>>>> ergonomic).
>>>>> 
>>>>> You haven't convinced me that this is at all improved in "correctness." 
>>>>> It trades one arbitrary iteration order for another on a type that tries 
>>>>> to model an unordered collection.
>>>>>  
>>>>>> We could do something like you suggest. I don’t think the method would 
>>>>>> need to be underscored… the ordering pass could just be a method on the 
>>>>>> protocol which defines it as unordered.  Then we could provide a special 
>>>>>> conformance for things where order really matters based on adherence to 
>>>>>> that protocol.  That might be an acceptable tradeoff.  It would give us 
>>>>>> speed at the cost of having the correct implementation being less 
>>>>>> ergonomic and more error prone (you have to remember to check that it is 
>>>>>> unordered and call the ordering method when it mattered).
>>>>>> 
>>>>>> I’d still be a bit worried that people would make incorrect generic 
>>>>>> algorithms based on expecting an order from unordered things, but at 
>>>>>> least it would be possible for them check and handle it correctly.  I 
>>>>>> think I could get behind that tradeoff/compromise, given where we are in 
>>>>>> the swift process and Swift's obsession with speed (though I still 
>>>>>> slightly prefer the safer default).  At least the standard library would 
>>>>>> handle all the things correctly, and that is what will affect the 
>>>>>> majority of programmers.
>>>>> 
>>>>> What is an example of such an "incorrect" generic algorithm that would be 
>>>>> made correct by such a scheme?
>>>> 
>>>> To start with, the one you gave as an example at the beginning of this 
>>>> discussion: Two sets with identical elements which have different internal 
>>>> storage and thus give different orderings as sequences.  You yourself have 
>>>> argued that the confusion around this is enough of a problem that we need 
>>>> to make a source-breaking change (renaming it) to warn people that the 
>>>> results of the ‘elementsEqual’ algorithm are undefined for sets and 
>>>> dictionaries.
>>> 
>>> No, I am arguing that the confusion about ‘elementsEqual’ is foremost a 
>>> problem with its name; the result of this operation is not at all undefined 
>>> for two sets but actually clearly defined: it returns true if two sets have 
>>> the same elements in the same iteration order, which is a publicly 
>>> observable behavior of sets (likewise dictionaries).
>> 
>> But it is a behavior which has absolutely no meaning at all because the 
>> order does not depend on the elements of the set but on the history of how 
>> the set has been reached its current state.
>> So why should I ever use this method on a set? 
>> What is the use case?
> 
> One example: you can use it to check an instance of Set<Float> to determine 
> if it has a NaN value. (The “obvious” way of doing it is not guaranteed to 
> work since NaN != NaN.)


How would I do that? I'd rather expect to use a property isNaN on Float to do 
that.


> 
>>>> I don’t see why a non-source-breaking change is suddenly off-limits.
>>>> 
>>>> But more than that, any generic algorithm which is assuming that the 
>>>> sequence is coming from an ordered source (i.e. many things using 
>>>> first/last).  Some uses of first are ok because the programmer actually 
>>>> means ‘any’, but anywhere where they actually mean first/last may be 
>>>> problematic.
>>> 
>>> Such as...?
>>> 
>>>> Currently, there is no way to test for ordered-ness, so there is no way 
>>>> for even a careful programmer to mitigate this problem.  By adding a 
>>>> protocol which states that something is unordered, we can either branch on 
>>>> it, or create a separate version of an algorithm for things which conform.
>>> 
>>> It is clearly the case that Swift’s protocol hierarchy fits sets and 
>>> collections imperfectly; however, it is in the nature of modeling that 
>>> imperfections are present. The question is not whether it is possible to 
>>> incur performance, API surface area, and other trade-offs to make the model 
>>> more faithful, but rather whether this usefully solves any problem. What is 
>>> the problem being mitigated? As I write above, Swift’s Set and Dictionary 
>>> types meet the semantic requirements for Collection and moonlight as 
>>> ordered collections. What is a generic algorithm on an ordered collection 
>>> that is  “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said, 
>>> is not such an example.)
>> 
>> On the contrary, `elementsEqual` is exactly such an example, because it 
>> makes no sense to use it on a Set.
>> 
>> let s1 = Set([1,2,3,4,5,6])
>> let s2 = Set([6,5,4,3,2,1])
>> 
>> Both sets have different iteration orders. Comparing those sets with some 
>> other collection using `elementsEqual` will give no meaningful result 
>> because the order - and therefore the result of `elementsEqual` - is in 
>> effect random.
> 
> No, it is not such an example; it’s misleadingly named but works 
> correctly—that is, its behavior matches exactly the documented behavior, 
> which relies on only the semantic guarantees of Sequence, which Set correctly 
> fulfills.

Fulfills to the letter. Again, what can you do with it if the result is random??

-Thorsten

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Draft] Rename Sequence.elementsEqual

Reply via email to