Re: [swift-evolution] [Draft] Rename Sequence.elementsEqual

Thorsten Seitz via swift-evolution Tue, 17 Oct 2017 08:59:11 -0700


> Am 17.10.2017 um 14:44 schrieb Xiaodi Wu <xiaodi...@gmail.com>:
> 
> 
>> On Tue, Oct 17, 2017 at 00:56 Thorsten Seitz <tseit...@icloud.com> wrote:
>> 
>> 
>>> Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi...@gmail.com>:
>>> 
>>> 
>>> On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseit...@icloud.com> wrote:
>>>>> Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution 
>>>>> <swift-evolution@swift.org>:
>>>>> 
>>>>> 
>>>>> On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jh...@gbis.com> wrote:
>>>>>> 
>>>>>>> On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jh...@gbis.com> wrote:
>>>>>>>> 
>>>>>>>>>> On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote:
>>>>>>>>>>>>>  That ordering can be arbitrary, but it shouldn’t leak internal 
>>>>>>>>>>>>> representation such that the method used to create identical 
>>>>>>>>>>>>> things affects the outcome of generic methods because of 
>>>>>>>>>>>>> differences in internal representation.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>  It would be better to say that the iteration order is 
>>>>>>>>>>>>>>> well-defined. That will almost always mean documented, and 
>>>>>>>>>>>>>>> usually predictable though obviously e.g. RNGs and iterating in 
>>>>>>>>>>>>>>> random order will not be predictable by design.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> That's actually more semantically constrained than what Swift 
>>>>>>>>>>>>>>>>> calls a `Collection` (which requires conforming types to be 
>>>>>>>>>>>>>>>>> multi-pass and(?) finite). By contrast, Swift's `SpongeBob` 
>>>>>>>>>>>>>>>>> protocol explicitly permits conforming single-pass, infinite, 
>>>>>>>>>>>>>>>>> and/or unordered types. 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I think you’re talking about Sequence here, I’ve lost track of 
>>>>>>>>>>>>>>>> your nonsense by now. Yes, the current Swift protocol named 
>>>>>>>>>>>>>>>> Sequence allows unordered types. You seem to keep asserting 
>>>>>>>>>>>>>>>> that but not actually addressing my argument, which is that 
>>>>>>>>>>>>>>>> allowing Sequences to be unordered with the current API is 
>>>>>>>>>>>>>>>> undesired and actively harmful, and should therefore be 
>>>>>>>>>>>>>>>> changed.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> What is harmful about it?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> After thinking about it, I think the harmful bit is that 
>>>>>>>>>>>>>> unordered sequences are leaking internal representation (In your 
>>>>>>>>>>>>>> example, this is causing people to be surprised when two sets 
>>>>>>>>>>>>>> with identical elements are generating different 
>>>>>>>>>>>>>> sequences/orderings based on how they were created).  You are 
>>>>>>>>>>>>>> correct when you say that this problem is even true for for-in.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I would not say it is a problem. Rather, by definition, iteration 
>>>>>>>>>>>>> involves retrieving one element after another; if you're allowed 
>>>>>>>>>>>>> to do that with Set, then the elements of a Set are observably 
>>>>>>>>>>>>> ordered in some way. Since it's not an OrderedSet--i.e., order 
>>>>>>>>>>>>> doesn't matter--then the only sensible conclusion is that the 
>>>>>>>>>>>>> order of elements obtained in a for...in loop must be arbitrary. 
>>>>>>>>>>>>> If you think this is harmful, then you must believe that one 
>>>>>>>>>>>>> should be prohibited from iterating over an instance of Set. 
>>>>>>>>>>>>> Otherwise, Set is inescapably a Sequence by the Swift definition 
>>>>>>>>>>>>> of Sequence. All extension methods on Sequence like drop(while:) 
>>>>>>>>>>>>> are really just conveniences for common things that you can do 
>>>>>>>>>>>>> with iterated access; to my mind, they're essentially just 
>>>>>>>>>>>>> alternative ways of spelling various for...in loops.
>>>>>>>>>>>> 
>>>>>>>>>>>> I think an argument could be made that you shouldn’t be able to 
>>>>>>>>>>>> iterate over a set without first defining an ordering on it (even 
>>>>>>>>>>>> if that ordering is somewhat arbitrary).  Maybe we have something 
>>>>>>>>>>>> like a “Sequenc(e)able” protocol which defines things which can be 
>>>>>>>>>>>> turned into a sequence when combined with some sort of ordering.  
>>>>>>>>>>>> One possible ordering could be the internal representation (At 
>>>>>>>>>>>> least in that case we are calling it out specifically).  If I had 
>>>>>>>>>>>> to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I 
>>>>>>>>>>>> would definitely be less surprised when it returns false even 
>>>>>>>>>>>> though setA == setB.
>>>>>>>>>>> 
>>>>>>>>>>> Well, that's a totally different direction, then; you're arguing 
>>>>>>>>>>> that `Set` and `Dictionary` should not conform to `Sequence` 
>>>>>>>>>>> altogether. That's fine (it's also a direction that some of us 
>>>>>>>>>>> explored off-list a while ago), but at this point in Swift's 
>>>>>>>>>>> evolution, realistically, it's not within the realm of possible 
>>>>>>>>>>> changes.
>>>>>>>>>> 
>>>>>>>>>> I am actually suggesting something slightly different.  Basically, 
>>>>>>>>>> Set and Dictionary’s conformance to Collection would have a 
>>>>>>>>>> different implementation.  They would conform to another protocol 
>>>>>>>>>> declaring that they are unordered. That protocol would fill in part 
>>>>>>>>>> of the conformance to sequence/collection using a default ordering, 
>>>>>>>>>> which is mostly arbitrary, but guaranteed to produce the same 
>>>>>>>>>> ordering for the same list of elements (even across collection 
>>>>>>>>>> types).  This would be safer, but a tiny bit slower than what we 
>>>>>>>>>> have now (We could also potentially develop a way for collections 
>>>>>>>>>> like set to amortize the cost). For those who need to recover speed, 
>>>>>>>>>> the new protocol would also define a property which quickly returns 
>>>>>>>>>> a sequence/iterator using the internal ordering (I arbitrarily 
>>>>>>>>>> called it .arbitraryOrder).
>>>>>>>>>> 
>>>>>>>>>> I believe it would not be source breaking.
>>>>>>>>> 
>>>>>>>>> That is indeed something slightly different.
>>>>>>>>> 
>>>>>>>>> In an ideal world--and my initial understanding of what you were 
>>>>>>>>> suggesting--Set and Dictionary would each have a member like 
>>>>>>>>> `collection`, which would expose the underlying data as a 
>>>>>>>>> `SetCollection` or `DictionaryCollection` that in turn would conform 
>>>>>>>>> to `Collection`; meanwhile, Set and Dictionary themselves would not 
>>>>>>>>> offer methods such as `prefix`, or indexing by subscript, which are 
>>>>>>>>> not compatible with being unordered. For those who want a particular 
>>>>>>>>> ordering, there'd be something like `collection(ordered 
>>>>>>>>> areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.
>>>>>>>>> 
>>>>>>>>> What you suggest here instead would be minimally source-breaking. 
>>>>>>>>> However, I'm unsure of where these guarantees provide benefit to 
>>>>>>>>> justify the performance cost. Certainly not for `first` or 
>>>>>>>>> `dropFirst(_:)`, which still yields an arbitrary result which doesn't 
>>>>>>>>> make sense for something _unordered_. We *could* have an underscored 
>>>>>>>>> customization point named something like `_customOrderingPass` that 
>>>>>>>>> is only invoked from `elementsEqual` or other such methods to 
>>>>>>>>> pre-rearrange the internal ordering of unordered collections in some 
>>>>>>>>> deterministic way before comparison. Is that what you have in mind?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Something like that.  Whatever we do, there will be a tradeoff between 
>>>>>>>> speed, correctness, and ergonomics.
>>>>>>>> 
>>>>>>>> My suggestion trades speed for correctness, and provides a way to 
>>>>>>>> recover speed through additional typing (which is slightly less 
>>>>>>>> ergonomic).
>>>>>>> 
>>>>>>> You haven't convinced me that this is at all improved in "correctness." 
>>>>>>> It trades one arbitrary iteration order for another on a type that 
>>>>>>> tries to model an unordered collection.
>>>>>>>  
>>>>>>>> We could do something like you suggest. I don’t think the method would 
>>>>>>>> need to be underscored… the ordering pass could just be a method on 
>>>>>>>> the protocol which defines it as unordered.  Then we could provide a 
>>>>>>>> special conformance for things where order really matters based on 
>>>>>>>> adherence to that protocol.  That might be an acceptable tradeoff.  It 
>>>>>>>> would give us speed at the cost of having the correct implementation 
>>>>>>>> being less ergonomic and more error prone (you have to remember to 
>>>>>>>> check that it is unordered and call the ordering method when it 
>>>>>>>> mattered).
>>>>>>>> 
>>>>>>>> I’d still be a bit worried that people would make incorrect generic 
>>>>>>>> algorithms based on expecting an order from unordered things, but at 
>>>>>>>> least it would be possible for them check and handle it correctly.  I 
>>>>>>>> think I could get behind that tradeoff/compromise, given where we are 
>>>>>>>> in the swift process and Swift's obsession with speed (though I still 
>>>>>>>> slightly prefer the safer default).  At least the standard library 
>>>>>>>> would handle all the things correctly, and that is what will affect 
>>>>>>>> the majority of programmers.
>>>>>>> 
>>>>>>> What is an example of such an "incorrect" generic algorithm that would 
>>>>>>> be made correct by such a scheme?
>>>>>> 
>>>>>> To start with, the one you gave as an example at the beginning of this 
>>>>>> discussion: Two sets with identical elements which have different 
>>>>>> internal storage and thus give different orderings as sequences.  You 
>>>>>> yourself have argued that the confusion around this is enough of a 
>>>>>> problem that we need to make a source-breaking change (renaming it) to 
>>>>>> warn people that the results of the ‘elementsEqual’ algorithm are 
>>>>>> undefined for sets and dictionaries.
>>>>> 
>>>>> No, I am arguing that the confusion about ‘elementsEqual’ is foremost a 
>>>>> problem with its name; the result of this operation is not at all 
>>>>> undefined for two sets but actually clearly defined: it returns true if 
>>>>> two sets have the same elements in the same iteration order, which is a 
>>>>> publicly observable behavior of sets (likewise dictionaries).
>>>> 
>>>> But it is a behavior which has absolutely no meaning at all because the 
>>>> order does not depend on the elements of the set but on the history of how 
>>>> the set has been reached its current state.
>>>> So why should I ever use this method on a set? 
>>>> What is the use case?
>>> 
>>> One example: you can use it to check an instance of Set<Float> to determine 
>>> if it has a NaN value. (The “obvious” way of doing it is not guaranteed to 
>>> work since NaN != NaN.)
>> 
>> How would I do that? I'd rather expect to use a property isNaN on Float to 
>> do that.
> 
> set.elementsEqual(set)


If this is the only use case for `elementsEqual` I suggest we remove that 
method and just use Float.isNaN:

set.contains { $0.isNaN }

which I argue is far more readable (intention revealing).

-Thorsten 


> 
>>>>>> I don’t see why a non-source-breaking change is suddenly off-limits.
>>>>>> 
>>>>>> But more than that, any generic algorithm which is assuming that the 
>>>>>> sequence is coming from an ordered source (i.e. many things using 
>>>>>> first/last).  Some uses of first are ok because the programmer actually 
>>>>>> means ‘any’, but anywhere where they actually mean first/last may be 
>>>>>> problematic.
>>>>> 
>>>>> Such as...?
>>>>> 
>>>>>> Currently, there is no way to test for ordered-ness, so there is no way 
>>>>>> for even a careful programmer to mitigate this problem.  By adding a 
>>>>>> protocol which states that something is unordered, we can either branch 
>>>>>> on it, or create a separate version of an algorithm for things which 
>>>>>> conform.
>>>>> 
>>>>> It is clearly the case that Swift’s protocol hierarchy fits sets and 
>>>>> collections imperfectly; however, it is in the nature of modeling that 
>>>>> imperfections are present. The question is not whether it is possible to 
>>>>> incur performance, API surface area, and other trade-offs to make the 
>>>>> model more faithful, but rather whether this usefully solves any problem. 
>>>>> What is the problem being mitigated? As I write above, Swift’s Set and 
>>>>> Dictionary types meet the semantic requirements for Collection and 
>>>>> moonlight as ordered collections. What is a generic algorithm on an 
>>>>> ordered collection that is  “not OK” for Set and Dictionary? 
>>>>> (“elementsEqual”, as I’ve said, is not such an example.)
>>>> 
>>>> On the contrary, `elementsEqual` is exactly such an example, because it 
>>>> makes no sense to use it on a Set.
>>>> 
>>>> let s1 = Set([1,2,3,4,5,6])
>>>> let s2 = Set([6,5,4,3,2,1])
>>>> 
>>>> Both sets have different iteration orders. Comparing those sets with some 
>>>> other collection using `elementsEqual` will give no meaningful result 
>>>> because the order - and therefore the result of `elementsEqual` - is in 
>>>> effect random.
>>> 
>>> No, it is not such an example; it’s misleadingly named but works 
>>> correctly—that is, its behavior matches exactly the documented behavior, 
>>> which relies on only the semantic guarantees of Sequence, which Set 
>>> correctly fulfills.
>> 
>> Fulfills to the letter. Again, what can you do with it if the result is 
>> random??
> 
> The result is not random.
>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Draft] Rename Sequence.elementsEqual

Reply via email to