On Tue, Oct 17, 2017 at 12:54 Jonathan Hull <jh...@gbis.com> wrote: > On Oct 17, 2017, at 9:34 AM, Xiaodi Wu <xiaodi...@gmail.com> wrote: > > > On Tue, Oct 17, 2017 at 09:42 Jonathan Hull <jh...@gbis.com> wrote: > >> On Oct 17, 2017, at 5:44 AM, Xiaodi Wu <xiaodi...@gmail.com> wrote: >> >> >> On Tue, Oct 17, 2017 at 00:56 Thorsten Seitz <tseit...@icloud.com> wrote: >> >>> >>> >>> Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi...@gmail.com>: >>> >>> >>> On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseit...@icloud.com> >>> wrote: >>> >>>> Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution < >>>> swift-evolution@swift.org>: >>>> >>>> >>>> On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jh...@gbis.com> wrote: >>>> >>>>> >>>>> On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote: >>>>> >>>>> On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jh...@gbis.com> wrote: >>>>> >>>>>> >>>>>> On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote: >>>>>> >>>>>>> That ordering can be arbitrary, but it shouldn’t leak internal >>>>>>>> representation such that the method used to create identical things >>>>>>>> affects >>>>>>>> the outcome of generic methods because of differences in internal >>>>>>>> representation. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> It would be better to say that the iteration order is >>>>>>>>> well-defined. That will almost always mean documented, and usually >>>>>>>>> predictable though obviously e.g. RNGs and iterating in random order >>>>>>>>> will >>>>>>>>> not be predictable by design. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> That's actually more semantically constrained than what Swift >>>>>>>>>> calls a `Collection` (which requires conforming types to be >>>>>>>>>> multi-pass >>>>>>>>>> and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly >>>>>>>>>> permits conforming single-pass, infinite, and/or unordered types. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think you’re talking about Sequence here, I’ve lost track of >>>>>>>>>> your nonsense by now. Yes, the current Swift protocol named Sequence >>>>>>>>>> allows >>>>>>>>>> unordered types. You seem to keep asserting that but not actually >>>>>>>>>> addressing my argument, which is *that allowing Sequences to be >>>>>>>>>> unordered with the current API is undesired and actively harmful, and >>>>>>>>>> should* *therefore** be changed*. >>>>>>>>>> >>>>>>>>> >>>>>>>>> What is harmful about it? >>>>>>>>> >>>>>>>>> >>>>>>>>> After thinking about it, I think the harmful bit is that unordered >>>>>>>>> sequences are leaking internal representation (In your example, this >>>>>>>>> is >>>>>>>>> causing people to be surprised when two sets with identical elements >>>>>>>>> are >>>>>>>>> generating different sequences/orderings based on how they were >>>>>>>>> created). >>>>>>>>> You are correct when you say that this problem is even true for >>>>>>>>> for-in. >>>>>>>>> >>>>>>>> >>>>>>>> I would not say it is a problem. Rather, by definition, iteration >>>>>>>> involves retrieving one element after another; if you're allowed to do >>>>>>>> that >>>>>>>> with Set, then the elements of a Set are observably ordered in some >>>>>>>> way. >>>>>>>> Since it's not an OrderedSet--i.e., order doesn't matter--then the only >>>>>>>> sensible conclusion is that the order of elements obtained in a >>>>>>>> for...in >>>>>>>> loop must be arbitrary. If you think this is harmful, then you must >>>>>>>> believe >>>>>>>> that one should be prohibited from iterating over an instance of Set. >>>>>>>> Otherwise, Set is inescapably a Sequence by the Swift definition of >>>>>>>> Sequence. All extension methods on Sequence like drop(while:) are >>>>>>>> really >>>>>>>> just conveniences for common things that you can do with iterated >>>>>>>> access; >>>>>>>> to my mind, they're essentially just alternative ways of spelling >>>>>>>> various >>>>>>>> for...in loops. >>>>>>>> >>>>>>>> >>>>>>>> I think an argument could be made that you shouldn’t be able to >>>>>>>> iterate over a set without first defining an ordering on it (even if >>>>>>>> that >>>>>>>> ordering is somewhat arbitrary). Maybe we have something like a >>>>>>>> “Sequenc(e)able” protocol which defines things which can be turned >>>>>>>> into a >>>>>>>> sequence when combined with some sort of ordering. One possible >>>>>>>> ordering >>>>>>>> could be the internal representation (At least in that case we are >>>>>>>> calling >>>>>>>> it out specifically). If I had to say >>>>>>>> “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would >>>>>>>> definitely >>>>>>>> be less surprised when it returns false even though setA == setB. >>>>>>>> >>>>>>> >>>>>>> Well, that's a totally different direction, then; you're arguing >>>>>>> that `Set` and `Dictionary` should not conform to `Sequence` altogether. >>>>>>> That's fine (it's also a direction that some of us explored off-list a >>>>>>> while ago), but at this point in Swift's evolution, realistically, it's >>>>>>> not >>>>>>> within the realm of possible changes. >>>>>>> >>>>>>> >>>>>>> I am actually suggesting something slightly different. Basically, >>>>>>> Set and Dictionary’s conformance to Collection would have a different >>>>>>> implementation. They would conform to another protocol declaring that >>>>>>> they >>>>>>> are unordered. That protocol would fill in part of the conformance to >>>>>>> sequence/collection using a default ordering, which is mostly arbitrary, >>>>>>> but guaranteed to produce the same ordering for the same list of >>>>>>> elements >>>>>>> (even across collection types). This would be safer, but a tiny bit >>>>>>> slower >>>>>>> than what we have now (We could also potentially develop a way for >>>>>>> collections like set to amortize the cost). For those who need to >>>>>>> recover >>>>>>> speed, the new protocol would also define a property which quickly >>>>>>> returns >>>>>>> a sequence/iterator using the internal ordering (I arbitrarily called it >>>>>>> .arbitraryOrder). >>>>>>> >>>>>>> I believe it would not be source breaking. >>>>>>> >>>>>> >>>>>> That is indeed something slightly different. >>>>>> >>>>>> In an ideal world--and my initial understanding of what you were >>>>>> suggesting--Set and Dictionary would each have a member like >>>>>> `collection`, >>>>>> which would expose the underlying data as a `SetCollection` or >>>>>> `DictionaryCollection` that in turn would conform to `Collection`; >>>>>> meanwhile, Set and Dictionary themselves would not offer methods such as >>>>>> `prefix`, or indexing by subscript, which are not compatible with being >>>>>> unordered. For those who want a particular ordering, there'd be something >>>>>> like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> >>>>>> {Set|Dictionary}Collection`. >>>>>> >>>>>> What you suggest here instead would be minimally source-breaking. >>>>>> However, I'm unsure of where these guarantees provide benefit to justify >>>>>> the performance cost. Certainly not for `first` or `dropFirst(_:)`, which >>>>>> still yields an arbitrary result which doesn't make sense for something >>>>>> _unordered_. We *could* have an underscored customization point named >>>>>> something like `_customOrderingPass` that is only invoked from >>>>>> `elementsEqual` or other such methods to pre-rearrange the internal >>>>>> ordering of unordered collections in some deterministic way before >>>>>> comparison. Is that what you have in mind? >>>>>> >>>>>> >>>>>> >>>>>> Something like that. Whatever we do, there will be a tradeoff >>>>>> between speed, correctness, and ergonomics. >>>>>> >>>>>> My suggestion trades speed for correctness, and provides a way to >>>>>> recover speed through additional typing (which is slightly less >>>>>> ergonomic). >>>>>> >>>>> >>>>> You haven't convinced me that this is at all improved in >>>>> "correctness." It trades one arbitrary iteration order for another on a >>>>> type that tries to model an unordered collection. >>>>> >>>>> >>>>>> We could do something like you suggest. I don’t think the method >>>>>> would need to be underscored… the ordering pass could just be a method on >>>>>> the protocol which defines it as unordered. Then we could provide a >>>>>> special conformance for things where order really matters based on >>>>>> adherence to that protocol. That might be an acceptable tradeoff. It >>>>>> would give us speed at the cost of having the correct implementation >>>>>> being >>>>>> less ergonomic and more error prone (you have to remember to check that >>>>>> it >>>>>> is unordered and call the ordering method when it mattered). >>>>>> >>>>>> I’d still be a bit worried that people would make incorrect generic >>>>>> algorithms based on expecting an order from unordered things, but at >>>>>> least >>>>>> it would be possible for them check and handle it correctly. I think I >>>>>> could get behind that tradeoff/compromise, given where we are in the >>>>>> swift >>>>>> process and Swift's obsession with speed (though I still slightly prefer >>>>>> the safer default). At least the standard library would handle all the >>>>>> things correctly, and that is what will affect the majority of >>>>>> programmers. >>>>>> >>>>> >>>>> What is an example of such an "incorrect" generic algorithm that would >>>>> be made correct by such a scheme? >>>>> >>>>> >>>>> To start with, the one you gave as an example at the beginning of this >>>>> discussion: Two sets with identical elements which have different internal >>>>> storage and thus give different orderings as sequences. You yourself have >>>>> argued that the confusion around this is enough of a problem that we need >>>>> to make a source-breaking change (renaming it) to warn people that the >>>>> results of the ‘elementsEqual’ algorithm are undefined for sets and >>>>> dictionaries. >>>>> >>>> >>>> No, I am arguing that the confusion about ‘elementsEqual’ is foremost a >>>> problem with its name; the result of this operation is not at all undefined >>>> for two sets but actually clearly defined: it returns true if two sets have >>>> the same elements in the same iteration order, which is a publicly >>>> observable behavior of sets (likewise dictionaries). >>>> >>>> >>>> But it is a behavior which has absolutely no meaning at all because the >>>> order does not depend on the elements of the set but on the history of how >>>> the set has been reached its current state. >>>> So why should I ever use this method on a set? >>>> What is the use case? >>>> >>> >>> One example: you can use it to check an instance of Set<Float> to >>> determine if it has a NaN value. (The “obvious” way of doing it is not >>> guaranteed to work since NaN != NaN.) >>> >>> >>> How would I do that? I'd rather expect to use a property isNaN on Float >>> to do that. >>> >> >> set.elementsEqual(set) >> >> >> I see why that would work (thanks to Set being a collection vs a >> sequence), but it still feels like a hack. I definitely wouldn’t want to >> be maintaining code with that in it. Especially when compared to something >> like: >> >> set.contains(where: {$0.isNaN}) >> > > The purpose of protocols is to enable useful generic code. You cannot use > isNaN for code that works on generic Collection, or for that matter, even > code that works on Set where Element : Numeric. > > Much generic code (for example, generic sorting) easily blows up when > encountering NaN. If you want an algorithm that is robust enough to handle > (or at least reject) that scenario, then you will need to check for it. It > is not a “hack” but a very common and accepted way of determining the > presence of NaN. > > > Just because a hack is commonly accepted or common doesn’t mean it isn’t a > hack. >
It’s not a “hack.” NaN is required by IEEE fiat not to be equivalent to itself. Asking whether a value is reflexively equivalent to itself is guaranteed to be sufficient to detect the presence of NaN in all IEEE-compliant contexts, which is to say all past, present, and future versions of Swift. Using elementsEqual in this way (in a generic context which may or may not > involve floats) is definitely a hack, and is likely to bite you at some > point. > How is it definitely a hack when it is blessed by IEEE, and how can it bite me? I don’t see why a non-source-breaking change is suddenly off-limits. >>>>> >>>>> But more than that, any generic algorithm which is assuming that the >>>>> sequence is coming from an ordered source (i.e. many things using >>>>> first/last). Some uses of first are ok because the programmer actually >>>>> means ‘any’, but anywhere where they actually mean first/last may be >>>>> problematic. >>>>> >>>> >>>> Such as...? >>>> >>>> Currently, there is no way to test for ordered-ness, so there is no way >>>>> for even a careful programmer to mitigate this problem. By adding a >>>>> protocol which states that something is unordered, we can either branch on >>>>> it, or create a separate version of an algorithm for things which conform. >>>>> >>>> >>>> It is clearly the case that Swift’s protocol hierarchy fits sets and >>>> collections imperfectly; however, it is in the nature of modeling that >>>> imperfections are present. The question is not whether it is possible to >>>> incur performance, API surface area, and other trade-offs to make the model >>>> more faithful, but rather whether this usefully solves any problem. What is >>>> the problem being mitigated? As I write above, Swift’s Set and Dictionary >>>> types meet the semantic requirements for Collection and moonlight as >>>> ordered collections. What is a generic algorithm on an ordered collection >>>> that is “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said, >>>> is not such an example.) >>>> >>>> >>>> On the contrary, `elementsEqual` is exactly such an example, because it >>>> makes no sense to use it on a Set. >>>> >>>> let s1 = Set([1,2,3,4,5,6]) >>>> let s2 = Set([6,5,4,3,2,1]) >>>> >>>> Both sets have different iteration orders. Comparing those sets with >>>> some other collection using `elementsEqual` will give no meaningful result >>>> because the order - and therefore the result of `elementsEqual` - is in >>>> effect random. >>>> >>> >>> No, it is not such an example; it’s misleadingly named but works >>> correctly—that is, its behavior matches exactly the documented behavior, >>> which relies on only the semantic guarantees of Sequence, which Set >>> correctly fulfills. >>> >>> >>> Fulfills to the letter. Again, what can you do with it if the result is >>> random?? >>> >> >> The result is not random. >> >> >> It is undefined though. As you said earlier, by the guarantees we have >> been given, it may shift over different builds/runs of a program. Thus in >> one run, it might return true and then false in another (without changing >> our code). As Greg pointed out, it is also possible with the guarantees we >> are given, for set/dict to have different orderings with copies of >> themselves. (It will happen for sure when deep copying a dictionary with >> reference-type keys). >> > > As I wrote to Greg, if that is a possibility for Set, then the > implementation is problematic for conformance to Collection and needs to be > re-examined. > > > So why not fix both things at once with a single change? > > > I’m not sure why you claim the order of a dictionary will definitiely > change on deep copying. But in any case, we are not talking about deep > copying here, or at least, I’m not; we’re talking about the notional > copying involved in the semantics of passing a set as an argument. > > > Because the ordering of some dictionaries depends on the memory address of > the key. There is no way to copy the key in those cases without changing > the ordering. You can watch their ordering shift in the debugger from run > to run. > I’m fine with that. Instances of reference types have identity. A deep copy of a dictionary *has different keys* from the original. This may not be so from the perspective of a custom == operator, but it most certainly is so from the perspective of reference type semantics. So to me it is fine (correct, even?) that a deep copy of a dictionary has a different iteration order—it has different keys. Again, this is something entirely different from notional copies made in the course of passing values as arguments. As I keep saying, relying on the behavior of a leaking internal >> implementation is a bad plan. >> > > Again, it is not a leaking internal implementation. It is a public API > guarantee. > > > The public API guarantee is that there will be some ordering that is > stable for a particular instance on a particular run (as long as it is not > mutated/copied). The actual ordering is undefined. > > Any generic code which depends on a particular ordering will have output > which is undefined. > Generic code is incorrect that “depends on a particular ordering” because it assumes semantics that are not guaranteed. Just because a function returns a seemingly predictable value doesn’t mean > it is entirely deterministic. There are lots of vulnerabilities in C/C++ > due to reliance on undefined behavior. > > > We should add an additional guarantee to set/dict that the order returned >> will be the same for the same contents regardless of history (but can be >> otherwise arbitrary). That will fix the behavior for algorithms like >> elementsEqual (i.e. They will return the same result across builds/runs). >> It will also implicitly provide as a result, the constraint you were >> arguing is needed across copies of a collection type. I agree that that is >> an important guarantee. Why not fix both issues with a single >> non-source-breaking change? >> > > As I wrote, the behavior of elementsEqual is not broken and does not > require fixing. > > > Your argument for that is tautological though. You could argue for > keeping any bug in the codebase using the same reasoning. > > Why was elementsEqual created? Isn’t it meant to check equality of two > sequences in a generic context where == isn’t available? > No no no no no no no no. That’s precisely why the name is misleading. elementsEqual is *not* simply a mixed-type version of ==. Remember that == as implemented on a concrete sequence type *has no obligation* to use elementwise comparison using the element’s implementation of ==. This is not merely theoretical: [Float].== does *not* do an elementwise comparison using Float.==. By contrast, you are guaranteed a true elementwise comparison with elementsEqual regardless of how equivalence is defined for the sequence. Isn’t it problematic that two equivalent sets may not be equivalent in a > generic context? > Again, this shows why the name is unfortunate. elementsEqual is not supposed to be a generic version of ==, and its name is misleading. Repeat: elementsEqual is *not* an alternative for == in either the generic or concrete context. Neither one implies the other. What is the benefit of the guarantee you propose that justifies the > performance cost? > > > The benefit is that generic algorithms which rely on ordering of unordered > things would be much more robust. It would solve your copy issue as well. > > You are acting like there is a huge performance cost. There isn’t. For > Set, it would still have the traditional performance characteristics for > Sets. Any slowdown (by un-cutting corners) would be amortized over > insertions. (note: I am talking option #3 from my summary here, not #5, > which was the one with an actual performance cost) > > > Why is the source-breaking change of renaming things better? >> > > Because, in my analysis, the problem is that the method is incorrectly > named. The problem affects all types that conform to Sequence and not just > Set and Dictionary; elementsEqual is a distinct function from ==, and it > must either continue to be distinct or cease to exist, but its name does > nothing to clarify any distinction. > > > It also won’t fix the original problem you mention as motivation. People > will still be surprised at equivalent sets having different iteration > orders regardless of what the function is named. > That is not the motivating problem. The motivating problem is that people think that elementsEqual is semantically equivalent to ==. It is a distinct method that *does not test* equivalence of two sequences. Thanks, > Jon > > >
_______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution