> On Dec 30, 2015, at 3:39 PM, Kevin Ballard via swift-evolution > <[email protected]> wrote: > > Refactoring like that can always be done later; introducing infinite > sequences now shouldn't make it any harder to refactor the protocols. If > anything, it will provide more practical experience with how infinite > sequences interact with the current protocol hierarchy that would help guide > the design of any refactoring (or, perhaps more importantly, help determine > if such a refactoring is worthwhile). > > A big concern I have with refactoring like that is you'd rarely ever be able > to actually bound an algorithm on FiniteSequenceType, because there will > always be ways of constructing sequences that the compiler can't tell if it's > infinite. Trivial example is `ary.cycle.takeWhile(pred)`. This is infinite if > `pred(elt)` is true for every element of `ary` and finite otherwise. But > there's tons of other ways of creating such indeterminate sequences. Bounding > algorithms (such as `map(..) -> [T]` or `reduce()`) on finite sequences only > would be rather limiting, as users who know their sequence must be finite but > can't prove it to the compiler would be unable to use methods that are in > fact perfectly safe.
If you will accept that I would prefer the standard library types to interpret the standard library protocols as having already made such a binding — as, indeed, the type signatures necessarily imply — the rest of my objection should be comprehensible. I don’t find the repeated points about what can happen due to user error to be interesting. > > What you could do with such a refactoring is optimize certain algorithms if > they use a more precise sequence type, but I'm not actually sure what sort of > optimizations you can make with your proposed protocol hierarchy. In fact, > the only real optimizations that come to mind are optimizing when you know > you're working with a CycleSequence, because for various algorithms you > really only have to iterate the underlying elements once (e.g. assuming a > pure predicate, CycleSequence can implement contains() safely, although we > don't actually have pure in the language right now so that's not necessarily > a safe assumption; also, contains() isn't a protocol method, just an > extension method, so we can't actually do that anyway). I already gave the example of a product sequence. There’s a simple implementation that is correct when both input sequences are finite-and-stable, but not necessarily if otherwise; the proposed refactoring would statically provides enough information to allow one to choose the minimally-*pessimal* implementation that is still correct for such inputs. How much correctness matters is, of course, essentially subjective. > > On that note, core language team, anyone know offhand why SequenceType has a > bunch of extension methods that aren't part of the protocol? For example, > contains(_ predicate:). The only real reason I can think of is to shrink the > protocol witness table, but surely that's not particularly meaningful. I > warrant that contains(_ predicate:) doesn't really have any reason to be > overridden by anything except sequences that knowingly repeat elements (e.g. > CycleSequence and Repeat), and even that's only if you assume the predicate > is pure, but there's some other methods that make sense to override on some > sequences (like minElement(_ isOrderedBefore:) for any sequence that has a > defined ordering). I am not sure how, precisely, you would propose to override the closure-taking variant of `minElement` to take advantage of an intrinsic ordering, but I’d be curious to see it. I’d hope that the non-closure-accepting variants would be made overridable once the type system supports it. > > -Kevin Ballard > > On Wed, Dec 30, 2015, at 11:01 AM, plx via swift-evolution wrote: >> >>> On Dec 29, 2015, at 6:38 PM, Dave Abrahams <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>>> On Dec 29, 2015, at 3:39 PM, plx via swift-evolution >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> >>>>> >>>>> On Dec 29, 2015, at 1:17 PM, Kevin Ballard via swift-evolution >>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>> >>>>> On Tue, Dec 29, 2015, at 08:14 AM, plx via swift-evolution wrote: >>>>>> Personally I’d say this should be a -1 for standard-library inclusion. >>>>>> >>>>>> Swift’s not really built to handle infinite sequences right now; until >>>>>> they are handed better by the standard library convenience methods for >>>>>> creating them shouldn’t be in the standard library. >>>>> >>>>> As far as I can tell, the only way in which it's "not really built" to >>>>> handle this is that there are multiple constructs that attempt to eagerly >>>>> consume an entire sequence, and using these with an infinite sequence >>>>> would end up looping forever. But I don't consider that to be a >>>>> particularly serious problem. You could "fix" it by refactoring >>>>> SequenceType into two protocols, SequenceType (for possibly-infinite >>>>> sequences) and FiniteSequenceType (for known-finite sequences) and then >>>>> going over the entire standard library and updating various spots to use >>>>> FiniteSequenceType, except this would be very limiting (sequences that >>>>> are not known if they're infinite to the compiler could still be valid >>>>> for various eager algorithms if the programmer knows it will be finite in >>>>> practice). >>>> >>>> Indeed, on my wishlist I would like to see the standard protocols >>>> refactored to something more like this: >>>> >>>> SequenceType // can be iterated >>>> FiniteSequenceType : SequenceType, // of finite length >>>> StableSequenceType : SequenceType, // can be re-iterated identically >>>> CollectionType : StableSequenceType, FiniteSequenceType, (etc.) // >>>> otherwise as it is now >>> >>> This is interesting. A few concerns: >>> >>> First, we have tried not to create any distinct protocols with identical >>> syntactic requirements, because we think it makes the world clearer; we >>> think people are more likely to assign incorrect protocols when all the >>> operations they want are available, but don’t have the right semantics. >>> That isn’t to say we shouldn’t start doing it, but it would be a break from >>> the past. >>> >>> Higher protocol granularity has a high comprehensibility cost. >>> Distinguishing protocols based on semantic requirements alone may make the >>> library harder to understand. I’ve heard some people’s heads have exploded >>> from simply encountering CollectionType. >>> >>> Next, it’s a principle of generic programming that every protocol should be >>> justified by both algorithms that exploit its requirements (e.g. extension >>> methods) and some real-world models that can’t reasonably also conform to a >>> more-refined protocol. For example, we have a ForwardIndexType because a >>> singly-linked list has multipass forward traversal and can’t do >>> bidirectional traversal. In order to evaluate any proposal for new >>> protocols, we’d need to see all of these things. >> >> Speaking frankly, I’d say that there are few benefits to a refactoring along >> the lines sketched above until/unless you want to have solid support for >> things like a product-sequence or product-collection; you would gain >> *significant* advantages from such a refactored hierarchy in such scenarios, >> but I can’t think of any meaningful improvement the refactoring would offer >> in the essentially-linear case. (Note that this is distinct from examples of >> concrete models that are naturally e.g. finite-but-not-stable and >> vice-versa; they exist, but the rest is already long enough as it is). >> >> A full example is beyond my time-budget here, but I can give the flavor of >> where it makes a difference somewhat quickly. >> >> Consider a hypothetical `ProductSequence2<A:SequenceType,B:SequenceType>` >> that enumerates the cartesian product of two sequences (in an unspecified >> order); the naive implementation has problems with non-stable sequences and >> with infinite-sequences, but I’ll only discuss the infinite case b/c it’s >> more relevant here. >> >> When either—or both—of the sequences are infinite, the order-of-iteration >> has actual observable consequences; as a concrete example, if you want this >> to work out: >> >> ProductSequence2([1,2].cycle,[1,2].cycle).contains((2,2)) == true >> >> …then we have this: >> >> - if `a` is finite and `b` is not, you want to iterate like (a0,b0), >> (a1,b0), (a2, b0) … , (a0, b1), (a1, b1), …, etc >> - if `b` is finite and `a` is not, you want to iterate like (a0,b0), >> (a0,b1), (a0, b2) … , (a1, b0), (a1, b1), …, etc >> - if neither `a` nor `b` is finite, you want to iterate like (a0, b0), >> (a1,b0), (a0,b1), (a2, b0), (a1, b1), (a0, b2), … etc >> >> …as with those orderings you *will* eventually reach each pair in the >> product (which seems strictly superior to an iteration order which will >> leave many members of the product forever un-visited). >> >> Importantly, the third ordering is inefficient in the general case, and thus >> isn’t a suitable default; you’d only want it in places where you’re >> *intentionally* using infinite series. >> >> This is a concrete example of the sort of thing that leaves me preferring >> that the standard library *not* contain infinite sequences at this time: >> such sequences *highly* benefit from special handling, but there’s no good >> way at this time to generically provide such handling in a general context >> within the bounds of the current language and standard library . >> >> If there’s a realistic chance of such a refactoring being accepted if >> sufficiently-well-proposed I can prepare something, but I’m admittedly >> skeptical given the likely magnitude of the associated changes. In >> particular, the API on types like `ForwardIndexType` is rather unfortunate >> for things like a hypothetical product-collection scenario; since such >> products are among the stronger motivating uses, this means it’s a protocol >> refactoring + a lot of API changes to the existing standard library, and >> probably making things clunkier in the common / linear cases. >> >>> >>>> …but can understand the wish to not overly-complicate the basic protocol >>>> hierarchy (and also to avoid baking-in nice-to-have, but >>>> impossible-to-really-enforce semantic requirements; I’d trust the standard >>>> library to use them properly, but not typical 3rd party code, somewhat >>>> defeating the purpose). >>> >>> Well, I guess I should have read ahead and most of my lecture above was >>> needless! I’m posting it anyway because I think it spells out some >>> important principles we’ll need to refer to later. >>> >>>> Everything else is a difference of outlook; we agree on the facts and >>>> differ in interpretation. >>>> >>>> I consider concrete types that adopt a protocol only to simply call >>>> `fatalError` for most of the protocol methods to be pathological — often >>>> useful, but still pathological — and thus far the standard library doesn’t >>>> contain any such pathological types (to my knowledge). >>>> >>>> `cycle` is useful but not useful enough to be the standard library’s first >>>> “pathological” type, so it’s still a -1 as proposed. >>>> >>>> This is nothing specific to `cycle` and my opinion here could change were >>>> the language or the standard library to evolve in various ways. >>>>> >>>>>> You’d also want to call `fatalError` for at least `reduce`, `reverse`, >>>>>> `sort`, `split`(?), `flatMap`, `dropLast`, `suffix`, and `forEach`. >>>>> >>>>> You can only do it for the ones defined in the protocol, not ones defined >>>>> in extensions. This means map, filter, forEach, and suffix. >>>>> >>>>> split is actually perfectly implementable for a CycleSequence, although >>>>> it does need a custom implementation. split is bounded by at most Int.max >>>>> splits, which means it is guaranteed to terminate even for an infinite >>>>> sequence (although the final subsequence does need to be infinite[1]). >>>>> Even if there are no separators in the cycle, it can just return the >>>>> cycle itself. >>>>> >>>>> [1] Interestingly, the current implementation actually dumps the >>>>> remainder into an Array and returns that (wrapped in AnySequence), which >>>>> is curious because it would be perfectly legal for it to just wrap the >>>>> generator up in AnySequence and return that instead. I'm tempted to >>>>> submit a PR to change that now, as it just seems like unnecessary work to >>>>> use an array. >>>>> >>>>>> `startsWith` and `elementsEqual` and `lexicographicComparison` are all >>>>>> broken if you call them like e.g. `self.startsWith(self)`. >>>>> >>>>> That's true, but what do you really expect when you're calling them with >>>>> two infinite sequences? It's not so much that they're broken as it is >>>>> that you're creating an infinite loop without any way to break out. And >>>>> FWIW, lexicographicalCompare is actually something you might want to >>>>> explicitly support on infinite sequences if you know your sequences >>>>> aren't equal and want to find out which one is less than the other. >>>>> >>>>> There are plenty of ways to shoot yourself in the foot. I don't think >>>>> infinite sequences are really the big problem you're making them out to >>>>> be. >>>>> >>>>>> You can conceivably implement a non-crashing `contains`, `minElement` >>>>>> and `maxElement` on `CycleSequence` by calling through to the wrapped >>>>>> collection, but that’ll seemingly evaporate as soon as your >>>>>> `CycleSequence` winds up hidden inside an `AnySequence`. >>>>> >>>>> You can't override those anyway in a generic context, because they're not >>>>> members of the protocol, they're just extensions. You could implement >>>>> them such that your implementation is called when working on the concrete >>>>> CycleSequence type, but I'm not sure if that's a great idea to do that >>>>> when the actual behavior differs from calling it generically on >>>>> SequenceType (well, triggering a fatalError() instead of an infinite loop >>>>> is fine because they're both Bottom, but returning a valid result in one >>>>> context and looping infinitely in the other seems bad). >>>>> >>>>> Of course, these methods could actually be moved into the protocol >>>>> itself, which would let us override them. I'm not entirely sure why they >>>>> aren't in the protocol to begin with, I guess because there's not much >>>>> need for overriding these outside of infinite sequences (well, finite >>>>> sorted sequences could provide an optimized min/maxElement, and an >>>>> optimized version of contains(_: Self.Generator.Element), but maybe >>>>> there's tradeoffs to doing this, e.g. maybe there's some reason why >>>>> having a large protocol witness table is a bad idea). >>>> >>>> I don’t think `contains` or `minElement/maxElement` *can* be part of the >>>> protocol (in the sense of overridable) at this time (they require >>>> `Element` satisfy certain type constraints), but they certainly should be >>>> if the type system someday would support that. >>>> >>>>> >>>>>> Which illustrates why this is a -1 for me; there's nothing wrong with >>>>>> the functionality in isolation and there’s nothing wrong with infinite >>>>>> sequences, but the standard library should play well with itself, and >>>>>> this wouldn’t play well with the rest of the standard library. >>>>> >>>>> Ultimately, there's not much difference between an infinite sequence and >>>>> a sequence of Int.max elements. The latter is finite, but it's so massive >>>>> (especially on 64-bit) that any kind of eager processing is going to hit >>>>> the same problems as an infinite sequence. Every problem you describe >>>>> will be a problem with the simple sequence `(0..<Int.max)` as well. >>>>> >>>>> -Kevin Ballard >>>>> >>>>>> That opinion could change as the language changes or the standard >>>>>> library evolves. >>>>>> >>>>>>> On Dec 28, 2015, at 1:20 AM, Kevin Ballard via swift-evolution >>>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> ## Introduction >>>>>>> >>>>>>> Add a new property `cycle` to CollectionType that returns an infinite >>>>>>> SequenceType that yields the elements of the collection in a loop. >>>>>>> >>>>>>> ## Motivation >>>>>>> >>>>>>> It's sometimes useful to be able to have an infinite sequence. For >>>>>>> example, `CollectionOfOne(x).cycle` could be used to have an infinite >>>>>>> sequence of a single element (similar to Repeat but without a count). A >>>>>>> common use for infinite sequences is zipping with a finite sequence. As >>>>>>> far as I'm aware, the stdlib does not currently provide any way to >>>>>>> create such an infinite sequence. >>>>>>> >>>>>>> ## Proposed solution >>>>>>> >>>>>>> Extend CollectionType with a new property `cycle` that yields a type >>>>>>> that conforms to SequenceType. This sequence yields each element of the >>>>>>> collection in an infinite loop. >>>>>>> >>>>>>> ## Detailed design >>>>>>> >>>>>>> 2 new types would be added: >>>>>>> >>>>>>> struct CycleSequence<Base : CollectionType> : LazySequenceType { ... } >>>>>>> struct CycleGenerator<Base : CollectionType> : GeneratorType { ... } >>>>>>> >>>>>>> CollectionType would be extended with a property: >>>>>>> >>>>>>> extension CollectionType { >>>>>>> public var cycle: CycleSequence<Self> { get } >>>>>>> } >>>>>>> >>>>>>> This is an extension of CollectionType instead of SequenceType because >>>>>>> it requires a multi-pass sequence (and SequenceType does not provide >>>>>>> that guarantee). The returned type conforms to SequenceType instead of >>>>>>> CollectionType because there is no possible `endIndex` that satisfies >>>>>>> the requirement of being reachable from `startIndex` by zero or more >>>>>>> applications of `successor()`. >>>>>>> >>>>>>> Because the default eager versions of map and filter will execute >>>>>>> forever on an infinite sequence, CycleSequence conforms to >>>>>>> LazySequenceType instead of SequenceType in order to provide lazy >>>>>>> versions of those functions. Additionally, it will provide >>>>>>> implementations of the eager versions that simply trigger a >>>>>>> fatalError(), as the alternative is an infinite loop that consumes more >>>>>>> and more memory. >>>>>>> >>>>>>> ## Impact on existing code >>>>>>> >>>>>>> None >>>>>>> >>>>>>> -Kevin Ballard >>>>>>> _______________________________________________ >>>>>>> swift-evolution mailing list >>>>>>> [email protected] <mailto:[email protected]> >>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution >>>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution> >>>>>> >>>>>> _______________________________________________ >>>>>> swift-evolution mailing list >>>>>> [email protected] <mailto:[email protected]> >>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution >>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution> >>>>> _______________________________________________ >>>>> swift-evolution mailing list >>>>> [email protected] <mailto:[email protected]> >>>>> https://lists.swift.org/mailman/listinfo/swift-evolution >>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution> >>>> >>>> >>>> _______________________________________________ >>>> swift-evolution mailing list >>>> [email protected] <mailto:[email protected]> >>>> https://lists.swift.org/mailman/listinfo/swift-evolution >>>> <https://lists.swift.org/mailman/listinfo/swift-evolution> >>> >>> >> >> >> >> _______________________________________________ >> swift-evolution mailing list >> [email protected] <mailto:[email protected]> >> https://lists.swift.org/mailman/listinfo/swift-evolution >> <https://lists.swift.org/mailman/listinfo/swift-evolution> > > > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
