> On Jul 1, 2016, at 6:32 PM, Dave Abrahams <[email protected]> wrote: > > > on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com > <http://matthew-at-anandabits.com/>> wrote: > >>> On Jul 1, 2016, at 11:51 AM, Dave Abrahams <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com >>> <http://matthew-at-anandabits.com/> <http://matthew-at-anandabits.com/ >>> <http://matthew-at-anandabits.com/>>> wrote: >>> >> >>>>> On Jun 30, 2016, at 12:26 PM, Dave Abrahams <[email protected] >>>>> <mailto:[email protected]>> >>>> wrote: >>>>> >>>>> >>>>> on Wed Jun 29 2016, Haravikk <swift-evolution-AT-haravikk.me >>>>> <http://swift-evolution-at-haravikk.me/>> wrote: >>>>> >>>> >>>>>>> On 29 Jun 2016, at 00:10, Matthew Johnson via swift-evolution >>>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> Swift is a language that embraces value semantics. Many common >>>>>>> iterators *can* be implemented with value semantics. Just because we >>>>>>> can’t implement *all* iterators with value semantics doesn’t mean we >>>>>>> should require them to have reference semantics. It just means you >>>>>>> can’t *assume* value semantics when working with iterators in generic >>>>>>> code unless / until we have a way to specify a value semantics >>>>>>> constraint. That’s not necessarily a bad thing especially when it >>>>>>> leaves the door open to interesting future possibilities. >>>>>>> >>>>>>> -Matthew >>>>>> >>>>>> I'm kind of undecided about this personally. I think one of the >>>>>> problems with Swift is that the only indication that you have a >>>>>> reference type is that you can declare it as a constant, yet still >>>>>> call mutating methods upon it, this isn't a very positive way of >>>>>> identifying it however. This may be more of a GUI/IDE issue though, in >>>>>> that something being a class isn't always that obvious at a glance. >>>>>> >>>>>> I wonder, could we somehow force iterators stored in variables to be >>>>>> passed via inout? This would make it pretty clear that you're using >>>>>> the same iterator and not a copy in all cases, encouraging you to >>>>>> obtain another if you really do need to perform multiple passes. >>>>> >>>>> I'm going to push single-pass iteration on the stack briefly and talk >>>>> about the topic that's been under discussion here: infinite multipass >>>>> sequences. >>>>> >>>>> ## Fitting “Infinite Multipass” Into the Model >>>>> >>>>> It remains to be decided whether it's worth doing, but if it's to >>>>> happen >>>> >>>> I definitely think it’s worth doing. >>> >>> Opinions are nice, but rationales are better. How will we understand >>> *why* it's worth doing? >> >> I agree. >> >> The rationale has been discussed quite a bit already in this thread. >> The current protocols do not provide the semantics many people are >> assuming in their code, leading to a lot of code that is incorrect >> despite the fact that it usually works in practice. >> >> This is especially frequent in the case of the finite assumption. >> This assumption is so common it seems very wise to me to encode it as >> a semantic requirement in a protocol. >> >> IMO these are problem worth addressing, especially now that we have a >> good handle on what a solution would look like. >> >>> >>>> I really appreciate the attention that the library team has given to >>>> this. >>>> >>>>> , the standard library team thinks the right design is roughly >>>>> this: >>>>> >>>>> /// A multipass sequence that may be infinite >>>>> protocol Collection { >>>>> >>>>> // Only eager algorithms that can terminate available here >>>>> func index(where predicate: (Element)->Bool) -> Index >>>>> >>>>> // all lazy algorithms available here >>>>> var lazy: ... >>>>> >>>>> var startIndex: Index >>>>> var endIndex: Index // possibly not reachable from startIndex >>>>> >>>>> associatedtype SubSequence : Collection >>>>> // do we need an associated FiniteSubsequence, e.g. for prefixes? >>>>> } >>>>> >>>>> protocol FiniteCollection : Collection { >>>>> >>>>> // All eager algorithms available here >>>>> func map(...) -> >>>>> var count: ... >>>>> } >>>>> >>>>> protocol BidirectionalCollection : Collection { ... } >>>>> >>>>> protocol RandomAccessCollection : BidirectionalCollection { … } >>>> >>>> Does this design entirely break with the relationship between >>>> collections and iterators (dropping `makeIterator` as a protocol >>>> requirement)? If so, would for..in (over collections) be built on top >>>> of indices and use `formIndex(after:)`? Would it require a finite >>>> collection (unless we add `until` to the language and then allow >>>> `for..in..until` to work with infinite collections)? >>> >>> All of these points are up for discussion. >> >> Cool. I think the collection for..in has some nice advantages, but >> since it sounds like we’ll probably keep Iterator around it might be >> best to take the approach of making them both work. >> >> You already know that I would prefer to see the current for..in built >> on finite sequences and allow for..in..unitl to be used with infinite >> sequences if we add that in the future. :) >> >>> John McCall pointed out to >>> me that an index-based for..in would make it possible to implement >>> >>> for inout x in y { mutate(&x) } >> >> That would be very nice! >> >> I think it might also increase performance. I don’t know exactly how >> for..in is implemented today, but the implementation of >> IndexingIterator compares position to endIndex. If for..in is also >> comparing checking the optional for nil that’s an extra comparison. >> We shouldn't need to actually construct the optional in the first >> place using an index-based for..in. Maybe optimizations like this >> already exist? But even if they do, it seems like they wouldn’t be >> possible in some cases where the type of the sequence isn’t statically >> known. >> >>> >>>> Would we still retain `IndexingIterator`even if we break the >>>> relationship in the protocol requirements? >>> >>> Yes: it should be possible to implement Collection algorithms in terms >>> of Iterator algorithms, and IndexingIterator provides the means. That >>> said, I think the makeIterator requirement does little harm, especially >>> when it can be defaulted for Collections. >> >> I like this answer. >> >>> >>>> Would it still be possible to do things like zip a multi-pass sequence >>>> with a single-pass sequence (assuming we keep single-pass sequences or >>>> add them back eventually)? This seems like a use case worth >>>> supporting in some way. >>> >>> Yes. If you can create an Iterator from a Collection, and you can zip >>> Iterators, you can do this. >> >> Yes, of course. I’m glad we would keep this relationship in tact. >> >>> >>>> One subtle change I think this implies is that things like >>>> `LazyFilterSequence` can implement `makeIterator` with constant >>>> complexity, deferring the O(N) complexity to the first call to `next`. >>> >>> I don't believe that's a difference, though I could be wrong. >> >> You’re right, I was wrong. `LazyFilterSequence` just constructs an >> iterator and returns it. `LazyFilterCollection` has to loop until it >> finds the first item matching the predicate in its `startIndex` >> implementation. The part I was missing is that `IndexingIterator` >> gets the `startIndex` in its initializer. >> >>> >>>> `startIndex` for `LazyFilterCollection` currently has O(N) complexity. >>>> The complexity of a complete iteration doesn’t change and probably >>>> isn’t a big deal, but it’s worth noting. >>> >>> Filtered collection views always require a bit of hand-waving around >>> performance guarantees; I don't think that changes. >>> >>>> I’ve been looking at some code that wraps a sequence and considering >>>> how it would be impacted. With iterators it looks like this: >>>> >>>> guard let element = base.next() >>>> else { return nil } >>>> >>>> With collections and indices it would look something like this: >>>> >>>> base.formIndex(after: &index) >>>> guard index != baseEndIndex >>>> else { return endIndex } >>>> >>>> let element = base[index] >>>> >>>> That’s not too bad but it is more verbose. >>> >>> Sequence today is a single-pass thing. If you are wrapping Sequence >>> today presumably you'd wrap an Iterator tomorrow, and you wouldn't have >>> to deal with indices. >>> >>>> If we’re going to push people towards collections and indices we >>>> should try to make common patterns like “update the iteration state >>>> and return the next element if there is one" simpler. >>> >>> That's IndexingIterator. >> >> Cool, I wrote this thinking that was going away. >> >>> >>>> This could be accomplished with an extension method along these lines: >>>> >>>> guard let element = base.formIndex(after: &index, >>>> .returningOptionalElement) >>>> else { return endIndex } >>>> >>>> With an implementation something like: >>>> >>>> enum FormIndexResult { >>>> .returningOptionalElement >>>> } >>>> extension Collection { >>>> func formIndex(after i: inout Self.Index, _ result: >>>> FormIndexResult) -> Self.Element? >>>> } >>>> >>>> This would provide similar functionality to `IndexingIterator` without >>>> coupling the storage of `elements` and `position` (which is important >>>> if you’re wrapping a collection and need to wrap the collection and >>>> its indices independently). >>> >>> I'm afraid I don't understand. Could you be more explicit about what >>> you have in mind? >> >> The idea was to provide functionality similar to `IndexingIterator` in >> the sense the following code would provide equivalent functionality to >> `iterator.next()` but expressed in terms of a collection and an index: >> >> let optionalElement = myCollection.formIndex(after: &myIndex, . >> returningOptionalElement) >> >> vs >> >> let optionalElement = myIterator.next() >> >> The single case enum is just there to provide a label that >> differentiates the overload. >> >> If we’re keeping IndexingIterator this probably isn’t necessary. I >> still have a use case for it but it is rather obscure. > > I can imagine wanting a design like the above for cases where > implementing the endIndex requires adding an extra bit of state, e.g. in > > struct LazyPrefix<Base: Collection> : Collection { > init(_ base: Base, where: (C.Element)->Bool) > ... > } > > you don't want to traverse the base collection eagerly just to come up > with an endIndex, so you store an optional Base.Index in > LazyPrefix.Index, which is nil for the endIndex. In these cases, index > comparison is less efficient than it might otherwise be. > > But my answer for these cases is simple: simply use a specialized > Iterator that will be more efficient than IndexingIterator. Is there a > reason that doesn't work for your case?
I would index a custom iterator in my case, but I am talking about code that would live inside the implementation of `index(after:)` in the `Collection` conformance of a wrapper `Collection` that bears some resemblance to `LazyFlattenCollection`. In this case you are receiving your `Index` which wraps `Base.Index`. Like I said, this is a pretty obscure case and I would never suggest including it on the grounds of a use case that is only relevant to `index(after:)` implementations in wrapper collections. :) I brought it because I thought you may have been suggesting a more drastic change that removed the collection iterators. > >>> On second thought, I believe it is important to have a way to support >>> existing “partially formed” multipass sequences that don't expose >>> copying or equality for their iteration states. >> >> Can you provide examples of these? I’m having difficulty thinking of >> one. > > NSSet is an example. > >>> Iterator is the right way to do that. So I think we need to keep >>> Iterator around. >> >> I don’t have any objection to keeping it either. :) Hopefully we’d >> still be able to improve the design in the future if / when new >> enabling language features come along. >> >>> >>>> In the meantime, people would be able to implement their own protocol >>>> for single pass sequences. What they would lose is for..in as well as >>>> the standard library algorithms. I’m not sure how many people this >>>> would impact or how big the impact would be for them. We have seen a >>>> couple of examples in this discussion, but probably not enough to >>>> asses the overall impact. >>>> >>>> One thing you don’t mention here is a distinction between finite and >>>> infinite single-pass sequences (iterators). I don’t know if the >>>> finite / infinite distinction is as important here, but wanted to >>>> point it out. Obviously if we remove support single-pass sequences >>>> now we could defer that discussion until we’re ready to bring back >>>> support for them. >>> >>> There are a few possible answers I can think of: >>> >>> 1. Do the “obvious” thing and create a separate protocol for finite >>> single-pass sequences >>> >>> 2. Decide that the combination of infinite and single-pass is rare >>> enough (/dev/urandom, temperature sensor) that it's better to just >>> ask people handling them to be careful and not, e.g., try to “count” >>> them. >>> >>> 3. Decide that everything on a single-pass sequence is lazy. Since you >>> can only take a single pass anyway, people won't observe their >>> closures being called more often than necessary, which was the main >>> motivator for making map, filter, et. al eager on collections without >>> an explicit .lazy. >>> >>> Implications of #3: >>> >>> * Any “partially-formed” multipass sequences (modeling only Iterator) >>> would be free to expose an accurate underestimatedCount, thereby >>> optimizing the process of copying into an array. The lazy filter >>> Iterator adaptor would have an underestimatedCount of 0. >>> >>> * All algorithms that require multiple passes, such as sorted(), would >>> be unavailable on Iterator. You'd have to construct an Array (or >>> other MutableCollection) and sort that in-place. Of course, >>> constructing an Array from an Iterator could still go on forever if >>> the Iterator turned out to be infinite, which means, at some level #3 >>> is just a refinement of #2 that makes it less error-prone. >> >> Do you lean towards any of these? > > Yes, #3. > > We can always make the few operations that have to be eager—such as > Array construction from an Iterator—explicit with a label or something: > > Array(claimingFiniteness: someIterator) This makes sense. Finite single-pass iterators can always be added in the future if compelling use cases emerge. We’re not taking anything away. All of the code I have looked at that makes a finite assumption would be converted to require `Collection` in the new model. > > -- > Dave
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
