Sent from my iPad
> On Jul 1, 2016, at 9:55 PM, Dave Abrahams <[email protected]> wrote: > > > on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote: > >>> On Jul 1, 2016, at 6:32 PM, Dave Abrahams <[email protected]> wrote: >>> >>> >>> on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com >>> <http://matthew-at-anandabits.com/>> wrote: >>> >> >>>>> On Jul 1, 2016, at 11:51 AM, Dave Abrahams <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> >>>>> on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com >>>>> <http://matthew-at-anandabits.com/> >>>>> <http://matthew-at-anandabits.com/ >>>>> <http://matthew-at-anandabits.com/>>> wrote: >>>>> >>>> >>>>>>> On Jun 30, 2016, at 12:26 PM, Dave Abrahams <[email protected] >>>>>>> <mailto:[email protected]>> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> on Wed Jun 29 2016, Haravikk <swift-evolution-AT-haravikk.me >>>>>>> <http://swift-evolution-at-haravikk.me/>> wrote: >>>>>>> >>>>>> >>>>>>>>> On 29 Jun 2016, at 00:10, Matthew Johnson via swift-evolution >>>>>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>>>>> >>>>>>>>> Swift is a language that embraces value semantics. Many common >>>>>>>>> iterators *can* be implemented with value semantics. Just because we >>>>>>>>> can’t implement *all* iterators with value semantics doesn’t mean we >>>>>>>>> should require them to have reference semantics. It just means you >>>>>>>>> can’t *assume* value semantics when working with iterators in generic >>>>>>>>> code unless / until we have a way to specify a value semantics >>>>>>>>> constraint. That’s not necessarily a bad thing especially when it >>>>>>>>> leaves the door open to interesting future possibilities. >>>>>>>>> >>>>>>>>> -Matthew >>>>>>>> >>>>>>>> I'm kind of undecided about this personally. I think one of the >>>>>>>> problems with Swift is that the only indication that you have a >>>>>>>> reference type is that you can declare it as a constant, yet still >>>>>>>> call mutating methods upon it, this isn't a very positive way of >>>>>>>> identifying it however. This may be more of a GUI/IDE issue though, in >>>>>>>> that something being a class isn't always that obvious at a glance. >>>>>>>> >>>>>>>> I wonder, could we somehow force iterators stored in variables to be >>>>>>>> passed via inout? This would make it pretty clear that you're using >>>>>>>> the same iterator and not a copy in all cases, encouraging you to >>>>>>>> obtain another if you really do need to perform multiple passes. >>>>>>> >>>>>>> I'm going to push single-pass iteration on the stack briefly and talk >>>>>>> about the topic that's been under discussion here: infinite multipass >>>>>>> sequences. >>>>>>> >>>>>>> ## Fitting “Infinite Multipass” Into the Model >>>>>>> >>>>>>> It remains to be decided whether it's worth doing, but if it's to >>>>>>> happen >>>>>> >>>>>> I definitely think it’s worth doing. >>>>> >>>>> Opinions are nice, but rationales are better. How will we understand >>>>> *why* it's worth doing? >>>> >>>> I agree. >>>> >>>> The rationale has been discussed quite a bit already in this thread. >>>> The current protocols do not provide the semantics many people are >>>> assuming in their code, leading to a lot of code that is incorrect >>>> despite the fact that it usually works in practice. >>>> >>>> This is especially frequent in the case of the finite assumption. >>>> This assumption is so common it seems very wise to me to encode it as >>>> a semantic requirement in a protocol. >>>> >>>> IMO these are problem worth addressing, especially now that we have a >>>> good handle on what a solution would look like. >>>> >>>>> >>>>>> I really appreciate the attention that the library team has given to >>>>>> this. >>>>>> >>>>>>> , the standard library team thinks the right design is roughly >>>>>>> this: >>>>>>> >>>>>>> /// A multipass sequence that may be infinite >>>>>>> protocol Collection { >>>>>>> >>>>>>> // Only eager algorithms that can terminate available here >>>>>>> func index(where predicate: (Element)->Bool) -> Index >>>>>>> >>>>>>> // all lazy algorithms available here >>>>>>> var lazy: ... >>>>>>> >>>>>>> var startIndex: Index >>>>>>> var endIndex: Index // possibly not reachable from startIndex >>>>>>> >>>>>>> associatedtype SubSequence : Collection >>>>>>> // do we need an associated FiniteSubsequence, e.g. for prefixes? >>>>>>> } >>>>>>> >>>>>>> protocol FiniteCollection : Collection { >>>>>>> >>>>>>> // All eager algorithms available here >>>>>>> func map(...) -> >>>>>>> var count: ... >>>>>>> } >>>>>>> >>>>>>> protocol BidirectionalCollection : Collection { ... } >>>>>>> >>>>>>> protocol RandomAccessCollection : BidirectionalCollection { … } >>>>>> >>>>>> Does this design entirely break with the relationship between >>>>>> collections and iterators (dropping `makeIterator` as a protocol >>>>>> requirement)? If so, would for..in (over collections) be built on top >>>>>> of indices and use `formIndex(after:)`? Would it require a finite >>>>>> collection (unless we add `until` to the language and then allow >>>>>> `for..in..until` to work with infinite collections)? >>>>> >>>>> All of these points are up for discussion. >>>> >>>> Cool. I think the collection for..in has some nice advantages, but >>>> since it sounds like we’ll probably keep Iterator around it might be >>>> best to take the approach of making them both work. >>>> >>>> You already know that I would prefer to see the current for..in built >>>> on finite sequences and allow for..in..unitl to be used with infinite >>>> sequences if we add that in the future. :) >>>> >>>>> John McCall pointed out to >>>>> me that an index-based for..in would make it possible to implement >>>>> >>>>> for inout x in y { mutate(&x) } >>>> >>>> That would be very nice! >>>> >>>> I think it might also increase performance. I don’t know exactly how >>>> for..in is implemented today, but the implementation of >>>> IndexingIterator compares position to endIndex. If for..in is also >>>> comparing checking the optional for nil that’s an extra comparison. >>>> We shouldn't need to actually construct the optional in the first >>>> place using an index-based for..in. Maybe optimizations like this >>>> already exist? But even if they do, it seems like they wouldn’t be >>>> possible in some cases where the type of the sequence isn’t statically >>>> known. >>>> >>>>> >>>>>> Would we still retain `IndexingIterator`even if we break the >>>>>> relationship in the protocol requirements? >>>>> >>>>> Yes: it should be possible to implement Collection algorithms in terms >>>>> of Iterator algorithms, and IndexingIterator provides the means. That >>>>> said, I think the makeIterator requirement does little harm, especially >>>>> when it can be defaulted for Collections. >>>> >>>> I like this answer. >>>> >>>>> >>>>>> Would it still be possible to do things like zip a multi-pass sequence >>>>>> with a single-pass sequence (assuming we keep single-pass sequences or >>>>>> add them back eventually)? This seems like a use case worth >>>>>> supporting in some way. >>>>> >>>>> Yes. If you can create an Iterator from a Collection, and you can zip >>>>> Iterators, you can do this. >>>> >>>> Yes, of course. I’m glad we would keep this relationship in tact. >>>> >>>>> >>>>>> One subtle change I think this implies is that things like >>>>>> `LazyFilterSequence` can implement `makeIterator` with constant >>>>>> complexity, deferring the O(N) complexity to the first call to `next`. >>>>> >>>>> I don't believe that's a difference, though I could be wrong. >>>> >>>> You’re right, I was wrong. `LazyFilterSequence` just constructs an >>>> iterator and returns it. `LazyFilterCollection` has to loop until it >>>> finds the first item matching the predicate in its `startIndex` >>>> implementation. The part I was missing is that `IndexingIterator` >>>> gets the `startIndex` in its initializer. >>>> >>>>> >>>>>> `startIndex` for `LazyFilterCollection` currently has O(N) complexity. >>>>>> The complexity of a complete iteration doesn’t change and probably >>>>>> isn’t a big deal, but it’s worth noting. >>>>> >>>>> Filtered collection views always require a bit of hand-waving around >>>>> performance guarantees; I don't think that changes. >>>>> >>>>>> I’ve been looking at some code that wraps a sequence and considering >>>>>> how it would be impacted. With iterators it looks like this: >>>>>> >>>>>> guard let element = base.next() >>>>>> else { return nil } >>>>>> >>>>>> With collections and indices it would look something like this: >>>>>> >>>>>> base.formIndex(after: &index) >>>>>> guard index != baseEndIndex >>>>>> else { return endIndex } >>>>>> >>>>>> let element = base[index] >>>>>> >>>>>> That’s not too bad but it is more verbose. >>>>> >>>>> Sequence today is a single-pass thing. If you are wrapping Sequence >>>>> today presumably you'd wrap an Iterator tomorrow, and you wouldn't have >>>>> to deal with indices. >>>>> >>>>>> If we’re going to push people towards collections and indices we >>>>>> should try to make common patterns like “update the iteration state >>>>>> and return the next element if there is one" simpler. >>>>> >>>>> That's IndexingIterator. >>>> >>>> Cool, I wrote this thinking that was going away. >>>> >>>>> >>>>>> This could be accomplished with an extension method along these lines: >>>>>> >>>>>> guard let element = base.formIndex(after: &index, >>>>>> .returningOptionalElement) >>>>>> else { return endIndex } >>>>>> >>>>>> With an implementation something like: >>>>>> >>>>>> enum FormIndexResult { >>>>>> .returningOptionalElement >>>>>> } >>>>>> extension Collection { >>>>>> func formIndex(after i: inout Self.Index, _ result: >>>>>> FormIndexResult) -> Self.Element? >>>>>> } >>>>>> >>>>>> This would provide similar functionality to `IndexingIterator` without >>>>>> coupling the storage of `elements` and `position` (which is important >>>>>> if you’re wrapping a collection and need to wrap the collection and >>>>>> its indices independently). >>>>> >>>>> I'm afraid I don't understand. Could you be more explicit about what >>>>> you have in mind? >>>> >>>> The idea was to provide functionality similar to `IndexingIterator` in >>>> the sense the following code would provide equivalent functionality to >>>> `iterator.next()` but expressed in terms of a collection and an index: >>>> >>>> let optionalElement = myCollection.formIndex(after: &myIndex, . >>>> returningOptionalElement) >>>> >>>> vs >>>> >>>> let optionalElement = myIterator.next() >>>> >>>> The single case enum is just there to provide a label that >>>> differentiates the overload. >>>> >>>> If we’re keeping IndexingIterator this probably isn’t necessary. I >>>> still have a use case for it but it is rather obscure. >>> >>> I can imagine wanting a design like the above for cases where >>> implementing the endIndex requires adding an extra bit of state, e.g. in >>> >>> struct LazyPrefix<Base: Collection> : Collection { >>> init(_ base: Base, where: (C.Element)->Bool) >>> ... >>> } >>> >>> you don't want to traverse the base collection eagerly just to come up >>> with an endIndex, so you store an optional Base.Index in >>> LazyPrefix.Index, which is nil for the endIndex. In these cases, index >>> comparison is less efficient than it might otherwise be. >>> >>> But my answer for these cases is simple: simply use a specialized >>> Iterator that will be more efficient than IndexingIterator. Is there a >>> reason that doesn't work for your case? >> >> I would index a custom iterator in my case, > > I don't know what it means to index an iterator. Wow, I have no idea how I feel ended up writing that. I must have gotten distracted somehow. I meant a custom iterator wouldn't be relevant in my case. > >> but I am talking about code that would live inside the implementation >> of `index(after:)` in the `Collection` conformance of a wrapper >> `Collection` that bears some resemblance to `LazyFlattenCollection`. >> In this case you are receiving your `Index` which wraps `Base.Index`. > > I don't understand, but maybe it doesn't matter. Sorry, maybe I would need to provide more context. But it's not important as like I said, it is a relatively obscure case. > FWIW, I'm pretty > confident we could write a specialized Collection protocol that, given > some equatable State and a next() method, supplied all the basic > Collection requirements, for the cases where the Iterator is the most > efficient means of traversal, e.g. > > protocol IteratingCollection : Collection { > associatedtype IterationState : Equatable > func next(inout state: IterationState) -> Element? > func startState() -> IterationState > } > > extension IteratingCollection { > // all the collection requirements implemented in terms > // of startState() and next(...) > } This would be cool. It could be the basis of a method taking a start state and a closure (I think you mentioned a method like this earlier). > >> Like I said, this is a pretty obscure case and I would never suggest >> including it on the grounds of a use case that is only relevant to >> `index(after:)` implementations in wrapper collections. :) I brought >> it because I thought you may have been suggesting a more drastic >> change that removed the collection iterators. > >>> >>>>> On second thought, I believe it is important to have a way to support >>>>> existing “partially formed” multipass sequences that don't expose >>>>> copying or equality for their iteration states. >>>> >>>> Can you provide examples of these? I’m having difficulty thinking of >>>> one. >>> >>> NSSet is an example. >>> >>>>> Iterator is the right way to do that. So I think we need to keep >>>>> Iterator around. >>>> >>>> I don’t have any objection to keeping it either. :) Hopefully we’d >>>> still be able to improve the design in the future if / when new >>>> enabling language features come along. >>>> >>>>> >>>>>> In the meantime, people would be able to implement their own protocol >>>>>> for single pass sequences. What they would lose is for..in as well as >>>>>> the standard library algorithms. I’m not sure how many people this >>>>>> would impact or how big the impact would be for them. We have seen a >>>>>> couple of examples in this discussion, but probably not enough to >>>>>> asses the overall impact. >>>>>> >>>>>> One thing you don’t mention here is a distinction between finite and >>>>>> infinite single-pass sequences (iterators). I don’t know if the >>>>>> finite / infinite distinction is as important here, but wanted to >>>>>> point it out. Obviously if we remove support single-pass sequences >>>>>> now we could defer that discussion until we’re ready to bring back >>>>>> support for them. >>>>> >>>>> There are a few possible answers I can think of: >>>>> >>>>> 1. Do the “obvious” thing and create a separate protocol for finite >>>>> single-pass sequences >>>>> >>>>> 2. Decide that the combination of infinite and single-pass is rare >>>>> enough (/dev/urandom, temperature sensor) that it's better to just >>>>> ask people handling them to be careful and not, e.g., try to “count” >>>>> them. >>>>> >>>>> 3. Decide that everything on a single-pass sequence is lazy. Since you >>>>> can only take a single pass anyway, people won't observe their >>>>> closures being called more often than necessary, which was the main >>>>> motivator for making map, filter, et. al eager on collections without >>>>> an explicit .lazy. >>>>> >>>>> Implications of #3: >>>>> >>>>> * Any “partially-formed” multipass sequences (modeling only Iterator) >>>>> would be free to expose an accurate underestimatedCount, thereby >>>>> optimizing the process of copying into an array. The lazy filter >>>>> Iterator adaptor would have an underestimatedCount of 0. >>>>> >>>>> * All algorithms that require multiple passes, such as sorted(), would >>>>> be unavailable on Iterator. You'd have to construct an Array (or >>>>> other MutableCollection) and sort that in-place. Of course, >>>>> constructing an Array from an Iterator could still go on forever if >>>>> the Iterator turned out to be infinite, which means, at some level #3 >>>>> is just a refinement of #2 that makes it less error-prone. >>>> >>>> Do you lean towards any of these? >>> >>> Yes, #3. >>> >>> We can always make the few operations that have to be eager—such as >>> Array construction from an Iterator—explicit with a label or something: >>> >>> Array(claimingFiniteness: someIterator) >> >> This makes sense. Finite single-pass iterators can always be added in >> the future if compelling use cases emerge. We’re not taking anything >> away. >> >> All of the code I have looked at that makes a finite assumption would >> be converted to require `Collection` in the new model. > > I think you mean FiniteCollection, yes? Yes, sorry. > > -- > Dave _______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
