on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
>> On Jul 1, 2016, at 6:32 PM, Dave Abrahams <[email protected]> wrote: >> >> >> on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com >> <http://matthew-at-anandabits.com/>> wrote: >> > >>>> On Jul 1, 2016, at 11:51 AM, Dave Abrahams <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> >>>> on Fri Jul 01 2016, Matthew Johnson <matthew-AT-anandabits.com >>>> <http://matthew-at-anandabits.com/> >>>> <http://matthew-at-anandabits.com/ >>>> <http://matthew-at-anandabits.com/>>> wrote: >>>> >>> >>>>>> On Jun 30, 2016, at 12:26 PM, Dave Abrahams <[email protected] >>>>>> <mailto:[email protected]>> >>>>> wrote: >>>>>> >>>>>> >>>>>> on Wed Jun 29 2016, Haravikk <swift-evolution-AT-haravikk.me >>>>>> <http://swift-evolution-at-haravikk.me/>> wrote: >>>>>> >>>>> >>>>>>>> On 29 Jun 2016, at 00:10, Matthew Johnson via swift-evolution >>>>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>>>> >>>>>>>> Swift is a language that embraces value semantics. Many common >>>>>>>> iterators *can* be implemented with value semantics. Just because we >>>>>>>> can’t implement *all* iterators with value semantics doesn’t mean we >>>>>>>> should require them to have reference semantics. It just means you >>>>>>>> can’t *assume* value semantics when working with iterators in generic >>>>>>>> code unless / until we have a way to specify a value semantics >>>>>>>> constraint. That’s not necessarily a bad thing especially when it >>>>>>>> leaves the door open to interesting future possibilities. >>>>>>>> >>>>>>>> -Matthew >>>>>>> >>>>>>> I'm kind of undecided about this personally. I think one of the >>>>>>> problems with Swift is that the only indication that you have a >>>>>>> reference type is that you can declare it as a constant, yet still >>>>>>> call mutating methods upon it, this isn't a very positive way of >>>>>>> identifying it however. This may be more of a GUI/IDE issue though, in >>>>>>> that something being a class isn't always that obvious at a glance. >>>>>>> >>>>>>> I wonder, could we somehow force iterators stored in variables to be >>>>>>> passed via inout? This would make it pretty clear that you're using >>>>>>> the same iterator and not a copy in all cases, encouraging you to >>>>>>> obtain another if you really do need to perform multiple passes. >>>>>> >>>>>> I'm going to push single-pass iteration on the stack briefly and talk >>>>>> about the topic that's been under discussion here: infinite multipass >>>>>> sequences. >>>>>> >>>>>> ## Fitting “Infinite Multipass” Into the Model >>>>>> >>>>>> It remains to be decided whether it's worth doing, but if it's to >>>>>> happen >>>>> >>>>> I definitely think it’s worth doing. >>>> >>>> Opinions are nice, but rationales are better. How will we understand >>>> *why* it's worth doing? >>> >>> I agree. >>> >>> The rationale has been discussed quite a bit already in this thread. >>> The current protocols do not provide the semantics many people are >>> assuming in their code, leading to a lot of code that is incorrect >>> despite the fact that it usually works in practice. >>> >>> This is especially frequent in the case of the finite assumption. >>> This assumption is so common it seems very wise to me to encode it as >>> a semantic requirement in a protocol. >>> >>> IMO these are problem worth addressing, especially now that we have a >>> good handle on what a solution would look like. >>> >>>> >>>>> I really appreciate the attention that the library team has given to >>>>> this. >>>>> >>>>>> , the standard library team thinks the right design is roughly >>>>>> this: >>>>>> >>>>>> /// A multipass sequence that may be infinite >>>>>> protocol Collection { >>>>>> >>>>>> // Only eager algorithms that can terminate available here >>>>>> func index(where predicate: (Element)->Bool) -> Index >>>>>> >>>>>> // all lazy algorithms available here >>>>>> var lazy: ... >>>>>> >>>>>> var startIndex: Index >>>>>> var endIndex: Index // possibly not reachable from startIndex >>>>>> >>>>>> associatedtype SubSequence : Collection >>>>>> // do we need an associated FiniteSubsequence, e.g. for prefixes? >>>>>> } >>>>>> >>>>>> protocol FiniteCollection : Collection { >>>>>> >>>>>> // All eager algorithms available here >>>>>> func map(...) -> >>>>>> var count: ... >>>>>> } >>>>>> >>>>>> protocol BidirectionalCollection : Collection { ... } >>>>>> >>>>>> protocol RandomAccessCollection : BidirectionalCollection { … } >>>>> >>>>> Does this design entirely break with the relationship between >>>>> collections and iterators (dropping `makeIterator` as a protocol >>>>> requirement)? If so, would for..in (over collections) be built on top >>>>> of indices and use `formIndex(after:)`? Would it require a finite >>>>> collection (unless we add `until` to the language and then allow >>>>> `for..in..until` to work with infinite collections)? >>>> >>>> All of these points are up for discussion. >>> >>> Cool. I think the collection for..in has some nice advantages, but >>> since it sounds like we’ll probably keep Iterator around it might be >>> best to take the approach of making them both work. >>> >>> You already know that I would prefer to see the current for..in built >>> on finite sequences and allow for..in..unitl to be used with infinite >>> sequences if we add that in the future. :) >>> >>>> John McCall pointed out to >>>> me that an index-based for..in would make it possible to implement >>>> >>>> for inout x in y { mutate(&x) } >>> >>> That would be very nice! >>> >>> I think it might also increase performance. I don’t know exactly how >>> for..in is implemented today, but the implementation of >>> IndexingIterator compares position to endIndex. If for..in is also >>> comparing checking the optional for nil that’s an extra comparison. >>> We shouldn't need to actually construct the optional in the first >>> place using an index-based for..in. Maybe optimizations like this >>> already exist? But even if they do, it seems like they wouldn’t be >>> possible in some cases where the type of the sequence isn’t statically >>> known. >>> >>>> >>>>> Would we still retain `IndexingIterator`even if we break the >>>>> relationship in the protocol requirements? >>>> >>>> Yes: it should be possible to implement Collection algorithms in terms >>>> of Iterator algorithms, and IndexingIterator provides the means. That >>>> said, I think the makeIterator requirement does little harm, especially >>>> when it can be defaulted for Collections. >>> >>> I like this answer. >>> >>>> >>>>> Would it still be possible to do things like zip a multi-pass sequence >>>>> with a single-pass sequence (assuming we keep single-pass sequences or >>>>> add them back eventually)? This seems like a use case worth >>>>> supporting in some way. >>>> >>>> Yes. If you can create an Iterator from a Collection, and you can zip >>>> Iterators, you can do this. >>> >>> Yes, of course. I’m glad we would keep this relationship in tact. >>> >>>> >>>>> One subtle change I think this implies is that things like >>>>> `LazyFilterSequence` can implement `makeIterator` with constant >>>>> complexity, deferring the O(N) complexity to the first call to `next`. >>>> >>>> I don't believe that's a difference, though I could be wrong. >>> >>> You’re right, I was wrong. `LazyFilterSequence` just constructs an >>> iterator and returns it. `LazyFilterCollection` has to loop until it >>> finds the first item matching the predicate in its `startIndex` >>> implementation. The part I was missing is that `IndexingIterator` >>> gets the `startIndex` in its initializer. >>> >>>> >>>>> `startIndex` for `LazyFilterCollection` currently has O(N) complexity. >>>>> The complexity of a complete iteration doesn’t change and probably >>>>> isn’t a big deal, but it’s worth noting. >>>> >>>> Filtered collection views always require a bit of hand-waving around >>>> performance guarantees; I don't think that changes. >>>> >>>>> I’ve been looking at some code that wraps a sequence and considering >>>>> how it would be impacted. With iterators it looks like this: >>>>> >>>>> guard let element = base.next() >>>>> else { return nil } >>>>> >>>>> With collections and indices it would look something like this: >>>>> >>>>> base.formIndex(after: &index) >>>>> guard index != baseEndIndex >>>>> else { return endIndex } >>>>> >>>>> let element = base[index] >>>>> >>>>> That’s not too bad but it is more verbose. >>>> >>>> Sequence today is a single-pass thing. If you are wrapping Sequence >>>> today presumably you'd wrap an Iterator tomorrow, and you wouldn't have >>>> to deal with indices. >>>> >>>>> If we’re going to push people towards collections and indices we >>>>> should try to make common patterns like “update the iteration state >>>>> and return the next element if there is one" simpler. >>>> >>>> That's IndexingIterator. >>> >>> Cool, I wrote this thinking that was going away. >>> >>>> >>>>> This could be accomplished with an extension method along these lines: >>>>> >>>>> guard let element = base.formIndex(after: &index, >>>>> .returningOptionalElement) >>>>> else { return endIndex } >>>>> >>>>> With an implementation something like: >>>>> >>>>> enum FormIndexResult { >>>>> .returningOptionalElement >>>>> } >>>>> extension Collection { >>>>> func formIndex(after i: inout Self.Index, _ result: >>>>> FormIndexResult) -> Self.Element? >>>>> } >>>>> >>>>> This would provide similar functionality to `IndexingIterator` without >>>>> coupling the storage of `elements` and `position` (which is important >>>>> if you’re wrapping a collection and need to wrap the collection and >>>>> its indices independently). >>>> >>>> I'm afraid I don't understand. Could you be more explicit about what >>>> you have in mind? >>> >>> The idea was to provide functionality similar to `IndexingIterator` in >>> the sense the following code would provide equivalent functionality to >>> `iterator.next()` but expressed in terms of a collection and an index: >>> >>> let optionalElement = myCollection.formIndex(after: &myIndex, . >>> returningOptionalElement) >>> >>> vs >>> >>> let optionalElement = myIterator.next() >>> >>> The single case enum is just there to provide a label that >>> differentiates the overload. >>> >>> If we’re keeping IndexingIterator this probably isn’t necessary. I >>> still have a use case for it but it is rather obscure. >> >> I can imagine wanting a design like the above for cases where >> implementing the endIndex requires adding an extra bit of state, e.g. in >> >> struct LazyPrefix<Base: Collection> : Collection { >> init(_ base: Base, where: (C.Element)->Bool) >> ... >> } >> >> you don't want to traverse the base collection eagerly just to come up >> with an endIndex, so you store an optional Base.Index in >> LazyPrefix.Index, which is nil for the endIndex. In these cases, index >> comparison is less efficient than it might otherwise be. >> >> But my answer for these cases is simple: simply use a specialized >> Iterator that will be more efficient than IndexingIterator. Is there a >> reason that doesn't work for your case? > > I would index a custom iterator in my case, I don't know what it means to index an iterator. > but I am talking about code that would live inside the implementation > of `index(after:)` in the `Collection` conformance of a wrapper > `Collection` that bears some resemblance to `LazyFlattenCollection`. > In this case you are receiving your `Index` which wraps `Base.Index`. I don't understand, but maybe it doesn't matter. FWIW, I'm pretty confident we could write a specialized Collection protocol that, given some equatable State and a next() method, supplied all the basic Collection requirements, for the cases where the Iterator is the most efficient means of traversal, e.g. protocol IteratingCollection : Collection { associatedtype IterationState : Equatable func next(inout state: IterationState) -> Element? func startState() -> IterationState } extension IteratingCollection { // all the collection requirements implemented in terms // of startState() and next(...) } > Like I said, this is a pretty obscure case and I would never suggest > including it on the grounds of a use case that is only relevant to > `index(after:)` implementations in wrapper collections. :) I brought > it because I thought you may have been suggesting a more drastic > change that removed the collection iterators. >> >>>> On second thought, I believe it is important to have a way to support >>>> existing “partially formed” multipass sequences that don't expose >>>> copying or equality for their iteration states. >>> >>> Can you provide examples of these? I’m having difficulty thinking of >>> one. >> >> NSSet is an example. >> >>>> Iterator is the right way to do that. So I think we need to keep >>>> Iterator around. >>> >>> I don’t have any objection to keeping it either. :) Hopefully we’d >>> still be able to improve the design in the future if / when new >>> enabling language features come along. >>> >>>> >>>>> In the meantime, people would be able to implement their own protocol >>>>> for single pass sequences. What they would lose is for..in as well as >>>>> the standard library algorithms. I’m not sure how many people this >>>>> would impact or how big the impact would be for them. We have seen a >>>>> couple of examples in this discussion, but probably not enough to >>>>> asses the overall impact. >>>>> >>>>> One thing you don’t mention here is a distinction between finite and >>>>> infinite single-pass sequences (iterators). I don’t know if the >>>>> finite / infinite distinction is as important here, but wanted to >>>>> point it out. Obviously if we remove support single-pass sequences >>>>> now we could defer that discussion until we’re ready to bring back >>>>> support for them. >>>> >>>> There are a few possible answers I can think of: >>>> >>>> 1. Do the “obvious” thing and create a separate protocol for finite >>>> single-pass sequences >>>> >>>> 2. Decide that the combination of infinite and single-pass is rare >>>> enough (/dev/urandom, temperature sensor) that it's better to just >>>> ask people handling them to be careful and not, e.g., try to “count” >>>> them. >>>> >>>> 3. Decide that everything on a single-pass sequence is lazy. Since you >>>> can only take a single pass anyway, people won't observe their >>>> closures being called more often than necessary, which was the main >>>> motivator for making map, filter, et. al eager on collections without >>>> an explicit .lazy. >>>> >>>> Implications of #3: >>>> >>>> * Any “partially-formed” multipass sequences (modeling only Iterator) >>>> would be free to expose an accurate underestimatedCount, thereby >>>> optimizing the process of copying into an array. The lazy filter >>>> Iterator adaptor would have an underestimatedCount of 0. >>>> >>>> * All algorithms that require multiple passes, such as sorted(), would >>>> be unavailable on Iterator. You'd have to construct an Array (or >>>> other MutableCollection) and sort that in-place. Of course, >>>> constructing an Array from an Iterator could still go on forever if >>>> the Iterator turned out to be infinite, which means, at some level #3 >>>> is just a refinement of #2 that makes it less error-prone. >>> >>> Do you lean towards any of these? >> >> Yes, #3. >> >> We can always make the few operations that have to be eager—such as >> Array construction from an Iterator—explicit with a label or something: >> >> Array(claimingFiniteness: someIterator) > > This makes sense. Finite single-pass iterators can always be added in > the future if compelling use cases emerge. We’re not taking anything > away. > > All of the code I have looked at that makes a finite assumption would > be converted to require `Collection` in the new model. I think you mean FiniteCollection, yes? -- Dave _______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
