Hello, Would you mind going into detail about when you need iterators that mutate containers? I always think of iterators as having no side effects.
I ask that because if somehow you can live with only external iterators, having one type of iterators instead of two seems to me like a great simplification of the language. Thanks, Noam On 6 June 2013 07:09, Daniel Micay <[email protected]> wrote: > A quick terminology refresher, for those who aren't familiar with it: > > * Internal iterator: takes a closure, runs the closure until it asks to > break > * External iterator: state machine, advanced by the caller in a loop > > To a caller, external iterators provide the most functionality, because > they > can be used as an internal iterator. You lose the state of an internal > iterator > by breaking out of iterator, so generic algorithms like zip, union, > intersect > and merge can't be implemented for a pair of iterators. > > # Issues with internal iterators in Rust > > A few months ago, we only had internal iterators and there were no generic > algorithms to use with any iterator - only with BaseIter's `each` method. > > Rust's internal iterators implement the protocol encoded in Rust's for > statement, but it's not possible to give them all a common trait or > implement > generic methods or functions taking any internal iterator. > > As a workaround, we can write algorithms assuming internal iterators only > take > one argument (the closure): > > fn count<T>(f: &fn(fn(T) -> bool) -> bool) -> uint > > The caller has to use a partial function to call these adaptors, for which > we > lack sugar: > > count(|f| uint::range(0, 10, f)) > > For simple functions, this is fairly reasonable once you're used to it. It > quickly gets out of control though, even for a simple function like filter: > > filter<T>(pred: &fn(&T) -> bool, input: &fn(&fn(T) -> bool) -> bool, > output: &fn(T) -> bool) -> bool {} > > Sadly, `filter_ref` is also needed to work around closures behaving badly > with > lifetimes. An example of the problem with `fold_ref`: > > fn product<T: One + Mul<T, T>>(iter: &fn(f: &fn(&T) -> bool) -> bool) > -> T { > fold_ref(One::one::<T>(), iter, |a, x| *a = a.mul(x)) > } > > Since `product` expects an iterator yielding `&T` (a borrowed pointer in > any > region), it won't work with `fold` because that requires the borrowed > pointer > to have the same type (and thus lifetime) for every iteration like `&'a > int`. > > This issue with borrowed pointers was blocking me from replacing the > existing > algorithms reimplemented for both `str` and `vec` with the generic ones. > > Chaining together iteration algorithms is a common use case, but even > chaining > two together is confusing at first: > > to_vec(|g| filter(|&x| *x < 3, |f| xs.each(f), g) > > Another more alarming issue is that with internal iterators, the `break` > and > `return` statements don't always work in a `for` loop. If the iterator > isn't > implemented correctly, the loop will keep going. > > This also has borrow checking implications, because the compiler can't > assume > those statements actually cause flow control to leave the loop immediately. > > # External iterators > > Based on the above, you might think generic iteration algorithms in Rust > are a > bleak prospect. However, we already have a nice external iterator library, > and > they don't suffer from the above issues. > > All kinds of external iterators implement the following trait, whether > they are > a fibonacci number generator, a reverse iterator over a vector or iterator > over > a range in a sorted set: > > pub trait Iterator<A> { > /// Advance the iterator and return the next value. Return `None` > when the end is reached. > fn next(&mut self) -> Option<A>; > } > > Generic adaptors are implemented on `Iterator`, and many of them are > `Iterator` > implementations themselves: > > use std::iterator::*; > > fn main() { > let mut it = Counter::new(0.0, 1.0) > .take_while(|x| *x < 10000000.0) > .transform(|x| x / 2.0) > .transform(|x| x + 2.0); > println(it.fold(0.0, |a, b| a + b).to_str()) > } > > If you're curious, the optimized LLVM IR: http://ix.io/5Xl > > Unlike internal iterators, external iterators only run one iteration at a > time, > so a `for` loop designed for them would always be able to succeed with > `break` > and `return`. It would also be able to avoid the ugly `advance` wrapper > currently required to use external iterators with `for`. > > // The current situation, wrapping an external iterator as an internal > one > // > // Since the advance method is not known the be correct, borrow > checking > // still assumes `return` and `break` are imperfect. > for xs.zip(ys).advance |x| { ... } > > // A hypothetical `for` loop using the `Iterator` trait > for iterator |x| { ... } > > // It could also fall back to an `Iterable` trait and obtain an > iterator > for container |x| { ... } > > External iterators also avoid the problems with references and closures, > because they simply return `T` rather than passing it to a closure. > > # Why not just switch to external iterators? > > Algorithms that can be represented easily without the call stack are as > easy to > write as either an internal or external iterator. > > However, without support for compiling a function to a state machine (C# > does > this for yield), traversal of a recursive data structure has to be manually > translated to a procedural algorithm with an explicit stack. For complex > data > structures, this process can be very difficult. > > I'm hopeful Rust will gain support for this down the road after 1.0. If it > does, there will be no reason to write immutable internal iterators. > > Another issue is mutability, as you can write iterators that are able to > mutate > containers. With internal iterators, this is easy to do with safe code. > With > external ones, it will `unsafe` and won't be easy to get right. > > # Solution? > > I don't have any proposal to completely solve this issue. :) > > I think extending the built-in `for` loop to work with external iterators > should be considered, because right now the verbosity discourages using > them > and makes borrow checking more painful than it has to be. > > It could treat functions as internal iterators, and look for an `Iterator` > implementation (using a `lang` item) for external ones. > > Python's `for` loop starts by looking for an iterator (a `__next__` > method) and > falls back to an iterable (an `__iter__` method) so behaviour like this > isn't > an alien concept. > > _______________________________________________ > Rust-dev mailing list > [email protected] > https://mail.mozilla.org/listinfo/rust-dev > >
_______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
