Hello,

Would you mind going into detail about when you need iterators that mutate
containers? I always think of iterators as having no side effects.

I ask that because if somehow you can live with only external iterators,
having one type of iterators instead of two seems to me like a great
simplification of the language.

Thanks,
Noam


On 6 June 2013 07:09, Daniel Micay <[email protected]> wrote:

> A quick terminology refresher, for those who aren't familiar with it:
>
> * Internal iterator: takes a closure, runs the closure until it asks to
> break
> * External iterator: state machine, advanced by the caller in a loop
>
> To a caller, external iterators provide the most functionality, because
> they
> can be used as an internal iterator. You lose the state of an internal
> iterator
> by breaking out of iterator, so generic algorithms like zip, union,
> intersect
> and merge can't be implemented for a pair of iterators.
>
> # Issues with internal iterators in Rust
>
> A few months ago, we only had internal iterators and there were no generic
> algorithms to use with any iterator - only with BaseIter's `each` method.
>
> Rust's internal iterators implement the protocol encoded in Rust's for
> statement, but it's not possible to give them all a common trait or
> implement
> generic methods or functions taking any internal iterator.
>
> As a workaround, we can write algorithms assuming internal iterators only
> take
> one argument (the closure):
>
>     fn count<T>(f: &fn(fn(T) -> bool) -> bool) -> uint
>
> The caller has to use a partial function to call these adaptors, for which
> we
> lack sugar:
>
>     count(|f| uint::range(0, 10, f))
>
> For simple functions, this is fairly reasonable once you're used to it. It
> quickly gets out of control though, even for a simple function like filter:
>
>     filter<T>(pred: &fn(&T) -> bool, input: &fn(&fn(T) -> bool) -> bool,
> output: &fn(T) -> bool) -> bool {}
>
> Sadly, `filter_ref` is also needed to work around closures behaving badly
> with
> lifetimes. An example of the problem with `fold_ref`:
>
>     fn product<T: One + Mul<T, T>>(iter: &fn(f: &fn(&T) -> bool) -> bool)
> -> T {
>         fold_ref(One::one::<T>(), iter, |a, x| *a = a.mul(x))
>     }
>
> Since `product` expects an iterator yielding `&T` (a borrowed pointer in
> any
> region), it won't work with `fold` because that requires the borrowed
> pointer
> to have the same type (and thus lifetime) for every iteration like `&'a
> int`.
>
> This issue with borrowed pointers was blocking me from replacing the
> existing
> algorithms reimplemented for both `str` and `vec` with the generic ones.
>
> Chaining together iteration algorithms is a common use case, but even
> chaining
> two together is confusing at first:
>
>     to_vec(|g| filter(|&x| *x < 3, |f| xs.each(f), g)
>
> Another more alarming issue is that with internal iterators, the `break`
> and
> `return` statements don't always work in a `for` loop. If the iterator
> isn't
> implemented correctly, the loop will keep going.
>
> This also has borrow checking implications, because the compiler can't
> assume
> those statements actually cause flow control to leave the loop immediately.
>
> # External iterators
>
> Based on the above, you might think generic iteration algorithms in Rust
> are a
> bleak prospect. However, we already have a nice external iterator library,
> and
> they don't suffer from the above issues.
>
> All kinds of external iterators implement the following trait, whether
> they are
> a fibonacci number generator, a reverse iterator over a vector or iterator
> over
> a range in a sorted set:
>
>     pub trait Iterator<A> {
>         /// Advance the iterator and return the next value. Return `None`
> when the end is reached.
>         fn next(&mut self) -> Option<A>;
>     }
>
> Generic adaptors are implemented on `Iterator`, and many of them are
> `Iterator`
> implementations themselves:
>
>     use std::iterator::*;
>
>     fn main() {
>         let mut it = Counter::new(0.0, 1.0)
>             .take_while(|x| *x < 10000000.0)
>             .transform(|x| x / 2.0)
>             .transform(|x| x + 2.0);
>         println(it.fold(0.0, |a, b| a + b).to_str())
>     }
>
> If you're curious, the optimized LLVM IR: http://ix.io/5Xl
>
> Unlike internal iterators, external iterators only run one iteration at a
> time,
> so a `for` loop designed for them would always be able to succeed with
> `break`
> and `return`. It would also be able to avoid the ugly `advance` wrapper
> currently required to use external iterators with `for`.
>
>     // The current situation, wrapping an external iterator as an internal
> one
>     //
>     // Since the advance method is not known the be correct, borrow
> checking
>     // still assumes `return` and `break` are imperfect.
>     for xs.zip(ys).advance |x| { ... }
>
>     // A hypothetical `for` loop using the `Iterator` trait
>     for iterator |x| { ... }
>
>     // It could also fall back to an `Iterable` trait and obtain an
> iterator
>     for container |x| { ... }
>
> External iterators also avoid the problems with references and closures,
> because they simply return `T` rather than passing it to a closure.
>
> # Why not just switch to external iterators?
>
> Algorithms that can be represented easily without the call stack are as
> easy to
> write as either an internal or external iterator.
>
> However, without support for compiling a function to a state machine (C#
> does
> this for yield), traversal of a recursive data structure has to be manually
> translated to a procedural algorithm with an explicit stack. For complex
> data
> structures, this process can be very difficult.
>
> I'm hopeful Rust will gain support for this down the road after 1.0. If it
> does, there will be no reason to write immutable internal iterators.
>
> Another issue is mutability, as you can write iterators that are able to
> mutate
> containers. With internal iterators, this is easy to do with safe code.
> With
> external ones, it will `unsafe` and won't be easy to get right.
>
> # Solution?
>
> I don't have any proposal to completely solve this issue. :)
>
> I think extending the built-in `for` loop to work with external iterators
> should be considered, because right now the verbosity discourages using
> them
> and makes borrow checking more painful than it has to be.
>
> It could treat functions as internal iterators, and look for an `Iterator`
> implementation (using a `lang` item) for external ones.
>
> Python's `for` loop starts by looking for an iterator (a `__next__`
> method) and
> falls back to an iterable (an `__iter__` method) so behaviour like this
> isn't
> an alien concept.
>
> _______________________________________________
> Rust-dev mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/rust-dev
>
>
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to