Re: [rust-dev] Language support for external iterators (new for loop)

Gareth Smith Wed, 26 Jun 2013 15:50:25 -0700

One downside of external iterators is that they produce functionsignatures that expose implementation details and are harder to understand.

For example, an internal iterator function that loops over odd numberswould have a signature like this:


fn each_odd(ints: &[int], fn(int) -> bool) -> bool;

The same function written to produce an external iterator would have asignature more like this:

fn iter_odds<'a>(ints: &'a [int]) -> iterator::FilterMapIterator<'a,&int, int, vec::VecIterator<'a, int>>;

This signature is more complicated. It is also likely to change if theimplementation of iter_odds changes (e.g. to no longer use filter_map).


Gareth

On 26/06/13 04:44, Daniel Micay wrote:

This is a followup from the previous discussion on internal vs. external
iterators.[1]

Since then, most iterators in the standard library have been converted to
external ones. Almost all uses of the `for` loop are now using the `.advance`
wrapper, and I don't think there will be any use cases left for the old
internal iteration protocol.

# External iterators

To reiterate the benefits of the external iteration protocol:

* It's generic and works well with traits, so functions can be written to work
   on any arbitrary `Iterator<A>`. Most adaptors can work for any type `A`,
   whether it's a value, reference or mutable reference.

* Iteration state is an object, so iterators can be interleaved. This is
   required for a generic zip, merge, union, intersect, etc. and is often useful
   in an ad-hoc fashion to consume only some of an iterator without
   losing it.

* In the future, Rust can have generators using a `yield` statement like C#,
   compiling down to a fast state machine without requiring context switches,
   virtual functions or even closures. This would eliminate the difficulty of
   coding recursive traversals by-hand with external iterators.

# Alternatives

The iteration protocol can't be based on anything requiring virtual method
calls, heap allocations or context switches without the performance becoming
significantly worse.

There have been some suggestions about following the lead of Clojure and using
reducers[2], but the implementation suffers from the same limitations of not
having an external state.

Rust doesn't yet have a way to write data-parallel code, but when it does gain
that, containers can just support partitioning themselves into ranges via
`Iterator`. It will work for in-place mutation in parallel too.

# A new loop

I think it's a foregone conclusion that we'll be replacing `for`, so I suggest
that we just reuse the current syntax and change the semantics:

     for iterator |pattern| { body }

This can just be compiled as the following would be:

     let mut it = iterator;
     loop {
         match it.next() {
             Some(pattern) => { body }
             None => break
         }
     }

A lang item can be added for the Iterator trait requirement.

This would avoid the `.advance` tacked onto almost every `for` loop at the
moment, and rid us of the problems associated with the current `for`:

* The `break` and `return` statements can fail to work, so borrow/liveness
   checking can't trust them. A loop body can't take an `&mut` reference and
   return it because it could result in grabbing the reference again. This also
   seems to be why we forbid `return` inside closures and do statements, since 
it
   would be confusing to have to act differently than `for`.

* The function's local variables are upvars in the closure, so using them is
   very restricted. It's very obvious that it's not just another block because
   of this.

* Compiling closures is slow, as they have to broken down by SROA and involve
   inlining a function after proving the function pointer is constant. If we
   were marking the function pointers as `llvm.invariant` it might stop being a
   performance hit, but it would remain a compile time issue.

# Iterables

The `for` loop above can also be extended to work for *any* `Iterable` in the
future, not just iterators themselves.

     for iterable |x| { ... } // use the default iterator
     for iterable.rev_iter |x| { ... } // use another iterator

At the moment, the `Iterable` trait cannot be defined/used because the compiler
ignores bounds on trait type parameters, but it would be something like the
following:

     #[lang = "iterable"]
     trait Iterable<A, T: Iterator<A>> {
         fn iter(&self) -> T;
     }

     trait ReverseIterable<A, T: Iterator<A>> {
         fn rev_iter(&self) -> T;
     }

     trait MutableIterable<A, T: Iterator<A>> {
         fn mut_iter(&mut self) -> T;
     }

     trait MutableReverseIterable<A, T: Iterator<A>> {
         fn mut_rev_iter(&mut self) -> T;
     }

# Links

[1]: https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html
[2]: https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-06-25#iterators


_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Language support for external iterators (new for loop)

Reply via email to