As an example of something that #3 makes possible, here's a proof-of-concept
implementation of the previously-mentioned push back behavior (renamed here as
.push_front(), because .push_back() in this context sounds like it pushes onto
the end):
pub struct PushFront<A, T> {
priv val: Option<A>,
priv iter: T
}
impl<A, T: Iterator<A>> PushFront<A, T> {
#[inline]
fn new(iter: T) -> PushFront<A, T> {
PushFront{ val: None, iter: iter }
}
/// Pushes the value onto the front of the iterator, replacing any previous
/// pushed value if it hasn't already been yielded.
///
/// Only one value may be buffered at any time.
#[inline]
fn push_front(&mut self, val: A) {
self.val = Some(val);
}
}
impl<A, T: Iterator<A>> Iterator<A> for PushFront<A, T> {
#[inline]
fn next(&mut self) -> Option<A> {
if self.val.is_some() {
self.val.take()
} else {
self.iter.next()
}
}
#[inline]
fn size_hint(&self) -> (uint, Option<uint>) {
let (low, hi) = self.iter.size_hint();
if self.val.is_some() {
(low+1, hi.map_consume(|x| if x < uint::max_value { x+1 } else { x
}))
} else {
(low, hi)
}
}
}
impl<A, T: DoubleEndedIterator<A>> DoubleEndedIterator<A> for PushFront<A, T> {
#[inline]
fn next_back(&mut self) -> Option<A> {
match self.iter.next() {
Some(x) => Some(x),
None if self.val.is_some() => self.val.take(),
_ => None
}
}
}
impl<A: Clone, T: RandomAccessIterator<A>> RandomAccessIterator<A> for
PushFront<A, T> {
#[inline]
fn indexable(&self) -> uint {
let x = self.iter.indexable();
if self.val.is_some() {
if x < uint::max_value { x+1 } else { x }
} else { x }
}
#[inline]
fn idx(&self, index: uint) -> Option<A> {
if index == 0 && self.val.is_some() {
self.val.clone()
} else {
self.iter.idx(index-1)
}
}
}
-Kevin
On Aug 4, 2013, at 8:33 PM, Kevin Ballard <[email protected]> wrote:
> You could certainly design a non-blocking IO iterator that way. My example of
> a non-blocking IO iterator wasn't meant to illustrate something I'd actually
> recommend you do (since the iterator protocol isn't really a great fit for
> this), but rather show the flexibility of the approach.
>
> That said, in the non-blocking IO approach, it does still "exhaust the
> elements available", it just so happens that more elements become available
> over time.
>
> A better example might be an iterator that allows you to push more elements
> onto the end. Or one that allows you to move the iterator backwards in time,
> so to speak, so it re-emits the same elements it emitted previously (e.g.
> give it a .prev() method). Or merely one that allows you to push back a
> single element (that gets yielded on the next call to .next()), which is
> actually fairly common in character iterators in other languages (gives you
> the equivalent of a .peek() operation, by consuming a value and then
> immediately pushing it back if you don't want it). None of these examples are
> possible with approaches #1 or #2, only with approach #3. And all of them are
> still compatible with the basic for loop.
>
> -Kevin
>
> On Aug 4, 2013, at 5:53 PM, Jason Fager <[email protected]> wrote:
>
>> Not confused, I understand your point about for loops not caring about what
>> happens with additional 'next' calls.
>>
>> But as a user of an iterator, I have the expectation that a for loop
>> exhausts the elements available from an iterator unless I return early or
>> break. Designing an iterator that intentionally sidesteps that expectation
>> seems like a bad idea. Principle of least astonishment, etc.
>>
>> And yes, of course, iterators already return Option. But they return Option
>> to satisfy the *iterator protocol*, not the use cases you described. I'm
>> talking about adding another layer of Option
>>
>> So say I want to implement non-blocking io that plays nice w/ the iterator
>> protocol. Using an implementation taking advantage of option#3 look
>> something like:
>>
>> loop {
>> for i in iter {
>> foo(i);
>> }
>> // other stuff
>> if(noReallyImDone) {
>> break;
>> }
>> }
>>
>>
>> While hoisting the Iterator's type into another layer of Option looks like:
>>
>> for i in iter {
>> match i {
>> Some(i) => foo(i);
>> None => {
>> //other stuff
>> }
>> }
>> }
>>
>> The outer Option allows the iterator protocol to work as expected, i.e.
>> iterate over all available elements in the iterator, and the inner
>> implements the non-blocking protocol you're looking for.
>>
>>
>>
>>
>>
>>
>>
>> On Sun, Aug 4, 2013 at 8:12 PM, Kevin Ballard <[email protected]> wrote:
>> I suspect you're confused about something.
>>
>> The for loop doesn't care in the slightest what an iterator does after it's
>> returned None. All 3 approaches work equally well as far as the for loop is
>> concerned.
>>
>> And I'm not sure what you mean by "design Iterators that would take
>> advantage of the undefined behavior". If an iterator defines how it behaves
>> after returning None, then it's defined behavior. If you're using iterators
>> and you know your entire iterator pipeline, then you can use whatever
>> behavior the iterators involved define. You only need to restrict yourself
>> to what the iterator protocol defines if you don't know what iterator you're
>> consuming.
>>
>> I also don't understand your suggestion about using Option. Iterators
>> already return an Option.
>>
>> -Kevin
>>
>> On Aug 4, 2013, at 4:45 PM, Jason Fager <[email protected]> wrote:
>>
>>> Of course. I think I'm reacting more to the possible use cases you
>>> described for option 3 than the actual meaning of it. It seems like a
>>> really bad idea to design iterators that would take advantage of the
>>> undefined behavior, not least b/c it's unexpected and not supported by the
>>> most pervasive client of the iterator protocol (the for loop, in the sense
>>> of actually iterating through all elements available through the iterator),
>>> but that doesn't mean option 3 is in itself the wrong thing to do.
>>>
>>> But addressing the use cases you mentioned, if you need that kind of
>>> functionality, shouldn't you be hoisting the iterator's return type into
>>> its own Option? i.e., an Iterator<T> should be become an
>>> Iterator<Option<T>>?
>>>
>>>
>>> On Sun, Aug 4, 2013 at 6:23 PM, Kevin Ballard <[email protected]> wrote:
>>> The new for loop works with all 3 of these. Your output shows that it
>>> queried .next() twice, and got a single Some(1) result back. Once it gets
>>> None, it never calls .next() again, whereas the 3 behaviors stated
>>> previously are exclusively concerned with what happens if you call .next()
>>> again after it has already returned None.
>>>
>>> -Kevin
>>>
>>> P.S. I changed the email address that I'm subscribed to this list with, so
>>> apologies for any potential confusion.
>>>
>>> On Aug 4, 2013, at 6:18 AM, Jason Fager <[email protected]> wrote:
>>>
>>>> The new for loop already assumes #2, right?
>>>>
>>>> let x = [1,2,3];
>>>> let mut it = x.iter().peek_(|x| printfln!(*x)).scan(true, |st, &x| { if
>>>> *st { *st = false; Some(x) } else { None } });
>>>>
>>>> for i in it {
>>>> printfln!("from for loop: %?", i);
>>>> }
>>>>
>>>>
>>>> Which produces:
>>>>
>>>> &1
>>>> from for loop: 1
>>>> &2
>>>>
>>>>
>>>>
>>>> On Sun, Aug 4, 2013 at 1:49 AM, Daniel Micay <[email protected]> wrote:
>>>> On Sat, Aug 3, 2013 at 9:18 PM, Kevin Ballard <[email protected]> wrote:
>>>> > The iterator protocol, as I'm sure you're aware, is the protocol that
>>>> > defines the behavior of the Iterator trait. Unfortunately, at the moment
>>>> > the
>>>> > trait does not document what happens if you call `.next()` on an iterator
>>>> > after a previous call has returned `None`. According to Daniel Micay, the
>>>> > intention was that the iterator would return `None` forever. However,
>>>> > this
>>>> > is not guaranteed by at least one iterator adaptor (Scan), nor is it
>>>> > documented. Furthermore, no thought has been given to what happens if an
>>>> > iterator pipeline has side-effects. A trivial example of the side-effect
>>>> > problem is this:
>>>> >
>>>> > let x = [1,2,3];
>>>> > let mut it = x.iter().peek_(|x| printfln!(*x)).scan(true, |st, &x| {
>>>> > if
>>>> > *st { *st = false; Some(x) } else { None } });
>>>> > (it.next(), it.next(), it.next())
>>>> >
>>>> > This results in `(Some(1), None, None)` but it prints out
>>>> >
>>>> > &1
>>>> > &2
>>>> > &3
>>>> >
>>>> > After giving it some thought, I came up with 3 possible definitions for
>>>> > behavior in this case:
>>>> >
>>>> > 1. Once `.next()` has returned `None`, it will return None forever.
>>>> > Furthermore, calls to `.next()` after `None` has been returned will not
>>>> > trigger side-effects in the iterator pipeline. This means that once
>>>> > `.next()` has returned `None`, it becomes idempotent.
>>>> >
>>>> > This is most likely going to be what people will assume the iterator
>>>> > protocol defines, in the absence of any explicit statement. What's more,
>>>> > they probably won't even consider the side-effects case.
>>>> >
>>>> > Implementing this will require care be given to every single iterator
>>>> > and
>>>> > iterator adaptor. Most iterators will probably behave like this (unless
>>>> > they
>>>> > use a user-supplied closure), but a number of different iterator adaptors
>>>> > will need to track this explicitly with a bool flag. It's likely that
>>>> > user-supplied iterator adaptors will forget to enforce this and will
>>>> > therefore behave subtlely wrong in the face of side-effects.
>>>> >
>>>> > 2. Once `.next()` has returned `None`, it will return `None` forever. No
>>>> > statement is made regarding side-effects.
>>>> >
>>>> > This is what most people will think they're assuming, if asked. The
>>>> > danger here is that they will almost certainly actaully assume #1, and
>>>> > thus
>>>> > may write subtlely incorrect code if they're given an iterator pipeline
>>>> > with
>>>> > side-effects.
>>>> >
>>>> > This is easier to implement than #1. Most iterators will do this
>>>> > already.
>>>> > Iterator adaptors will generally only have to take care when they use a
>>>> > user-supplied closure (e.g. `scan()`).
>>>> >
>>>> > 3. The behavior of `.next()` after `None` has been returned is left
>>>> > undefined. Individual iterators may choose to define behavior here
>>>> > however
>>>> > they see fit.
>>>> >
>>>> > This is what we actually have implemented in the standard libraries
>>>> > today. It's also by far the easiest to implement, as iterators and
>>>> > adaptors
>>>> > may simply choose to not define any particular behavior.
>>>> >
>>>> > This is made more attractive by the fact that some iterators may
>>>> > choose
>>>> > to actually define behavior that's different than "return `None`
>>>> > forever".
>>>> > For example, a user may write an iterator that wraps non-blocking I/O,
>>>> > returning `None` when there's no data available and returning `Some(x)`
>>>> > again once more data comes in. Or if you don't like that example, they
>>>> > could
>>>> > write an iterator that may be updated to contain more data after being
>>>> > exhausted.
>>>> >
>>>> > The downside is that users may assume #1 when #3 holds, which is why
>>>> > this
>>>> > needs to be documented properly.
>>>> >
>>>> > ---
>>>> >
>>>> > I believe that #3 is the right behavior to define. This gives the most
>>>> > flexibility to individual iterators, and we can provide an iterator
>>>> > adaptor
>>>> > that gives any iterator the behavior defined by #1 (see Fuse in PR
>>>> > #8276).
>>>> >
>>>> > I am not strongly opposed to defining #1 instead, but I am mildly worried
>>>> > about the likelihood that users will implement iterators that don't have
>>>> > this guarantee, as this is not something that can be statically checked
>>>> > by
>>>> > the compiler. What's more, if an iterator breaks this guarantee, the
>>>> > problem
>>>> > will show up in the code that calls it, rather than in the iterator
>>>> > itself,
>>>> > which may make debugging harder.
>>>> >
>>>> > I am strongly opposed to #2. If we guarantee that an iterator that
>>>> > returns
>>>> > `None` once will return `None` forever, users will assume that this means
>>>> > that `.next()` becomes idempotent (with regards to side-effects) after
>>>> > `None` is returned, but this will not be true. Furthermore, users will
>>>> > probably not even realize they've made a bad assumption, as most users
>>>> > will
>>>> > not be thinking about side-effects when consuming iterators.
>>>> >
>>>> > I've already gone ahead and implemented #3 in pull request #8276.
>>>> >
>>>> > -Kevin
>>>>
>>>> I'm leaning towards #2 or #3, mostly because adaptors *not*
>>>> dispatching to the underlying next() implementation are too complex.
>>>>
>>>> I took a look at the behaviour of Python's iterators in these corner
>>>> cases as good baseline for comparison:
>>>>
>>>> ~~~
>>>> >>> def peek(it):
>>>> ... for x in it:
>>>> ... print(x)
>>>> ... yield x
>>>> ...
>>>> >>> xs = [1, 2, 3]
>>>> >>> ys = [1, 2, 3, 4, 5]
>>>> ~~~
>>>>
>>>> You can tell their `zip` function short-circuits, and simply
>>>> dispatches to the underlying implementations. Rust's `zip` is similar
>>>> but doesn't currently short-circuit (it might as well).
>>>>
>>>> ~~~
>>>> >>> it = zip(peek(ys), xs)
>>>> >>> next(it)
>>>> 1
>>>> (1, 1)
>>>> >>> next(it)
>>>> 2
>>>> (2, 2)
>>>> >>> next(it)
>>>> 3
>>>> (3, 3)
>>>> >>> next(it)
>>>> 4
>>>> Traceback (most recent call last):
>>>> File "<stdin>", line 1, in <module>
>>>> StopIteration
>>>> >>> next(it)
>>>> 5
>>>> Traceback (most recent call last):
>>>> File "<stdin>", line 1, in <module>
>>>> StopIteration
>>>> >>> next(it)
>>>> Traceback (most recent call last):
>>>> File "<stdin>", line 1, in <module>
>>>> StopIteration
>>>> >>> it = zip(xs, peek(ys))
>>>> >>> next(it)
>>>> 1
>>>> (1, 1)
>>>> >>> next(it)
>>>> 2
>>>> (2, 2)
>>>> >>> next(it)
>>>> 3
>>>> (3, 3)
>>>> >>> next(it)
>>>> Traceback (most recent call last):
>>>> File "<stdin>", line 1, in <module>
>>>> StopIteration
>>>> ~~~
>>>>
>>>> It also makes no attempt to store whether it has stopped internally,
>>>> and will start yielding again if each iterator yields an element when
>>>> zip asks for them one by one (keeping in mind that it short-circuits).
>>>>
>>>> Most other language keep `hasNext` and `next` separate (D and Scala,
>>>> among others) leading to more corner cases, and they do not seem to
>>>> clearly define the semantics for side effects down the pipeline.
>>>>
>>>> http://dlang.org/phobos/std_range.html
>>>> http://www.scala-lang.org/api/current/scala/collection/Iterator.html
>>>> _______________________________________________
>>>> Rust-dev mailing list
>>>> [email protected]
>>>> https://mail.mozilla.org/listinfo/rust-dev
>>>>
>>>> _______________________________________________
>>>> Rust-dev mailing list
>>>> [email protected]
>>>> https://mail.mozilla.org/listinfo/rust-dev
>>>
>>>
>>
>>
>
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev