On Mon, Nov 24, 2014 at 7:18 AM, Mark Shannon <m...@hotpy.org> wrote:
> Hi,
>
> I have serious concerns about this PEP, and would ask you to reconsider it.

Hoping I'm not out of line in responding here, as PEP author. Some of
your concerns (eg "5 days is too short") are clearly for Guido, not
me, but perhaps I can respond to the rest of it.

> [ Very short summary:
>     Generators are not the problem. It is the naive use of next() in an
> iterator that is the problem. (Note that all the examples involve calls to
> next()).
>     Change next() rather than fiddling with generators.
> ]
>
> StopIteration is not a normal exception, indicating a problem, rather it
> exists to signal exhaustion of an iterator.
> However, next() raises StopIteration for an exhausted iterator, which really
> is an error.
> Any iterator code (generator or __next__ method) that calls next() treats
> the StopIteration as a normal exception and propogates it.
> The controlling loop then interprets StopIteration as a signal to stop and
> thus stops.
> *The problem is the implicit shift from signal to error and back to signal.*

The situation is this: Both __next__ and next() need the capability to
return literally any object at all. (I raised a hypothetical
possibility of some sort of sentinel object, but for such a sentinel
to be useful, it will need to have a name, which means that *by
definition* that object would have to come up when iterating over the
.values() of some namespace.) They both also need to be able to
indicate a lack of return value. This means that either they return a
(success, value) tuple, or they have some other means of signalling
exhaustion.

I'm not sure what you mean by your "However" above. In both __next__
and next(), this is a signal; it becomes an error as soon as you call
next() and don't cope adequately with the signal, just as KeyError is
an error.

> 2. The proposed solution does not address this issue at all, but rather
> legislates against generators raising StopIteration.

Because that's the place where a StopIteration will cause a silent
behavioral change, instead of cheerily bubbling up to top-level and
printing a traceback.

> 3. Generators and the iterator protocol were introduced in Python 2.2, 13
> years ago.
> For all of that time the iterator protocol has been defined by the
> __iter__(), next()/__next__() methods and the use of StopIteration to
> terminate iteration.
>
> Generators are a way to write iterators without the clunkiness of explicit
> __iter__() and next()/__next__() methods, but have always obeyed the same
> protocol as all other iterators. This has allowed code to rewritten from one
> form to the other whenever desired.
>
> Do not forget that despite the addition of the send() and throw() methods
> and their secondary role as coroutines, generators have primarily always
> been a clean and elegant way of writing iterators.

This question has been raised several times; there is a distinct
difference between __iter__() and __next__(), and it is only the
latter which is aware of StopIteration. Compare these three classes:

class X:
    def __init__(self): self.state=0
    def __iter__(self): return self
    def __next__(self):
        if self.state == 3: raise StopIteration
        self.state += 1
        return self.state

class Y:
    def __iter__(self):
        return iter([1,2,3])

class Z:
    def __iter__(self):
        yield 1
        yield 2
        yield 3

Note how just one of these classes uses StopIteration, and yet all
three are iterable, yielding the same results. Neither Y nor Z is
breaking iterator protocol - but neither of them is writing an
iterator, either.

> 4. Porting from Python 2 to Python 3 seems to be hard enough already.

Most of the code broken by this change can be fixed by a mechanical
replacement of "raise StopIteration" with "return"; the rest need to
be checked to see if they're buggy or unclear. There is an edge case
with "return some_value" vs "raise StopIteration(some_value)" (the
former's not compatible with 2.7), but apart from that, the
recommended form of code for 3.7 will work in all versions of Python
since 2.2.

> 5. I think I've already covered this in the other points, but to reiterate
> (excuse the pun):
> Calling next() on an exhausted iterator is, I would suggest, a logical
> error.

How do you know that it's exhausted, other than by calling next() on it?

> It also worth noting that calling next() is the only place a StopIteration
> exception is likely to occur outside of the iterator protocol.

This I agree with.

> An example
> ----------
>
> Consider a function to return the value from a set with a single member.
> def value_from_singleton(s):
>     if len(s) < 2:  #Intentional error here (should be len(s) == 1)
>        return next(iter(s))
>     raise ValueError("Not a singleton")
>
> Now suppose we pass an empty set to value_from_singleton(s), then we get a
> StopIteration exception, which is a bit weird, but not too bad.

Only a little weird - and no different from the way you'd get a
TypeError if you pass it an integer.

> However it is when we use it in a generator (or in the __next__ method of an
> iterator) that we get a serious problem.
> Currently the iterator appears to be exhausted early, which is wrong.
> However, with the proposed change we get RuntimeError("generator raised
> StopIteration") raised, which is also wrong, just in a different way.

What you have here is two distinct issues. The first is "what happens
if an unexpected StopIteration occurs during __next__ processing?",
and the second is "ditto ditto a generator's execution?". The first
one is extremely hard to deal with, and extremely unlikely. The second
is much easier to deal with, and can therefore be solved.

> Solutions
> ---------
> My preferred "solution" is to do nothing except improving the documentation
> of next(). Explain that it can raise StopIteration which, if allowed to
> propogate can cause premature exhaustion of an iterator.

Docs fixing doesn't solve everything.

> If something must be done then I would suggest changing the behaviour of
> next() for an exhausted iterator.
> Rather than raise StopIteration it should raise ValueError (or IndexError?).

So, if I've understood you correctly, what you're saying is that
__next__ should raise StopIteration, and then next() should absorb
that and raise ValueError instead? I'm not sure how this would help
anything, but I can see that it would poke the issue with a sharp
pointy stick. Can you elaborate on how this would work in practice?

> Also, it might be worth considering making StopIteration inherit from
> BaseException, rather than Exception.

Separate concern altogether, as the bases of StopIteration have
nothing to do with a protocol meaning collision. I would probably
support this change, on the basis that Exception should be for, well,
exceptions, and BaseException can be used for everything that uses the
exception-handling mechanism for other purposes. But it wouldn't help
or affect this proposal.

ChrisA
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to