On 23/11/14 22:54, Chris Angelico wrote:
On Mon, Nov 24, 2014 at 7:18 AM, Mark Shannon <m...@hotpy.org> wrote:
Hi,

I have serious concerns about this PEP, and would ask you to reconsider it.

Hoping I'm not out of line in responding here, as PEP author. Some of
your concerns (eg "5 days is too short") are clearly for Guido, not
me, but perhaps I can respond to the rest of it.

[ Very short summary:
     Generators are not the problem. It is the naive use of next() in an
iterator that is the problem. (Note that all the examples involve calls to
next()).
     Change next() rather than fiddling with generators.
]

StopIteration is not a normal exception, indicating a problem, rather it
exists to signal exhaustion of an iterator.
However, next() raises StopIteration for an exhausted iterator, which really
is an error.
Any iterator code (generator or __next__ method) that calls next() treats
the StopIteration as a normal exception and propogates it.
The controlling loop then interprets StopIteration as a signal to stop and
thus stops.
*The problem is the implicit shift from signal to error and back to signal.*

The situation is this: Both __next__ and next() need the capability to
return literally any object at all. (I raised a hypothetical
possibility of some sort of sentinel object, but for such a sentinel
to be useful, it will need to have a name, which means that *by
definition* that object would have to come up when iterating over the
.values() of some namespace.) They both also need to be able to
indicate a lack of return value. This means that either they return a
(success, value) tuple, or they have some other means of signalling
exhaustion.

You are grouping next() and it.__next__() together, but they are different.
I think we agree that the __next__() method is part of the iterator protocol and should raise StopIteration. There is no fundamental reason why next(), the builtin function, should raise StopIteration, just because __next__(), the method, does. Many xxx() functions that wrap __xxx__() methods add additional functionality.

Consider max() or min(). Both of these methods take an iterable and if that iterable is empty they raise a ValueError.
If next() did likewise then the original example that motivates this PEP
would not be a problem.


I'm not sure what you mean by your "However" above. In both __next__
and next(), this is a signal; it becomes an error as soon as you call
next() and don't cope adequately with the signal, just as KeyError is
an error.

2. The proposed solution does not address this issue at all, but rather
legislates against generators raising StopIteration.

Because that's the place where a StopIteration will cause a silent
behavioral change, instead of cheerily bubbling up to top-level and
printing a traceback.
I must disagree. It is the FOR_ITER bytecode (implementing a loop or comprehension) that "silently" converts a StopIteration exception into a branch.

I think the generator's __next__() method handling of exceptions is correct; it propogates them, like most other code.


3. Generators and the iterator protocol were introduced in Python 2.2, 13
years ago.
For all of that time the iterator protocol has been defined by the
__iter__(), next()/__next__() methods and the use of StopIteration to
terminate iteration.

Generators are a way to write iterators without the clunkiness of explicit
__iter__() and next()/__next__() methods, but have always obeyed the same
protocol as all other iterators. This has allowed code to rewritten from one
form to the other whenever desired.

Do not forget that despite the addition of the send() and throw() methods
and their secondary role as coroutines, generators have primarily always
been a clean and elegant way of writing iterators.

This question has been raised several times; there is a distinct
difference between __iter__() and __next__(), and it is only the
I just mentioned __iter__ as it is part of the protocol, I agree that __next__ is relevant method.
latter which is aware of StopIteration. Compare these three classes:

class X:
     def __init__(self): self.state=0
     def __iter__(self): return self
     def __next__(self):
         if self.state == 3: raise StopIteration
         self.state += 1
         return self.state

class Y:
     def __iter__(self):
         return iter([1,2,3])

class Z:
     def __iter__(self):
         yield 1
         yield 2
         yield 3

Note how just one of these classes uses StopIteration, and yet all
three are iterable, yielding the same results. Neither Y nor Z is
breaking iterator protocol - but neither of them is writing an
iterator, either.

All three raise StopIteration, even if it is implicit.
This is trivial to demonstrate:

def will_it_raise_stop_iteration(it):
    try:
        while True:
            it.__next__()
    except StopIteration:
        print("Raises StopIteration")
    except:
        print("Raises something else")


4. Porting from Python 2 to Python 3 seems to be hard enough already.

Most of the code broken by this change can be fixed by a mechanical
replacement of "raise StopIteration" with "return"; the rest need to
be checked to see if they're buggy or unclear. There is an edge case
with "return some_value" vs "raise StopIteration(some_value)" (the
former's not compatible with 2.7), but apart from that, the
recommended form of code for 3.7 will work in all versions of Python
since 2.2.
I think that when it comes to porting 2 to 3, the perception is more important than the technical difficultly. Sadly :(


5. I think I've already covered this in the other points, but to reiterate
(excuse the pun):
Calling next() on an exhausted iterator is, I would suggest, a logical
error.

How do you know that it's exhausted, other than by calling next() on it?
Either we add a new method, or you have to handle the exception explicitly. But that is what you are trying to force anyway.

I probably should have said "Calling next(), without guarding against the possibility that the iterator is exhausted, is a logical error."


It also worth noting that calling next() is the only place a StopIteration
exception is likely to occur outside of the iterator protocol.

This I agree with.

An example
----------

Consider a function to return the value from a set with a single member.
def value_from_singleton(s):
     if len(s) < 2:  #Intentional error here (should be len(s) == 1)
        return next(iter(s))
     raise ValueError("Not a singleton")

Now suppose we pass an empty set to value_from_singleton(s), then we get a
StopIteration exception, which is a bit weird, but not too bad.

Only a little weird - and no different from the way you'd get a
TypeError if you pass it an integer.
Except that TypeError is what is says, an error. StopIteration is a special not-really-an-error thing.


However it is when we use it in a generator (or in the __next__ method of an
iterator) that we get a serious problem.
Currently the iterator appears to be exhausted early, which is wrong.
However, with the proposed change we get RuntimeError("generator raised
StopIteration") raised, which is also wrong, just in a different way.

What you have here is two distinct issues. The first is "what happens
if an unexpected StopIteration occurs during __next__ processing?",
and the second is "ditto ditto a generator's execution?". The first
one is extremely hard to deal with, and extremely unlikely. The second
is much easier to deal with, and can therefore be solved.
I don't think there are two distinct issues. It is only the combination of the two that causes a real problem.

There are two places that StopIteration could be convert into a "real" exception. In the next() function or in the generator.__next__() method.
Doing so in next() is, IMO, simpler and easier to understand and explain.


Solutions
---------
My preferred "solution" is to do nothing except improving the documentation
of next(). Explain that it can raise StopIteration which, if allowed to
propogate can cause premature exhaustion of an iterator.

Docs fixing doesn't solve everything.
True, but docs fixing is always backwards compatible :)

If something must be done then I would suggest changing the behaviour of
next() for an exhausted iterator.
Rather than raise StopIteration it should raise ValueError (or IndexError?).

So, if I've understood you correctly, what you're saying is that
__next__ should raise StopIteration, and then next() should absorb
that and raise ValueError instead? I'm not sure how this would help
anything, but I can see that it would poke the issue with a sharp
pointy stick. Can you elaborate on how this would work in practice?
How would it help? It would prevent propagation of StopIteration causes
premature exhaustion of an iterator. That is what the PEP is about, isn't it?

Also, it might be worth considering making StopIteration inherit from
BaseException, rather than Exception.

Separate concern altogether, as the bases of StopIteration have
nothing to do with a protocol meaning collision. I would probably
support this change, on the basis that Exception should be for, well,
exceptions, and BaseException can be used for everything that uses the
exception-handling mechanism for other purposes. But it wouldn't help
or affect this proposal.
Agreed.

Cheers,
Mark.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to