On 02.03.2017 06:46, Nick Coghlan wrote:
On 1 March 2017 at 19:37, Wolfgang Maier
<wolfgang.ma...@biologie.uni-freiburg.de
<mailto:wolfgang.ma...@biologie.uni-freiburg.de>>
wrote:

    Now here's the proposal: allow an except (or except break) clause to
    follow for/while loops that will be executed if the loop was
    terminated by a break statement.

    Now while it's possible that Nick had a good reason not to do so,


I never really thought about it, as I only use the "else:" clause for
search loops where there aren't any side effects in the "break" case
(other than the search result being bound to the loop variable), so
while I find "except break:" useful as an explanatory tool, I don't have
any practical need for it.

I think you've made as strong a case for the idea as could reasonably be
made :)

However, Steven raises a good point that this would complicate the
handling of loops in the code generator a fair bit, as it would add up
to two additional jump targets in cases wherever the new clause was used.

Currently, compiling loops only needs to track the start of the loop
(for continue), and the first instruction after the loop (for break).
With this change, they'd also need to track:

- the start of the "except break" clause (for break when the clause is used)
- the start of the "else" clause (for the non-break case when both
trailing clauses are present)


I think you could get away with only one additional jump target as I showed in my previous reply to Steven. The heavier burden would be on the parser, which would have to distinguish the existing and the two new loop variants (loop with except clause, loop with except and else clause) but, anyway, that's probably not really the point.
What weighs heavier, I think, is your design argument.

The design level argument against adding the clause is that it breaks
the "one obvious way" principle, as the preferred form for search loops
look like this:

    for item in iterable:
        if condition(item):
            break
    else:
        # Else clause either raises an exception or sets a default value
        item = get_default_value()

   # If we get here, we know "item" is a valid reference
   operation(item)

And you can easily switch the `break` out for a suitable `return` if you
move this into a helper function:

    def find_item_of_interest(iterable):
        for item in iterable:
            if condition(item):
                return item
        # The early return means we can skip using "else"
        return get_default_value()

Given that basic structure as a foundation, you only switch to the
"nested side effect" form if you have to:

    for item in iterable:
        if condition(item):
            operation(item)
            break
    else:
        # Else clause neither raises an exception nor sets a default value
        condition_was_never_true(iterable)

This form is generally less amenable to being extracted into a reusable
helper function, since it couples the search loop directly to the
operation performed on the bound item, whereas decoupling them gives you
a lot more flexibility in the eventual code structure.

The proposal in this thread then has the significant downside of only
covering the "nested side effect" case:

    for item in iterable:
        if condition(item):
            break
    except break:
        operation(item)
    else:
        condition_was_never_true(iterable)

While being even *less* amenable to being pushed down into a helper
function (since converting the "break" to a "return" would bypass the
"except break" clause).

I'm actually not quite buying this last argument. If you wanted to refactor this to "return" instead of "break", you could simply put the return into the except break block. In many real-world situations with multiple breaks from a loop this could actually make things easier instead of worse. Personally, the "nested side effect" form makes me uncomfortable every time I use it because the side effects on breaking or not breaking the loop don't end up at the same indentation level and not necessarily together. However, I'm gathering from the discussion so far that not too many people are thinking like me about this point, so maybe I should simply adjust my mind-set.


All that said, this is a very nice abstract view on things! I really learned quite a bit from this, thank you :)

As always though, reality can be expected to be quite a bit more complicated than theory so I decided to check the stdlib for real uses of break. This is quite a tedious task since break is used in many different ways and I couldn't come up with a good automated way of classifying them. So what I did is just go through stdlib code (in reverse alphabetical order) containing the break keyword and put it into categories manually. I only got up to socket.py before losing my enthusiasm, but here's what I found:

- overall I looked at 114 code blocks that contain one or more breaks

- 84 of these are trivial use cases that simply break out of a while True block or terminate a while/for loop prematurely (no use for any follow-up clause there)

- 8 more are causing a side-effect before a single break, and it would be pointless to put this into an except break clause

- 3 more cause different, non-redundant side-effects before different breaks from the same loop and, obviously, an except break clause would not help them either

=> So the vast majority of breaks does *not* need an except break *nor* an else clause, but that's just as expected.


Of the remaining 19 non-trivial cases

- 9 are variations of your classical search idiom above, i.e., there's an else clause there and nothing more is needed

- 6 are variations of your "nested side-effects" form presented above with debatable (see above) benefit from except break

- 2 do not use an else clause currently, but have multiple breaks that do partly redundant things that could be combined in a single except break clause

- 1 is an example of breaking out of two loops; from sre_parse._parse_sub:

[...]
    # check if all items share a common prefix
    while True:
        prefix = None
        for item in items:
            if not item:
                break
            if prefix is None:
                prefix = item[0]
            elif item[0] != prefix:
                break
        else:
            # all subitems start with a common "prefix".
            # move it out of the branch
            for item in items:
                del item[0]
            subpatternappend(prefix)
            continue # check next one
        break
[...]

This could have been written as:

[...]
    # check if all items share a common prefix
    while True:
        prefix = None
        for item in items:
            if not item:
                break
            if prefix is None:
                prefix = item[0]
            elif item[0] != prefix:
                break
        except break:
            break

        # all subitems start with a common "prefix".
        # move it out of the branch
        for item in items:
            del item[0]
        subpatternappend(prefix)
[...]


- finally, 1 is a complicated break dance to achieve sth that clearly would have been easier with except break; from typing.py:

[...]
    def __subclasscheck__(self, cls):
        if cls is Any:
            return True
        if isinstance(cls, GenericMeta):
            # For a class C(Generic[T]) where T is co-variant,
            # C[X] is a subclass of C[Y] iff X is a subclass of Y.
            origin = self.__origin__
            if origin is not None and origin is cls.__origin__:
                assert len(self.__args__) == len(origin.__parameters__)
                assert len(cls.__args__) == len(origin.__parameters__)
                for p_self, p_cls, p_origin in zip(self.__args__,
                                                   cls.__args__,
                                                   origin.__parameters__):
                    if isinstance(p_origin, TypeVar):
                        if p_origin.__covariant__:
# Covariant -- p_cls must be a subclass of p_self.
                            if not issubclass(p_cls, p_self):
                                break
                        elif p_origin.__contravariant__:
# Contravariant. I think it's the opposite. :-)
                            if not issubclass(p_self, p_cls):
                                break
                        else:
                            # Invariant -- p_cls and p_self must equal.
                            if p_self != p_cls:
                                break
                    else:
                        # If the origin's parameter is not a typevar,
                        # insist on invariance.
                        if p_self != p_cls:
                            break
                else:
                    return True
# If we break out of the loop, the superclass gets a chance.
        if super().__subclasscheck__(cls):
            return True
        if self.__extra__ is None or isinstance(cls, GenericMeta):
            return False
        return issubclass(cls, self.__extra__)
[...]

which could be rewritten as:

[...]
    def __subclasscheck__(self, cls):
        if cls is Any:
            return True
        if isinstance(cls, GenericMeta):
            # For a class C(Generic[T]) where T is co-variant,
            # C[X] is a subclass of C[Y] iff X is a subclass of Y.
            origin = self.__origin__
            if origin is not None and origin is cls.__origin__:
                assert len(self.__args__) == len(origin.__parameters__)
                assert len(cls.__args__) == len(origin.__parameters__)
                for p_self, p_cls, p_origin in zip(self.__args__,
                                                   cls.__args__,
                                                   origin.__parameters__):
                    if isinstance(p_origin, TypeVar):
                        if p_origin.__covariant__:
# Covariant -- p_cls must be a subclass of p_self.
                            if not issubclass(p_cls, p_self):
                                break
                        elif p_origin.__contravariant__:
# Contravariant. I think it's the opposite. :-)
                            if not issubclass(p_self, p_cls):
                                break
                        else:
                            # Invariant -- p_cls and p_self must equal.
                            if p_self != p_cls:
                                break
                    else:
                        # If the origin's parameter is not a typevar,
                        # insist on invariance.
                        if p_self != p_cls:
                            break
                except break:
# If we break out of the loop, the superclass gets a chance.
                    if super().__subclasscheck__(cls):
                        return True
if self.__extra__ is None or isinstance(cls, GenericMeta):
                        return False
                    return issubclass(cls, self.__extra__)

                return True
[...]


My summary: I do see use-cases for the except break clause, but, admittedly, they are relatively rare and may be not worth the hassle of introducing new syntax.

_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to