[Python-Dev] Re: Clarification of unpacking semantics.

Serhiy Storchaka Thu, 06 Feb 2020 14:05:28 -0800

06.02.20 08:28, Brandt Bucher пише:

Commits 13bc139 and 8a4cd70 introduced subtle changes in the evaluation logic of unpacking 
operations. Previously, all elements were evaluated prior to being collected in a container. Now, 
these operations are interleaved. For example, the code `[*a, *b]` used to evaluate in the order 
`a` -> `b` -> `a.__iter__()` -> `b.__iter__()`. Now, it evaluates as `a` -> 
`a.__iter__()` -> `b` -> `b.__iter__()`.


I believe this breaking semantic change is a bug, and I've opened a PR to fix it 
(https://github.com/python/cpython/pull/18264). My reasoning is that "unary *" 
isn't an operator; it doesn't appear on the operator precedence table in the docs, and 
you can't evaluate `*x`. Like the brackets and the comma, it's part of the syntax of the 
outer display expression, not the inner one. It specifies how the list should be built, 
so it should be evaluated last, as part of the list construction. And it has always been 
this way since PEP 448 (as far as I can tell).

The docs themselves seem to support this line of reasoning 
(https://docs.python.org/3/reference/expressions.html#evaluation-order):

In the following lines, expressions will be evaluated in the arithmetic order 
of their suffixes:
...
expr1(expr2, expr3, *expr4, **expr5)


Note that the stars are not part of expressions 1-5, but are a part of the 
top-level call expression that operates on them all.

Mark Shannon disagrees with me (I'll let him reply rather than attempt to 
summarize his argument for him), but we figured it might be better to get more 
input here on exactly whether you all think the behavior should change or not. 
You can see the discussion on the PR itself for some additional points and 
context.


I have two problems with this change.

1. It changes error messages.

>>> print(*1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: print() argument after * must be an iterable, not int
>>> print(*1, *2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Value after * must be an iterable, not int

In 3.8 you got the same error message.

>>> print(*1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: print() argument after * must be an iterable, not int
>>> print(*1, *2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: print() argument after * must be an iterable, not int

I am not sure whether the function name is a useful information, butsome effort was spend to preserve it. In any case, error messages shouldbe consistent.



2. It introduces performance regression.

In 3.8 the bytecode for `(*a, *b, *c)` was:

  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 LOAD_NAME                2 (c)
              6 BUILD_TUPLE_UNPACK       3

In master it is:

  1           0 BUILD_LIST               0
              2 LOAD_NAME                0 (a)
              4 LIST_EXTEND              1
              6 LOAD_NAME                1 (b)
              8 LIST_EXTEND              1
             10 LOAD_NAME                2 (c)
             12 LIST_EXTEND              1
             14 LIST_TO_TUPLE

The bytecode is larger, therefore slower. It also prevents possibleoptimization of BUILD_TUPLE_UNPACK and similar opcodes for common caseof tuples and lists which would allow to minimize the number of memoryallocations.

_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/CZZKWFW22TBJ5VLO7GUIF7A7QBFTBAC2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Clarification of unpacking semantics.

Reply via email to