Re: [Python-Dev] Examples for PEP 572

Terry Reedy Tue, 03 Jul 2018 18:57:02 -0700

On 7/3/2018 5:37 PM, Serhiy Storchaka wrote:

I like programming languages in which all are expressions (includingfunction declarations, branching and loops) and you can use anassignment at any point, but Python is built on other ways, and I likePython too. PEP 572 looks violating several Python design principles.Python looks simple language, and this is its strong side. I believemost Python users are not professional programmers -- they aresysadmins, scientists, hobbyists and kids -- but Python is suitable forthem because its clear syntax and encouraging good style of programming.In particularly mutating and non-mutating operations are separated. Theassignment expression breaks this. There should be very good reasons fordoing this. But it looks to me that all examples for PEP 572 can bewritten better without using the walrus operator.

I appreciate you showing alternatives I can use now. Even onceimplemented, one can not use A.E's until one no longer cares about 3.7compatibility. Then there will still be a choice.

results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]


     results = [(x, y, x/y) for x in input_data for y in [f(x)] if y > 0]


Would (f(x),) be faster?

import timeit as ti

print(ti.timeit('for y in {x}: pass', 'x=1'))
print(ti.timeit('for y in [x]: pass', 'x=1'))
print(ti.timeit('for y in (x,): pass', 'x=1'))

# prints
0.13765254499999996  # seconds per 1_000_000 = microseconds each.
0.10321274000000003
0.09492473300000004

Yes, but not enough to pay for adding ',', and sometimes forgetting.

stuff = [[y := f(x), x/y] for x in range(5)]

 stuff = [[y, x/y] for x in range(5) for y in [f(x)]]

Creating an leaky name binding appears to about 5 x faster thaniterating a temporary singleton.


print(ti.timeit('y=x', 'x=1'))
print(ti.timeit('y=x; del y', 'x=1'))
#
0.017357778999999907
0.021115051000000107

If one adds 'del y' to make the two equivalent, the chars typed is aboutthe same. To me, the choice amounts to subject reference. Even withy:=x available, I would write the expansion as


res = []
for x in range(5):
    y = f(x)
    res.append((y, x/y))

rather than use the assignment expression in the tuple. It creates a'hitch' in thought.

This idiom looks unusual for you? But this is a legal Python syntax, andit is not more unusual than the new walrus operator. This idiom is notcommonly used because there is very little need of using above examplesin real code. And I'm sure that the walrus operator in comprehensionwill be very rare unless PEP 572 will encourage writing complicatedcomprehensions. Most users prefer to write an explicit loop.

I want to remember that PEP 572 started from the discussion onPython-ideas which proposed a syntax for writing the following code as acomprehension:
     smooth_signal = []
     average = initial_value
     for xt in signal:
         average = (1-decay)*average + decay*xt
         smooth_signal.append(average)
Using the "for in []" idiom this can be written (if you prefercomprehensions) as:
     smooth_signal = [average
                      for average in [initial_value]
                      for x in signal
                      for average in [(1-decay)*average + decay*x]]
Try now to write this using PEP 572. The walrus operator turned to beless suitable for solving the original problem because it doesn't helpto initialize the initial value.
Examples from PEP 572:
# Loop-and-a-half
while (command := input("> ")) != "quit":
    print("You entered:", command)
The straightforward way:

     while True:
         command = input("> ")
         if command == "quit": break
         print("You entered:", command)

The clever way:

     for command in iter(lambda: input("> "), "quit"):
         print("You entered:", command)

The 2-argument form of iter is under-remembered and under-used. Thelength difference is 8.

    while (command := input("> ")) != "quit":
    for command in iter(lambda: input("> "), "quit"):

I like the iter version, but the for-loop machinery and extra functioncall makes a minimal loop half a millisecond slower.


import timeit as ti

def s():
    it = iter(10000*'0' + '1')

def w():
    it = iter(10000*'0' + '1')
    while True:
        command = next(it)
        if command == '1': break

def f():
    it = iter(10000*'0' + '1')
    for command in iter(lambda: next(it), '1'): pass

print(ti.timeit('s()', 'from __main__ import s', number=1000))
print(ti.timeit('w()', 'from __main__ import w', number=1000))
print(ti.timeit('f()', 'from __main__ import f', number=1000))
#
0.0009702129999999975
0.9365254250000001
1.5913117949999998

Of course, with added processing of 'command' the time differencedisappears. Printing (in IDLE) is an extreme case.


def wp():
    it = iter(100*'0' + '1')
    while True:
        command = next(it)
        if command == '1': break
        print('w', command)

def fp():
    it = iter(100*'0' + '1')
    for command in iter(lambda: next(it), '1'):
        print('f', command)

print(ti.timeit('wp()', 'from __main__ import wp', number=1))
print(ti.timeit('fp()', 'from __main__ import fp', number=1))
#
0.48
0.47

# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if match := re.search(pat, text):
    print("Found:", match.group(0))
# The same syntax chains nicely into 'elif' statements, unlike the
# equivalent using assignment statements.
elif match := re.search(otherpat, text):
    print("Alternate found:", match.group(0))
elif match := re.search(third, text):
    print("Fallback found:", match.group(0))

It may be more efficient to use a single regular expression whichconsists of multiple or-ed patterns

My attempt resulted in a slowdown. Duplicating the dominance of patover otherpat over third requires, I believe, negative lookahead assertions.

---

import re
import timeit as ti

##print(ti.timeit('for y in {x}: pass', 'x=1'))
##print(ti.timeit('for y in [x]: pass', 'x=1'))
##print(ti.timeit('for y in (x,): pass', 'x=1'))
##
##print(ti.timeit('y=x', 'x=1'))
##print(ti.timeit('y=x; del y', 'x=1'))

pat1 = re.compile('1')
pat2 = re.compile('2')
pat3 = re.compile('3')
pat123 = re.compile('1|2(?!.*1)|3(?!.*(1|2))')
# I think most people would prefer to use the 3 simple patterns.

def ifel(text):
    match = re.search(pat1, text)
    if match: return match.group()
    match = re.search(pat2, text)
    if match: return match.group()
    match = re.search(pat3, text)
    if match: return match.group()

def mach(text):
    match = re.search(pat123, text)
    return match.group()

print([ifel('321'), ifel('32x'), ifel('3xx')] == ['1', '2', '3'])
print([mach('321'), mach('32x'), mach('3xx')] == ['1', '2', '3'])
# True, True

text = '0'*10000 + '321'

print(ti.timeit('ifel(text)', "from __main__ import ifel, text",number=100000))print(ti.timeit('mach(text)', "from __main__ import mach, text",number=100000))

# 0.77, 7.22

marked as different groups.

When I put parens around 1, 2, 3 in pat123, the 2nd timeit continueduntil I restarted Shell. Maybe you can do better.

For example see the cute regex-based tokenizer in gettext.py:

_token_pattern = re.compile(r"""

(?P<WHITESPACES>[ \t]+) | # spaces andhorizontal tabs

        (?P<NUMBER>[0-9]+\b)                       | # decimal integer
        (?P<NAME>n\b)                              | # only n is allowed
        (?P<PARENTHESIS>[()])                      |

(?P<OPERATOR>[-*/%+?:]|[><!]=?|==|&&|\|\|) | # !, *, /, %, +,-, <, >, # <=, >=, ==, !=,&&, ||,

                                                     # ? :

# unary andbitwise ops

                                                     # not allowed
        (?P<INVALID>\w+|.)                           # invalid token
    """, re.VERBOSE|re.DOTALL)

def _tokenize(plural):
    for mo in re.finditer(_token_pattern, plural):
        kind = mo.lastgroup
        if kind == 'WHITESPACES':
            continue
        value = mo.group(kind)
        if kind == 'INVALID':
            raise ValueError('invalid token in plural form: %s' % value)
        yield value
    yield ''

I have not found any code similar to the PEP 572 example in pydoc.py. Ithas different code:

pattern = re.compile(r'\b((http|ftp)://\S+[\w/]|'
                        r'RFC[- ]?(\d+)|'
                        r'PEP[- ]?(\d+)|'
                        r'(self\.)?(\w+))')

...

start, end = match.span()
results.append(escape(text[here:start]))

all, scheme, rfc, pep, selfdot, name = match.groups()
if scheme:
    url = escape(all).replace('"', '&quot;')
    results.append('<a href="%s">%s</a>' % (url, url))
elif rfc:
    url = 'http://www.rfc-editor.org/rfc/rfc%d.txt' % int(rfc)
    results.append('<a href="%s">%s</a>' % (url, escape(all)))
elif pep:

...

It doesn't look as a sequence of re.search() calls. It is more clear andefficient, and using the assignment expression will not make it better.

# Reading socket data until an empty string is returned
while data := sock.recv():
    print("Received data:", data)


     for data in iter(sock.recv, b''):
         print("Received data:", data)

if pid := os.fork():
    # Parent code
else:
    # Child code


     pid = os.fork()
     if pid:
         # Parent code
     else:
         # Child code

It looks to me that there is no use case for PEP 572. It just makesPython worse.



--
Terry Jan Reedy


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Examples for PEP 572

Reply via email to