Serhiy Storchaka <storchaka+cpyt...@gmail.com> added the comment:

Results of microbenchmarks:

$ ./python -m perf timeit -s 'a = list(range(1000))' -- 'for i in a: pass'
Mean +- std dev: 6.31 us +- 0.09 us

$ ./python -m perf timeit -s 'a = list(range(1000))' -- '
for i in a:
    try: pass
    finally: pass
'
Unpatched:  Mean +- std dev: 16.3 us +- 0.2 us
PR 2827:    Mean +- std dev: 16.2 us +- 0.2 us
PR 4682:    Mean +- std dev: 16.2 us +- 0.2 us
PR 5006:    Mean +- std dev: 14.5 us +- 0.4 us

$ ./python -m perf timeit -s 'a = list(range(1000))' -- '
for i in a:
    try: continue
    finally: pass
'
Unpatched:  Mean +- std dev: 24.0 us +- 0.5 us
PR 2827:    Mean +- std dev: 11.9 us +- 0.1 us
PR 4682:    Mean +- std dev: 12.0 us +- 0.1 us
PR 5006:    Mean +- std dev: 19.0 us +- 0.3 us

$ ./python -m perf timeit -s 'a = list(range(1000))' -- '
for i in a:
    while True:
        try: break
        finally: pass
'
Unpatched:  Mean +- std dev: 25.9 us +- 0.5 us
PR 2827:    Mean +- std dev: 11.9 us +- 0.1 us
PR 4682:    Mean +- std dev: 12.0 us +- 0.1 us
PR 5006:    Mean +- std dev: 18.9 us +- 0.1 us


PR 2827 and PR 4682 have the same performance. The overhead of the finally 
block is smaller in PR 5006, perhaps because BEGIN_FINALLY pushes 1 NULL 
instead of 6 NULLs. CALL_FINALLY adds 4.5 ns in the latter too examples. This 
overhead could be decreased by using special cache for Python integers that 
represent return addresses or using separate stack for return addresses. But 
this looks as an overkill to me now. 4.5 ns is pretty small overhead, the 
simple `i = i` have the same timing.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue17611>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to