[issue45367] Specialize BINARY_MULTIPLY

2021-10-17 Thread Dennis Sweeney
Change by Dennis Sweeney : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue45367] Specialize BINARY_MULTIPLY

2021-10-14 Thread Mark Shannon
Mark Shannon added the comment: New changeset 3b3d30e8f78271a488965c9cd11136e1aa890757 by Dennis Sweeney in branch 'main': bpo-45367: Specialize BINARY_MULTIPLY (GH-28727) https://github.com/python/cpython/commit/3b3d30e8f78271a488965c9cd11136e1aa890757 --

[issue45367] Specialize BINARY_MULTIPLY

2021-10-06 Thread Mark Shannon
Mark Shannon added the comment: If some misses are caused by mixed int/float operands, it might be worth investigating whether these occur in loops. Most JIT compilers perform some sort of loop peeling to counter this form of type instability. E.g. x = 0 for ... x += some_float() `x`

[issue45367] Specialize BINARY_MULTIPLY

2021-10-05 Thread Dennis Sweeney
Dennis Sweeney added the comment: Hm the above was not PGO. I tried again with PGO and it is not so good: Mean +- std dev: [nbody_main_pgo] 177 ms +- 4 ms -> [nbody_specialized_pgo] 190 ms +- 2 ms: 1.07x slower Mean +- std dev: [pidigits_main_pgo] 208 ms +- 1 ms ->

[issue45367] Specialize BINARY_MULTIPLY

2021-10-05 Thread Ken Jin
Ken Jin added the comment: > (Windows doesn't want to install greenlet for pyperformance) I had the *exact* same issues, I eventually found a workaround for it after many hours spent guessing. Initially, setuptools complained that I needed MSVC++ 14.0 or later (even after I had the latest

[issue45367] Specialize BINARY_MULTIPLY

2021-10-04 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +27074 stage: -> patch review pull_request: https://github.com/python/cpython/pull/28727 ___ Python tracker ___

[issue45367] Specialize BINARY_MULTIPLY

2021-10-04 Thread Dennis Sweeney
New submission from Dennis Sweeney : I'm having trouble setting up a rigorous benchmark (Windows doesn't want to install greenlet for pyperformance), but just running a couple of individual files, I got this: Mean +- std dev: [nbody_main] 208 ms +- 2 ms -> [nbody_specialized] 180 ms +- 2