[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-17 Thread Christian Heimes


Christian Heimes  added the comment:

Thank you for your thorough testing. It's useful to know that the option does 
not speed up PGO builds of Python.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-17 Thread Ma Lin


Ma Lin  added the comment:

Last benchmark was wrong, \Ob3 option was not enabled.

Apply `pgo_ob3.diff`, it slows, so I close this issue.

+-++--+
| Benchmark   | py39_pgo_a | py39_pgo_b   |
+=++==+
| 2to3| 461 ms | 465 ms: 1.01x slower (+1%)   |
+-++--+
| chameleon   | 13.4 ms| 13.7 ms: 1.03x slower (+3%)  |
+-++--+
| chaos   | 138 ms | 141 ms: 1.02x slower (+2%)   |
+-++--+
| crypto_pyaes| 141 ms | 143 ms: 1.01x slower (+1%)   |
+-++--+
| deltablue   | 9.01 ms| 9.20 ms: 1.02x slower (+2%)  |
+-++--+
| django_template | 64.7 ms| 65.4 ms: 1.01x slower (+1%)  |
+-++--+
| dulwich_log | 78.2 ms| 78.8 ms: 1.01x slower (+1%)  |
+-++--+
| fannkuch| 640 ms | 668 ms: 1.04x slower (+4%)   |
+-++--+
| float   | 165 ms | 163 ms: 1.01x faster (-1%)   |
+-++--+
| genshi_text | 40.7 ms| 41.5 ms: 1.02x slower (+2%)  |
+-++--+
| genshi_xml  | 87.2 ms| 88.4 ms: 1.01x slower (+1%)  |
+-++--+
| go  | 309 ms | 314 ms: 1.01x slower (+1%)   |
+-++--+
| hexiom  | 12.3 ms| 12.7 ms: 1.03x slower (+3%)  |
+-++--+
| json_dumps  | 16.7 ms| 16.8 ms: 1.01x slower (+1%)  |
+-++--+
| json_loads  | 32.1 us| 32.5 us: 1.01x slower (+1%)  |
+-++--+
| logging_format  | 14.6 us| 15.0 us: 1.03x slower (+3%)  |
+-++--+
| logging_silent  | 247 ns | 257 ns: 1.04x slower (+4%)   |
+-++--+
| logging_simple  | 13.2 us| 13.6 us: 1.03x slower (+3%)  |
+-++--+
| mako| 22.1 ms| 22.8 ms: 1.03x slower (+3%)  |
+-++--+
| meteor_contest  | 135 ms | 137 ms: 1.01x slower (+1%)   |
+-++--+
| nbody   | 184 ms | 191 ms: 1.04x slower (+4%)   |
+-++--+
| nqueens | 132 ms | 137 ms: 1.04x slower (+4%)   |
+-++--+
| pathlib | 156 ms | 162 ms: 1.04x slower (+4%)   |
+-++--+
| pickle  | 16.3 us| 15.4 us: 1.05x faster (-5%)  |
+-++--+
| pickle_dict | 39.7 us| 40.0 us: 1.01x slower (+1%)  |
+-++--+
| pickle_list | 5.93 us| 6.15 us: 1.04x slower (+4%)  |
+-++--+
| pickle_pure_python  | 581 us | 587 us: 1.01x slower (+1%)   |
+-++--+
| pidigits| 243 ms | 242 ms: 1.00x faster (-0%)   |
+-++--+
| pyflate | 885 ms | 908 ms: 1.03x slower (+3%)   |
+-++--+
| python_startup  | 27.8 ms| 28.0 ms: 1.01x slower (+1%)  |
+-++--+
| python_startup_no_site  | 22.0 ms| 22.1 ms: 1.00x slower (+0%)  |
+-++--+
| raytrace| 630 ms | 632 ms: 1.00x slower (+0%)   |
+-++--+
| regex_compile   | 215 m

[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-16 Thread Ma Lin


Ma Lin  added the comment:

In PGO build, the improvement is not much.

(3.9 branch, with PGO, build.bat -p X64 --pgo)

+-+--+--+
| Benchmark   | baseline-pgo | ob3-pgo  |
+=+==+==+
| 2to3| 464 ms   | 462 ms: 1.01x faster (-1%)   |
+-+--+--+
| chameleon   | 14.0 ms  | 13.5 ms: 1.03x faster (-3%)  |
+-+--+--+
| crypto_pyaes| 142 ms   | 143 ms: 1.00x slower (+0%)   |
+-+--+--+
| django_template | 65.0 ms  | 65.4 ms: 1.01x slower (+1%)  |
+-+--+--+
| fannkuch| 665 ms   | 650 ms: 1.02x faster (-2%)   |
+-+--+--+
| float   | 166 ms   | 164 ms: 1.01x faster (-1%)   |
+-+--+--+
| genshi_text | 41.4 ms  | 41.0 ms: 1.01x faster (-1%)  |
+-+--+--+
| genshi_xml  | 88.1 ms  | 87.0 ms: 1.01x faster (-1%)  |
+-+--+--+
| go  | 315 ms   | 311 ms: 1.01x faster (-1%)   |
+-+--+--+
| hexiom  | 12.7 ms  | 12.6 ms: 1.01x faster (-1%)  |
+-+--+--+
| json_dumps  | 16.7 ms  | 16.6 ms: 1.01x faster (-1%)  |
+-+--+--+
| json_loads  | 33.5 us  | 32.1 us: 1.04x faster (-4%)  |
+-+--+--+
| logging_simple  | 13.6 us  | 13.3 us: 1.02x faster (-2%)  |
+-+--+--+
| mako| 22.7 ms  | 22.8 ms: 1.01x slower (+1%)  |
+-+--+--+
| meteor_contest  | 136 ms   | 138 ms: 1.01x slower (+1%)   |
+-+--+--+
| nbody   | 189 ms   | 186 ms: 1.02x faster (-2%)   |
+-+--+--+
| nqueens | 135 ms   | 135 ms: 1.01x faster (-1%)   |
+-+--+--+
| pathlib | 157 ms   | 154 ms: 1.02x faster (-2%)   |
+-+--+--+
| pickle  | 16.8 us  | 16.4 us: 1.02x faster (-2%)  |
+-+--+--+
| pickle_dict | 41.3 us  | 40.4 us: 1.02x faster (-2%)  |
+-+--+--+
| pickle_list | 6.34 us  | 6.42 us: 1.01x slower (+1%)  |
+-+--+--+
| pickle_pure_python  | 588 us   | 584 us: 1.01x faster (-1%)   |
+-+--+--+
| pidigits| 242 ms   | 242 ms: 1.00x faster (-0%)   |
+-+--+--+
| pyflate | 905 ms   | 898 ms: 1.01x faster (-1%)   |
+-+--+--+
| python_startup  | 28.0 ms  | 27.9 ms: 1.00x faster (-0%)  |
+-+--+--+
| regex_compile   | 220 ms   | 218 ms: 1.01x faster (-1%)   |
+-+--+--+
| regex_v8| 33.1 ms  | 32.9 ms: 1.01x faster (-1%)  |
+-+--+--+
| richards| 88.9 ms  | 88.3 ms: 1.01x faster (-1%)  |
+-+--+--+
| scimark_fft | 494 ms   | 486 ms: 1.02x faster (-2%)   |
+-+--+--+
| scimark_lu  | 210 ms   | 207 ms: 1.02x faster (-2%)   |
+-+--+--+
| scimark_monte_carlo | 141 ms   | 137 ms: 1.03x faster (-3%)   |
+-+--+--+
| scimark_sor | 263 ms   | 255 ms: 1.03x faster (-3

[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-16 Thread Ma Lin


Ma Lin  added the comment:

> Could you please try again with PGO?

Please wait.

BTW, this option was advised in another project.
In that project, even enable `\Ob3`, it still slower than GCC 9 build.
If you are interested, see: https://github.com/facebook/zstd/issues/2314

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-16 Thread Christian Heimes


Christian Heimes  added the comment:

Could you please try again with PGO? All our official builds use PGO.

--
nosy: +christian.heimes

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-16 Thread Ma Lin


New submission from Ma Lin :

MSVC2019 has a new option `/Ob3`, it specifies more aggressive inlining than 
/Ob2:
https://docs.microsoft.com/en-us/cpp/build/reference/ob-inline-function-expansion?view=msvc-160

If use this option in MSVC2017, it will emit a warning:
cl : Command line warning D9002 : ignoring unknown option '/Ob3'

Just apply `Ob3.diff`, get this improvement:
(Python 3.9 branch, No PGO, build.bat -p X64)

+-+--+--+
| Benchmark   | baseline | ob3  |
+=+==+==+
| 2to3| 563 ms   | 552 ms: 1.02x faster (-2%)   |
+-+--+--+
| chameleon   | 16.5 ms  | 16.1 ms: 1.03x faster (-3%)  |
+-+--+--+
| chaos   | 200 ms   | 197 ms: 1.02x faster (-2%)   |
+-+--+--+
| crypto_pyaes| 186 ms   | 184 ms: 1.01x faster (-1%)   |
+-+--+--+
| deltablue   | 13.0 ms  | 12.6 ms: 1.03x faster (-3%)  |
+-+--+--+
| dulwich_log | 94.5 ms  | 93.9 ms: 1.01x faster (-1%)  |
+-+--+--+
| fannkuch| 806 ms   | 761 ms: 1.06x faster (-6%)   |
+-+--+--+
| float   | 211 ms   | 199 ms: 1.06x faster (-6%)   |
+-+--+--+
| genshi_text | 48.3 ms  | 47.7 ms: 1.01x faster (-1%)  |
+-+--+--+
| go  | 446 ms   | 437 ms: 1.02x faster (-2%)   |
+-+--+--+
| hexiom  | 16.6 ms  | 15.9 ms: 1.04x faster (-4%)  |
+-+--+--+
| json_dumps  | 19.9 ms  | 19.3 ms: 1.03x faster (-3%)  |
+-+--+--+
| json_loads  | 45.5 us  | 43.9 us: 1.04x faster (-3%)  |
+-+--+--+
| logging_format  | 21.4 us  | 20.7 us: 1.03x faster (-3%)  |
+-+--+--+
| logging_silent  | 343 ns   | 319 ns: 1.07x faster (-7%)   |
+-+--+--+
| mako| 29.0 ms  | 27.6 ms: 1.05x faster (-5%)  |
+-+--+--+
| meteor_contest  | 168 ms   | 162 ms: 1.04x faster (-3%)   |
+-+--+--+
| nbody   | 256 ms   | 244 ms: 1.05x faster (-5%)   |
+-+--+--+
| nqueens | 168 ms   | 162 ms: 1.04x faster (-4%)   |
+-+--+--+
| pathlib | 175 ms   | 168 ms: 1.04x faster (-4%)   |
+-+--+--+
| pickle  | 17.9 us  | 17.3 us: 1.04x faster (-4%)  |
+-+--+--+
| pickle_dict | 41.0 us  | 33.2 us: 1.24x faster (-19%) |
+-+--+--+
| pickle_list | 6.73 us  | 5.89 us: 1.14x faster (-12%) |
+-+--+--+
| pickle_pure_python  | 829 us   | 793 us: 1.05x faster (-4%)   |
+-+--+--+
| pidigits| 243 ms   | 243 ms: 1.00x faster (-0%)   |
+-+--+--+
| pyflate | 1.21 sec | 1.18 sec: 1.03x faster (-2%) |
+-+--+--+
| raytrace| 947 ms   | 915 ms: 1.03x faster (-3%)   |
+-+--+--+
| regex_compile   | 291 ms   | 284 ms: 1.03x faster (-2%)   |
+-+--+--+
| regex_dna   | 217 ms   | 222 ms: 1.02x slower (+2%)   |
+-+--+--+
| regex_effbot| 3.97 ms  | 4.13 ms: 1.04x slower (+4%)  |
+-+--+--+
| regex_v8| 35.2 ms  | 34.6 ms: 1.02x faster (-2%)  |
+-+--+--+
| richards