New submission from Dong-hee Na <[email protected]>:
Compiling CPython with the PGO option is good for CPython performance but compile time is very painful since PGO profiling is executed with a single thread. When I tested with run -m test --pgo -j8, it doesn't affect to optimized result with fast build time. so I would like to provide the option for the number of workers for PGO build. and also with this feature, we can include more PGO tests more aggressively. @vstinner, Do you have any suggestions for this option? - a: ./configure --enable-optimizations --pgo-workers=8 - b: ./configure --enable-optimizations --with-concurrent-pgo - c: ./configure --enable-optimizations (By detecting system cpu count) Following metrics is the reference for decision making :) ## Build Time AS-IS: real 4m42.799s TO-BE(this case -j8): real 2m10.405s ## No performance regression I didn't check how the environment is reliable but there looks no regression. +------------------------+---------+-----------------------+ | Benchmark | base | workers | +========================+=========+=======================+ | 2to3 | 409 ms | 412 ms: 1.01x slower | +------------------------+---------+-----------------------+ | chaos | 115 ms | 114 ms: 1.01x faster | +------------------------+---------+-----------------------+ | deltablue | 6.66 ms | 6.59 ms: 1.01x faster | +------------------------+---------+-----------------------+ | fannkuch | 605 ms | 611 ms: 1.01x slower | +------------------------+---------+-----------------------+ | float | 138 ms | 129 ms: 1.07x faster | +------------------------+---------+-----------------------+ | go | 220 ms | 215 ms: 1.02x faster | +------------------------+---------+-----------------------+ | hexiom | 10.3 ms | 10.1 ms: 1.02x faster | +------------------------+---------+-----------------------+ | json_dumps | 19.6 ms | 19.2 ms: 1.02x faster | +------------------------+---------+-----------------------+ | json_loads | 40.6 us | 39.7 us: 1.02x faster | +------------------------+---------+-----------------------+ | logging_silent | 180 ns | 173 ns: 1.04x faster | +------------------------+---------+-----------------------+ | logging_simple | 8.89 us | 8.81 us: 1.01x faster | +------------------------+---------+-----------------------+ | nqueens | 134 ms | 136 ms: 1.01x slower | +------------------------+---------+-----------------------+ | pathlib | 24.6 ms | 24.2 ms: 1.01x faster | +------------------------+---------+-----------------------+ | pickle | 16.1 us | 15.9 us: 1.01x faster | +------------------------+---------+-----------------------+ | pickle_dict | 41.4 us | 38.1 us: 1.09x faster | +------------------------+---------+-----------------------+ | pickle_list | 6.27 us | 5.09 us: 1.23x faster | +------------------------+---------+-----------------------+ | pickle_pure_python | 499 us | 492 us: 1.01x faster | +------------------------+---------+-----------------------+ | pidigits | 285 ms | 290 ms: 1.02x slower | +------------------------+---------+-----------------------+ | python_startup | 12.1 ms | 12.2 ms: 1.01x slower | +------------------------+---------+-----------------------+ | python_startup_no_site | 8.91 ms | 8.89 ms: 1.00x faster | +------------------------+---------+-----------------------+ | raytrace | 510 ms | 500 ms: 1.02x faster | +------------------------+---------+-----------------------+ | regex_compile | 211 ms | 210 ms: 1.00x faster | +------------------------+---------+-----------------------+ | regex_effbot | 4.99 ms | 4.88 ms: 1.02x faster | +------------------------+---------+-----------------------+ | regex_v8 | 37.3 ms | 36.3 ms: 1.03x faster | +------------------------+---------+-----------------------+ | richards | 73.6 ms | 72.2 ms: 1.02x faster | +------------------------+---------+-----------------------+ | scimark_fft | 542 ms | 552 ms: 1.02x slower | +------------------------+---------+-----------------------+ | scimark_lu | 189 ms | 184 ms: 1.03x faster | +------------------------+---------+-----------------------+ | scimark_monte_carlo | 106 ms | 106 ms: 1.01x slower | +------------------------+---------+-----------------------+ | scimark_sor | 199 ms | 196 ms: 1.01x faster | +------------------------+---------+-----------------------+ | spectral_norm | 177 ms | 176 ms: 1.01x faster | +------------------------+---------+-----------------------+ | unpack_sequence | 64.9 ns | 63.7 ns: 1.02x faster | +------------------------+---------+-----------------------+ | unpickle | 21.5 us | 21.6 us: 1.00x slower | +------------------------+---------+-----------------------+ | unpickle_list | 7.69 us | 7.55 us: 1.02x faster | +------------------------+---------+-----------------------+ | unpickle_pure_python | 402 us | 394 us: 1.02x faster | +------------------------+---------+-----------------------+ | xml_etree_parse | 218 ms | 217 ms: 1.01x faster | +------------------------+---------+-----------------------+ | xml_etree_iterparse | 156 ms | 156 ms: 1.01x faster | +------------------------+---------+-----------------------+ | xml_etree_generate | 132 ms | 131 ms: 1.01x faster | +------------------------+---------+-----------------------+ | xml_etree_process | 92.8 ms | 91.5 ms: 1.02x faster | +------------------------+---------+-----------------------+ | Geometric mean | (ref) | 1.02x faster | +------------------------+---------+-----------------------+ Benchmark hidden because not significant (8): logging_format, meteor_contest, nbody, pyflate, regex_dna, scimark_sparse_mat_mult, sqlite_synth, telco ---------- assignee: corona10 components: Build messages: 411888 nosy: corona10, gvanrossum, vstinner priority: normal severity: normal status: open title: Provide number of workers option for fast PGO build time type: enhancement versions: Python 3.11 _______________________________________ Python tracker <[email protected]> <https://bugs.python.org/issue46551> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
