On 2019-07-09, Inada Naoki wrote:
> PyObject_Malloc inlines pymalloc_alloc, and PyObject_Free inlines
> pymalloc_free.
> But compiler doesn't know which is the hot part in pymalloc_alloc and
> pymalloc_free.
Hello Inada,
I don't see this on my PC. I'm using GCC 8.3.0. I have configured
the build with --enable-optimizations. To speed up the profile
generation, I have changed PROFILE_TASK to only run these tests:
test_shelve test_set test_pprint test_pickletools
test_ordered_dict test_tabnanny test_difflib test_pickle
test_json test_collections
I haven't spent much time trying to figure out what set of tests is
best but the above set runs pretty quickly and seems to work okay.
I have run pyperformance to compare CPython 'master' with your PR
14674. There doesn't seem to be a difference (table below). If I
look at the disassembly, it seems that the hot paths of
pymalloc_alloc and pymalloc_free are being inlined as you would
hope, without needing the LIKELY/UNLIKELY annotations.
OTOH, your addition of LIKELY() and UNLIKELY() in the PR is a pretty
small change and probably doesn't hurt anything. So, I think it
would be fine to merge it.
Regards,
Neil
+-------------------------+---------+-----------------------------+
| Benchmark | master | PR-14674 |
+=========================+=========+=============================+
| 2to3 | 305 ms | 304 ms: 1.00x faster (-0%) |
+-------------------------+---------+-----------------------------+
| chaos | 109 ms | 110 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| crypto_pyaes | 118 ms | 117 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| django_template | 112 ms | 114 ms: 1.02x slower (+2%) |
+-------------------------+---------+-----------------------------+
| fannkuch | 446 ms | 440 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| float | 119 ms | 120 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| go | 247 ms | 250 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| json_loads | 25.1 us | 24.4 us: 1.03x faster (-3%) |
+-------------------------+---------+-----------------------------+
| logging_simple | 8.86 us | 8.66 us: 1.02x faster (-2%) |
+-------------------------+---------+-----------------------------+
| meteor_contest | 97.5 ms | 97.7 ms: 1.00x slower (+0%) |
+-------------------------+---------+-----------------------------+
| nbody | 140 ms | 142 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| pathlib | 19.2 ms | 18.9 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| pickle | 8.95 us | 9.08 us: 1.02x slower (+2%) |
+-------------------------+---------+-----------------------------+
| pickle_dict | 18.1 us | 18.0 us: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| pickle_list | 2.75 us | 2.68 us: 1.03x faster (-3%) |
+-------------------------+---------+-----------------------------+
| pidigits | 182 ms | 184 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| python_startup | 7.83 ms | 7.81 ms: 1.00x faster (-0%) |
+-------------------------+---------+-----------------------------+
| python_startup_no_site | 5.36 ms | 5.36 ms: 1.00x faster (-0%) |
+-------------------------+---------+-----------------------------+
| raytrace | 495 ms | 499 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| regex_dna | 173 ms | 170 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| regex_effbot | 2.79 ms | 2.67 ms: 1.05x faster (-4%) |
+-------------------------+---------+-----------------------------+
| regex_v8 | 21.1 ms | 21.2 ms: 1.00x slower (+0%) |
+-------------------------+---------+-----------------------------+
| richards | 68.2 ms | 68.7 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| scimark_monte_carlo | 103 ms | 102 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| scimark_sparse_mat_mult | 4.37 ms | 4.35 ms: 1.00x faster (-0%) |
+-------------------------+---------+-----------------------------+
| spectral_norm | 132 ms | 133 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| sqlalchemy_imperative | 30.3 ms | 30.7 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| sympy_sum | 88.2 ms | 89.2 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| telco | 6.63 ms | 6.58 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| tornado_http | 178 ms | 179 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| unpickle | 12.0 us | 12.4 us: 1.03x slower (+3%) |
+-------------------------+---------+-----------------------------+
| unpickle_list | 3.93 us | 3.75 us: 1.05x faster (-4%) |
+-------------------------+---------+-----------------------------+
Not significant (25): deltablue; dulwich_log; hexiom; html5lib; json_dumps;
logging_format; logging_silent; mako; nqueens; pickle_pure_python;
regex_compile; scimark_fft; scimark_lu; scimark_sor; sqlalchemy_declarative;
sqlite_synth; sympy_expand; sympy_integrate; sympy_str; unpack_sequence;
unpickle_pure_python; xml_etree_parse; xml_etree_iterparse; xml_etree_generate;
xml_etree_process
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/6E44YQ4EOFCO6CNYFXT7PQJUCCFR5YXS/