STINNER Victor added the comment:

Hi,

I made progress on my FASTCALL branch. I removed tp_fastnew, tp_fastinit and
tp_fastnew fields from PyTypeObject to replace them with new type flags (ex:
Py_TPFLAGS_FASTNEW) to avoid code duplication and reduce the memory footprint.
Before, each function was simply duplicated.

This change introduces a backward incompatibility change: it's not more
possible to call directly tp_new, tp_init and tp_call. I don't know yet if such
change would be acceptable in Python 3.6, nor if it is worth it.

I spent a lot of ot time on the CPython benchmark suite to check for
performance regression. In fact, I spent most of my time to try to understand
why most benchmarks looked completly unstable. I now tuned correctly my
system and patched perf.py to get reliable benchmarks.

On the latest run of the benchmark suite, most benchmarks are faster! I have to 
investigate why 3 benchmarks are still slower. In the run, normal_startup was 
not significant, etree_parse was faster (instead of slower), but raytrace was 
already slower (but only 1.13x slower). It may be the "noise" of the PGO 
compilation. I already noticed that once: see the issue #27056 "pickle: 
constant propagation in _Unpickler_Read()".

Result of the benchmark suite:

slower (3):

* raytrace: 1.06x slower
* etree_parse: 1.03x slower
* normal_startup: 1.02x slower

faster (18):

* unpickle_list: 1.11x faster
* chameleon_v2: 1.09x faster
* etree_generate: 1.08x faster
* etree_process: 1.08x faster
* mako_v2: 1.06x faster
* call_method_unknown: 1.06x faster
* django_v3: 1.05x faster
* regex_compile: 1.05x faster
* etree_iterparse: 1.05x faster
* fastunpickle: 1.05x faster
* meteor_contest: 1.05x faster
* pickle_dict: 1.05x faster
* float: 1.04x faster
* pathlib: 1.04x faster
* silent_logging: 1.04x faster
* call_method: 1.03x faster
* json_dump_v2: 1.03x faster
* call_simple: 1.03x faster

not significant (21):

* 2to3
* call_method_slots
* chaos
* fannkuch
* fastpickle
* formatted_logging
* go
* json_load
* nbody
* nqueens
* pickle_list
* pidigits
* regex_effbot
* regex_v8
* richards
* simple_logging
* spectral_norm
* startup_nosite
* telco
* tornado_http
* unpack_sequence

I know that my patch is simply giant and cannot be merged like that.

Since the performance is still promising, I plan to split my giant
patch into smaller patches, easier to review. I will try to check that
individual patches don't make Python slower. This work will take time.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26814>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to