STINNER Victor added the comment: Hi,
I made progress on my FASTCALL branch. I removed tp_fastnew, tp_fastinit and tp_fastnew fields from PyTypeObject to replace them with new type flags (ex: Py_TPFLAGS_FASTNEW) to avoid code duplication and reduce the memory footprint. Before, each function was simply duplicated. This change introduces a backward incompatibility change: it's not more possible to call directly tp_new, tp_init and tp_call. I don't know yet if such change would be acceptable in Python 3.6, nor if it is worth it. I spent a lot of ot time on the CPython benchmark suite to check for performance regression. In fact, I spent most of my time to try to understand why most benchmarks looked completly unstable. I now tuned correctly my system and patched perf.py to get reliable benchmarks. On the latest run of the benchmark suite, most benchmarks are faster! I have to investigate why 3 benchmarks are still slower. In the run, normal_startup was not significant, etree_parse was faster (instead of slower), but raytrace was already slower (but only 1.13x slower). It may be the "noise" of the PGO compilation. I already noticed that once: see the issue #27056 "pickle: constant propagation in _Unpickler_Read()". Result of the benchmark suite: slower (3): * raytrace: 1.06x slower * etree_parse: 1.03x slower * normal_startup: 1.02x slower faster (18): * unpickle_list: 1.11x faster * chameleon_v2: 1.09x faster * etree_generate: 1.08x faster * etree_process: 1.08x faster * mako_v2: 1.06x faster * call_method_unknown: 1.06x faster * django_v3: 1.05x faster * regex_compile: 1.05x faster * etree_iterparse: 1.05x faster * fastunpickle: 1.05x faster * meteor_contest: 1.05x faster * pickle_dict: 1.05x faster * float: 1.04x faster * pathlib: 1.04x faster * silent_logging: 1.04x faster * call_method: 1.03x faster * json_dump_v2: 1.03x faster * call_simple: 1.03x faster not significant (21): * 2to3 * call_method_slots * chaos * fannkuch * fastpickle * formatted_logging * go * json_load * nbody * nqueens * pickle_list * pidigits * regex_effbot * regex_v8 * richards * simple_logging * spectral_norm * startup_nosite * telco * tornado_http * unpack_sequence I know that my patch is simply giant and cannot be merged like that. Since the performance is still promising, I plan to split my giant patch into smaller patches, easier to review. I will try to check that individual patches don't make Python slower. This work will take time. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26814> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com