STINNER Victor added the comment: type-slot-calls.diff: Can you please create a pull request?
> `a + b` still is 25-30% slower than `a.__add__(b)` Hum, can you please post a microbenchmark results to see the effect of the patch? > After analyzing the article and comparing it with the current code I have > found that virtually all proposed optimization steps already applied in 3.7 > by Victor! The difference is only in details. The article has two main points: * the calling convention of the Python C API requires to create a tuple, and that's expensive * "a + b" has a complex semantics which requires to check for __radd__, check for issubclass(), etc. Yeah, it seems like the FASTCALL changes I made in typeobject.c removed the overhead of the temporary tuple. Yury's and Naoki's work on CALL_METHOD also improved performances here on method calls. I don't think that we can change the semantics, only try to optimize the implementation. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30509> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com