STINNER Victor added the comment:

type-slot-calls.diff: Can you please create a pull request?

> `a + b` still is 25-30% slower than `a.__add__(b)`

Hum, can you please post a microbenchmark results to see the effect of the 
patch?

> After analyzing the article and comparing it with the current code I have 
> found that virtually all proposed optimization steps already applied in 3.7 
> by Victor! The difference is only in details.

The article has two main points:

* the calling convention of the Python C API requires to create a tuple, and 
that's expensive
* "a + b" has a complex semantics which requires to check for __radd__, check 
for issubclass(), etc.

Yeah, it seems like the FASTCALL changes I made in typeobject.c removed the 
overhead of the temporary tuple. Yury's and Naoki's work on CALL_METHOD also 
improved performances here on method calls.

I don't think that we can change the semantics, only try to optimize the 
implementation.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30509>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to