New submission from Serhiy Storchaka:

In excellent Peter Cawley's article "Why are slots so slow?" [1] analysed 
causes why `a + b` is slower than `a.__add__(b)` for custom __add__ and 
provided suggestions for optimizing type slot calls. `a + b` and `a.__add__(b)` 
execute the same user code, `a + b` should have smaller overhead of bytecode 
interpreting, but it was 2 times slower than  `a.__add__(b)`. In the article `a 
+ b` has been made 16% faster than `a.__add__(b)`.

In 3.7 the difference between two ways is smaller, but `a + b` still is 25-30% 
slower than `a.__add__(b)`. After analyzing the article and comparing it with 
the current code I have found that virtually all proposed optimization steps 
already applied in 3.7 by Victor! The difference is only in details.

The proposed patch tweaks the code and makes `a + b` only 12% slower than 
`a.__add__(b)`. There is similar effect for other type slot calls.

[1] https://www.corsix.org/content/why-are-slots-so-slow

----------
components: Interpreter Core
files: type-slot-calls.diff
keywords: patch
messages: 294739
nosy: haypo, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Optimize calling type slots
type: performance
versions: Python 3.7
Added file: http://bugs.python.org/file46912/type-slot-calls.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30509>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to