STINNER Victor <vstin...@python.org> added the comment:

Raymond:
> PyTuple_Check() got slower across the board.  This is problematic because the 
> principal use case for PyTuple_Check() is as a guard for various fast paths.

Raymond gave more context in this email:
https://mail.python.org/archives/list/python-...@python.org/message/Q3YHYIKNUQH34FDEJRSLUP2MTYELFWY3/

As far as I know, only macOS is impacted by this performance regression (119 
nsec => 152 nsec on a microbenchmark). On Windows and Linux, Python is built 
with LTO and PGO optimizations.

For example, the Python package provided by Fedora 32 is built with GCC 10 
using LTO and PGO. Using GCC 10 and LTO, the PyTuple_Check() code is inlined 
even if PyType_GetFlags() is function call in PyType_HasFeature().

I didn't know that on macOS, LTO was not used. I created bpo-41181 to request 
to enable LTO on the macOS installer builder, but Ned Deily rejected the issue 
(read the issue for the rationale).

I dislike comparing performances when LTO and PGO optimizations are not used. 
LTO enables tons of optimizations and PGO makes benchmark results more 
reproducible.

One idea is to stop running benchmarks on macOS, and at least only run 
benchmarks on binaries built with LTO and PGO.

--

Raymond:
> The direct cause of the degradation is that the inlining of PyType_Check() 
> didn't go so well — commit 509dd90f4684e40af3105dd3e754fa4b9c1530c1.

I'm not sure the performance changed with this commit. IMHO it's more the 
commit 45ec5b99aefa54552947049086e87ec01bc2fc9a: bpo-40170 and GH-19378.

Background behind this change: see PEP 620 for the overall plan, see bpo-39573 
and bpo-40170 to make PyObject and PyTypeObject structures opaque. The idea is 
to slowly move third party extension modules towards the stable ABI (PEP 384). 
Internally, CPython continues to access directly structure members.


> The original unreviewed commit (...)

GH-19378 was reviewed and approved by another core dev.

The impact of performance of these changes was taken in account, and that's why 
it was decided to add a new internal _PyType_HasFeature() function.

It wasn't noticed that PyTuple_Check() calls the public PyType_HasFeature() 
instead of the internal _PyType_HasFeature(). As I wrote, using LTO, the code 
is inlined anyway, so it's the same thing.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39542>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to