[Numpy-discussion] Re: Better compatibility of the Python scientific/data stack with fast Python interpreters

PIERRE AUGIER via NumPy-Discussion Thu, 15 May 2025 06:52:19 -0700

----- Mail original -----
> De: "Stefan Krah" <sk...@bytereef.org>
> À: "numpy-discussion" <numpy-discussion@python.org>
> Envoyé: Mercredi 7 Mai 2025 18:17:38
> Objet: [Numpy-discussion] Re: Better compatibility of the Python 
> scientific/data stack with fast Python interpreters

> On Wed, May 07, 2025 at 03:52:33PM +0200, PIERRE AUGIER via NumPy-Discussion
> wrote:
>> > "If the goal is to run psycopg3 fast in PyPy, there are at least three 
>> > paths:
>> > 
>> >    use a cffi backend: this is likely to be the fastest one
>> >    (in case psycopg uses Cython and doesn't call directly any CPython C 
>> > API): use
>> >    the Cython HPy backend, once it's ready
>> >    (in case psycopg uses the CPython C API directly): rewrite it to use HPy
>> >    instead."
> [cut]
> 
>> Could you explain a bit more? Or provide a link? CFFI with PyPy? or CFFI with
>> CPython?
> 
> Yes: _if_ the above 2022 quote from the psycopg3 issue is still valid,
> and my benchmarks show that the CPython C-API is faster than the PyPy
> CFFI, then (according to the quote), HPy is slower than PyPy CFFI and
> therefore much slower than CPython C-API.
> 
> All of that under the assumption that there are many API calls.
> 
>> Is there a proper reproducible benchmark?
> 
> Here are the benchmarks for _decimal:
> 
> bench.py
> [...]

Interesting thinking and benchmarks.

I don't think PyPy CFFI should be faster than HPy, at least theoretically and 
when the implementations of the HPy ABI in PyPy and GraalPy will be more ready 
in terms of performance.

The promisse of HPy is that you don't get performance regression with CPython 
AND you get something really efficient on alternative implementations.

To check that, I worked a bit on benchmarks (Piconumpy benchmarks and HPy 
microbenchmarks). Unfortunately, we are still not yet there and the 
implementations of HPy ABI in PyPy has yet few performance issues, in 
particular related to object creations/deletions.

I can share these results which are quite understandable and interesting:

========== PyPy HPy univ / CPy native (time ratio, smaller is better) ==========
TestModule::test_noargs                       0.50
TestModule::test_onearg_None                  0.60
TestModule::test_onearg_int                   0.65
TestModule::test_varargs                      0.93
TestModule::test_call_with_tuple              2.21
TestModule::test_call_with_tuple_and_dict     1.50
TestModule::test_allocate_int                 0.67
TestModule::test_allocate_tuple               6.33
TestType::test_allocate_obj                   5.89
TestType::test_method_lookup                  0.05
TestType::test_noargs                         0.85
TestType::test_onearg_None                    0.95
TestType::test_onearg_int                     0.97
TestType::test_varargs                        1.18
TestType::test_len                            0.48
TestType::test_getitem                        0.73
TestHeapType::test_allocate_obj_and_survive   5.11
TestHeapType::test_allocate_obj_and_die       4.07
TestHeapType::test_method_lookup              0.05
TestHeapType::test_noargs                     0.84
TestHeapType::test_onearg_None                0.93
TestHeapType::test_onearg_int                 0.96
TestHeapType::test_varargs                    1.20
TestHeapType::test_len                        0.50
TestHeapType::test_getitem                    0.78

Here we compare the time obtained with PyPy with a HPy universal extension 
(written with the HPy API) to the time obtained with CPython (3.12) with a 
standard extension (written with the CPython C API).

Interestingly, most cases are fine, with values smaller than 1, meaning that 
PyPy is faster than what is currently obtained with CPython. This is remarkable 
because it is about interacting with a C extension, with nothing in pure Python!

Unfortunately, there are also few important cases (mostly related to 
"allocate_obj") which are much slower than CPython (of the order of 5 times 
slower!). I don't know if PyPy can improve or fix this known problem.

Let's stress that these are microbenchmarks focusing on the C API. The numbers 
would not be the same on real life code and they would of course strongly 
depend on the time spent in Python and in native code.

The result of a HPy transition of the ecosystem would not (yet*) be to strongly 
accelerate code using small objects of types defined in C extensions. It would 
be that it would be more possible to code more things in pure Python and to mix 
pure Python and native extensions, for exemple objects of Python defined 
classes with Numpy arrays attributes. PyPy has this very nice property that 
Python abstractions can be used "for free" (without any cost in terms of 
performance).

* "yet" because there are some plans to improve that.

Pierre
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Better compatibility of the Python scientific/data stack with fast Python interpreters

Reply via email to