Hi, 

I also have found tremendous benefits from LowLevelCallable in my applications. Here is a small benchmark I gathered : https://gabrielfougeron.github.io/choreo-docs/_build/auto_examples/benchmarks/ODE_ivp_LowLevel_bench.html#sphx-glr-build-auto-examples-benchmarks-ode-ivp-lowlevel-bench-py

I too am advocating for more widespread use of them both within scipy and outside of scipy. See the thread I initiated a few months back https://mail.python.org/archives/list/scipy-dev@python.org/thread/Q2EMWF5ECXPBKSHCXVQLFNPRD4RBMZMW/#H3ODMVULTFU7FDTQ2DE74OKALSAUSVQ4

Le 15 févr. 2024 19:32, d.lenar...@gmail.com a écrit :

Hello SciPy Team,

Recently, I came across a blog article titled "Julia: 17X Faster than Python's SciPy" which discussed performance benchmarks comparing Julia and Python for scientific computing tasks. In the article, the author highlighted significant performance gains in Julia compared to Python's SciPy library, particularly in the context of using scipy.optimize.curve_fit as an example.
The article can be found here: https://www.matecdev.com/posts/julia-17x-faster-vs-python-scipy.html

Coincidentally, just a few days ago, I discovered LowLevelCallable functions as a way to speed up quadrature operations. I speculated that the overhead of Python functions might explain the performance difference described in the mentioned article. However, upon further investigation, I realized that LowLevelCallable functions are not currently utilized for optimizers in scipy.optimize.curve_fit.

In my search for more information on this topic, I stumbled upon the GitHub issue titled "Passing LowLevelCallable to Optimizers" (#11783), where Mr. Gommers from SciPy expressed reservations about the implementation of LowLevelCallable functions as inputs to minimization problems in SciPy. He noted that there is not much overhead in using Python functions, and the potential benefits are likely to be very small.

I conducted some experiments with scipy.integrate.quad and observed speed-ups ranging from 10-20x when using LowLevelCallable functions compared to Python functions for quadrature tasks. These findings suggest that LowLevelCallable functions can offer performance improvements in certain scenarios.

This leads me to wonder: are the benefits of LowLevelCallable specific to the scipy.integrate subpackage, or could it be beneficial for the scipy.optimize subpackage as well?

I wanted to share these findings and seek your expert opinion on the matter.

Best regards,
Oyibo

Below is the code of the performance check. I used Numba to convert the Python function into a LowLevelCallable.

# ****************************************************************************
import numpy as np
import numba as nb
import numba.types as nbt
import scipy.integrate as si
from scipy import LowLevelCallable
import timeit

# ****************************************************************************
# Case 1: Integrate with Python function
def py_integrand(t):
    return np.exp(-t) / t**2

result = si.quad(py_integrand, 1, np.inf)
print('Python:')
print('Integral Result:', result[0])
# Integral Result: 0.14849550677592208

# ****************************************************************************
# Case 2: Integrate with LowLevelCallable (use Numba to convert)
llc_integrand = nb.cfunc("float64(float64)")(py_integrand).ctypes
result = si.quad(llc_integrand, 1, np.inf)
print('\nLowLevelCallable:')
print('Integral Result:', result[0])
# Integral Result: 0.14849550677592208

# ****************************************************************************
# Case 3: Integrate with LowLevelCallable for multiple args
#         (A little bit more code but more versatile and faster)
def LowLevelCallable_integrand(integrand_func):
    jitted_function = nb.njit(integrand_func)

    # @cfunc('float64(int32, float64*)')
    @nb.cfunc(nbt.float64(nbt.intc, nbt.CPointer(nbt.float64)))
    def wrapped(n, xx):
        values = nb.carray(xx, n)
        return jitted_function(values)
    return LowLevelCallable(wrapped.ctypes)

@LowLevelCallable_integrand
def llc_integrand_gen(args):
    t = args[0]
    return np.exp(-t) / t**2

result = si.quad(llc_integrand_gen, 1, np.inf)
print('\nLowLevelCallable (generalized, can be used for additional args):')
print('Integral Result:', result[0])
# Integral Result: 0.14849550677592208

# ****************************************************************************
# Measuring execution time
num_repeats = 10_000
num_runs = 10
time_py = np.mean(timeit.repeat(lambda: si.quad(py_integrand, 1, np.inf),
                                repeat=num_repeats, number=num_runs))*1000
time_llc = np.mean(timeit.repeat(lambda: si.quad(llc_integrand, 1, np.inf),
                                 repeat=num_repeats, number=num_runs))*1000
time_llc_gen = np.mean(timeit.repeat(lambda: si.quad(llc_integrand_gen, 1, np.inf),
                                     repeat=num_repeats, number=num_runs))*1000
print(f'\nExecution Time (mean of {num_runs} runs, {num_repeats} repeats):')
print(f'Python Callable:      {time_py:.3f} ms')
print(f'LowLevelCallable:     {time_llc:.3f} ms')
print(f'LowLevelCallable_gen: {time_llc_gen:.3f} ms')
print(f'time_py/time_llc:      {time_py/time_llc:.1f}x')
print(f'time_py/time_llc_gen:  {time_py/time_llc_gen:.1f}x')
# Execution Time (mean of 10 runs, 10000 repeats):
# Python Callable:             1.218 ms
# LowLevelCallable:          0.114 ms
# LowLevelCallable_gen: 0.065 ms
# time_py/time_llc:           10.6x
# time_py/time_llc_gen:  18.7x
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org
To unsubscribe send an email to scipy-dev-le...@python.org
https://mail.python.org/mailman3/lists/scipy-dev.python.org/
Member address: gabriel.fouge...@hotmail.fr


_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org
To unsubscribe send an email to scipy-dev-le...@python.org
https://mail.python.org/mailman3/lists/scipy-dev.python.org/
Member address: arch...@mail-archive.com

Reply via email to