Hello SciPy Team,
Recently, I came across a blog article titled "Julia: 17X Faster than Python's
SciPy" which discussed performance benchmarks comparing Julia and Python for
scientific computing tasks. In the article, the author highlighted significant
performance gains in Julia compared to Python's SciPy library, particularly in
the context of using scipy.optimize.curve_fit as an example.
The article can be found here:
https://www.matecdev.com/posts/julia-17x-faster-vs-python-scipy.html
Coincidentally, just a few days ago, I discovered LowLevelCallable functions as
a way to speed up quadrature operations. I speculated that the overhead of
Python functions might explain the performance difference described in the
mentioned article. However, upon further investigation, I realized that
LowLevelCallable functions are not currently utilized for optimizers in
scipy.optimize.curve_fit.
In my search for more information on this topic, I stumbled upon the GitHub
issue titled "Passing LowLevelCallable to Optimizers" (#11783), where Mr.
Gommers from SciPy expressed reservations about the implementation of
LowLevelCallable functions as inputs to minimization problems in SciPy. He
noted that there is not much overhead in using Python functions, and the
potential benefits are likely to be very small.
I conducted some experiments with scipy.integrate.quad and observed speed-ups
ranging from 10-20x when using LowLevelCallable functions compared to Python
functions for quadrature tasks. These findings suggest that LowLevelCallable
functions can offer performance improvements in certain scenarios.
This leads me to wonder: are the benefits of LowLevelCallable specific to the
scipy.integrate subpackage, or could it be beneficial for the scipy.optimize
subpackage as well?
I wanted to share these findings and seek your expert opinion on the matter.
Best regards,
Oyibo
Below is the code of the performance check. I used Numba to convert the Python
function into a LowLevelCallable.
# ****************************************************************************
import numpy as np
import numba as nb
import numba.types as nbt
import scipy.integrate as si
from scipy import LowLevelCallable
import timeit
# ****************************************************************************
# Case 1: Integrate with Python function
def py_integrand(t):
return np.exp(-t) / t**2
result = si.quad(py_integrand, 1, np.inf)
print('Python:')
print('Integral Result:', result[0])
# Integral Result: 0.14849550677592208
# ****************************************************************************
# Case 2: Integrate with LowLevelCallable (use Numba to convert)
llc_integrand = nb.cfunc("float64(float64)")(py_integrand).ctypes
result = si.quad(llc_integrand, 1, np.inf)
print('\nLowLevelCallable:')
print('Integral Result:', result[0])
# Integral Result: 0.14849550677592208
# ****************************************************************************
# Case 3: Integrate with LowLevelCallable for multiple args
# (A little bit more code but more versatile and faster)
def LowLevelCallable_integrand(integrand_func):
jitted_function = nb.njit(integrand_func)
# @cfunc('float64(int32, float64*)')
@nb.cfunc(nbt.float64(nbt.intc, nbt.CPointer(nbt.float64)))
def wrapped(n, xx):
values = nb.carray(xx, n)
return jitted_function(values)
return LowLevelCallable(wrapped.ctypes)
@LowLevelCallable_integrand
def llc_integrand_gen(args):
t = args[0]
return np.exp(-t) / t**2
result = si.quad(llc_integrand_gen, 1, np.inf)
print('\nLowLevelCallable (generalized, can be used for additional args):')
print('Integral Result:', result[0])
# Integral Result: 0.14849550677592208
# ****************************************************************************
# Measuring execution time
num_repeats = 10_000
num_runs = 10
time_py = np.mean(timeit.repeat(lambda: si.quad(py_integrand, 1, np.inf),
repeat=num_repeats, number=num_runs))*1000
time_llc = np.mean(timeit.repeat(lambda: si.quad(llc_integrand, 1, np.inf),
repeat=num_repeats, number=num_runs))*1000
time_llc_gen = np.mean(timeit.repeat(lambda: si.quad(llc_integrand_gen, 1,
np.inf),
repeat=num_repeats, number=num_runs))*1000
print(f'\nExecution Time (mean of {num_runs} runs, {num_repeats} repeats):')
print(f'Python Callable: {time_py:.3f} ms')
print(f'LowLevelCallable: {time_llc:.3f} ms')
print(f'LowLevelCallable_gen: {time_llc_gen:.3f} ms')
print(f'time_py/time_llc: {time_py/time_llc:.1f}x')
print(f'time_py/time_llc_gen: {time_py/time_llc_gen:.1f}x')
# Execution Time (mean of 10 runs, 10000 repeats):
# Python Callable: 1.218 ms
# LowLevelCallable: 0.114 ms
# LowLevelCallable_gen: 0.065 ms
# time_py/time_llc: 10.6x
# time_py/time_llc_gen: 18.7x
_______________________________________________
SciPy-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/scipy-dev.python.org/
Member address: [email protected]