potiuk commented on PR #59876:
URL: https://github.com/apache/airflow/pull/59876#issuecomment-3698914864
> By the way with `@cache` I assume execution is slower as the has need to
be built and then a lookup must be made which is more expensive than a function
pointer lookup and jump. Therefore would not like a `@cache` solution.
I don't think so. Calling a function in Python is generally very slow
operation. It's not a classic pointer jump. Functions calls in Python are done
by interpreter, they are not using jumps as C programs do. Interpreter has to
look-up the method to call, create a new frame, push it on "Python stack" and
clean the frame after it returns. This is all done "in the interpreter" - it's
not even using the processor stack, "Python stack" for method frames is
actually stored in heap memory, not in processor stack - so any stack
manipulation (calling and returning from function) is kinda slow.
I did some basic micro-benchmarks:
```python
import time
from functools import lru_cache
class MethodBenchmark:
def __init__(self):
self.call_count = 0
def empty_method(self):
"""Empty method without caching"""
self.call_count += 1
return None
@lru_cache(maxsize=128)
def cached_empty_method(self):
"""Empty method with caching"""
return None
def benchmark():
obj = MethodBenchmark()
iterations = 1_000_000
# Benchmark non-cached method
start = time.perf_counter()
for _ in range(iterations):
obj.empty_method()
non_cached_time = time.perf_counter() - start
# Benchmark cached method
start = time.perf_counter()
for _ in range(iterations):
obj.cached_empty_method()
cached_time = time.perf_counter() - start
# Results
print(f"Non-cached method: {non_cached_time:.6f} seconds")
print(f"Cached method: {cached_time:.6f} seconds")
print(f"Speedup: {non_cached_time / cached_time:.2f}x")
print(f"Non-cached call count: {obj.call_count}")
if __name__ == "__main__":
benchmark()
```
Result with Python 3.10
```
/Users/jarekpotiuk/code/airflow/.venv/bin/python
/Users/jarekpotiuk/Library/Application
Support/JetBrains/IntelliJIdea2025.3/scratches/scratch_5.py
Non-cached method: 0.026673 seconds
Cached method: 0.026702 seconds
Speedup: 1.00x
Non-cached call count: 1000000
Process finished with exit code 0
```
This means that when you put `@cache` -> it **never** runs slower than
method call, and additionaly you save on all the executed code inside.
I modified the code of non-cached methods to do single if:
```
if self.call_count == 0:
self.call_count += 1
else:
self.call_count += 1
```
And there are the results:
```
Non-cached method: 0.032580 seconds
Cached method: 0.026062 seconds
Speedup: 1.25x
Non-cached call count: 1000001
Process finished with exit code 0
```
(100001 is because cached call increased it by 1)
There are **plenty** of optimisations in Python 3.11 - 3.13 that might skew
this simple example (JIT and specializing adaptive interpreter changes) - so I
run it with Python 3.10
Similar discussion:
https://stackoverflow.com/questions/14648374/python-function-calls-are-really-slow
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]