potiuk commented on PR #59876:
URL: https://github.com/apache/airflow/pull/59876#issuecomment-3698914864

   > By the way with `@cache` I assume execution is slower as the has need to 
be built and then a lookup must be made which is more expensive than a function 
pointer lookup and jump. Therefore would not like a `@cache` solution.
   
   I don't think so. Calling a function in Python is generally very slow 
operation. It's not a classic pointer jump. Functions calls in Python are done 
by interpreter, they are not using jumps as C programs do. Interpreter has to 
look-up the method to call, create a new frame, push it on "Python stack" and 
clean the frame after it returns. This is all done "in the interpreter" - it's 
not even using the processor stack, "Python stack" for method frames is 
actually stored in heap memory, not in processor stack - so any stack 
manipulation (calling and returning from function) is kinda slow.
   
   I did some basic micro-benchmarks:
   
   ```python
   import time
   from functools import lru_cache
   
   class MethodBenchmark:
       def __init__(self):
           self.call_count = 0
   
       def empty_method(self):
           """Empty method without caching"""
           self.call_count += 1
           return None
   
       @lru_cache(maxsize=128)
       def cached_empty_method(self):
           """Empty method with caching"""
           return None
   
   
   def benchmark():
       obj = MethodBenchmark()
       iterations = 1_000_000
   
       # Benchmark non-cached method
       start = time.perf_counter()
       for _ in range(iterations):
           obj.empty_method()
       non_cached_time = time.perf_counter() - start
   
       # Benchmark cached method
       start = time.perf_counter()
       for _ in range(iterations):
           obj.cached_empty_method()
       cached_time = time.perf_counter() - start
   
       # Results
       print(f"Non-cached method: {non_cached_time:.6f} seconds")
       print(f"Cached method: {cached_time:.6f} seconds")
       print(f"Speedup: {non_cached_time / cached_time:.2f}x")
       print(f"Non-cached call count: {obj.call_count}")
   
   
   if __name__ == "__main__":
       benchmark()
   ```
   
   Result with Python 3.10 
   
   ```
   /Users/jarekpotiuk/code/airflow/.venv/bin/python 
/Users/jarekpotiuk/Library/Application 
Support/JetBrains/IntelliJIdea2025.3/scratches/scratch_5.py 
   Non-cached method: 0.026673 seconds
   Cached method: 0.026702 seconds
   Speedup: 1.00x
   Non-cached call count: 1000000
   
   Process finished with exit code 0
   ```
   
   This means that when you put `@cache` -> it **never** runs slower than 
method call, and additionaly you save on all the executed code inside.
   
   I modified the code of non-cached methods to do single if:
   
   ```
           if self.call_count == 0:
               self.call_count += 1
           else:
               self.call_count += 1
   ```
   
   And there are the results:
   
   ```
   Non-cached method: 0.032580 seconds
   Cached method: 0.026062 seconds
   Speedup: 1.25x
   Non-cached call count: 1000001
   
   Process finished with exit code 0
   ```
   
   (100001 is because cached call increased it by 1) 
   
   There are **plenty** of optimisations in Python 3.11 - 3.13 that might skew 
this simple example (JIT and specializing adaptive interpreter changes) - so I 
run it with Python 3.10
   
   Similar discussion:  
https://stackoverflow.com/questions/14648374/python-function-calls-are-really-slow
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to