GitHub user alkismavridis deleted a comment on the discussion: Thread-Powered 
Local Executor

@tirkarthi thanks for your answer.

The basic motivation is to fix our memory problems. The workers of 
LocalExecutor eat **way too much** memory, which for our case, keeps increasing 
indefinitely, even after some memory fixes (we run 3.1.1). See 
[this](https://github.com/apache/airflow/issues/56641#issuecomment-3479139319) 
thread for example.
We even tried to reduce the parallelism (aka the amount of workers) to 16, and 
still the memory consumption is too high.

The "unlimited parallelism" feature of Local Executor could improve our memory 
issues, but [it is 
removed](https://github.com/apache/airflow/issues/57495#event-20640437774). So, 
the 16 workers just keep the memory they allocated even when they do nothing. 
And as said, this memory is a value that constantly increases with time. So, 
the situation is not really optimal at the moment.
Since, all of our operators are extremely lightweight (SSH Operators), I 
thought there should be a way to avoid all the resource overhead of the 
LocalExecutor workers.  It can very well be the case that my specific proposal 
is not the proper one. Still, I hope there is some way to run 30-40 of SSH 
connections without having to allocate multiple GB for it.

Now, I try to understand the functionality and implications of the deferring 
approach you mentioned. It is definatelly worth trying, thanks! One important 
aspect we need to continue working like the SSHOperator is that the logs of the 
long running tasks should be  still collected and are visible real-time in 
Airflow. Is this possible with the Deferring approach?

GitHub link: 
https://github.com/apache/airflow/discussions/57699#discussioncomment-14855185

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to