xBis7 commented on PR #54103:
URL: https://github.com/apache/airflow/pull/54103#issuecomment-3301379665

   > it seems to solve the issue that someone on the mailing list had with the 
starvation, however, even here it does not fully do it.
   >
   >
   > It just looks like you are trying to solve a single issue that someone in 
the mailing list was experiencing, which happens to increase the scheduler 
throughput, and it seems like a sub problem of the greater problem of general 
starvation in airflow.
   
   Hi @Nataneljpwd, we haven't experienced any noticeable starvation issues and 
we hadn't been concerned about it until you brought it up on the mailing list. 
From what I see, the scheduler is designed to select tasks in FIFO order while 
it prioritizes some tasks that have a higher priority weight. All tasks are 
running sooner or later. I haven't been trying to change the order by which 
tasks are selected.
   
   > If the goal is to increase throughput, why not delegate ALL the work to 
the sql query?
   > 
   > Why shouldn't we add the other checks?
   > Why did you decide specifically to add the max_active_tasks check only? it 
seems weird to me that you chose one limit, did you try other limits first? or 
did you ever try to do it on multiple limits together?
   
   Let me give you a bit of context. We noticed that the scheduler and the 
workers didn't use as much CPU as expected, even when configured properly to do 
so. After some research, we found a few inefficiencies in the scheduler queuing 
process and noticed that a lot of iterations aren't doing much work even though 
there is load on the system.
   
   Here is an example, assuming that the scheduler can get 30 tasks from the 
query, examine them and queue them. There are 3-5 dags with 100 tasks each. We 
can run at max 10 tasks for each dag_run.
   
   The current query will get 30 tasks from the 1st dag, queue 10 and discard 
the rest. It will do a few more iterations to realize that we have reached max 
tasks for that dag_run and should examine the next in line. But if it takes 
into consideration the limit along with the current number of active tasks, it 
will only get the number of tasks that can actually be queued and then move on 
to the next dag_run based on FIFO and priorities. It will keep doing that until 
it reaches 30 tasks which all will be valid for queuing, all in 1 iteration 
instead of multiple ones.
   
   > And it seems to increase scheduler throughput, up until the point you have 
mapped tasks, or a lot of tasks, or tasks which belong to a starved pool, where 
you will still choose multiple tasks that will be dropped in the loop, and 
cause more iterations, however, this time the query takes twice the amount of 
time.
   
   If you take a look at the unit tests and the benchmarks, you will see that 
that's not the case. Yes, the query takes twice the time but the iterations are 
far less because it actually queues everything the query returns. The more the 
tasks, the greater the performance improvement.
   
   Additionally, we have tested this in production with a huge load of tasks. 
One of our dags with mapped tasks might end up with 10.000 tasks.
   
   > Task A goes to pool A, 128 mapped tasks, short running
   > Task B goes to pool B 50 mapped tasks
   
   I think here you mean `dag A` and `dag B`.
   
   > Here, the tasks will sort in an alternating pattern, lets suppose Task A 
is first for whatever reason, does not really matter.
   > We will get 10 tasks, 5 from A, 5 from B, run only one from B, and 
suppose, Task A falls because of mapped task count limit, and we only schedule 
1 task.
   > Next run, another task (or two) from Pool B is freed, the same thing 
happens again, we schedule only 1 or 2 tasks, and drop the other selected tasks.
   > 
   > This issue can occur for any other concurrency limit such as 
max_active_tis_per_dag, or pool slot count.
   
   I'm not trying to address this in the current PR. It will continue working 
as it is currently working in the scheduler. The point is that we already have 
these limits and we check them after getting the tasks from the db. If we also 
check them while getting the tasks from the db, we won't spend time looking 
into tasks that can't be queued and the number of required iterations will 
decrease.
   
   > I just don't think I fully understood the goal of the PR, and so, if you 
coud clarify, it would be appreciated.
   
   I hope all the above make sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to