xBis7 commented on PR #54103: URL: https://github.com/apache/airflow/pull/54103#issuecomment-3301379665
> it seems to solve the issue that someone on the mailing list had with the starvation, however, even here it does not fully do it. > > > It just looks like you are trying to solve a single issue that someone in the mailing list was experiencing, which happens to increase the scheduler throughput, and it seems like a sub problem of the greater problem of general starvation in airflow. Hi @Nataneljpwd, we haven't experienced any noticeable starvation issues and we hadn't been concerned about it until you brought it up on the mailing list. From what I see, the scheduler is designed to select tasks in FIFO order while it prioritizes some tasks that have a higher priority weight. All tasks are running sooner or later. I haven't been trying to change the order by which tasks are selected. > If the goal is to increase throughput, why not delegate ALL the work to the sql query? > > Why shouldn't we add the other checks? > Why did you decide specifically to add the max_active_tasks check only? it seems weird to me that you chose one limit, did you try other limits first? or did you ever try to do it on multiple limits together? Let me give you a bit of context. We noticed that the scheduler and the workers didn't use as much CPU as expected, even when configured properly to do so. After some research, we found a few inefficiencies in the scheduler queuing process and noticed that a lot of iterations aren't doing much work even though there is load on the system. Here is an example, assuming that the scheduler can get 30 tasks from the query, examine them and queue them. There are 3-5 dags with 100 tasks each. We can run at max 10 tasks for each dag_run. The current query will get 30 tasks from the 1st dag, queue 10 and discard the rest. It will do a few more iterations to realize that we have reached max tasks for that dag_run and should examine the next in line. But if it takes into consideration the limit along with the current number of active tasks, it will only get the number of tasks that can actually be queued and then move on to the next dag_run based on FIFO and priorities. It will keep doing that until it reaches 30 tasks which all will be valid for queuing, all in 1 iteration instead of multiple ones. > And it seems to increase scheduler throughput, up until the point you have mapped tasks, or a lot of tasks, or tasks which belong to a starved pool, where you will still choose multiple tasks that will be dropped in the loop, and cause more iterations, however, this time the query takes twice the amount of time. If you take a look at the unit tests and the benchmarks, you will see that that's not the case. Yes, the query takes twice the time but the iterations are far less because it actually queues everything the query returns. The more the tasks, the greater the performance improvement. Additionally, we have tested this in production with a huge load of tasks. One of our dags with mapped tasks might end up with 10.000 tasks. > Task A goes to pool A, 128 mapped tasks, short running > Task B goes to pool B 50 mapped tasks I think here you mean `dag A` and `dag B`. > Here, the tasks will sort in an alternating pattern, lets suppose Task A is first for whatever reason, does not really matter. > We will get 10 tasks, 5 from A, 5 from B, run only one from B, and suppose, Task A falls because of mapped task count limit, and we only schedule 1 task. > Next run, another task (or two) from Pool B is freed, the same thing happens again, we schedule only 1 or 2 tasks, and drop the other selected tasks. > > This issue can occur for any other concurrency limit such as max_active_tis_per_dag, or pool slot count. I'm not trying to address this in the current PR. It will continue working as it is currently working in the scheduler. The point is that we already have these limits and we check them after getting the tasks from the db. If we also check them while getting the tasks from the db, we won't spend time looking into tasks that can't be queued and the number of required iterations will decrease. > I just don't think I fully understood the goal of the PR, and so, if you coud clarify, it would be appreciated. I hope all the above make sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
