GitHub user matrach added a comment to the discussion: Scheduler performance 
with large number of mapped task instances

Hi,

the Task SDK seems to be quite unrelated to the issue presented here: it's not 
about overhead of running small tasks, but about the scheduler taking 8 seconds 
to resolve rather simple dependencies among 2000 tasks. If the scheduler 
wouldn't hang here, the performance would be okay-ish as in the "100x" case. I 
would argue, while handling 100k+ task instances requires long-term effort, the 
current performance bottleneck is an localized issue. Is there any effort for 
Airflow 3 concerned with the scheduler?

In the short term, it looks like the `TriggerRuleDep` implementation is the 
culprit as it wasn't optimized for handling hundreds of objects. The codepath 
for deciding whether to schedule DAG's task instances already has all required 
information, and if I didn't miss anything, it should be possible to fix the 
dependency resolution to handle several thousands of task instances at once. 
(That is, until we're bound by the DB). This is why I reported the problem as a 
bug.

For the effort to schedule millions of small tasks with dependencies, it seems 
like deeper changes would be required (in regards to what I saw on the 
2.10/main branches). I didn't find anything related to scheduling and task 
dependency in the links you've provided. Did I miss something?

GitHub link: 
https://github.com/apache/airflow/discussions/46044#discussioncomment-11976140

----
This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org

Reply via email to