[
https://issues.apache.org/jira/browse/AIRFLOW-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200557#comment-17200557
]
ASF GitHub Bot commented on AIRFLOW-3607:
-----------------------------------------
yuqian90 commented on pull request #11010:
URL: https://github.com/apache/airflow/pull/11010#issuecomment-697067550
> I'm not sure if we want to cherry-pick this fix to 1.10 as 2.0 is closer.
On the other hand... the job of cherrypicking was done and we just need to
merge.
Thanks @turbaszek . I cherry-picked this because the scheduler in 1.10.* is
having trouble for large DAGs (not that large, just hundreds of tasks in one
DAG). It queries the db too many times and was struggling to finish. (See
[flamegraph.before](https://raw.githubusercontent.com/yuqian90/airflow/gif_for_demo/airflow/www/static/flamegraph_before.svg)
and
[flamegraph_after](https://raw.githubusercontent.com/yuqian90/airflow/gif_for_demo/airflow/www/static/flamegraph_after.svg)
in the PR description.) Which caused us to hit #10790 too. So this
cherry-pick is more of a fix rather than an improvement in some sense.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Decreasing scheduler delay between tasks
> ----------------------------------------
>
> Key: AIRFLOW-3607
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3607
> Project: Apache Airflow
> Issue Type: Improvement
> Components: scheduler
> Affects Versions: 1.10.0, 1.10.1, 1.10.2
> Environment: ubuntu 14.04
> Reporter: Amichai Horvitz
> Assignee: Amichai Horvitz
> Priority: Major
> Fix For: 2.0.0
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> I came across the TODO in airflow/ti_deps/deps/trigger_rule_dep (line 52)
> that says instead of checking the query for every task let the tasks report
> to the dagrun. I have a dag with many tasks and the delay between tasks can
> rise to 10 seconds or more, I already changed the configuration, added
> processes and memory, checked the code and did research, profiling and other
> experiments. I hope that this change will make a drastic change in the delay.
> I would be happy to discuss this solution, the research and other solutions
> for this issue.
> Thanks
--
This message was sent by Atlassian Jira
(v8.3.4#803005)