ephraimbuddy opened a new pull request, #40696:
URL: https://github.com/apache/airflow/pull/40696
TI.are_dependencies_met run over and over even when no changes have happened
that would allow it to pass. This causes the scheduler loop to get slower and
slower as more blocked TIs pile up.
This scenario is easy to reproduce with this DAG (courtesy of @rob-1126):
Before running it, enable debug logging
```
from datetime import datetime
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
class FailsFirstTimeOperator(BashOperator):
def execute(self, context):
if context["ti"].try_number == 1:
raise Exception("I fail the first time on purpose to test retry
delay")
print(context["ti"].try_number)
return super().execute(context)
one_day_of_seconds = 60 * 60 * 24
with DAG(dag_id="waity", schedule_interval=None, start_date=datetime(2021,
1, 1)):
starting_task = FailsFirstTimeOperator(task_id="starting_task",
retry_delay=one_day_of_seconds,
retries=1, bash_command="echo whee")
for i in range(0,1*1000):
task = BashOperator(task_id=f"task_{i}", bash_command="sleep 1")
starting_task >> task
```
Simply run multiples of the above DAG (6 dagruns is enough to observe the
delay).
Note that the scheduler loop is now taking ~4-6 seconds, and grows with each
new waity dagrun.
This commit adds a new column(blocked_by_upstream) to the TaskInstance
table. This column is updated anytime a task instance is blocked by an upstream
taskinstance. This way, we prevent the repetitive dependencies check for the
task instances
closes: https://github.com/apache/airflow/pull/40293
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]