[
https://issues.apache.org/jira/browse/AIRFLOW-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abhilash Kishore updated AIRFLOW-7090:
--------------------------------------
Description:
The first task of my DAG has `depends_on_past=True` and
`wait_for_downstream=True`. The DAG ran automatically when I turned it `On` and
it completed successfully. Now, I manually triggered the DAG again (after the
first run completed successfully), but this time, my first task did not start
running. `Task Instance Details` for this task shows `depends_on_past is true
for this task's DAG, but the previous task instance has not run yet.`
According to [docs|#trigger-rules] about `depends_on_past (boolean)`:
> when set to True, keeps a task from getting triggered if the previous
> schedule for the task hasn’t succeeded.
The first DAG run was successful and the first instance of the first task was
(obviously) successful as well. Yet, why is the second instance of the first
task complaining that the `previous task instance has not run yet`?
Relevant parts of my code:
{code:java}
...
args = { 'owner': 'USC Graduate School', 'start_date': days_ago(1), }
dag = DAG(
dag_id='enrollment_import_poc',
default_args=args,
schedule_interval='0 0 * * *',
dagrun_timeout=timedelta(minutes=60),
max_active_runs=1,
template_searchpath = os.environ.get('AIRFLOW_HOME'),
tags=['uscgradschool']
)
schools = MsSqlOperator(
task_id='schools',
depends_on_past=True,
wait_for_downstream=True,
sql=os.path.join("queries", "01_schools.sql"),
mssql_conn_id="mssql_local",
autocommit=True,
dag=dag
)
...
{code}
was:
The first task of my DAG has `depends_on_past=True` and
`wait_for_downstream=True`. The DAG ran automatically when I turned it `On` and
it completed successfully. Now, I manually triggered the DAG again (after the
first run completed successfully), but this time, my first task did not start
running. `Task Instance Details` for this task shows `depends_on_past is true
for this task's DAG, but the previous task instance has not run yet.`
According to [docs|#trigger-rules]] about `depends_on_past (boolean)`:
> when set to True, keeps a task from getting triggered if the previous
> schedule for the task hasn’t succeeded.
The first DAG run was successful and the first instance of the first task was
(obviously) successful as well. Yet, why is the second instance of the first
task complaining that the `previous task instance has not run yet`?
Relevant parts of my code:
{code:java}
...
args = { 'owner': 'USC Graduate School', 'start_date': days_ago(1), }
dag = DAG(
dag_id='enrollment_import_poc',
default_args=args,
schedule_interval='0 0 * * *',
dagrun_timeout=timedelta(minutes=60),
max_active_runs=1,
template_searchpath = os.environ.get('AIRFLOW_HOME'),
tags=['uscgradschool']
)
schools = MsSqlOperator(
task_id='schools',
depends_on_past=True,
wait_for_downstream=True,
sql=os.path.join("queries", "01_schools.sql"),
mssql_conn_id="mssql_local",
autocommit=True,
dag=dag
)
...
{code}
> With depends_on_past=True, second instance of task not scheduled even when
> first instance ran successfully
> ----------------------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-7090
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7090
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.10.9
> Reporter: Abhilash Kishore
> Priority: Major
> Attachments: 1.png, 2.png, 3.png, 4.png
>
>
> The first task of my DAG has `depends_on_past=True` and
> `wait_for_downstream=True`. The DAG ran automatically when I turned it `On`
> and it completed successfully. Now, I manually triggered the DAG again (after
> the first run completed successfully), but this time, my first task did not
> start running. `Task Instance Details` for this task shows `depends_on_past
> is true for this task's DAG, but the previous task instance has not run yet.`
> According to [docs|#trigger-rules] about `depends_on_past (boolean)`:
> > when set to True, keeps a task from getting triggered if the previous
> > schedule for the task hasn’t succeeded.
> The first DAG run was successful and the first instance of the first task was
> (obviously) successful as well. Yet, why is the second instance of the first
> task complaining that the `previous task instance has not run yet`?
> Relevant parts of my code:
>
> {code:java}
> ...
> args = { 'owner': 'USC Graduate School', 'start_date': days_ago(1), }
> dag = DAG(
> dag_id='enrollment_import_poc',
> default_args=args,
> schedule_interval='0 0 * * *',
> dagrun_timeout=timedelta(minutes=60),
> max_active_runs=1,
> template_searchpath = os.environ.get('AIRFLOW_HOME'),
> tags=['uscgradschool']
> )
> schools = MsSqlOperator(
> task_id='schools',
> depends_on_past=True,
> wait_for_downstream=True,
> sql=os.path.join("queries", "01_schools.sql"),
> mssql_conn_id="mssql_local",
> autocommit=True,
> dag=dag
> )
> ...
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)