hugowangler opened a new issue, #27299:
URL: https://github.com/apache/airflow/issues/27299
### Apache Airflow version
2.4.2
### What happened
List index out of range exception is raised when trying to trigger a DAG run
of another DAG using the `TriggerDagRunOperator` with `reset_dag_run=True`.
```
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1165} INFO - Dependencies all
met for <TaskInstance: trigger_example.trigger
manual__2022-10-26T17:13:33+00:00 [queued]>
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1165} INFO - Dependencies all
met for <TaskInstance: trigger_example.trigger
manual__2022-10-26T17:13:33+00:00 [queued]>
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1362} INFO -
--------------------------------------------------------------------------------
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1363} INFO - Starting attempt 1
of 1
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1364} INFO -
--------------------------------------------------------------------------------
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1383} INFO - Executing
<Task(TriggerDagRunOperator): trigger> on 2022-10-26 17:13:33+00:00
[2022-10-26, 17:13:38 UTC] {standard_task_runner.py:55} INFO - Started
process 2181 to run task
[2022-10-26, 17:13:38 UTC] {standard_task_runner.py:82} INFO - Running:
['airflow', 'tasks', 'run', 'trigger_example', 'trigger',
'manual__2022-10-26T17:13:33+00:00', '--job-id', '920', '--raw', '--subdir',
'DAGS_FOLDER/dags/trigger-example-dag.py', '--cfg-path', '/tmp/tmpmg9ay0du']
[2022-10-26, 17:13:38 UTC] {standard_task_runner.py:83} INFO - Job 920:
Subtask trigger
[2022-10-26, 17:13:38 UTC] {task_command.py:376} INFO - Running
<TaskInstance: trigger_example.trigger manual__2022-10-26T17:13:33+00:00
[running]> on host airflow-worker-0.airflow-worker.airflow.svc.cluster.local
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1590} INFO - Exporting the
following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=trigger_example
AIRFLOW_CTX_TASK_ID=trigger
AIRFLOW_CTX_EXECUTION_DATE=2022-10-26T17:13:33+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-10-26T17:13:33+00:00
[2022-10-26, 17:13:38 UTC] {trigger_dagrun.py:146} INFO - Clearing example
on 2022-10-24T00:00:00+00:00
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1851} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/trigger_dagrun.py",
line 136, in execute
dag_run = trigger_dag(
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/api/common/trigger_dag.py",
line 124, in trigger_dag
triggers = _trigger_dag(
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/api/common/trigger_dag.py",
line 78, in _trigger_dag
raise DagRunAlreadyExists(
airflow.exceptions.DagRunAlreadyExists: A Dag Run already exists for dag id
example at 2022-10-24T00:00:00+00:00 with run id
manual__2022-10-24T00:00:00+00:00
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/trigger_dagrun.py",
line 157, in execute
dag_run = DagRun.find(dag_id=dag.dag_id, run_id=run_id)[0]
IndexError: list index out of range
[2022-10-26, 17:13:38 UTC] {taskinstance.py:1401} INFO - Marking task as
FAILED. dag_id=trigger_example, task_id=trigger,
execution_date=20221026T171333, start_date=20221026T171338,
end_date=20221026T171338
[2022-10-26, 17:13:38 UTC] {standard_task_runner.py:100} ERROR - Failed to
execute job 920 for task trigger (list index out of range; 2181)
[2022-10-26, 17:13:38 UTC] {local_task_job.py:164} INFO - Task exited with
return code 1
[2022-10-26, 17:13:38 UTC] {local_task_job.py:273} INFO - 0 downstream tasks
scheduled from follow-on schedule check
```
### What you think should happen instead
The DAG run should be cleared since a run at the specified `execution_date`
exists, or if something else actually is wrong this should probably be logged
better so the user understands what's wrong their DAG.
### How to reproduce
To reproduce I used the following two DAGs
# example-dag.py
```
import pendulum
from airflow.decorators import task, dag
from airflow.operators.bash import BashOperator
@dag(
dag_id="example",
schedule="@daily",
start_date=pendulum.datetime(2022, 10, 24, tz="UTC"),
catchup=True,
)
def example():
hello = BashOperator(task_id="hello", bash_command="echo hello")
@task(task_id="airflow")
def airflow():
print("airflow")
hello >> airflow()
dag = example()
```
# trigger-example-dag.py
```
import pendulum
from airflow.decorators import dag, task
from airflow.operators.trigger_dagrun import TriggerDagRunOperator
@dag(
dag_id="trigger_example",
schedule="@daily",
start_date=pendulum.datetime(2022, 10, 25, tz="UTC"),
catchup=False,
)
def trigger_example_dag():
@task(task_id="dummy")
def dummy():
print("dummy")
retry = TriggerDagRunOperator(
task_id="trigger",
trigger_dag_id="example",
execution_date="20221024",
reset_dag_run=True,
)
dummy() >> retry
dag = trigger_example_dag()
```
# Steps
From the Airflow UI
1. Enable the `example` DAG and let it catchup
2. Enable the `trigger_example` DAG
After this is done you should be able to see that the `trigger` task in
`trigger_exampe` fails with the list index out of bounds exception (see
stacktrace above).
### Operating System
debian 11 bullseye
### Versions of Apache Airflow Providers
_No response_
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
Using the Apache Airflow Helm Chart `1.6.0` but we have upgraded the airflow
version to `2.4.2`.
Also using self deployed postgres with pgbouncer enabled. The postgres
deployment has been working as expected.
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]