set92 opened a new issue, #29872: URL: https://github.com/apache/airflow/issues/29872
### Apache Airflow version Other Airflow 2 version (please specify below) ### What happened Airflow version: 2.4.2 It looks like it is related to #27614 , but not sure how to reproduce it. It happened couple times last week, when we run a task that only log the current execution to a Postgres db through a python operator, but today has happened again 3 times. The thing is that the task runs once, and it finishes perfectly, but then idk why it runs again (it doesn't have retries), and it fails because of the PK (`Key (execution_id, taskgroup_id)=(2573, task_name) already exists.`). We think it only happens in this task, but since most of the other task are inserts into Bigquery, which doesn't have PKs, they could be appending twice the data without us knowing. To give you a visual representation of the taskgroup itself: Few PythonOperators & BranchOperator / \ Whitelisting: BranchOperator -- task_insert_pg_log: PythonOperator \ / ----------------------------------------------------------- All the times that we got the error was after some BranchOperator (maybe this operator trigger twice even when they shouldn't?). The trigger_rule of the task_insert_pg_log is `none_failed_min_one_success`. But even if that would be the case the whitelisting is basically like a ShortCircuitOperator that stops the taskgroup from running, and the upper path it got executed, so it shouldn't have run the lower path. And it could have happened in other tasks, but since they didn't return an error, we think the problem is only here. I was checking the logs from audit log, but I can only see that the DAG tried to run this task 4 times, 2 in each scheduler? But idk why that happens. I checked other tasks, and they mostly have 2 runs, 1 per scheduler. Although some have 3. ``` | Id | Dttm | Dag Id | Task Id | Event | Logical Date | Owner | Extra | |---------|----------------------|-----------------------|---------------------------------------------------------------------------------|--------------|----------------------|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 2327881 | 2023-03-01, 22:52:16 | generate_database_dag | taskgroup_name.task_insert_pg_log | cli_task_run | | root | {"host_name": "generatedatabasedagtaskgroup-549cbe4e79214e0f91827164fc4657c6", "full_command": "['/opt/username/.venv/bin/airflow', 'tasks', 'run', 'generate_database_dag', 'taskgroup_name.task_insert_pg_log', 'scheduled__2023-02-27T22:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/master_dag_factory_generate_database_dag.py']"} | | 2327880 | 2023-03-01, 22:52:16 | generate_database_dag | taskgroup_name.task_insert_pg_log | running | 2023-02-27, 22:00:00 | admin | | | 2327662 | 2023-03-01, 22:50:13 | generate_database_dag | taskgroup_name.task_insert_pg_log | cli_task_run | | root | {"host_name": "generatedatabasedagtaskgroup-549cbe4e79214e0f91827164fc4657c6", "full_command": "['/opt/username/.venv/bin/airflow', 'tasks', 'run', 'generate_database_dag', 'taskgroup_name.task_insert_pg_log', 'scheduled__2023-02-27T22:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/master_dag_factory_generate_database_dag.py']"} | | 2327641 | 2023-03-01, 22:50:02 | generate_database_dag | taskgroup_name.task_insert_pg_log | success | 2023-02-27, 22:00:00 | admin | | | 2327633 | 2023-03-01, 22:50:00 | generate_database_dag | taskgroup_name.task_insert_pg_log | cli_task_run | | root | {"host_name": "generatedatabasedagtaskgroup-0c7227b07ac04dadbaae3df6f58b6edb", "full_command": "['/opt/username/.venv/bin/airflow', 'tasks', 'run', 'generate_database_dag', 'taskgroup_name.task_insert_pg_log', 'scheduled__2023-02-27T22:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/master_dag_factory_generate_database_dag.py']"} | | 2327631 | 2023-03-01, 22:50:00 | generate_database_dag | taskgroup_name.task_insert_pg_log | running | 2023-02-27, 22:00:00 | admin | | | 2327417 | 2023-03-01, 22:47:44 | generate_database_dag | taskgroup_name.task_insert_pg_log | cli_task_run | | root | {"host_name": "generatedatabasedagtaskgroup-0c7227b07ac04dadbaae3df6f58b6edb", "full_command": "['/opt/username/.venv/bin/airflow', 'tasks', 'run', 'generate_database_dag', 'taskgroup_name.task_insert_pg_log', 'scheduled__2023-02-27T22:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/master_dag_factory_generate_database_dag.py']"} | ``` So, not sure where else I could look for more logs or more information to try to known why those tasks got triggered. ### What you think should happen instead I thought that after a task has a SUCCESS it doesn't try to run again. And the worse is that I don't know where to look, when this is going to happen again, if what you mentioned in #27614 about the listener API will fix things without doing anything, and therefore it would be best to upgrade to 2.5.0 (we were waiting for 2.6.0 to upgrade to the new interface) or we will need to start adding some extra control at the start of each Operator ### How to reproduce Don't know, would love to get to the root of the problem, to be sure what is the problem and how can I avoid it, but don't know which logs or where I can look for more information. ### Operating System ubuntu 20.04 ### Versions of Apache Airflow Providers apache-airflow-providers-amazon==6.0.0 apache-airflow-providers-cncf-kubernetes==4.4.0 apache-airflow-providers-common-sql==1.2.0 apache-airflow-providers-ftp==3.1.0 apache-airflow-providers-google==6.8.0 apache-airflow-providers-http==4.0.0 apache-airflow-providers-imap==3.0.0 apache-airflow-providers-postgres==5.2.2 apache-airflow-providers-sendgrid==3.0.0 apache-airflow-providers-slack==6.0.0 apache-airflow-providers-sqlite==3.2.1 ### Deployment Other 3rd-party Helm chart ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org