amyshields opened a new issue, #41884: URL: https://github.com/apache/airflow/issues/41884
### Apache Airflow version 2.9.3 ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? We have seen this issue several times. 1. A task failed Up to 5 minutes go by (this is the longest we have seen the wait) 2. The task itself is marked as `FAILED` 3. All downstream tasks are marked as `upstream_failed` It is important to note, we also see this behaviour for a task succeeding (not being reflected in Airflow UI or its metadata DB). We have validated this by also making a call to Airflow's API to retrieve the task instance & the state has not been reflected as we would expect. This exact case happened today (30th Aug) with a 2 minute delay: 1. task_A failed today at 7:22 BST - <img width="1673" alt="Screenshot 2024-08-30 at 09 29 15" src="https://github.com/user-attachments/assets/8af892b7-13c1-40ad-b1cf-972a7c4c8841"> 2. One of its downstreams is in a None state at 7:23:00am BST <img width="1051" alt="Screenshot 2024-08-30 at 09 24 33" src="https://github.com/user-attachments/assets/84ab7cd9-b0e0-4e10-a6b5-57a60959b89d"> 3. Then the downstream is set to a upstream failed state at 7:25am BST <img width="1658" alt="Screenshot 2024-08-30 at 09 31 18" src="https://github.com/user-attachments/assets/7dde2d5b-1374-4753-a3bd-145c9bafcda0"> ### What you think should happen instead? 1. A task failed <Little to no wait> 2. The task itself is marked as `FAILED` 3. All downstream tasks are marked as `upstream_failed` We do not expect any delay in the task being marked with its appropriate state nor the marking of any downstreams. ### How to reproduce This is hard to reproduce as unfortunately the metadata db (task instance table) only ever stores the latest state of a task (to minimize production downtime we are immediately retrying failed tasks and then subsequently will succeed and we dont get the first state stored). Possibly cold look into insertion timestamps and task completion timestamp and look at the delay here. ### Operating System linux/arm64 ### Versions of Apache Airflow Providers _No response_ ### Deployment Other Docker-based deployment ### Deployment details We use this docker image: apache/airflow:2.9.3-python3.9 ### Anything else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
