cedric-fauth opened a new issue, #63183:
URL: https://github.com/apache/airflow/issues/63183

   ### Apache Airflow version
   
   Other Airflow 3 version (please specify below)
   
   ### If "Other Airflow 3 version" selected, which one?
   
   3.1.6
   
   ### What happened?
   
   Many of my airflow DAGs are failing because some tasks are killed and marked 
as failed even if they completed successfully. This seems to happen randomly 
but often multiple DAGs fail at a similar time. 
   
   ### Behavior:
   Just after a task logs their last log message and returns there is an 
internal error with the airflow sdk involved:
   
   ```
   {"logger": "airflow.task.operators.dags.custom_operators.DjangoOperator", 
"filename": "python.py", "lineno": 216, "event": "Done. Returned value was: 
None", "level": "info"}
   {"logger": "task", "filename": "task_runner.py", "lineno": 1562, 
"error_detail": [{"exc_type": "AirflowRuntimeError", "exc_value": 
"API_SERVER_ERROR: {'status_code': 409, 'message': 'Server returned error', 
'detail': {'detail': {'reason': 'invalid_state', 'message': 'TI was not in the 
running state so it cannot be updated', 'previous_state': 'success'}}}", 
"exc_notes": [], "syntax_error": null, "is_cause": false, "frames": 
[{"filename": 
"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py",
 "lineno": 1555, "name": "main"}, {"filename": 
"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py",
 "lineno": 1083, "name": "run"}, {"filename": 
"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py", 
"lineno": 206, "name": "send"}, {"filename": 
"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py", 
"lineno": 270, "name": "_get_response"}, {"filename": 
"/bin/app/lib/python3.12/site-packages/airfl
 ow/sdk/execution_time/comms.py", "lineno": 257, "name": "_from_frame"}], 
"is_group": false, "exceptions": []}], "event": "Top level error", "level": 
"error"}
   {"exit_code": 1, "event": "Process exited abnormally", "level": "warning", 
"logger": "task"}
   ```
   
   
[{\"exc_notes\":[],\"exc_type\":\"AirflowRuntimeError\",\"exc_value\":\"API_SERVER_ERROR:
 {'status_code': 409, 'message': 'Server returned error', 'detail': {'detail': 
{'reason': 'invalid_state', 'message': 'TI was not in the running state so it 
cannot be updated', 'previous_state': 
'success'}}}\",\"exceptions\":[],\"frames\":[{\"filename\":\"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py\",\"lineno\":1555,\"name\":\"main\"},{\"filename\":\"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py\",\"lineno\":1083,\"name\":\"run\"},{\"filename\":\"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py\",\"lineno\":206,\"name\":\"send\"},{\"filename\":\"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py\",\"lineno\":270,\"name\":\"_get_response\"},{\"filename\":\"/bin/app/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py\",\"lineno\":257,\"name\":\"_from_frame\"}],\"is_c
 ause\":false,\"is_group\":false,\"syntax_error\":\"\\u003cnil\\u003e\"}]
   
   ### What you think should happen instead?
   
   After the task ends it should update its state to successful and updating 
the state should not result in an error. It looks a bit like airflow already 
updated the state and tries to update it again even if the task was already 
completed.
   
   ### How to reproduce
   
   I don't know how this can be reproduced but there are similar issues to this 
one which handle different cases of `TI was not in the running state so it 
cannot be updated`.
   
   ### Operating System
   
   WSL
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow[celery, redis, amazon, postgres, docker, fab]==3.1.6
   apache-airflow-providers-fab==3.1.2
   
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   Deployed on ECS using a Fargate Cluster. One task per worker, scheduler, 
etc. Only one container running per task with additional logging sidecar 
containers.
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to