emmasone opened a new issue #19254:
URL: https://github.com/apache/airflow/issues/19254
### Description
Errors monitoring task status. Occasionally we see that MWAA will not pick
up that a re-run of a failed task in Glue has now completed successfully.
Instead it will mark the task as failed and try re-running again.
This was reported to AWS and below is a snippet of their response:
"I would like to inform you that, I was able to replicate the same issue
using the the local runner[1] so the issue is not due toe MWAA environment. It
seems like an issue with the GlueOperator[2] used in the airflow and hence, our
internal service team suggest you to create a GitHub issue for open source
airflow GlueOperator."
### Use case/motivation
What we would want to see happen is that Airflow accurately captures the
status of a glue job. In the case that a glue job fails in a first run and
eventually succeeds in the second run and whatever retry the case is, Airflow
should be able to accurately capture the rerun as succeeded from Glue and not
as failed and so keeps retrying until the retries are exhausted.
NB: This issue we have seen happen on very rare occasions. Airflow mostly
captures the status of a Glue job accurately (even in the case of retries) but
not 100% of the time. We are asking that Airflow should be able to accurately
capture the status of a Glue job 100% of the time.
### Related issues
_No response_
### Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]