kristopherkane opened a new issue, #33667: URL: https://github.com/apache/airflow/issues/33667
### Description Google Cloud Dataproc cluster creation should eagerly delete ERROR state clusters. It is possible for Google Cloud Dataproc clusters to create in the ERROR state. The current operator (DataprocCreateClusterOperator) will require three total task attempts (original + two retries) in order to create the cluster, assuming underlying GCE infrastructure resolves itself between task attempts. This can be reduced to two total attempts by eagerly deleting a cluster in ERROR state before failing the current task attempt. Clusters in the ERROR state are not useable to submit Dataproc based jobs via the Dataproc API. ### Use case/motivation Reducing the number of task attempts can reduce GCP based cost as delays between retry attempts could be minutes. There's no reason to keep a running, costly cluster in the ERROR state if it can be detected in the initial create task. ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
