Aaron Dossett created AIRFLOW-3149:
--------------------------------------
Summary: GCP dataproc cluster creation should have the option to
delete an ERROR cluster
Key: AIRFLOW-3149
URL: https://issues.apache.org/jira/browse/AIRFLOW-3149
Project: Apache Airflow
Issue Type: Improvement
Components: gcp
Affects Versions: 1.10.0
Reporter: Aaron Dossett
Assignee: Aaron Dossett
We sometimes encounter issues where a dataproc cluster creation ends up in
ERROR state. That is, the cluster “exists” but in the state of ERROR[1] (not
just that the cluster creation API call failed). This makes retries impossible
since the cluster name already exists subsequent retried creations are
guaranteed to fail.
A `delete_cluster_on_error` parameter should be added to the
`DataprocClusterCreateOperator` operator that controls whether or not an
attempt to delete an ERROR cluster is made.
[1] - I’ve seen that happen in two ways 1) a purely transient error from GCP
`Internal server error` or the like 2) when the request is rejected because it
would exceed the project quota.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)