Aaron Dossett created AIRFLOW-3149:
--------------------------------------

             Summary: GCP dataproc cluster creation should have the option to 
delete an ERROR cluster
                 Key: AIRFLOW-3149
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3149
             Project: Apache Airflow
          Issue Type: Improvement
          Components: gcp
    Affects Versions: 1.10.0
            Reporter: Aaron Dossett
            Assignee: Aaron Dossett


We sometimes encounter issues where a dataproc cluster creation ends up in 
ERROR state. That is, the cluster “exists” but in the state of ERROR[1] (not 
just that the cluster creation API call failed). This makes retries impossible 
since the cluster name already exists subsequent retried creations are 
guaranteed to fail. 

A `delete_cluster_on_error` parameter should be added to the 
`DataprocClusterCreateOperator` operator that controls whether or not an 
attempt to delete an ERROR cluster is made.

 

[1] - I’ve seen that happen in two ways 1) a purely transient error from GCP 
`Internal server error` or the like 2) when the request is rejected because it 
would exceed the project quota.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to