syedahsn commented on code in PR #27276:
URL: https://github.com/apache/airflow/pull/27276#discussion_r1011053597


##########
airflow/providers/amazon/aws/operators/redshift_cluster.py:
##########
@@ -392,20 +393,31 @@ def __init__(
         *,
         cluster_identifier: str,
         aws_conn_id: str = "aws_default",
+        attempts: int = 1,
+        attempt_interval: int = 30,
         **kwargs,
     ):
         super().__init__(**kwargs)
         self.cluster_identifier = cluster_identifier
         self.aws_conn_id = aws_conn_id
+        self.attempts = attempts
+        self.attempt_interval = attempt_interval
 
     def execute(self, context: Context):
         redshift_hook = RedshiftHook(aws_conn_id=self.aws_conn_id)
-        cluster_state = 
redshift_hook.cluster_status(cluster_identifier=self.cluster_identifier)
-        if cluster_state == "paused":
-            self.log.info("Starting Redshift cluster %s", 
self.cluster_identifier)
-            
redshift_hook.get_conn().resume_cluster(ClusterIdentifier=self.cluster_identifier)
-        else:
-            raise Exception(f"Unable to resume cluster - cluster state is 
{cluster_state}")
+
+        while self.attempts >= 1:
+            try:
+                
redshift_hook.get_conn().resume_cluster(ClusterIdentifier=self.cluster_identifier)
+                return
+            except 
redshift_hook.get_conn().exceptions.InvalidClusterStateFault as error:
+                self.attempts = self.attempts - 1
+
+                if self.attempts > 0:
+                    self.log.error("Unable to resume cluster. %d attempts 
remaining.", self.attempts)
+                    time.sleep(self.attempt_interval)
+                else:
+                    raise error

Review Comment:
   The reason the retries are necessary is because it's not clear why the 
cluster initially fails to resume (or pause). Despite the state of the cluster 
being in the correct state to perform the operation, the first few attempts 
fail (depending on the `attempt_interval`). 
   You have a good point about the behaviour of `retries` and `retry_delay`. 
Should I just use those, and remove `attempts` and `attempt_interval`?
   > If we must have it lets not expose the parameters to the users. I'm not 
sure if we want each indvidual operator to set it's own user facing logic for 
retry.
   
   I'm not sure what you mean by this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to