alessandrolacorte commented on code in PR #26932:
URL: https://github.com/apache/airflow/pull/26932#discussion_r990751333


##########
airflow/providers/common/sql/operators/sql.py:
##########
@@ -250,7 +250,7 @@ def execute(self, context: Context):
             records = hook.get_first(self.sql)
 
             if not records:
-                raise AirflowException(f"The following query returned zero 
rows: {self.sql}")
+                raise AirflowFailException(f"The following query returned zero 
rows: {self.sql}")

Review Comment:
   Hello @eladkal 
   Is your comment specific to this Exception or to all of them?
   
   Let me explain why I think `retries=0` is not a good idea from my point of 
view. Imagine you have an infrastructure issue (By this, I mean your DB is not 
available due to a temporary issue, or there are connectivity issues, etc), in 
this particular case, you want to retry your tasks as it is an issue "external" 
to your query. 
   
   Now, the change I am proposing does not affect this case, instead, it 
targets when the business logic is not met.  For example, we use this Operator 
to run simple Sanity checks (are the duplicates in the table?). The point is if 
you run a query and it returns False, there is no need to retry because I 
believe within the next retry, it will still return False. 
   
   This is particularly useful if you run on BigQuery since retrying for 
queries that will always return False is a waste of money. 
   
   WDYT? 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to