alessandrolacorte commented on code in PR #26932:
URL: https://github.com/apache/airflow/pull/26932#discussion_r990751333
##########
airflow/providers/common/sql/operators/sql.py:
##########
@@ -250,7 +250,7 @@ def execute(self, context: Context):
records = hook.get_first(self.sql)
if not records:
- raise AirflowException(f"The following query returned zero
rows: {self.sql}")
+ raise AirflowFailException(f"The following query returned zero
rows: {self.sql}")
Review Comment:
Hello @eladkal
Is your comment specific to this Exception or to all of them?
Let me explain why I think `retries=0` is not a good idea from my point of
view. Imagine you have an infrastructure issue (By this, I mean your DB is not
available due to a temporary issue, or there are connectivity issues, etc), in
this particular case, you want to retry your tasks as it is an issue "external"
to your query.
Now, the change I am proposing does not affect this case, instead, it
targets when the business logic is not met. For example, we use this Operator
to run simple Sanity checks (are the duplicates in the table?). The point is if
you run a query and it returns False, there is no need to retry because I
believe within the next retry, it will still return False.
This is particularly useful if you run on BigQuery since retrying for
queries that will always return False is a waste of money.
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]