dejii commented on code in PR #44279:
URL: https://github.com/apache/airflow/pull/44279#discussion_r2700707375


##########
providers/src/airflow/providers/google/cloud/operators/bigquery.py:
##########
@@ -2592,8 +2592,15 @@ def _submit_job(
             nowait=True,
         )
 
-    @staticmethod
-    def _handle_job_error(job: BigQueryJob | UnknownJob) -> None:
+    def _handle_job_error(self, job: BigQueryJob | UnknownJob) -> None:
+        self.log.info("Job %s is completed. Checking the job status", 
self.job_id)
+        # I've noticed that sometimes BigQuery jobs transiently report the 
wrong status, causing the Airflow job to be incorrectly marked as successful.
+        # To avoid this, we refresh the job properties before checking the 
final state and handling any errors.
+        while job.state != "DONE":

Review Comment:
   @shahar1 @pankajastro – I'm suspecting that the `job.state != "DONE"` didn't 
need to be introduced here. I took a look at the client library while fixing a 
silent bug introduced by this PR (see fix: 
https://github.com/apache/airflow/pull/60679), and found that `job.result()` 
can only return successfully when `state=DONE` or it returns an Exception which 
will cause the task to fail.
   
   References:
   1. 
https://github.com/googleapis/python-bigquery/blob/73228432a3c821db05d898ea4a4788adf15b033d/google/cloud/bigquery/job/base.py#L990-L1016
   2. 
https://github.com/googleapis/python-bigquery/blob/73228432a3c821db05d898ea4a4788adf15b033d/google/cloud/bigquery/job/query.py#L1770-L1787
   
   Also, per 
[docs](https://docs.cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatus):
   > Output only. Running state of the job. Valid states include 'PENDING', 
'RUNNING', and 'DONE'.
   
   So it's very unlikely that `"Job failed with state: PENDING|RUNNING"` is a 
valid exception.
   
   Let me know your thoughts here, I can remove it in my 
[PR](https://github.com/apache/airflow/pull/60679) since it's somewhat related.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to