potiuk commented on a change in pull request #13832:
URL: https://github.com/apache/airflow/pull/13832#discussion_r571159300



##########
File path: airflow/providers/amazon/aws/operators/batch.py
##########
@@ -177,29 +177,26 @@ def submit_job(self, context: Dict):  # pylint: 
disable=unused-argument
             self.job_id = response["jobId"]
 
             self.log.info("AWS Batch job (%s) started: %s", self.job_id, 
response)
-
         except Exception as e:
             self.log.error("AWS Batch job (%s) failed submission", self.job_id)
             raise AirflowException(e)
 
     def monitor_job(self, context: Dict):  # pylint: disable=unused-argument
         """
         Monitor an AWS Batch job
+        monitor_job can raise an exception or an AirflowTaskTimeout can be 
raised if execution_timeout
+        is given while creating the task. These exceptions should be handled 
in taskinstance.py
+        instead of here like it was previously done

Review comment:
       > this is true of every operator -- execution_timeout is something 
that's part of base operator right?
   
   Correct. This is like that but THE ONLY built-in operator using it is this 
AWS one
   
   > maybe i miss something. but i don't see anywhere in this operator (or its 
inherited methods) that task timeout is raised explicitly.
   
   We are following the same path here :). And it proves IMHO that some comment 
is needed here.
   
   I had the same question initially because I did not see it either. The AWS 
batch operators are built in a very strange way and while the "batch.py" does  
not look like it can throw the timeout, it actually does in here:
   
   
https://github.com/apache/airflow/blob/fc67521f31a0c9a74dadda8d5f0ac32c07be218d/airflow/providers/amazon/aws/hooks/datasync.py#L316
   
   Which is several stack frames below (and only if you try hard you can find 
it out).
   
   And somebody else did not actually know it and added the original try/except 
Exception (and threw Airflow Execption()) hiding the fact that there is a 
timeout exception raised deep in the datasync hook. 
   
   So I asked @ayushchauhan0811 to explain it here - otherwise we risk that 
someone will again unnowingly swallow the Timeout exception.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to