hussein-awala commented on PR #32990:
URL: https://github.com/apache/airflow/pull/32990#issuecomment-1689030674

   > So, what was the case before this change? IIUC:
   
   > execution_timeout applied to the overall execution of the task (e.g. 
duration of a single run of execute)
   
   yes, but when it's reached, we don't execute `on_kill` method
   
   > task deferral timeout is applied to the duration of a single deferral, and 
ignores overal execution_timeout
   
   Non, we take the min between deferral timeout and execution_timeout if they 
are both present
   
   > if task comes out of deferral then the execution_timeout is applied again 
to the task from the moment it resumes exeuction. so that for example if a task 
runs, defers, and resumes, and the execution timeout is 1 hr, then the task 
could run for 1 hr + deferral timeout + 1hr after resuming.
   
   Non, when we deferred the task, we calculated timeout as:
   ```python
           # Calculate timeout too if it was passed
           if defer.timeout is not None:
               self.trigger_timeout = timezone.utcnow() + defer.timeout
           else:
               self.trigger_timeout = None
   
           # If an execution_timeout is set, set the timeout to the minimum of
           # it and the trigger timeout
           execution_timeout = self.task.execution_timeout
           if execution_timeout:
               if self.trigger_timeout:
                   self.trigger_timeout = min(self.start_date + 
execution_timeout, self.trigger_timeout)
               else:
                   self.trigger_timeout = self.start_date + execution_timeout
   ```
   And when we resumed the task, we calculated the timeout duration as:
   ```python
                   timeout_seconds = (
                       task_to_execute.execution_timeout - (timezone.utcnow() - 
self.start_date)
                   ).total_seconds()
   ```
   The main issue in the operators (not sensors) was with how we handle this 
timeout, where it was considered as a normal failure in the trigger.
   
   > it makes me a bit uncomfortable to remove the feature to be able to set a 
specific deferral timeout on a given deferral. not that it's necessarily all 
that useful, but it's there.
   
   It's still possible, with the possibility to control the callback when this 
timeout is reached
   
   > execution_timeout will be applied to the overall duration of the task 
instance, no matter how many defer-resume cycles there are.
   
   yes but this was the case before
   
   > if the execution_timeout is 1 hr, then now if phase 1 takes 10 minutes, 
the deferral timeout will be set to 50 minutes. then if task is deferred for 30 
minutes before resuming, then when task resumes, effective execution timeout 
will be 30 minutes. and so on.
   
   20 minutes not 30 (1h - 10m - 30m)
   
   > timeout will be ignored if passed to self.defer
   
   if it's the earliest timeout moment, it will be used, and the 
`on_defer_timeout` callback will be called, if not, we'll use execution 
timeout, or sensor timeout (the user can define other custom types of timeout)
   
   The most important thing is the sensor timeout; In sync mode, when this 
timeout is reached, we fail the task regardless the number of remaining 
attempts, and this is not the case in the current async mode. This PR fixes 
this bug.
   
   So in summary, this PR:
   - process the timeout based on its reason
   - if it's because of execution timeout, it fails the attempt and call on_kill
   - if it's because of deferring timeout, it call the new callback method, 
which by default raise `AirflowDeferTimeout`, but the user can override it to 
return values or do anything
   - if it's a sensor, and the timeout is because of sensor timeout, it fails 
the TI immediately regards the number of remaining attempts 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to