eejbyfeldt commented on issue #30146:
URL: https://github.com/apache/airflow/issues/30146#issuecomment-1472735826

   That change does not look correct to me. The current condition that 
determines if the on_failure callback is called is
   
https://github.com/apache/airflow/blob/699171679a79917e9c2ee798f71a01d3051f3cc4/airflow/models/taskinstance.py#L1895
   So the correct condition for deciding the callback would need to replicate 
that logic? 
   
   Also not reraising the exception seems very problematic to me. The reason it 
currently called twice is that after reraiseing we end up here
   
https://github.com/apache/airflow/blob/699171679a79917e9c2ee798f71a01d3051f3cc4/airflow/models/taskinstance.py#L1465-L1479
   which handle updating the state and running the failure fallback. Just 
swallowing the exception like in the patch you posted will result in the task 
being incorrectly marked as success.
   
   Seems to me the reason for us not calling the `on_failure_callback` on cases 
where the task was externally marked as success/failed is this branch:
   
https://github.com/apache/airflow/blob/699171679a79917e9c2ee798f71a01d3051f3cc4/airflow/models/taskinstance.py#L1468-L1475
   so it seems like the correct fix would be to call the callbacks from there. 
Because that is the bug that PR was aiming to solve right? Or did I 
misunderstand the goal?
   
   It seems incorrect to me that assuming SIGTERM is the same thing being 
marked failed externally. As SIGTERM can be received for other reason, like the 
pod-preemption when running on k8s, and for such cases we want to go through 
the normal failure handling code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to