[GitHub] [airflow] eejbyfeldt commented on issue #30146: on_failure_callback call multiple times and when task has retries

via GitHub Thu, 16 Mar 2023 14:03:26 -0700


eejbyfeldt commented on issue #30146:
URL: https://github.com/apache/airflow/issues/30146#issuecomment-1472735826

That change does not look correct to me. The current condition that
determines if the on_failure callback is called is

https://github.com/apache/airflow/blob/699171679a79917e9c2ee798f71a01d3051f3cc4/airflow/models/taskinstance.py#L1895
So the correct condition for deciding the callback would need to replicate
that logic?

Also not reraising the exception seems very problematic to me. The reason it
currently called twice is that after reraiseing we end up here

https://github.com/apache/airflow/blob/699171679a79917e9c2ee798f71a01d3051f3cc4/airflow/models/taskinstance.py#L1465-L1479
which handle updating the state and running the failure fallback. Just
swallowing the exception like in the patch you posted will result in the task
being incorrectly marked as success.

Seems to me the reason for us not calling the `on_failure_callback` on cases
where the task was externally marked as success/failed is this branch:

https://github.com/apache/airflow/blob/699171679a79917e9c2ee798f71a01d3051f3cc4/airflow/models/taskinstance.py#L1468-L1475
so it seems like the correct fix would be to call the callbacks from there.
Because that is the bug that PR was aiming to solve right? Or did I
misunderstand the goal?

It seems incorrect to me that assuming SIGTERM is the same thing being
marked failed externally. As SIGTERM can be received for other reason, like the
pod-preemption when running on k8s, and for such cases we want to go through
the normal failure handling code.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] eejbyfeldt commented on issue #30146: on_failure_callback call multiple times and when task has retries

Reply via email to