ashb commented on code in PR #55767:
URL: https://github.com/apache/airflow/pull/55767#discussion_r2375338491
##########
task-sdk/src/airflow/sdk/execution_time/supervisor.py:
##########
@@ -1151,8 +1155,16 @@ def final_state(self):
return self._terminal_state or TaskInstanceState.SUCCESS
if self._exit_code != 0 and self._terminal_state == SERVER_TERMINATED:
return SERVER_TERMINATED
+
+ if self._is_signal_retryable() and self._should_retry:
+ return TaskInstanceState.UP_FOR_RETRY
+
return TaskInstanceState.FAILED
+ def _is_signal_retryable(self) -> bool:
+ """Check if the exit code signal can be retried."""
+ return self._exit_code in (-signal.SIGKILL, -signal.SIGTERM,
-signal.SIGSEGV)
Review Comment:
I really don't think we should be filtering the signal. When you are dealing
with native code, almost anything is game. If the process dies with any signal
I'd say we should retry it.
Retries exist to try and deal with transient errors -- dying with an
unexpected signal is (likely) one of those cases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]