potiuk commented on code in PR #66572:
URL: https://github.com/apache/airflow/pull/66572#discussion_r3264656640
##########
task-sdk/src/airflow/sdk/execution_time/supervisor.py:
##########
@@ -793,6 +793,33 @@ def handle_requests(self, log: FilteringBoundLogger) ->
Generator[None, _Request
),
request_id=request.id,
)
+ except Exception as e:
+ # Catch-all so a transient network error (httpx.ConnectError /
+ # httpx.TimeoutException) or any other non-ServerResponseError
+ # exception doesn't crash this generator and permanently break
+ # the IPC channel — the task subprocess would then get EOFError
+ # on every subsequent communication and the worker would be
+ # stuck. Surface the error to the task so it can decide how to
+ # react, log it loudly on the supervisor side, and keep the
+ # request loop alive.
+ log.exception(
+ "Unhandled exception while handling task request",
+ request_id=request.id,
+ exception_type=type(e).__name__,
+ )
Review Comment:
Good point — done in a496d9f. Switched to `exc_info=e` so the full exception
type and traceback are captured by `log.exception`, and dropped the redundant
`exception_type` kwarg.
---
Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]