potiuk opened a new pull request, #67513: URL: https://github.com/apache/airflow/pull/67513
Three failure modes in `StackdriverTaskHandler` exposed internal details or broke shutdown: 1. **F-011** — `read()` did not wrap `_read_logs()`. When Cloud Logging was unavailable, gRPC errors propagated as HTTP 500 from the log viewer instead of degrading gracefully. 2. **F-010** — gRPC errors from `list_log_entries` carry project IDs, resource names, and service-account info in their `__str__`, and were forwarded into the user-visible error response. CWE-209 information disclosure. 3. **F-013** — `close()` called `self._transport.flush()` without exception handling. A failed flush during shutdown raised through stdlib's logging machinery, which does not handle exceptions from `Handler.close()` gracefully. Reported in the [`apache/tooling-agents` L3 providers/google sweep `b1aec75`](https://github.com/apache/tooling-agents/issues/34). ## Change - Wrap `_read_logs()` in `read()` with a `try/except` that surfaces a short generic message (`Cloud Logging is currently unavailable.`) and writes the full traceback to the handler's own `_logger`. The outer guard catches the gRPC exceptions before they reach the user, so F-010's leakage path is closed without adding a second swallow inside `_read_single_logs_page` (which would have hidden iteration-loop failures from the outer guard). - Wrap `_transport.flush()` in `close()` with `try/except`; print to `stderr` since the logging machinery may itself be shutting down. ## Test plan - [x] `test_read_falls_back_when_cloud_logging_unavailable` — `list_log_entries` raises `ServiceUnavailable`; `read()` returns `[{end_of_log: True}]` with the generic user message, no internal details. - [x] `test_read_does_not_leak_internals_in_user_facing_message` — `PermissionDenied` carrying a service-account email and IAM permission name is replaced with the generic message; neither identifier appears in user-visible output. - [x] `test_close_swallows_transport_flush_errors` — broken `_transport.flush` does not raise out of `close()`; failure is written to stderr. - [x] `prek run ruff` clean. - [x] Full `test_stackdriver_task_handler.py` suite: 15 passed via `breeze run pytest`. --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes — Claude Code (Opus 4.7) Generated-by: Claude Code (Opus 4.7) following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
