potiuk opened a new pull request, #67513:
URL: https://github.com/apache/airflow/pull/67513

   Three failure modes in `StackdriverTaskHandler` exposed internal details or 
broke shutdown:
   
   1. **F-011** — `read()` did not wrap `_read_logs()`. When Cloud Logging was 
unavailable, gRPC errors propagated as HTTP 500 from the log viewer instead of 
degrading gracefully.
   2. **F-010** — gRPC errors from `list_log_entries` carry project IDs, 
resource names, and service-account info in their `__str__`, and were forwarded 
into the user-visible error response. CWE-209 information disclosure.
   3. **F-013** — `close()` called `self._transport.flush()` without exception 
handling. A failed flush during shutdown raised through stdlib's logging 
machinery, which does not handle exceptions from `Handler.close()` gracefully.
   
   Reported in the [`apache/tooling-agents` L3 providers/google sweep 
`b1aec75`](https://github.com/apache/tooling-agents/issues/34).
   
   ## Change
   
   - Wrap `_read_logs()` in `read()` with a `try/except` that surfaces a short 
generic message (`Cloud Logging is currently unavailable.`) and writes the full 
traceback to the handler's own `_logger`. The outer guard catches the gRPC 
exceptions before they reach the user, so F-010's leakage path is closed 
without adding a second swallow inside `_read_single_logs_page` (which would 
have hidden iteration-loop failures from the outer guard).
   - Wrap `_transport.flush()` in `close()` with `try/except`; print to 
`stderr` since the logging machinery may itself be shutting down.
   
   ## Test plan
   
   - [x] `test_read_falls_back_when_cloud_logging_unavailable` — 
`list_log_entries` raises `ServiceUnavailable`; `read()` returns `[{end_of_log: 
True}]` with the generic user message, no internal details.
   - [x] `test_read_does_not_leak_internals_in_user_facing_message` — 
`PermissionDenied` carrying a service-account email and IAM permission name is 
replaced with the generic message; neither identifier appears in user-visible 
output.
   - [x] `test_close_swallows_transport_flush_errors` — broken 
`_transport.flush` does not raise out of `close()`; failure is written to 
stderr.
   - [x] `prek run ruff` clean.
   - [x] Full `test_stackdriver_task_handler.py` suite: 15 passed via `breeze 
run pytest`.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes — Claude Code (Opus 4.7)
   
   Generated-by: Claude Code (Opus 4.7) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to