akshetpandey opened a new pull request, #67219:
URL: https://github.com/apache/airflow/pull/67219

   related: #66293
   
   ## Problem
   
   `CloudRunJobFinishedTrigger.run()` polls the long-running operation via
   `CloudRunAsyncHook.get_operation` in its loop. When that gRPC call fails
   with a transient 503 `ServiceUnavailable` — typical of a regional Cloud
   Run API blip while the underlying job is still progressing fine — the
   exception propagates out of the trigger:
   
   1. The triggerer logs the failure and tears down the trigger.
   2. The deferred task fails with `TaskDeferralError`.
   3. The worker's task-level retry re-runs the operator from scratch,
      which submits a **brand new Cloud Run execution** rather than
      waiting on the in-flight one. So a 1-second transient API blip turns
      into a duplicate (and billed) job run.
   
   This mirrors the same class of bug fixed for Dataflow in #66293; the
   fix is the same shape.
   
   ## How to fix
   
   Catch `ServiceUnavailable` inside the Cloud Run get-operation polling
   loop, log a warning, sleep `polling_period_seconds`, and continue
   polling. Other exceptions still propagate, so Airflow's task-level
   retry remains the safety net for genuinely terminal failures.
   
   ## Tests
   
   ```bash
   uv run --project providers/google pytest 
providers/google/tests/unit/google/cloud/triggers/test_cloud_run.py -xvs
   ```
   
   Adds two tests:
   
   - `test_trigger_continues_polling_after_retryable_service_unavailable`:
     first \`get_operation\` raises `ServiceUnavailable`, second returns a
     successfully-completed operation — trigger yields the SUCCESS
     `TriggerEvent` and `asyncio.sleep` is awaited exactly once with
     `polling_period_seconds`.
   - `test_trigger_propagates_unexpected_polling_exception`: a non-503
     exception still propagates out of `run()` (locks in that only
     `ServiceUnavailable` is retried).
   
   All existing tests in `test_cloud_run.py` still pass.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes (please specify the tool below)
     Generated-by: Claude Opus 4.7 following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to