SameerMesiah97 opened a new pull request, #64511:
URL: https://github.com/apache/airflow/pull/64511
**Description**
This change is a follow-up to PR #62196, which introduced parallel download
and upload support in `GCSTimeSpanFileTransformOperator`. It adds best-effort
cancellation of pending futures when a `GoogleCloudError` occurs during
parallel execution. When `*_continue_on_fail=False`, the operator still raises
on the first failure, but now also calls `cancel()` on submitted futures to
prevent queued tasks from starting, reducing unnecessary work in systemic
failure scenarios.
**Rationale**
Previously, the operator raised on failure but allowed all queued tasks to
proceed, which in high-volume workloads could result in many unnecessary GCS
API calls after a failure was already known. Attempting to cancel pending
futures reduces wasted work and avoids additional load on GCS, while preserving
existing behaviour and keeping failure semantics unchanged.
**Tests**
* Updated existing parallel execution tests to assert that `cancel()` is
invoked when `*_continue_on_fail=False` and not invoked when
`*_continue_on_fail=True`.
* Extended existing failure-path tests for both download and upload flows
using controlled executor and `as_completed` mocking.
* System tests are not required as external service interaction remains
unchanged from the previous implementation.
**Backwards Compatibility**
No change in behaviour; task failure semantics remain unchanged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]