SameerMesiah97 opened a new pull request, #64511:
URL: https://github.com/apache/airflow/pull/64511

    **Description**
   
   This change is a follow-up to PR #62196, which introduced parallel download 
and upload support in `GCSTimeSpanFileTransformOperator`. It adds best-effort 
cancellation of pending futures when a `GoogleCloudError` occurs during 
parallel execution. When `*_continue_on_fail=False`, the operator still raises 
on the first failure, but now also calls `cancel()` on submitted futures to 
prevent queued tasks from starting, reducing unnecessary work in systemic 
failure scenarios.
   
    **Rationale**
   
   Previously, the operator raised on failure but allowed all queued tasks to 
proceed, which in high-volume workloads could result in many unnecessary GCS 
API calls after a failure was already known. Attempting to cancel pending 
futures reduces wasted work and avoids additional load on GCS, while preserving 
existing behaviour and keeping failure semantics unchanged.
   
    **Tests**
   
   * Updated existing parallel execution tests to assert that `cancel()` is 
invoked when `*_continue_on_fail=False` and not invoked when 
`*_continue_on_fail=True`.
   * Extended existing failure-path tests for both download and upload flows 
using controlled executor and `as_completed` mocking.
   * System tests are not required as external service interaction remains 
unchanged from the previous implementation.
   
    **Backwards Compatibility**
   
   No change in behaviour; task failure semantics remain unchanged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to