cruseakshay opened a new pull request, #60626:
URL: https://github.com/apache/airflow/pull/60626

   ## Problem
   
   When using `KubernetesPodOperator` with cluster autoscaling, task pods can 
be preempted by higher-priority daemonsets during node bootstrap. This results 
in a 404 error when Airflow tries to read the pod status, causing immediate 
task failure instead of allowing Kubernetes to reschedule the pod.
   
   Fixes #59626
   
   ## Solution
   
   Introduce **state-aware retry logic** that tracks whether a pod ever reached 
the `Running` state:
   
   - **Pod never reached Running** → Raise `PodPreemptedException` (retriable)
   - **Pod was Running** → Raise `PodNotFoundException` (terminal failure)
   
   This prevents duplicate execution of non-idempotent tasks while allowing 
safe retries for pods preempted before they started.
   
   ## Changes
   
   | File | Change |
   |------|--------|
   | `exceptions.py` | Add `PodNotFoundException` and `PodPreemptedException` |
   | `pod_manager.py` | Add `PodPhaseTracker` dataclass and 404 handling logic |
   | `hooks/kubernetes.py` | Add phase tracker support to async `get_pod()` |
   | `triggers/pod.py` | Integrate phase tracking in `KubernetesPodTrigger` |
   | `kubernetes_helper_functions.py` | Add `PodPreemptedException` to retry 
logic |
   | `test_pod_manager.py` | Add comprehensive tests for new functionality |
   
   ## Testing
   
   - Unit tests for `PodPhaseTracker` state transitions
   - Unit tests for 404 handling with different pod states
   - Backward compatibility tests (no tracker = existing behavior)
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes (please specify the tool below)
   
   Generated-by: Claude (Cursor) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to