Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]

via GitHub Fri, 12 Dec 2025 10:22:57 -0800


Copilot commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2615172777



##########
providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py:
##########
@@ -2199,6 +2199,83 @@ def 
test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod
         process_pod_deletion_mock.assert_not_called()
         assert result.metadata.name == pod_2.metadata.name
 
+    @pytest.mark.parametrize(
+        "on_finish_action", [OnFinishAction.KEEP_POD, 
OnFinishAction.DELETE_SUCCEEDED_POD]
+    )
+    @patch(KUB_OP_PATH.format("patch_already_checked"))
+    @patch(KUB_OP_PATH.format("process_pod_deletion"))
+    def test_process_duplicate_label_pods_with_start_time_none(

Review Comment:
   The test name `test_process_duplicate_label_pods_with_start_time_none` could 
be more descriptive. Consider renaming to 
`test_process_duplicate_label_pods_falls_back_to_creation_timestamp_when_start_time_is_none`
 to make it clearer what behavior is being tested.
   ```suggestion
       def 
test_process_duplicate_label_pods_falls_back_to_creation_timestamp_when_start_time_is_none(
   ```



##########
providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py:
##########
@@ -2199,6 +2199,83 @@ def 
test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod
         process_pod_deletion_mock.assert_not_called()
         assert result.metadata.name == pod_2.metadata.name
 
+    @pytest.mark.parametrize(
+        "on_finish_action", [OnFinishAction.KEEP_POD, 
OnFinishAction.DELETE_SUCCEEDED_POD]
+    )
+    @patch(KUB_OP_PATH.format("patch_already_checked"))
+    @patch(KUB_OP_PATH.format("process_pod_deletion"))
+    def test_process_duplicate_label_pods_with_start_time_none(
+        self,
+        process_pod_deletion_mock,
+        patch_already_checked_mock,
+        on_finish_action,
+    ):
+        now = datetime.datetime.now()
+        k = KubernetesPodOperator(
+            namespace="default",
+            image="ubuntu:22.04",
+            cmds=["bash", "-cx"],
+            arguments=["echo 12"],
+            name="test",
+            task_id="task",
+            do_xcom_push=False,
+            reattach_on_restart=False,
+            on_finish_action=on_finish_action,
+        )
+        context = create_context(k)
+        pod_1 = 
k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), 
context=context)
+        pod_2 = 
k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), 
context=context)
+
+        pod_1.status = {"start_time": None}
+        pod_2.status = {"start_time": None}

Review Comment:
   Setting `pod.status` to a plain dictionary may not correctly mock the V1Pod 
object's status attribute. The kubernetes client's V1Pod expects a V1PodStatus 
object. Consider using `V1PodStatus(start_time=None)` instead to ensure the 
test accurately represents the actual API behavior when serialized with 
`to_dict()`.
   ```suggestion
           pod_1.status = V1PodStatus(start_time=None)
           pod_2.status = V1PodStatus(start_time=None)
   ```



##########
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##########
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list: 
list[k8s.V1Pod]) -> int:
         pod_start_times: list[datetime.datetime] = [
             pod.to_dict().get("status").get("start_time") for pod in pod_list

Review Comment:
   This code will raise an AttributeError if `pod.to_dict().get("status")` 
returns None, since you cannot call `.get("start_time")` on None. The code 
should use chained get() calls with a default dictionary to handle the case 
where status is None, for example: `pod.to_dict().get("status", 
{}).get("start_time")`.
   ```suggestion
               pod.to_dict().get("status", {}).get("start_time") for pod in 
pod_list
   ```



##########
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##########
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list: 
list[k8s.V1Pod]) -> int:
         pod_start_times: list[datetime.datetime] = [
             pod.to_dict().get("status").get("start_time") for pod in pod_list
         ]
+        if not all(pod_start_times):
+            pod_start_times: list[datetime.datetime] = [  # type: 
ignore[no-redef]
+                pod_start_time
+                if (
+                    pod_start_time := pod.to_dict()
+                    .get("metadata", {})
+                    .get("creation_timestamp", 
datetime.datetime.now(tz=datetime.timezone.utc))
+                )
+                else datetime.datetime.now(tz=datetime.timezone.utc)

Review Comment:
   The fallback logic has a flaw. The walrus operator on line 1397 assigns the 
result of `.get("creation_timestamp", 
datetime.datetime.now(tz=datetime.timezone.utc))`, which will always return a 
non-None value due to the default argument in get(). This means the condition 
on line 1395 (`if pod_start_time`) will always be True, making the `else 
datetime.datetime.now(tz=datetime.timezone.utc)` branch on line 1401 
unreachable.
   
   The code should check if `creation_timestamp` exists before using it, or 
remove the redundant else clause since the default is already handled by the 
get() call on line 1399.
   ```suggestion
               pod_start_times: list[datetime.datetime] = [
                   pod.to_dict()
                   .get("metadata", {})
                   .get("creation_timestamp", 
datetime.datetime.now(tz=datetime.timezone.utc))
   ```



##########
providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py:
##########
@@ -2199,6 +2199,83 @@ def 
test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod
         process_pod_deletion_mock.assert_not_called()
         assert result.metadata.name == pod_2.metadata.name
 
+    @pytest.mark.parametrize(
+        "on_finish_action", [OnFinishAction.KEEP_POD, 
OnFinishAction.DELETE_SUCCEEDED_POD]
+    )
+    @patch(KUB_OP_PATH.format("patch_already_checked"))
+    @patch(KUB_OP_PATH.format("process_pod_deletion"))
+    def test_process_duplicate_label_pods_with_start_time_none(
+        self,
+        process_pod_deletion_mock,
+        patch_already_checked_mock,
+        on_finish_action,
+    ):
+        now = datetime.datetime.now()
+        k = KubernetesPodOperator(
+            namespace="default",
+            image="ubuntu:22.04",
+            cmds=["bash", "-cx"],
+            arguments=["echo 12"],
+            name="test",
+            task_id="task",
+            do_xcom_push=False,
+            reattach_on_restart=False,
+            on_finish_action=on_finish_action,
+        )
+        context = create_context(k)
+        pod_1 = 
k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), 
context=context)
+        pod_2 = 
k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), 
context=context)
+
+        pod_1.status = {"start_time": None}
+        pod_2.status = {"start_time": None}
+        pod_1.metadata.creation_timestamp = now
+        pod_2.metadata.creation_timestamp = now + 
datetime.timedelta(seconds=60)
+        pod_2.metadata.labels.update({"try_number": "2"})
+
+        result = k.process_duplicate_label_pods([pod_1, pod_2])
+
+        patch_already_checked_mock.assert_called_once_with(pod_1, 
reraise=False)
+        process_pod_deletion_mock.assert_not_called()
+        assert result.metadata.name == pod_2.metadata.name
+
+    @pytest.mark.parametrize(
+        "on_finish_action", [OnFinishAction.KEEP_POD, 
OnFinishAction.DELETE_SUCCEEDED_POD]
+    )
+    @patch(KUB_OP_PATH.format("patch_already_checked"))
+    @patch(KUB_OP_PATH.format("process_pod_deletion"))
+    def 
test_process_duplicate_label_pods_with_no_start_time_or_creation_timestamp(
+        self,
+        process_pod_deletion_mock,
+        patch_already_checked_mock,
+        on_finish_action,
+    ):
+        k = KubernetesPodOperator(
+            namespace="default",
+            image="ubuntu:22.04",
+            cmds=["bash", "-cx"],
+            arguments=["echo 12"],
+            name="test",
+            task_id="task",
+            do_xcom_push=False,
+            reattach_on_restart=False,
+            on_finish_action=on_finish_action,
+        )
+        context = create_context(k)
+        pod_1 = 
k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), 
context=context)
+        pod_2 = 
k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), 
context=context)
+
+        pod_1.status = {"start_time": None}
+        pod_2.status = {"start_time": None}

Review Comment:
   Setting `pod.status` to a plain dictionary may not correctly mock the V1Pod 
object's status attribute. The kubernetes client's V1Pod expects a V1PodStatus 
object. Consider using `V1PodStatus(start_time=None)` instead to ensure the 
test accurately represents the actual API behavior when serialized with 
`to_dict()`.
   ```suggestion
           pod_1.status = V1PodStatus(start_time=None)
           pod_2.status = V1PodStatus(start_time=None)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]

Reply via email to