Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 merged PR #59097: URL: https://github.com/apache/airflow/pull/59097 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3739820921 LGTM, improved the phrasing of the log message and rebased. Will merge after all checks are green. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2683335208
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1422,12 +1422,21 @@ def process_duplicate_label_pods(self, pod_list:
list[k8s.V1Pod]) -> k8s.V1Pod:
self.process_pod_deletion(old_pod)
return new_pod
-@staticmethod
-def _get_most_recent_pod_index(pod_list: list[k8s.V1Pod]) -> int:
+def _get_most_recent_pod_index(self, pod_list: list[k8s.V1Pod]) -> int:
"""Loop through a list of V1Pod objects and get the index of the most
recent one."""
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
]
+if not all(pod_start_times):
+self.log.info(
+"Some of the start_time values are None, switch to
creation_timestamp to choose most recent pod"
+)
Review Comment:
```suggestion
self.log.info(
"Unable to determine most recent pod using start_time (some
pods have not started yet). Falling back to creation_timestamp from pod
metadata."
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3733215973 > > Not my area of expertise either (yet) - I've assigned Copilot for an initial review, so please address its commentary as well. However, my 2 cents, if I understand these changes correctly - the `if` block makes the comparison based on a `creation_timestamp`, which is clearly not the `start_time`. Therefore, assigning the `creation_timestamp` to `pod_start_times` in the `if` statement is rather misleading. In that case, why not making the entire comparison based on `creation_timestamp`, replacing the `start_time` completely? Alternatively, if comparison based on `start_time` is useful for most cases, maybe you could introduce a flag for the user so they'll configure the comparison method? > > I'll be happy for an additional review from k8s owners. > > Hello @shahar1 thank you for your comment! My thought was about extending current logic not changing it completely. In my opinion if start_time happens to be None it should not lead to the error that a user should resolve manually. The recent pod could be defined based on the creation_timestamp. I don´t see it as a different way that a user should define in the operator because at the end the code only specifies which of the identical pods is most recent. WDYT? > Luckily I've started recently working on a CKA certification, so I feel more comfortable reviewing this PR. Please avoid tagging other commiters/PMC from now - let's try settling this one between us. If anyone else wants to review or comment, they are more than welcome, of course. For future PRs, please be aware of matter - as Jarek said, arbitrarily tagging maintainers might "taint" your current and future PRs from review (k8s pun intended). For better visibility in future PRs, I recommend sending a message in the `#contributors` Slack channel. Now, let's get to the what needs to be done: 1. I closed irrelevant/nitpicking comments by Copilot - I left only the one which I find relevant (unreachable `else` statement). 2. I understood the fallback logic, but it should be **clearly** reflected to the user (probably by `log.info`), and also documented shortly in the function's docstring. Please take care of the two comments above, and from my prespective it will be good to merge. Thank you and good luck! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on code in PR #59097: URL: https://github.com/apache/airflow/pull/59097#discussion_r2678794196 ## providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py: ## @@ -2199,6 +2199,83 @@ def test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod process_pod_deletion_mock.assert_not_called() assert result.metadata.name == pod_2.metadata.name [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_start_time_none( +self, +process_pod_deletion_mock, +patch_already_checked_mock, +on_finish_action, +): +now = datetime.datetime.now() +k = KubernetesPodOperator( +namespace="default", +image="ubuntu:22.04", +cmds=["bash", "-cx"], +arguments=["echo 12"], +name="test", +task_id="task", +do_xcom_push=False, +reattach_on_restart=False, +on_finish_action=on_finish_action, +) +context = create_context(k) +pod_1 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) +pod_2 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) + +pod_1.status = {"start_time": None} +pod_2.status = {"start_time": None} Review Comment: I agree with TP, but I wouldn't be too picky about it tbh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on code in PR #59097: URL: https://github.com/apache/airflow/pull/59097#discussion_r2678793522 ## providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py: ## @@ -2199,6 +2199,83 @@ def test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod process_pod_deletion_mock.assert_not_called() assert result.metadata.name == pod_2.metadata.name [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_start_time_none( Review Comment: nitpicky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2678792511
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
]
+if not all(pod_start_times):
+pod_start_times: list[datetime.datetime] = [ # type:
ignore[no-redef]
+pod_start_time
+if (
+pod_start_time := pod.to_dict()
+.get("metadata", {})
+.get("creation_timestamp",
datetime.datetime.now(tz=datetime.timezone.utc))
+)
+else datetime.datetime.now(tz=datetime.timezone.utc)
Review Comment:
If `creation_timestamp` is not in `metadata`, it will return
`datetime.datetime.now()` due to the `.get(..., datetime.datetime.now(...)).` -
not due to the `if`.
The `else` statement seems to be unreachable (because the latter `get`
always returns not-`None`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2678792511
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
]
+if not all(pod_start_times):
+pod_start_times: list[datetime.datetime] = [ # type:
ignore[no-redef]
+pod_start_time
+if (
+pod_start_time := pod.to_dict()
+.get("metadata", {})
+.get("creation_timestamp",
datetime.datetime.now(tz=datetime.timezone.utc))
+)
+else datetime.datetime.now(tz=datetime.timezone.utc)
Review Comment:
If `creation_timestamp` is not in `metadata`, it will return
`datetime.datetime.now()` due to the `.get(..., datetime.datetime.now(...)`).`
- not due to the `if`.
The `else` statement seems to be unreachable (because the latter `get`
always returns not `None`).
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
]
+if not all(pod_start_times):
+pod_start_times: list[datetime.datetime] = [ # type:
ignore[no-redef]
+pod_start_time
+if (
+pod_start_time := pod.to_dict()
+.get("metadata", {})
+.get("creation_timestamp",
datetime.datetime.now(tz=datetime.timezone.utc))
+)
+else datetime.datetime.now(tz=datetime.timezone.utc)
Review Comment:
If `creation_timestamp` is not in `metadata`, it will return
`datetime.datetime.now()` due to the `.get(..., datetime.datetime.now(...)).` -
not due to the `if`.
The `else` statement seems to be unreachable (because the latter `get`
always returns not `None`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2678792511
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
]
+if not all(pod_start_times):
+pod_start_times: list[datetime.datetime] = [ # type:
ignore[no-redef]
+pod_start_time
+if (
+pod_start_time := pod.to_dict()
+.get("metadata", {})
+.get("creation_timestamp",
datetime.datetime.now(tz=datetime.timezone.utc))
+)
+else datetime.datetime.now(tz=datetime.timezone.utc)
Review Comment:
If `creation_timestamp` is not in `metadata`, it will return `datetime.now`
due to the `.get(..., datetime.datetime.now(...)`).` - not due to the `if`.
The `else` statement seems to be unreachable (because the latter `get`
always returns not `None`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
shahar1 commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2678782503
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
Review Comment:
Generally I agree with TP, but in any case it seems out of the scope of this
PR :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
uranusjr commented on code in PR #59097: URL: https://github.com/apache/airflow/pull/59097#discussion_r2674307959 ## providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py: ## @@ -2199,6 +2199,83 @@ def test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod process_pod_deletion_mock.assert_not_called() assert result.metadata.name == pod_2.metadata.name [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_start_time_none( +self, +process_pod_deletion_mock, +patch_already_checked_mock, +on_finish_action, +): +now = datetime.datetime.now() +k = KubernetesPodOperator( +namespace="default", +image="ubuntu:22.04", +cmds=["bash", "-cx"], +arguments=["echo 12"], +name="test", +task_id="task", +do_xcom_push=False, +reattach_on_restart=False, +on_finish_action=on_finish_action, +) +context = create_context(k) +pod_1 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) +pod_2 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) + +pod_1.status = {"start_time": None} +pod_2.status = {"start_time": None} Review Comment: I’d argue other tests should use this pattern too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
uranusjr commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2674306609
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
]
+if not all(pod_start_times):
+pod_start_times: list[datetime.datetime] = [ # type:
ignore[no-redef]
+pod_start_time
+if (
+pod_start_time := pod.to_dict()
+.get("metadata", {})
+.get("creation_timestamp",
datetime.datetime.now(tz=datetime.timezone.utc))
+)
+else datetime.datetime.now(tz=datetime.timezone.utc)
Review Comment:
Do you mean the value to `"create_timestamp"` could contain None? That does
not feel right to me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
uranusjr commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2674304917
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
Review Comment:
Do you mean `status` should always be available? In that case you should not
use `get()` but `["status"]` instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3725927163 > Not my area of expertise either (yet) - I've assigned Copilot for an initial review, so please address its commentary as well. However, my 2 cents, if I understand these changes correctly - the `if` block makes the comparison based on a `creation_timestamp`, which is clearly not the `start_time`. Therefore, assigning the `creation_timestamp` to `pod_start_times` in the `if` statement is rather misleading. In that case, why not making the entire comparison based on `creation_timestamp`, replacing the `start_time` completely? Alternatively, if comparison based on `start_time` is useful for most cases, maybe you could introduce a flag for the user so they'll configure the comparison method? > > I'll be happy for an additional review from k8s owners. Hello @shahar1 thank you for your comment! My thought was about extending current logic not changing it completely. In my opinion if start_time happens to be None it should not lead to the error that a user should resolve manually. The recent pod could be defined based on the creation_timestamp. I don´t see it as a different way that a user should define in the operator because at the end the code only specifies which of the identical pods is most recent. WDYT? Also, if it is possible @jscheffl could you please help with the review of the PR? Or advise someone who has expertise to review and approve this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on code in PR #59097: URL: https://github.com/apache/airflow/pull/59097#discussion_r2673884597 ## providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py: ## @@ -2199,6 +2199,83 @@ def test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod process_pod_deletion_mock.assert_not_called() assert result.metadata.name == pod_2.metadata.name [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_start_time_none( +self, +process_pod_deletion_mock, +patch_already_checked_mock, +on_finish_action, +): +now = datetime.datetime.now() +k = KubernetesPodOperator( +namespace="default", +image="ubuntu:22.04", +cmds=["bash", "-cx"], +arguments=["echo 12"], +name="test", +task_id="task", +do_xcom_push=False, +reattach_on_restart=False, +on_finish_action=on_finish_action, +) +context = create_context(k) +pod_1 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) +pod_2 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) + +pod_1.status = {"start_time": None} +pod_2.status = {"start_time": None} +pod_1.metadata.creation_timestamp = now +pod_2.metadata.creation_timestamp = now + datetime.timedelta(seconds=60) +pod_2.metadata.labels.update({"try_number": "2"}) + +result = k.process_duplicate_label_pods([pod_1, pod_2]) + +patch_already_checked_mock.assert_called_once_with(pod_1, reraise=False) +process_pod_deletion_mock.assert_not_called() +assert result.metadata.name == pod_2.metadata.name + [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_no_start_time_or_creation_timestamp( +self, +process_pod_deletion_mock, +patch_already_checked_mock, +on_finish_action, +): +k = KubernetesPodOperator( +namespace="default", +image="ubuntu:22.04", +cmds=["bash", "-cx"], +arguments=["echo 12"], +name="test", +task_id="task", +do_xcom_push=False, +reattach_on_restart=False, +on_finish_action=on_finish_action, +) +context = create_context(k) +pod_1 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) +pod_2 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) + +pod_1.status = {"start_time": None} +pod_2.status = {"start_time": None} Review Comment: Previous tests use the same way to define the status field. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on code in PR #59097: URL: https://github.com/apache/airflow/pull/59097#discussion_r2673883326 ## providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py: ## @@ -2199,6 +2199,83 @@ def test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod process_pod_deletion_mock.assert_not_called() assert result.metadata.name == pod_2.metadata.name [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_start_time_none( +self, +process_pod_deletion_mock, +patch_already_checked_mock, +on_finish_action, +): +now = datetime.datetime.now() +k = KubernetesPodOperator( +namespace="default", +image="ubuntu:22.04", +cmds=["bash", "-cx"], +arguments=["echo 12"], +name="test", +task_id="task", +do_xcom_push=False, +reattach_on_restart=False, +on_finish_action=on_finish_action, +) +context = create_context(k) +pod_1 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) +pod_2 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) + +pod_1.status = {"start_time": None} +pod_2.status = {"start_time": None} Review Comment: Previous tests use the same way to define the status. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2673878213
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
Review Comment:
I don´t think that this case is relevant, because the status field should be
handled by Kubernetes API.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on code in PR #59097:
URL: https://github.com/apache/airflow/pull/59097#discussion_r2673872633
##
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py:
##
@@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list:
list[k8s.V1Pod]) -> int:
pod_start_times: list[datetime.datetime] = [
pod.to_dict().get("status").get("start_time") for pod in pod_list
]
+if not all(pod_start_times):
+pod_start_times: list[datetime.datetime] = [ # type:
ignore[no-redef]
+pod_start_time
+if (
+pod_start_time := pod.to_dict()
+.get("metadata", {})
+.get("creation_timestamp",
datetime.datetime.now(tz=datetime.timezone.utc))
+)
+else datetime.datetime.now(tz=datetime.timezone.utc)
Review Comment:
No, because .get("creation_timestamp",
datetime.datetime.now(tz=datetime.timezone.utc) will return datetime.now only
in case when creation_timestamp field is not presented in the metadata. For
this cases else statement in the line 1401 was added.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3660924200 Hello @hussein-awala @jedcunningham could you please help with the review of the PR? Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3660916235 Hello @potiuk ! Yes, my mistake, I checked manually and missed some of the PR´s reviewed by the code owners. I apologize for the false statement. Thank you for pointing it out. I will ping them in a separate comment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
potiuk commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3660077261 > While I understand your point, please note that the auto-assigned reviewers haven't reviewed any PRs in the last 6 months. I tagged you because the standard process didn't seem to be working. I am not sure where you get that statistics from, but maybe try to ping those auto-assigned reviewers. You have not tried it yet and seems that they are best suited. Maybe they don't realise you wait for their review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3656867285 Hello @potiuk , @shahar1 ! Thank you for the comments. I tagged you because I saw that Jarek previously participated in Kubernetes provider PRs approvals, and Shahar approved my previous PR (#49899). While I understand your point, please note that the auto-assigned reviewers haven't reviewed any PRs in the last 6 months. I tagged you because the standard process didn't seem to be working. Since I mostly contribute to the Google provider, I assumed tagging active maintainers was the right move to unblock the PR. I wasn't trying to be disruptive, just trying to get the code reviewed. So I apologize if I didn't follow the specific protocol for Kubernetes provider. I'll keep this in mind for next time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Copilot commented on code in PR #59097: URL: https://github.com/apache/airflow/pull/59097#discussion_r2615172777 ## providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py: ## @@ -2199,6 +2199,83 @@ def test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod process_pod_deletion_mock.assert_not_called() assert result.metadata.name == pod_2.metadata.name [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_start_time_none( Review Comment: The test name `test_process_duplicate_label_pods_with_start_time_none` could be more descriptive. Consider renaming to `test_process_duplicate_label_pods_falls_back_to_creation_timestamp_when_start_time_is_none` to make it clearer what behavior is being tested. ```suggestion def test_process_duplicate_label_pods_falls_back_to_creation_timestamp_when_start_time_is_none( ``` ## providers/cncf/kubernetes/tests/unit/cncf/kubernetes/operators/test_pod.py: ## @@ -2199,6 +2199,83 @@ def test_process_duplicate_label_pods__label_patched_if_action_is_not_delete_pod process_pod_deletion_mock.assert_not_called() assert result.metadata.name == pod_2.metadata.name [email protected]( +"on_finish_action", [OnFinishAction.KEEP_POD, OnFinishAction.DELETE_SUCCEEDED_POD] +) +@patch(KUB_OP_PATH.format("patch_already_checked")) +@patch(KUB_OP_PATH.format("process_pod_deletion")) +def test_process_duplicate_label_pods_with_start_time_none( +self, +process_pod_deletion_mock, +patch_already_checked_mock, +on_finish_action, +): +now = datetime.datetime.now() +k = KubernetesPodOperator( +namespace="default", +image="ubuntu:22.04", +cmds=["bash", "-cx"], +arguments=["echo 12"], +name="test", +task_id="task", +do_xcom_push=False, +reattach_on_restart=False, +on_finish_action=on_finish_action, +) +context = create_context(k) +pod_1 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) +pod_2 = k.get_or_create_pod(pod_request_obj=k.build_pod_request_obj(context), context=context) + +pod_1.status = {"start_time": None} +pod_2.status = {"start_time": None} Review Comment: Setting `pod.status` to a plain dictionary may not correctly mock the V1Pod object's status attribute. The kubernetes client's V1Pod expects a V1PodStatus object. Consider using `V1PodStatus(start_time=None)` instead to ensure the test accurately represents the actual API behavior when serialized with `to_dict()`. ```suggestion pod_1.status = V1PodStatus(start_time=None) pod_2.status = V1PodStatus(start_time=None) ``` ## providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py: ## @@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list: list[k8s.V1Pod]) -> int: pod_start_times: list[datetime.datetime] = [ pod.to_dict().get("status").get("start_time") for pod in pod_list Review Comment: This code will raise an AttributeError if `pod.to_dict().get("status")` returns None, since you cannot call `.get("start_time")` on None. The code should use chained get() calls with a default dictionary to handle the case where status is None, for example: `pod.to_dict().get("status", {}).get("start_time")`. ```suggestion pod.to_dict().get("status", {}).get("start_time") for pod in pod_list ``` ## providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py: ## @@ -1390,6 +1390,17 @@ def _get_most_recent_pod_index(pod_list: list[k8s.V1Pod]) -> int: pod_start_times: list[datetime.datetime] = [ pod.to_dict().get("status").get("start_time") for pod in pod_list ] +if not all(pod_start_times): +pod_start_times: list[datetime.datetime] = [ # type: ignore[no-redef] +pod_start_time +if ( +pod_start_time := pod.to_dict() +.get("metadata", {}) +.get("creation_timestamp", datetime.datetime.now(tz=datetime.timezone.utc)) +) +else datetime.datetime.now(tz=datetime.timezone.utc) Review Comment: The fallback logic has a flaw. The walrus operator on line 1397 assigns the result of `.get("creation_timestamp", datetime.datetime.now(tz=datetime.timezone.utc))`, which will always return a non-None value due to the default argument in get(). This means the condition
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
potiuk commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3638357641 Not sure why you are calling me - that's not my area of expertise. Generally what we prefer is when you are pinging to attract maintainer to review - ping "in general" "Hey can I get a review" - generally speaking when I answer to your ping now that "this is not my area of expertise" - this will significantly decrease your chances of getting help from some other maintainer, because they will see I was involved (and they will not even see that I responded "not my area of interest". But .. you made your choice. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Prevent transient error in case when Pod start_time parameter is None [airflow]
Crowiant commented on PR #59097: URL: https://github.com/apache/airflow/pull/59097#issuecomment-3636528976 Hello @potiuk @shahar1 could you please review the PR? Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
