berglh commented on issue #44994:
URL: https://github.com/apache/airflow/issues/44994#issuecomment-2684110502
Hi there,
We hit this problem after upgrading to Airflow v2.10.5 on the v1.15.0
Official helm chart as well. We then downgraded the CNCF Kubernetes plugin to
version 8.3.3 on our Airflow container build, so that it was the same version
in Airflow v2.9.3 which is the default Airflow version for the v1.15.0 Helm
chart, and it appears to start working OK again.
We hit the same error as the @osintalex with the matching label error.
However, when we set `reattach_on_restart=False` in the KubernetesJobOperator
we also hit the follow error:
````
HTTP response body:
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod
\"axs-raw-people-count-5kjj2b7k\" is invalid: spec.containers[0].image:
Required
value","reason":"Invalid","details":{"name":"axs-raw-people-count-5kjj2b7k","kind":"Pod","causes":[{"reason":"FieldValueRequired","message":"Required
value","field":"spec.containers[0].image"}]},"code":422
```
Comparing the two `execute()` methods on the JobOperator class, we can see
that there were some significant changes
### Version 8.3.3 ()
https://github.com/apache/airflow/blob/providers-cncf-kubernetes/8.3.3/airflow/providers/cncf/kubernetes/operators/job.py#L147-L178
### Version 10.1.0
https://github.com/apache/airflow/blob/providers-cncf-kubernetes/10.1.0/providers/src/airflow/providers/cncf/kubernetes/operators/job.py#L150-L206
We can see that the paths changed in the plugin tags, during this process
and the following code was added:
https://github.com/apache/airflow/blob/providers-cncf-kubernetes/10.1.0/providers/src/airflow/providers/cncf/kubernetes/operators/job.py#L170-L174
There is some additional logic in the changes, but I think this should not
at all be specifying `self.pod` attribute in the object at all in the
KubernetesJobOperator. My guess is our secondary error is caused by the fact
the KuberenetesJobOperator extends the KuberenetesPodOperator class, and now it
tries to inspect the KuberenetesJobOperator class Pod and manifest structure as
though it was a KuberenetesPodOperator.
In my naive thoughts we shouldn't care about tracking the Pods in a Job, as
that's what the Job object is meant to do in Kubernetes. I'm not sure if that's
an intended change, but this clearly looks like a bug due to Commit 170b9ce. It
seems like this code block may of been inadvertently copied from
KuberenetesJobOperator when adding additional logic. My suggestion is we just
remove this block all together to rectify the issue if it's superfluous to the
KuberenetesJobOperator.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]