berglh commented on issue #44994:
URL: https://github.com/apache/airflow/issues/44994#issuecomment-2684110502

   Hi there,
   
   We hit this problem after upgrading to Airflow v2.10.5 on the v1.15.0 
Official helm chart as well. We then downgraded the CNCF Kubernetes plugin to 
version 8.3.3 on our Airflow container build, so that it was the same version 
in Airflow v2.9.3 which is the default Airflow version for the v1.15.0 Helm 
chart, and it appears to start working OK again.
   
   We hit the same error as the @osintalex with the matching label error. 
However, when we set `reattach_on_restart=False` in the KubernetesJobOperator 
we also hit the follow error:
   
   ````
   HTTP response body: 
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod
 \"axs-raw-people-count-5kjj2b7k\" is invalid: spec.containers[0].image: 
Required 
value","reason":"Invalid","details":{"name":"axs-raw-people-count-5kjj2b7k","kind":"Pod","causes":[{"reason":"FieldValueRequired","message":"Required
 value","field":"spec.containers[0].image"}]},"code":422
   ```
   
   Comparing the two `execute()` methods on the JobOperator class, we can see 
that there were some significant changes
   
   ### Version 8.3.3 ()
   
   
https://github.com/apache/airflow/blob/providers-cncf-kubernetes/8.3.3/airflow/providers/cncf/kubernetes/operators/job.py#L147-L178
   
   ### Version 10.1.0
   
   
https://github.com/apache/airflow/blob/providers-cncf-kubernetes/10.1.0/providers/src/airflow/providers/cncf/kubernetes/operators/job.py#L150-L206
   
   We can see that the paths changed in the plugin tags, during this process 
and the following code was added:
   
   
https://github.com/apache/airflow/blob/providers-cncf-kubernetes/10.1.0/providers/src/airflow/providers/cncf/kubernetes/operators/job.py#L170-L174
   
   There is some additional logic in the changes, but I think this should not 
at all be specifying `self.pod` attribute in the object at all in the 
KubernetesJobOperator. My guess is our secondary error is caused by the fact 
the KuberenetesJobOperator extends the KuberenetesPodOperator class, and now it 
tries to inspect the KuberenetesJobOperator class Pod and manifest structure as 
though it was a KuberenetesPodOperator.
   
   In my naive thoughts we shouldn't care about tracking the Pods in a Job, as 
that's what the Job object is meant to do in Kubernetes. I'm not sure if that's 
an intended change, but this clearly looks like a bug due to Commit 170b9ce. It 
seems like this code block may of been inadvertently copied from 
KuberenetesJobOperator when adding additional logic. My suggestion is we just 
remove this block all together to rectify the issue if it's superfluous to the 
KuberenetesJobOperator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to