olivermeyer opened a new issue, #23885:
URL: https://github.com/apache/airflow/issues/23885

   ### Apache Airflow version
   
   2.2.4
   
   ### What happened
   
   I am running Airflow 2.2.4 on Kubernetes, using the KubernetesExecutor. If I 
re-create the scheduler pod, it attempts to adopt running job pods but fails to 
do so:
   ```
   [2022-05-10 11:35:23,707] {kubernetes_executor.py:714} INFO - Failed to 
adopt pod <pod id>. Reason: (422)
   Reason: Unprocessable Entity
   HTTP response headers: HTTPHeaderDict({'Audit-Id': 
'ac152b63-74ef-48c2-b4eb-fe5fbc808a56', 'Cache-Control': 'no-cache, private', 
'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 
'251e299d-3b5d-4c7a-a3d9-46f17a316c93', 'X-Kubernetes-Pf-Prioritylevel-Uid': 
'66005fda-3f10-4344-8170-8c819dbbf59f', 'Date': 'Tue, 10 May 2022 11:35:23 
GMT', 'Transfer-Encoding': 'chunked'})
   HTTP response body: 
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod
 \"<pod id>\" is invalid: spec: Forbidden: pod updates may not change fields 
other than `spec.containers[*].image`, `spec.initContainers[*].image`, 
`spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing 
tolerations) <TRUNCATED>
   ```
   The pods are then killed by the scheduler.
   
   I tried to dig into the code as best I could, and I found that this _might_ 
be caused by the `KubernetesExecutor` trying to update the pod's 
`metadata.labels` 
[here](https://github.com/apache/airflow/blob/ee9049c0566b2539a247687de05f9cffa008f871/airflow/executors/kubernetes_executor.py#L697-L699)
 - but I could be wrong as I'm not very familiar with this part of Airflow.
   
   ### What you think should happen instead
   
   The scheduler should be able to adopt running pods instead of killing them.
   
   ### How to reproduce
   
   * Run Airflow with the KubernetesExecutor
   * Start a long-running task
   * Re-create the scheduler pod
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   This happens when the scheduler pod is re-created while a job pod is running.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to