JeremieDoctrine opened a new issue #20690:
URL: https://github.com/apache/airflow/issues/20690
### Apache Airflow version
2.2.2
### What happened
We are using Airflow version 2.2.2
Using kubernetes executor
We are running airflow with `multiNamespaceMode: true`
When the scheduler restarts it tries to adopt pods and fails. Below is the
error.
### What you expected to happen
```
[2022-01-05 16:09:48,531] {kubernetes_executor.py:730} INFO - Attempting to
adopt pod task_name
[2022-01-05 16:09:48,552] {kubernetes_executor.py:741} INFO - Failed to
adopt pod task_name. Reason: (422)
Reason: Unprocessable Entity
HTTP response headers: HTTPHeaderDict({'Audit-Id':
'abd29dca-ee1b-4ab4-990b-cc7a9bb2fa1d', 'Cache-Control': 'no-cache, private',
'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid':
'10aaeb70-dcb7-4f10-a829-3cb9c0af2c8d', 'X-Kubernetes-Pf-Prioritylevel-Uid':
'5ff16932-da39-41e3-b9da-590e6c736d0f', 'Date': 'Wed, 05 Jan 2022 16:09:48
GMT', 'Transfer-Encoding': 'chunked'})
HTTP response body:
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod
\"task_name\" is invalid: spec: Forbidden: pod updates may not change fields
other than `spec.containers[*].image`, `spec.initContainers[*].image`,
`spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing
tolerations)\n core.PodSpec
```
### How to reproduce
_No response_
### Operating System
Debian GNU/Linux 10 (buster)
### Versions of Apache Airflow Providers
_No response_
### Deployment
Other 3rd-party Helm chart
### Deployment details
```
executor: KubernetesExecutor
config:
core:
sql_alchemy_conn:
'postgresql+psycopg2://${db_user}:${db_password}@${db_host}/${database_name}'
max_active_tasks_per_dag: 50
parallelism: 50
donot_pickle: False
enable_xcom_pickling: True
default_timezone: "Europe/Paris"
load_default_connections: False
max_queued_run_per_dag: 16
dags_folder: /opt/airflow/dags/repo/dags
plugins_folder: /opt/airflow/dags/repo/plugins
scheduler:
catchup_by_default: False
kubernetes:
enable_tcp_keepalive: True
delete_worker_pods_on_failure: True
operators:
default_queue: masters
logging:
remote_logging: "true"
remote_base_log_folder: "s3://${log_folder}"
remote_log_conn_id: s3_log_conn
encrypt_s3_logs: False
webserver:
authenticate: True
auth_backend: airflow.contrib.auth.backends.google_auth
base_url: "https://${host}"
enable_proxy_fix: True
rbac: false
api:
auth_backend: airflow.api.auth.backend.basic_auth
images:
airflow:
tag: ${tag}
repository:
***REDACTED***.dkr.ecr.eu-central-1.amazonaws.com/${repository}
gitSync:
repository: k8s.gcr.io/git-sync/git-sync
tag: v3.3.5
redis:
enabled: False
postgresql:
enabled: False
data:
metadataConnection:
user: ${db_user}
pass: ${db_password}
protocol: postgresql
host: ${db_host}
port: 5432
db: ${database_name}
sslmode: disable
resultBackendConnection:
user: ${db_user}
pass: ${db_password}
protocol: postgresql
host: ${db_host}
port: 5432
db: ${database_name}
sslmode: disable
fernetKey: '***REDACTED***'
pgbouncer:
enabled: true
maxClientConn: 1000
resources:
limits:
cpu: 0.5
memory: 128Mi
requests:
cpu: 0.4
memory: 128Mi
podDisruptionBudget:
enabled: true
config:
maxUnavailable: 0
extraEnv: |
- name: PYTHONPATH
value: /opt/airflow/dags/repo
- name: SLACK_API_TOKEN
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: slack_api_token
- name: SLACK_WEBHOOK_URI
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: slack_webhook_uri
- name: ROLLBAR_WEBHOOK_TOKEN
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: rollbar_webhook_token
- name: AIRFLOW_CONN_S3_LOG_CONN
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: s3_log_conn
- name: AIRFLOW_CONN_S3_CONN
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: s3_conn
- name: AIRFLOW__GOOGLE__CLIENT_ID
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: google_auth_client_id
- name: AIRFLOW__GOOGLE__CLIENT_SECRET
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: google_auth_client_secret
- name: AIRFLOW__CORE__FERNET_KEY
valueFrom:
secretKeyRef:
name: airflow-connections-secrets
key: fernet_key
webserver:
webserverConfig: |
from flask_appbuilder.security.manager import AUTH_OAUTH
import os
AUTH_TYPE = AUTH_OAUTH
# Uncomment to setup Full admin role name
# AUTH_ROLE_ADMIN = 'Admin'
# Uncomment to setup Public role name, no authentication needed
# AUTH_ROLE_PUBLIC = 'Public'
# Will allow user self registration
AUTH_USER_REGISTRATION = True
# The default user self registration role
AUTH_USER_REGISTRATION_ROLE = "Admin"
print("#" * 5)
print("USING GOOGLE AUTH")
OAUTH_PROVIDERS = [
{
"name": "google",
"whitelist": ["***REDACTED***"],
"icon": "fa-google",
"token_key": "access_token",
"remote_app": {
"client_id": os.environ["AIRFLOW__GOOGLE__CLIENT_ID"],
"client_secret":
os.environ["AIRFLOW__GOOGLE__CLIENT_SECRET"],
"api_base_url": "https://www.googleapis.com/oauth2/v2/",
"client_kwargs": {"scope": "email profile"},
"request_token_url": None,
"access_token_url": "https://oauth2.googleapis.com/token",
"authorize_url": "https://accounts.google.com/o/oauth2/auth",
},
}
]
resources:
limits:
cpu: 0.5
memory: "2G"
requests:
cpu: 0.5
memory: "2G"
ingress:
enabled: true
web:
annotations: {
"alb.ingress.kubernetes.io/actions.redirect": "***REDACTED***",
"alb.ingress.kubernetes.io/certificate-arn": "***REDACTED***",
"alb.ingress.kubernetes.io/listen-ports": "[{\"HTTP\": 80},
{\"HTTPS\":443}]",
"alb.ingress.kubernetes.io/scheme": "internal",
"alb.ingress.kubernetes.io/target-type": "ip",
"kubernetes.io/ingress.class": "alb",
}
path: "/*"
pathType: "ImplementationSpecific"
host: "${host}"
# hosts: ["${host}"]
# ingressClassName: "alb"
tls:
enabled: false
secretName: ""
precedingPaths: []
succeedingPaths: []
workers:
resources:
limits:
cpu: 0.5
memory: "1024M"
requests:
cpu: 0.25
memory: "128Mi"
scheduler:
replicas: 1
podDisruptionBudget:
enabled: true
config:
maxUnavailable: 0
multiNamespaceMode: true
logs:
persistence:
enabled: true
storageClassName: efs-sc
airflowPodAnnotations: {'cluster-autoscaler.kubernetes.io/safe-to-evict':
'false','kubernetes.io/psp': 'eks.privileged'}
dags:
persistence:
# Enable persistent volume for storing dags
enabled: false
# Volume size for dags
size: 1Gi
# access mode of the persistent volume
accessMode: ReadWriteMany
gitSync:
enabled: true
repo: ***REDACTED***
branch: master
subPath: ""
rev: HEAD
depth: 1
sshKeySecret: ***REDACTED***
wait: 60
uid: 65533
containerName: git-sync
knownHosts: ***REDACTED***
```
### Anything else
This looks a bit similar than https://github.com/apache/airflow/issues/20203
except that we are running on the same deployment.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]