danielsteman opened a new issue, #24295:
URL: https://github.com/apache/airflow/issues/24295
### Apache Airflow version
2.3.2 (latest released)
### What happened
We deployed the latest version of Airflow on our K8s cluster (AKS) and we
noticed that the webserver is restarted every minute or so. The pod logs don't
show the immediate problem, apart from a particular exception that's being
raised several times by sqlalchemy which is the following:
```
ERROR - Creation of Permission View Error: (psycopg2.errors.UniqueViolation)
duplicate key value violates unique constraint
"ab_permission_view_permission_id_view_menu_id_key"
DETAIL: Key (permission_id, view_menu_id)=(5, 15) already exists.
```
We also noticed that the Google provider couldn't be imported but that
shouln't be an issue since we don't use that. I'm trying to find out what is
causing these restarts and how I can get the webserver to be stable.
These are the pod logs of the webserver:
```
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
Logfiles: - -
Access Logformat:
=================================================================
[2022-06-07 15:13:19,965] {webserver_command.py:231} DEBUG - [0 / 4] Some
workers are starting up, waiting...
[2022-06-07 15:13:20,390] {settings.py:508} DEBUG - User session lifetime is
set to 43200 minutes.
[2022-06-07 15:13:20,420] {settings.py:362} DEBUG -
settings.prepare_engine_args(): Using pool settings. pool_size=5,
max_overflow=10, pool_recycle=1800, pid=20
[2022-06-07 15:13:20,435] {logging_config.py:53} DEBUG - Unable to load
custom logging, using default config instead
[2022-06-07 15:13:20,881] {manager.py:585} INFO - Removed Permission menu
access on Permissions to role Admin
[2022-06-07 15:13:20,928] {manager.py:543} INFO - Removed Permission View:
menu_access on Permissions
[2022-06-07 15:13:21,081] {settings.py:508} DEBUG - User session lifetime is
set to 43200 minutes.
[2022-06-07 15:13:21,110] {settings.py:362} DEBUG -
settings.prepare_engine_args(): Using pool settings. pool_size=5,
max_overflow=10, pool_recycle=1800, pid=21
[2022-06-07 15:13:21,115] {manager.py:508} INFO - Created Permission View:
menu access on Permissions
[2022-06-07 15:13:21,123] {logging_config.py:53} DEBUG - Unable to load
custom logging, using default config instead
[2022-06-07 15:13:21,137] {manager.py:568} INFO - Added Permission menu
access on Permissions to role Admin
[2022-06-07 15:13:21,159] {settings.py:508} DEBUG - User session lifetime is
set to 43200 minutes.
[2022-06-07 15:13:21,167] {settings.py:508} DEBUG - User session lifetime is
set to 43200 minutes.
[2022-06-07 15:13:21,199] {settings.py:362} DEBUG -
settings.prepare_engine_args(): Using pool settings. pool_size=5,
max_overflow=10, pool_recycle=1800, pid=23
[2022-06-07 15:13:21,202] {settings.py:362} DEBUG -
settings.prepare_engine_args(): Using pool settings. pool_size=5,
max_overflow=10, pool_recycle=1800, pid=22
[2022-06-07 15:13:21,206] {logging_config.py:53} DEBUG - Unable to load
custom logging, using default config instead
[2022-06-07 15:13:21,211] {logging_config.py:53} DEBUG - Unable to load
custom logging, using default config instead
[2022-06-07 15:13:21,606] {manager.py:585} INFO - Removed Permission menu
access on Permissions to role Admin
[2022-06-07 15:13:21,609] {manager.py:585} INFO - Removed Permission menu
access on Permissions to role Admin
/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py:1461
SAWarning: DELETE statement on table 'ab_permission_view' expected to delete 1
row(s); 0 were matched. Please set confirm_deleted_rows=False within the
mapper configuration to prevent this warning.
[2022-06-07 15:13:21,645] {manager.py:543} INFO - Removed Permission View:
menu_access on Permissions
[2022-06-07 15:13:21,647] {manager.py:543} INFO - Removed Permission View:
menu_access on Permissions
[2022-06-07 15:13:21,818] {manager.py:508} INFO - Created Permission View:
menu access on Permissions
[2022-06-07 15:13:21,819] {manager.py:511} ERROR - Creation of Permission
View Error: (psycopg2.errors.UniqueViolation) duplicate key value violates
unique constraint "ab_permission_view_permission_id_view_menu_id_key"
DETAIL: Key (permission_id, view_menu_id)=(5, 15) already exists.
[SQL: INSERT INTO ab_permission_view (id, permission_id, view_menu_id)
VALUES (nextval('ab_permission_view_id_seq'), %(permission_id)s,
%(view_menu_id)s) RETURNING ab_permission_view.id]
[parameters: {'permission_id': 5, 'view_menu_id': 15}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-06-07 15:13:21,830] {manager.py:568} INFO - Added Permission menu
access on Permissions to role Admin
[2022-06-07 15:13:21,836] {manager.py:570} ERROR - Add Permission to Role
Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique
constraint "ab_permission_view_role_permission_view_id_role_id_key"
DETAIL: Key (permission_view_id, role_id)=(9646, 1) already exists.
[SQL: INSERT INTO ab_permission_view_role (id, permission_view_id, role_id)
VALUES (nextval('ab_permission_view_role_id_seq'), %(permission_view_id)s,
%(role_id)s) RETURNING ab_permission_view_role.id]
[parameters: {'permission_view_id': 9646, 'role_id': 1}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-06-07 15:13:21,942] {manager.py:570} ERROR - Add Permission to Role
Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique
constraint "ab_permission_view_role_permission_view_id_role_id_key"
DETAIL: Key (permission_view_id, role_id)=(9646, 1) already exists.
[SQL: INSERT INTO ab_permission_view_role (id, permission_view_id, role_id)
VALUES (nextval('ab_permission_view_role_id_seq'), %(permission_view_id)s,
%(role_id)s) RETURNING ab_permission_view_role.id]
[parameters: {'permission_view_id': 9646, 'role_id': 1}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
```
### What you think should happen instead
The webserver shouldn't restart that often.
### How to reproduce
Install the official Apache Airflow helm chart (version 1.6.0). The values
used are attached.
### Operating System
Ubuntu 20.04.3 LTS
### Versions of Apache Airflow Providers
2.3.2
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "7"
meta.helm.sh/release-name: airflow2
meta.helm.sh/release-namespace: airflow
creationTimestamp: "2022-06-01T09:15:33Z"
generation: 9
labels:
app.kubernetes.io/managed-by: Helm
chart: airflow-1.6.0
component: webserver
heritage: Helm
release: airflow2
tier: airflow
name: airflow2-webserver
namespace: airflow
resourceVersion: "8081109"
uid: 6d68ebd7-1538-4297-95e7-53dfbeb92fcb
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
component: webserver
release: airflow2
tier: airflow
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
annotations:
checksum/airflow-config:
2be5463333925fcbdf9f5f43e4be92639e42a0d83c4056a3297339be78dc1e40
checksum/extra-configmaps:
78ea18063ba5598229865ebb823ea30e8f8f40c7a04e44055ddff7ee4dd4d719
checksum/extra-secrets:
bb91ef06ddc31c0c5a29973832163d8b0b597812a793ef911d33b622bc9d1655
checksum/metadata-secret:
0cb618586cd6fd8f6632ca0acb91fb04833ebbb8a8dd70749fdc445cb20c2ba1
checksum/pgbouncer-config-secret:
942354d328c2108390909975571ee165865e43b20615e7517675b134d618575a
checksum/webserver-config:
4a2281a4e3ed0cc5e89f07aba3c1bb314ea51c17cb5d2b41e9b045054a6b5c72
checksum/webserver-secret-key:
543ddc641c5724aae4527e71048538dd12d3b3e51c57482f8528c15204f3633b
creationTimestamp: null
labels:
component: webserver
release: airflow2
tier: airflow
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
component: webserver
topologyKey: kubernetes.io/hostname
weight: 100
containers:
- args:
- bash
- -c
- exec airflow webserver
env:
- name: AIRFLOW__LOGGING__LOGGING_LEVEL
value: DEBUG
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__LOGGING__LOGGING_LEVEL
value: DEBUG
- name: AIRFLOW__WEBSERVER__BASE_URL
value: airflow.analyticslab-dev.azure.asr.nl
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__WEBSERVER__BASE_URL
value: airflow.analyticslab-dev.azure.asr.nl
- name: DYNACONF_SQL_CONNECTION_STRING
value:
Server=tcp:iendanalyticssqlserver.database.windows.net,1433;Database=OnlineData;Uid=louis;Pwd=7jNZ30zoAZjnTyAl7DJ0;Encrypt=yes;TrustServerCertificate=no;Connection
Timeout=30;
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__DYNACONF_SQL_CONNECTION_STRING
value:
Server=tcp:iendanalyticssqlserver.database.windows.net,1433;Database=OnlineData;Uid=louis;Pwd=7jNZ30zoAZjnTyAl7DJ0;Encrypt=yes;TrustServerCertificate=no;Connection
Timeout=30;
- name: DYNACONF_DATALAKE_ACCOUNT_KEY
value:
8Q9ciq8MtFu+HkfbO0zLGbtSZxZg6PaKU0W90C5r6TyVG6mV9E2aXZ3zv9Ua961pIBJSuIKSjGiJfpaEj6SSHw==
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__DYNACONF_DATALAKE_ACCOUNT_KEY
value:
8Q9ciq8MtFu+HkfbO0zLGbtSZxZg6PaKU0W90C5r6TyVG6mV9E2aXZ3zv9Ua961pIBJSuIKSjGiJfpaEj6SSHw==
- name: ENV_FOR_DYNACONF
value: production
- name: AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__ENV_FOR_DYNACONF
value: production
- name: AIRFLOW__CORE__FERNET_KEY
valueFrom:
secretKeyRef:
key: fernet-key
name: airflow2-fernet-key
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
key: connection
name: airflow2-airflow-metadata
- name: AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
key: connection
name: airflow2-airflow-metadata
- name: AIRFLOW_CONN_AIRFLOW_DB
valueFrom:
secretKeyRef:
key: connection
name: airflow2-airflow-metadata
- name: AIRFLOW__WEBSERVER__SECRET_KEY
valueFrom:
secretKeyRef:
key: webserver-secret-key
name: airflow2-webserver-secret-key
image:
weanalyticslabopublicimages.azurecr.io/asr-airflow:2.3.2-python3.8
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 20
httpGet:
path: /health
port: 80
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 30
name: webserver
ports:
- containerPort: 80
name: airflow-ui
protocol: TCP
readinessProbe:
failureThreshold: 20
httpGet:
path: /health
port: 80
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 30
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/airflow/pod_templates/pod_template_file.yaml
name: config
readOnly: true
subPath: pod_template_file.yaml
- mountPath: /opt/airflow/airflow.cfg
name: config
readOnly: true
subPath: airflow.cfg
- mountPath: /opt/airflow/config/airflow_local_settings.py
name: config
readOnly: true
subPath: airflow_local_settings.py
- mountPath: /opt/airflow/logs
name: logs
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: acrsecretpublic
initContainers:
- args:
- airflow
- db
- check-migrations
- --migration-wait-timeout=60
env:
- name: AIRFLOW__LOGGING__LOGGING_LEVEL
value: DEBUG
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__LOGGING__LOGGING_LEVEL
value: DEBUG
- name: AIRFLOW__WEBSERVER__BASE_URL
value: airflow.analyticslab-dev.azure.asr.nl
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__AIRFLOW__WEBSERVER__BASE_URL
value: airflow.analyticslab-dev.azure.asr.nl
- name: DYNACONF_SQL_CONNECTION_STRING
value:
Server=tcp:iendanalyticssqlserver.database.windows.net,1433;Database=OnlineData;Uid=louis;Pwd=7jNZ30zoAZjnTyAl7DJ0;Encrypt=yes;TrustServerCertificate=no;Connection
Timeout=30;
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__DYNACONF_SQL_CONNECTION_STRING
value:
Server=tcp:iendanalyticssqlserver.database.windows.net,1433;Database=OnlineData;Uid=louis;Pwd=7jNZ30zoAZjnTyAl7DJ0;Encrypt=yes;TrustServerCertificate=no;Connection
Timeout=30;
- name: DYNACONF_DATALAKE_ACCOUNT_KEY
value:
8Q9ciq8MtFu+HkfbO0zLGbtSZxZg6PaKU0W90C5r6TyVG6mV9E2aXZ3zv9Ua961pIBJSuIKSjGiJfpaEj6SSHw==
- name:
AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__DYNACONF_DATALAKE_ACCOUNT_KEY
value:
8Q9ciq8MtFu+HkfbO0zLGbtSZxZg6PaKU0W90C5r6TyVG6mV9E2aXZ3zv9Ua961pIBJSuIKSjGiJfpaEj6SSHw==
- name: ENV_FOR_DYNACONF
value: production
- name: AIRFLOW__KUBERNETES_ENVIRONMENT_VARIABLES__ENV_FOR_DYNACONF
value: production
- name: AIRFLOW__CORE__FERNET_KEY
valueFrom:
secretKeyRef:
key: fernet-key
name: airflow2-fernet-key
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
key: connection
name: airflow2-airflow-metadata
- name: AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
key: connection
name: airflow2-airflow-metadata
- name: AIRFLOW_CONN_AIRFLOW_DB
valueFrom:
secretKeyRef:
key: connection
name: airflow2-airflow-metadata
- name: AIRFLOW__WEBSERVER__SECRET_KEY
valueFrom:
secretKeyRef:
key: webserver-secret-key
name: airflow2-webserver-secret-key
image:
weanalyticslabopublicimages.azurecr.io/asr-airflow:2.3.2-python3.8
imagePullPolicy: IfNotPresent
name: wait-for-airflow-migrations
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/airflow/airflow.cfg
name: config
readOnly: true
subPath: airflow.cfg
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 0
runAsUser: 50000
serviceAccount: airflow2-webserver
serviceAccountName: airflow2-webserver
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: airflow2-airflow-config
name: config
- name: logs
persistentVolumeClaim:
claimName: logs
status:
conditions:
- lastTransitionTime: "2022-06-01T09:15:33Z"
lastUpdateTime: "2022-06-01T09:15:33Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2022-06-07T14:31:39Z"
lastUpdateTime: "2022-06-07T14:31:39Z"
message: ReplicaSet "airflow2-webserver-55584fc9c4" has timed out
progressing.
reason: ProgressDeadlineExceeded
status: "False"
type: Progressing
observedGeneration: 9
replicas: 2
unavailableReplicas: 2
updatedReplicas: 1
❯ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]