wolfier opened a new issue #19320: URL: https://github.com/apache/airflow/issues/19320
### Apache Airflow version 2.2.0 (latest released) ### Operating System Debian GNU/Linux 10 (buster) ### Versions of Apache Airflow Providers _No response_ ### Deployment Astronomer ### Deployment details  ### What happened Task stuck inside KubernetsExecutor task queue when the request is does not contains only resource requests and no resource limits. ``` from datetime import datetime from airflow.models import DAG from airflow.operators.bash import BashOperator from kubernetes.client import models as k8s with DAG( dag_id='resource', catchup=False, schedule_interval='@once', start_date=datetime(2020, 1, 1), ) as dag: op = BashOperator( task_id='task', bash_command="sleep 30", dag=dag, executor_config={ "pod_override": k8s.V1Pod( spec=k8s.V1PodSpec( containers=[ k8s.V1Container( name="base", resources=k8s.V1ResourceRequirements( requests={ "cpu": 0.5, "memory": "500Mi", }, # limits={ # "cpu": 0.5, # "memory": "2000Mi" # } ) ) ] ) ) } ) ``` ### What you expected to happen I expect the pod to be scheduled or at least fail and not stuck inside KubernetesExecutor's task queue / BaseExecutor's running set. Since the pod never spawned, it cannot run and more importantly finish, which is when the pod state changes to anything other than RUNNING and the task instance key is removed from the running set. https://github.com/apache/airflow/blob/2.2.0/airflow/executors/kubernetes_executor.py#L572-L588 ### How to reproduce 1. Create a DAG that only contains resource request and not resource limits 2. See Airflow task instance stuck in queued state because the task is never ran to move the task state to running. ### Anything else <details> <summary>Full scheduler logs</summary> ``` [2021-10-29 00:15:02,152] {scheduler_job.py:442} INFO - Sending TaskInstanceKey(dag_id='resource', task_id='task', run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1) to executor with priority 1 and queue celery [2021-10-29 00:15:02,152] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'resource', 'task', 'scheduled__2020-01-01T00:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/2.0/resource.py'] [2021-10-29 00:15:02,154] {base_executor.py:150} DEBUG - 0 running task instances [2021-10-29 00:15:02,154] {base_executor.py:151} DEBUG - 1 in queue [2021-10-29 00:15:02,154] {base_executor.py:152} DEBUG - 32 open slots [2021-10-29 00:15:02,155] {kubernetes_executor.py:534} INFO - Add task TaskInstanceKey(dag_id='resource', task_id='task', run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1) with command ['airflow', 'tasks', 'run', 'resource', 'task', 'scheduled__2020-01-01T00:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/2.0/resource.py'] with executor_config {'pod_override': {'api_version': None, 'kind': None, 'metadata': None, 'spec': {'active_deadline_seconds': None, 'affinity': None, 'automount_service_account_token': None, 'containers': [{'args': None, 'command': None, 'env': None, 'env_from': None, 'image': None, 'image_pull_policy': None, 'lifecycle': None, 'liveness_probe': None, 'name': 'base', 'ports': None, 'readiness_probe': None, 'resources': {'limits': None, 'requests': {'cpu': '0.5', 'memory': '500Mi'}}, 'security_context': None, 'stdin': None, 'stdin_once': None, 'termination_message_path': None, 'termination_message_policy': None, 'tty': None, 'volume_devices': None, 'volume_mounts': None, 'working_dir': None}], 'dns_config': None, 'dns_policy': None, 'enable_service_links': None, 'host_aliases': None, 'host_ipc': None, 'host_network': None, 'host_pid': None, 'hostname': None, 'image_pull_secrets': None, 'init_containers': None, 'node_name': None, 'node_selector': None, 'preemption_policy': None, 'priority': None, 'priority_class_name': None, 'readiness_gates': None, 'restart_policy': None, 'runtime_class_name': None, 'scheduler_name': None, 'security_context': None, 'service_account': None, 'service_account_name': None, 'share_process_namespace': None, 'subdomain': None, 'termination_grace_period_seconds': None, 'tolerations': None, 'volumes': None}, 'status': None}} [2021-10-29 00:15:02,157] {base_executor.py:161} DEBUG - Calling the <class 'airflow.executors.kubernetes_executor.KubernetesExecutor'> sync method [2021-10-29 00:15:02,157] {kubernetes_executor.py:557} DEBUG - self.running: {TaskInstanceKey(dag_id='resource', task_id='task', run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1)} [2021-10-29 00:15:02,158] {kubernetes_executor.py:361} DEBUG - Syncing KubernetesExecutor [2021-10-29 00:15:02,158] {kubernetes_executor.py:286} DEBUG - KubeJobWatcher alive, continuing [2021-10-29 00:15:02,160] {kubernetes_executor.py:300} INFO - Kubernetes job is (TaskInstanceKey(dag_id='resource', task_id='task', run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1), ['airflow', 'tasks', 'run', 'resource', 'task', 'scheduled__2020-01-01T00:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/2.0/resource.py'], {'api_version': None, 'kind': None, 'metadata': None, 'spec': {'active_deadline_seconds': None, 'affinity': None, 'automount_service_account_token': None, 'containers': [{'args': None, 'command': None, 'env': None, 'env_from': None, 'image': None, 'image_pull_policy': None, 'lifecycle': None, 'liveness_probe': None, 'name': 'base', 'ports': None, 'readiness_probe ': None, 'resources': {'limits': None, 'requests': {'cpu': '0.5', 'memory': '500Mi'}}, 'security_context': None, 'stdin': None, 'stdin_once': None, 'termination_message_path': None, 'termination_message_policy': None, 'tty': None, 'volume_devices': None, 'volume_mounts': None, 'working_dir': None}], 'dns_config': None, 'dns_policy': None, 'enable_service_links': None, 'host_aliases': None, 'host_ipc': None, 'host_network': None, 'host_pid': None, 'hostname': None, 'image_pull_secrets': None, 'init_containers': None, 'node_name': None, 'node_selector': None, 'preemption_policy': None, 'priority': None, 'priority_class_name': None, 'readiness_gates': None, 'restart_policy': None, 'runtime_class_name': None, 'scheduler_name': None, 'security_context': None, 'service_account': None, 'service_account_name': None, 'share_process_namespace': None, 'subdomain': None, 'termination_grace_period_seconds': None, 'tolerations': None, 'volumes': None}, 'status': None}, None) [2021-10-29 00:15:02,173] {kubernetes_executor.py:330} DEBUG - Kubernetes running for command ['airflow', 'tasks', 'run', 'resource', 'task', 'scheduled__2020-01-01T00:00:00+00:00', '--local', '--subdir', 'DAGS_FOLDER/2.0/resource.py'] [2021-10-29 00:15:02,173] {kubernetes_executor.py:331} DEBUG - Kubernetes launching image registry.gcp0001.us-east4.astronomer.io/quasarian-spectroscope-0644/airflow:deploy-2 [2021-10-29 00:15:02,174] {kubernetes_executor.py:260} DEBUG - Pod Creation Request: { "apiVersion": "v1", "kind": "Pod", "metadata": { "annotations": { "checksum/airflow-secrets": "70012f356e10dceeb7c58fb0ce05014197b5fa1c1a5d8955ce1a1a4cc7347fa8bc67336041bded0f5bb700f6b5a17c794d7dc1ec00b72e6e98998f1f45efd286", "dag_id": "resource", "task_id": "task", "try_number": "1", "run_id": "scheduled__2020-01-01T00:00:00+00:00" }, "labels": { "tier": "airflow", "component": "worker", "release": "quasarian-spectroscope-0644", "platform": "astronomer", "workspace": "cki7jmbr53180161pjtfor7aoj1", "airflow-worker": "8", "dag_id": "resource", "task_id": "task", "try_number": "1", "airflow_version": "2.2.0-astro.2", "kubernetes_executor": "True", "run_id": "scheduled__2020-01-01T0000000000-cc3b7db2b", "kubernetes-pod-operator": "False" }, "name": "resourcetask.4c8ffd0238954d27b7891491a8d6da4f", "namespace": "astronomer-quasarian-spectroscope-0644" }, "spec": { "affinity": { "nodeAffinity": { "requiredDuringSchedulingIgnoredDuringExecution": { "nodeSelectorTerms": [ { "matchExpressions": [ { "key": "astronomer.io/multi-tenant", "operator": "In", "values": [ "true" ] } ] } ] } } }, "containers": [ { "args": [ "airflow", "tasks", "run", "resource", "task", "scheduled__2020-01-01T00:00:00+00:00", "--local", "--subdir", "DAGS_FOLDER/2.0/resource.py" ], "command": [ "tini", "--", "/entrypoint" ], "env": [ { "name": "AIRFLOW__CORE__EXECUTOR", "value": "LocalExecutor" }, { "name": "AIRFLOW__CORE__FERNET_KEY", "valueFrom": { "secretKeyRef": { "key": "fernet-key", "name": "quasarian-spectroscope-0644-fernet-key" } } }, { "name": "AIRFLOW__CORE__SQL_ALCHEMY_CONN", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-airflow-metadata" } } }, { "name": "AIRFLOW_CONN_AIRFLOW_DB", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-airflow-metadata" } } }, { "name": "AIRFLOW__WEBSERVER__SECRET_KEY", "valueFrom": { "secretKeyRef": { "key": "webserver-secret-key", "name": "quasarian-spectroscope-0644-webserver-secret-key" } } }, { "name": "AIRFLOW__ELASTICSEARCH__HOST", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-elasticsearch" } } }, { "name": "AIRFLOW__ELASTICSEARCH__ELASTICSEARCH_HOST", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-elasticsearch" } } }, { "name": "AIRFLOW_IS_K8S_EXECUTOR_POD", "value": "True" } ], "envFrom": [ { "secretRef": { "name": "quasarian-spectroscope-0644-env" } } ], "image": "registry.gcp0001.us-east4.astronomer.io/quasarian-spectroscope-0644/airflow:deploy-2", "imagePullPolicy": "IfNotPresent", "name": "base", "resources": { "requests": { "cpu": "0.5", "memory": "500Mi" } }, "volumeMounts": [ { "mountPath": "/usr/local/airflow/logs", "name": "logs" }, { "mountPath": "/usr/local/airflow/airflow.cfg", "name": "config", "readOnly": true, "subPath": "airflow.cfg" }, { "mountPath": "/usr/local/airflow/config/airflow_local_settings.py", "name": "config", "readOnly": true, "subPath": "airflow_local_settings.py" } ] } ], "imagePullSecrets": [ { "name": "quasarian-spectroscope-0644-registry" } ], "restartPolicy": "Never", "securityContext": { "fsGroup": 50000, "runAsUser": 50000 }, "serviceAccountName": "quasarian-spectroscope-0644-airflow-worker", "volumes": [ { "emptyDir": {}, "name": "logs" }, { "configMap": { "name": "quasarian-spectroscope-0644-airflow-config" }, "name": "config" } ] } } [2021-10-29 00:15:02,215] {rest.py:228} DEBUG - response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod \"resourcetask.4c8ffd0238954d27b7891491a8d6da4f\" is invalid: [spec.containers[0].resources.requests: Invalid value: \"500m\": must be less than or equal to cpu limit, spec.containers[0].resources.requests: Invalid value: \"500Mi\": must be less than or equal to memory limit]","reason":"Invalid","details":{"name":"resourcetask.4c8ffd0238954d27b7891491a8d6da4f","kind":"Pod","causes":[{"reason":"FieldValueInvalid","message":"Invalid value: \"500m\": must be less than or equal to cpu limit","field":"spec.containers[0].resources.requests"},{"reason":"FieldValueInvalid","message":"Invalid value: \"500Mi\": must be less than or equal to memory limit","field":"spec.containers[0].resources.requests"}]},"code":422} [2021-10-29 00:15:02,215] {kubernetes_executor.py:267} ERROR - Exception when attempting to create Namespaced Pod: { "apiVersion": "v1", "kind": "Pod", "metadata": { "annotations": { "checksum/airflow-secrets": "70012f356e10dceeb7c58fb0ce05014197b5fa1c1a5d8955ce1a1a4cc7347fa8bc67336041bded0f5bb700f6b5a17c794d7dc1ec00b72e6e98998f1f45efd286", "dag_id": "resource", "task_id": "task", "try_number": "1", "run_id": "scheduled__2020-01-01T00:00:00+00:00" }, "labels": { "tier": "airflow", "component": "worker", "release": "quasarian-spectroscope-0644", "platform": "astronomer", "workspace": "cki7jmbr53180161pjtfor7aoj1", "airflow-worker": "8", "dag_id": "resource", "task_id": "task", "try_number": "1", "airflow_version": "2.2.0-astro.2", "kubernetes_executor": "True", "run_id": "scheduled__2020-01-01T0000000000-cc3b7db2b", "kubernetes-pod-operator": "False" }, "name": "resourcetask.4c8ffd0238954d27b7891491a8d6da4f", "namespace": "astronomer-quasarian-spectroscope-0644" }, "spec": { "affinity": { "nodeAffinity": { "requiredDuringSchedulingIgnoredDuringExecution": { "nodeSelectorTerms": [ { "matchExpressions": [ { "key": "astronomer.io/multi-tenant", "operator": "In", "values": [ "true" ] } ] } ] } } }, "containers": [ { "args": [ "airflow", "tasks", "run", "resource", "task", "scheduled__2020-01-01T00:00:00+00:00", "--local", "--subdir", "DAGS_FOLDER/2.0/resource.py" ], "command": [ "tini", "--", "/entrypoint" ], "env": [ { "name": "AIRFLOW__CORE__EXECUTOR", "value": "LocalExecutor" }, { "name": "AIRFLOW__CORE__FERNET_KEY", "valueFrom": { "secretKeyRef": { "key": "fernet-key", "name": "quasarian-spectroscope-0644-fernet-key" } } }, { "name": "AIRFLOW__CORE__SQL_ALCHEMY_CONN", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-airflow-metadata" } } }, { "name": "AIRFLOW_CONN_AIRFLOW_DB", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-airflow-metadata" } } }, { "name": "AIRFLOW__WEBSERVER__SECRET_KEY", "valueFrom": { "secretKeyRef": { "key": "webserver-secret-key", "name": "quasarian-spectroscope-0644-webserver-secret-key" } } }, { "name": "AIRFLOW__ELASTICSEARCH__HOST", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-elasticsearch" } } }, { "name": "AIRFLOW__ELASTICSEARCH__ELASTICSEARCH_HOST", "valueFrom": { "secretKeyRef": { "key": "connection", "name": "quasarian-spectroscope-0644-elasticsearch" } } }, { "name": "AIRFLOW_IS_K8S_EXECUTOR_POD", "value": "True" } ], "envFrom": [ { "secretRef": { "name": "quasarian-spectroscope-0644-env" } } ], "image": "registry.gcp0001.us-east4.astronomer.io/quasarian-spectroscope-0644/airflow:deploy-2", "imagePullPolicy": "IfNotPresent", "name": "base", "resources": { "requests": { "cpu": "0.5", "memory": "500Mi" } }, "volumeMounts": [ { "mountPath": "/usr/local/airflow/logs", "name": "logs" }, { "mountPath": "/usr/local/airflow/airflow.cfg", "name": "config", "readOnly": true, "subPath": "airflow.cfg" }, { "mountPath": "/usr/local/airflow/config/airflow_local_settings.py", "name": "config", "readOnly": true, "subPath": "airflow_local_settings.py" } ] } ], "imagePullSecrets": [ { "name": "quasarian-spectroscope-0644-registry" } ], "restartPolicy": "Never", "securityContext": { "fsGroup": 50000, "runAsUser": 50000 }, "serviceAccountName": "quasarian-spectroscope-0644-airflow-worker", "volumes": [ { "emptyDir": {}, "name": "logs" }, { "configMap": { "name": "quasarian-spectroscope-0644-airflow-config" }, "name": "config" } ] } } Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/airflow/executors/kubernetes_executor.py", line 262, in run_pod_async resp = self.kube_client.create_namespaced_pod( File "/usr/local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py", line 6174, in create_namespaced_pod (data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs) # noqa: E501 File "/usr/local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py", line 6251, in create_namespaced_pod_with_http_info return self.api_client.call_api( File "/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 340, in call_api return self.__call_api(resource_path, method, File "/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 172, in __call_api response_data = self.request( File "/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 382, in request return self.rest_client.POST(url, File "/usr/local/lib/python3.9/site-packages/kubernetes/client/rest.py", line 272, in POST return self.request("POST", url, File "/usr/local/lib/python3.9/site-packages/kubernetes/client/rest.py", line 231, in request raise ApiException(http_resp=r) kubernetes.client.rest.ApiException: (422) Reason: Unprocessable Entity HTTP response headers: HTTPHeaderDict({'Audit-Id': 'be88cd36-49fb-4398-92c4-b50bb80084c8', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 29 Oct 2021 00:15:02 GMT', 'Content-Length': '799'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod \"resourcetask.4c8ffd0238954d27b7891491a8d6da4f\" is invalid: [spec.containers[0].resources.requests: Invalid value: \"500m\": must be less than or equal to cpu limit, spec.containers[0].resources.requests: Invalid value: \"500Mi\": must be less than or equal to memory limit]","reason":"Invalid","details":{"name":"resourcetask.4c8ffd0238954d27b7891491a8d6da4f","kind":"Pod","causes":[{"reason":"FieldValueInvalid","message":"Invalid value: \"500m\": must be less than or equal to cpu limit","field":"spec.containers[0].resources.requests"},{"reason":"FieldValueInvalid","message":"Invalid value: \"500Mi\": must be less than or equal to memory limit","field":"spec.containers[0].resources.requests"}]},"code":422} [2021-10-29 00:15:02,308] {kubernetes_executor.py:609} WARNING - ApiException when attempting to run task, re-queueing. Message: Pod "resourcetask.4c8ffd0238954d27b7891491a8d6da4f" is invalid: [spec.containers[0].resources.requests: Invalid value: "500m": must be less than or equal to cpu limit, spec.containers[0].resources.requests: Invalid value: "500Mi": must be less than or equal to memory limit] [2021-10-29 00:15:02,308] {kubernetes_executor.py:621} DEBUG - Next timed event is in 53.668401 ``` </details> ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
