wolfier opened a new issue #19320:
URL: https://github.com/apache/airflow/issues/19320


   ### Apache Airflow version
   
   2.2.0 (latest released)
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   ![Screen Shot 2021-10-29 at 10 58 48 
AM](https://user-images.githubusercontent.com/5952735/139481226-04b4edad-c0cc-4bc9-9dde-e80fffa4e0f5.png)
   
   
   ### What happened
   
   Task stuck inside KubernetsExecutor task queue when the request is does not 
contains only resource requests and no resource limits.
   
   ```
   from datetime import datetime
   
   from airflow.models import DAG
   from airflow.operators.bash import BashOperator
   
   from kubernetes.client import models as k8s
   
   with DAG(
           dag_id='resource',
           catchup=False,
           schedule_interval='@once',
           start_date=datetime(2020, 1, 1),
   ) as dag:
       op = BashOperator(
           task_id='task',
           bash_command="sleep 30",
           dag=dag,
           executor_config={
               "pod_override": k8s.V1Pod(
                   spec=k8s.V1PodSpec(
                       containers=[
                           k8s.V1Container(
                               name="base",
                               resources=k8s.V1ResourceRequirements(
                                   requests={
                                       "cpu": 0.5,
                                       "memory": "500Mi",
                                   },
                                   # limits={
                                   #     "cpu": 0.5,
                                   #     "memory": "2000Mi"
                                   # }
                               )
                           )
                       ]
                   )
               )
           }
       )
   ``` 
   
   ### What you expected to happen
   
   I expect the pod to be scheduled or at least fail and not stuck inside 
KubernetesExecutor's task queue / BaseExecutor's running set. Since the pod 
never spawned, it cannot run and more importantly finish, which is when the pod 
state changes to anything other than RUNNING and the task instance key is 
removed from the running set.
   
   
https://github.com/apache/airflow/blob/2.2.0/airflow/executors/kubernetes_executor.py#L572-L588
 
   
   ### How to reproduce
   
   1. Create a DAG that only contains resource request and not resource limits
   2. See Airflow task instance stuck in queued state because the task is never 
ran to move the task state to running. 
   
   ### Anything else
   
   <details>
     <summary>Full scheduler logs</summary>
     
     ```
   [2021-10-29 00:15:02,152] {scheduler_job.py:442} INFO - Sending 
TaskInstanceKey(dag_id='resource', task_id='task', 
run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1) to executor with 
priority 1 and queue celery
   [2021-10-29 00:15:02,152] {base_executor.py:82} INFO - Adding to queue: 
['airflow', 'tasks', 'run', 'resource', 'task', 
'scheduled__2020-01-01T00:00:00+00:00', '--local', '--subdir', 
'DAGS_FOLDER/2.0/resource.py']
   [2021-10-29 00:15:02,154] {base_executor.py:150} DEBUG - 0 running task 
instances
   [2021-10-29 00:15:02,154] {base_executor.py:151} DEBUG - 1 in queue
   [2021-10-29 00:15:02,154] {base_executor.py:152} DEBUG - 32 open slots
   [2021-10-29 00:15:02,155] {kubernetes_executor.py:534} INFO - Add task 
TaskInstanceKey(dag_id='resource', task_id='task', 
run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1) with command 
['airflow', 'tasks', 'run', 'resource', 'task', 
'scheduled__2020-01-01T00:00:00+00:00', '--local', '--subdir', 
'DAGS_FOLDER/2.0/resource.py'] with executor_config {'pod_override': 
{'api_version': None,
    'kind': None,
    'metadata': None,
    'spec': {'active_deadline_seconds': None,
             'affinity': None,
             'automount_service_account_token': None,
             'containers': [{'args': None,
                             'command': None,
                             'env': None,
                             'env_from': None,
                             'image': None,
                             'image_pull_policy': None,
                             'lifecycle': None,
                             'liveness_probe': None,
                             'name': 'base',
                             'ports': None,
                             'readiness_probe': None,
                             'resources': {'limits': None,
                                           'requests': {'cpu': '0.5',
                                                        'memory': '500Mi'}},
                             'security_context': None,
                             'stdin': None,
                             'stdin_once': None,
                             'termination_message_path': None,
                             'termination_message_policy': None,
                             'tty': None,
                             'volume_devices': None,
                             'volume_mounts': None,
                             'working_dir': None}],
             'dns_config': None,
             'dns_policy': None,
             'enable_service_links': None,
             'host_aliases': None,
             'host_ipc': None,
             'host_network': None,
             'host_pid': None,
             'hostname': None,
             'image_pull_secrets': None,
             'init_containers': None,
             'node_name': None,
             'node_selector': None,
             'preemption_policy': None,
             'priority': None,
             'priority_class_name': None,
             'readiness_gates': None,
             'restart_policy': None,
             'runtime_class_name': None,
             'scheduler_name': None,
             'security_context': None,
             'service_account': None,
             'service_account_name': None,
             'share_process_namespace': None,
             'subdomain': None,
             'termination_grace_period_seconds': None,
             'tolerations': None,
             'volumes': None},
    'status': None}}
   [2021-10-29 00:15:02,157] {base_executor.py:161} DEBUG - Calling the <class 
'airflow.executors.kubernetes_executor.KubernetesExecutor'> sync method
   [2021-10-29 00:15:02,157] {kubernetes_executor.py:557} DEBUG - self.running: 
{TaskInstanceKey(dag_id='resource', task_id='task', 
run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1)}
   [2021-10-29 00:15:02,158] {kubernetes_executor.py:361} DEBUG - Syncing 
KubernetesExecutor
   [2021-10-29 00:15:02,158] {kubernetes_executor.py:286} DEBUG - 
KubeJobWatcher alive, continuing
   [2021-10-29 00:15:02,160] {kubernetes_executor.py:300} INFO - Kubernetes job 
is (TaskInstanceKey(dag_id='resource', task_id='task', 
run_id='scheduled__2020-01-01T00:00:00+00:00', try_number=1), ['airflow', 
'tasks', 'run', 'resource', 'task', 'scheduled__2020-01-01T00:00:00+00:00', 
'--local', '--subdir', 'DAGS_FOLDER/2.0/resource.py'], {'api_version': None,  
'kind': None,  'metadata': None,  'spec': {'active_deadline_seconds': None,     
      'affinity': None,           'automount_service_account_token': None,      
     'containers': [{'args': None,                           'command': None,   
                        'env': None,                           'env_from': 
None,                           'image': None,                           
'image_pull_policy': None,                           'lifecycle': None,         
                  'liveness_probe': None,                           'name': 
'base',                           'ports': None,                           
'readiness_probe
 ': None,                           'resources': {'limits': None,               
                          'requests': {'cpu': '0.5',                            
                          'memory': '500Mi'}},                           
'security_context': None,                           'stdin': None,              
             'stdin_once': None,                           
'termination_message_path': None,                           
'termination_message_policy': None,                           'tty': None,      
                     'volume_devices': None,                           
'volume_mounts': None,                           'working_dir': None}],         
  'dns_config': None,           'dns_policy': None,           
'enable_service_links': None,           'host_aliases': None,           
'host_ipc': None,           'host_network': None,           'host_pid': None,   
        'hostname': None,           'image_pull_secrets': None,           
'init_containers': None,           'node_name':
  None,           'node_selector': None,           'preemption_policy': None,   
        'priority': None,           'priority_class_name': None,           
'readiness_gates': None,           'restart_policy': None,           
'runtime_class_name': None,           'scheduler_name': None,           
'security_context': None,           'service_account': None,           
'service_account_name': None,           'share_process_namespace': None,        
   'subdomain': None,           'termination_grace_period_seconds': None,       
    'tolerations': None,           'volumes': None},  'status': None}, None)
   [2021-10-29 00:15:02,173] {kubernetes_executor.py:330} DEBUG - Kubernetes 
running for command ['airflow', 'tasks', 'run', 'resource', 'task', 
'scheduled__2020-01-01T00:00:00+00:00', '--local', '--subdir', 
'DAGS_FOLDER/2.0/resource.py']
   [2021-10-29 00:15:02,173] {kubernetes_executor.py:331} DEBUG - Kubernetes 
launching image 
registry.gcp0001.us-east4.astronomer.io/quasarian-spectroscope-0644/airflow:deploy-2
   [2021-10-29 00:15:02,174] {kubernetes_executor.py:260} DEBUG - Pod Creation 
Request: 
   {
     "apiVersion": "v1",
     "kind": "Pod",
     "metadata": {
       "annotations": {
         "checksum/airflow-secrets": 
"70012f356e10dceeb7c58fb0ce05014197b5fa1c1a5d8955ce1a1a4cc7347fa8bc67336041bded0f5bb700f6b5a17c794d7dc1ec00b72e6e98998f1f45efd286",
         "dag_id": "resource",
         "task_id": "task",
         "try_number": "1",
         "run_id": "scheduled__2020-01-01T00:00:00+00:00"
       },
       "labels": {
         "tier": "airflow",
         "component": "worker",
         "release": "quasarian-spectroscope-0644",
         "platform": "astronomer",
         "workspace": "cki7jmbr53180161pjtfor7aoj1",
         "airflow-worker": "8",
         "dag_id": "resource",
         "task_id": "task",
         "try_number": "1",
         "airflow_version": "2.2.0-astro.2",
         "kubernetes_executor": "True",
         "run_id": "scheduled__2020-01-01T0000000000-cc3b7db2b",
         "kubernetes-pod-operator": "False"
       },
       "name": "resourcetask.4c8ffd0238954d27b7891491a8d6da4f",
       "namespace": "astronomer-quasarian-spectroscope-0644"
     },
     "spec": {
       "affinity": {
         "nodeAffinity": {
           "requiredDuringSchedulingIgnoredDuringExecution": {
             "nodeSelectorTerms": [
               {
                 "matchExpressions": [
                   {
                     "key": "astronomer.io/multi-tenant",
                     "operator": "In",
                     "values": [
                       "true"
                     ]
                   }
                 ]
               }
             ]
           }
         }
       },
       "containers": [
         {
           "args": [
             "airflow",
             "tasks",
             "run",
             "resource",
             "task",
             "scheduled__2020-01-01T00:00:00+00:00",
             "--local",
             "--subdir",
             "DAGS_FOLDER/2.0/resource.py"
           ],
           "command": [
             "tini",
             "--",
             "/entrypoint"
           ],
           "env": [
             {
               "name": "AIRFLOW__CORE__EXECUTOR",
               "value": "LocalExecutor"
             },
             {
               "name": "AIRFLOW__CORE__FERNET_KEY",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "fernet-key",
                   "name": "quasarian-spectroscope-0644-fernet-key"
                 }
               }
             },
             {
               "name": "AIRFLOW__CORE__SQL_ALCHEMY_CONN",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-airflow-metadata"
                 }
               }
             },
             {
               "name": "AIRFLOW_CONN_AIRFLOW_DB",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-airflow-metadata"
                 }
               }
             },
             {
               "name": "AIRFLOW__WEBSERVER__SECRET_KEY",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "webserver-secret-key",
                   "name": "quasarian-spectroscope-0644-webserver-secret-key"
                 }
               }
             },
             {
               "name": "AIRFLOW__ELASTICSEARCH__HOST",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-elasticsearch"
                 }
               }
             },
             {
               "name": "AIRFLOW__ELASTICSEARCH__ELASTICSEARCH_HOST",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-elasticsearch"
                 }
               }
             },
             {
               "name": "AIRFLOW_IS_K8S_EXECUTOR_POD",
               "value": "True"
             }
           ],
           "envFrom": [
             {
               "secretRef": {
                 "name": "quasarian-spectroscope-0644-env"
               }
             }
           ],
           "image": 
"registry.gcp0001.us-east4.astronomer.io/quasarian-spectroscope-0644/airflow:deploy-2",
           "imagePullPolicy": "IfNotPresent",
           "name": "base",
           "resources": {
             "requests": {
               "cpu": "0.5",
               "memory": "500Mi"
             }
           },
           "volumeMounts": [
             {
               "mountPath": "/usr/local/airflow/logs",
               "name": "logs"
             },
             {
               "mountPath": "/usr/local/airflow/airflow.cfg",
               "name": "config",
               "readOnly": true,
               "subPath": "airflow.cfg"
             },
             {
               "mountPath": 
"/usr/local/airflow/config/airflow_local_settings.py",
               "name": "config",
               "readOnly": true,
               "subPath": "airflow_local_settings.py"
             }
           ]
         }
       ],
       "imagePullSecrets": [
         {
           "name": "quasarian-spectroscope-0644-registry"
         }
       ],
       "restartPolicy": "Never",
       "securityContext": {
         "fsGroup": 50000,
         "runAsUser": 50000
       },
       "serviceAccountName": "quasarian-spectroscope-0644-airflow-worker",
       "volumes": [
         {
           "emptyDir": {},
           "name": "logs"
         },
         {
           "configMap": {
             "name": "quasarian-spectroscope-0644-airflow-config"
           },
           "name": "config"
         }
       ]
     }
   }
   [2021-10-29 00:15:02,215] {rest.py:228} DEBUG - response body: 
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod
 \"resourcetask.4c8ffd0238954d27b7891491a8d6da4f\" is invalid: 
[spec.containers[0].resources.requests: Invalid value: \"500m\": must be less 
than or equal to cpu limit, spec.containers[0].resources.requests: Invalid 
value: \"500Mi\": must be less than or equal to memory 
limit]","reason":"Invalid","details":{"name":"resourcetask.4c8ffd0238954d27b7891491a8d6da4f","kind":"Pod","causes":[{"reason":"FieldValueInvalid","message":"Invalid
 value: \"500m\": must be less than or equal to cpu 
limit","field":"spec.containers[0].resources.requests"},{"reason":"FieldValueInvalid","message":"Invalid
 value: \"500Mi\": must be less than or equal to memory 
limit","field":"spec.containers[0].resources.requests"}]},"code":422}
   
   [2021-10-29 00:15:02,215] {kubernetes_executor.py:267} ERROR - Exception 
when attempting to create Namespaced Pod: {
     "apiVersion": "v1",
     "kind": "Pod",
     "metadata": {
       "annotations": {
         "checksum/airflow-secrets": 
"70012f356e10dceeb7c58fb0ce05014197b5fa1c1a5d8955ce1a1a4cc7347fa8bc67336041bded0f5bb700f6b5a17c794d7dc1ec00b72e6e98998f1f45efd286",
         "dag_id": "resource",
         "task_id": "task",
         "try_number": "1",
         "run_id": "scheduled__2020-01-01T00:00:00+00:00"
       },
       "labels": {
         "tier": "airflow",
         "component": "worker",
         "release": "quasarian-spectroscope-0644",
         "platform": "astronomer",
         "workspace": "cki7jmbr53180161pjtfor7aoj1",
         "airflow-worker": "8",
         "dag_id": "resource",
         "task_id": "task",
         "try_number": "1",
         "airflow_version": "2.2.0-astro.2",
         "kubernetes_executor": "True",
         "run_id": "scheduled__2020-01-01T0000000000-cc3b7db2b",
         "kubernetes-pod-operator": "False"
       },
       "name": "resourcetask.4c8ffd0238954d27b7891491a8d6da4f",
       "namespace": "astronomer-quasarian-spectroscope-0644"
     },
     "spec": {
       "affinity": {
         "nodeAffinity": {
           "requiredDuringSchedulingIgnoredDuringExecution": {
             "nodeSelectorTerms": [
               {
                 "matchExpressions": [
                   {
                     "key": "astronomer.io/multi-tenant",
                     "operator": "In",
                     "values": [
                       "true"
                     ]
                   }
                 ]
               }
             ]
           }
         }
       },
       "containers": [
         {
           "args": [
             "airflow",
             "tasks",
             "run",
             "resource",
             "task",
             "scheduled__2020-01-01T00:00:00+00:00",
             "--local",
             "--subdir",
             "DAGS_FOLDER/2.0/resource.py"
           ],
           "command": [
             "tini",
             "--",
             "/entrypoint"
           ],
           "env": [
             {
               "name": "AIRFLOW__CORE__EXECUTOR",
               "value": "LocalExecutor"
             },
             {
               "name": "AIRFLOW__CORE__FERNET_KEY",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "fernet-key",
                   "name": "quasarian-spectroscope-0644-fernet-key"
                 }
               }
             },
             {
               "name": "AIRFLOW__CORE__SQL_ALCHEMY_CONN",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-airflow-metadata"
                 }
               }
             },
             {
               "name": "AIRFLOW_CONN_AIRFLOW_DB",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-airflow-metadata"
                 }
               }
             },
             {
               "name": "AIRFLOW__WEBSERVER__SECRET_KEY",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "webserver-secret-key",
                   "name": "quasarian-spectroscope-0644-webserver-secret-key"
                 }
               }
             },
             {
               "name": "AIRFLOW__ELASTICSEARCH__HOST",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-elasticsearch"
                 }
               }
             },
             {
               "name": "AIRFLOW__ELASTICSEARCH__ELASTICSEARCH_HOST",
               "valueFrom": {
                 "secretKeyRef": {
                   "key": "connection",
                   "name": "quasarian-spectroscope-0644-elasticsearch"
                 }
               }
             },
             {
               "name": "AIRFLOW_IS_K8S_EXECUTOR_POD",
               "value": "True"
             }
           ],
           "envFrom": [
             {
               "secretRef": {
                 "name": "quasarian-spectroscope-0644-env"
               }
             }
           ],
           "image": 
"registry.gcp0001.us-east4.astronomer.io/quasarian-spectroscope-0644/airflow:deploy-2",
           "imagePullPolicy": "IfNotPresent",
           "name": "base",
           "resources": {
             "requests": {
               "cpu": "0.5",
               "memory": "500Mi"
             }
           },
           "volumeMounts": [
             {
               "mountPath": "/usr/local/airflow/logs",
               "name": "logs"
             },
             {
               "mountPath": "/usr/local/airflow/airflow.cfg",
               "name": "config",
               "readOnly": true,
               "subPath": "airflow.cfg"
             },
             {
               "mountPath": 
"/usr/local/airflow/config/airflow_local_settings.py",
               "name": "config",
               "readOnly": true,
               "subPath": "airflow_local_settings.py"
             }
           ]
         }
       ],
       "imagePullSecrets": [
         {
           "name": "quasarian-spectroscope-0644-registry"
         }
       ],
       "restartPolicy": "Never",
       "securityContext": {
         "fsGroup": 50000,
         "runAsUser": 50000
       },
       "serviceAccountName": "quasarian-spectroscope-0644-airflow-worker",
       "volumes": [
         {
           "emptyDir": {},
           "name": "logs"
         },
         {
           "configMap": {
             "name": "quasarian-spectroscope-0644-airflow-config"
           },
           "name": "config"
         }
       ]
     }
   }
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.9/site-packages/airflow/executors/kubernetes_executor.py",
 line 262, in run_pod_async
       resp = self.kube_client.create_namespaced_pod(
     File 
"/usr/local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py", 
line 6174, in create_namespaced_pod
       (data) = self.create_namespaced_pod_with_http_info(namespace, body, 
**kwargs)  # noqa: E501
     File 
"/usr/local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py", 
line 6251, in create_namespaced_pod_with_http_info
       return self.api_client.call_api(
     File 
"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 
340, in call_api
       return self.__call_api(resource_path, method,
     File 
"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 
172, in __call_api
       response_data = self.request(
     File 
"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py", line 
382, in request
       return self.rest_client.POST(url,
     File "/usr/local/lib/python3.9/site-packages/kubernetes/client/rest.py", 
line 272, in POST
       return self.request("POST", url,
     File "/usr/local/lib/python3.9/site-packages/kubernetes/client/rest.py", 
line 231, in request
       raise ApiException(http_resp=r)
   kubernetes.client.rest.ApiException: (422)
   Reason: Unprocessable Entity
   HTTP response headers: HTTPHeaderDict({'Audit-Id': 
'be88cd36-49fb-4398-92c4-b50bb80084c8', 'Cache-Control': 'no-cache, private', 
'Content-Type': 'application/json', 'Date': 'Fri, 29 Oct 2021 00:15:02 GMT', 
'Content-Length': '799'})
   HTTP response body: 
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod
 \"resourcetask.4c8ffd0238954d27b7891491a8d6da4f\" is invalid: 
[spec.containers[0].resources.requests: Invalid value: \"500m\": must be less 
than or equal to cpu limit, spec.containers[0].resources.requests: Invalid 
value: \"500Mi\": must be less than or equal to memory 
limit]","reason":"Invalid","details":{"name":"resourcetask.4c8ffd0238954d27b7891491a8d6da4f","kind":"Pod","causes":[{"reason":"FieldValueInvalid","message":"Invalid
 value: \"500m\": must be less than or equal to cpu 
limit","field":"spec.containers[0].resources.requests"},{"reason":"FieldValueInvalid","message":"Invalid
 value: \"500Mi\": must be less than or equal to memory 
limit","field":"spec.containers[0].resources.requests"}]},"code":422}
   
   
   [2021-10-29 00:15:02,308] {kubernetes_executor.py:609} WARNING - 
ApiException when attempting to run task, re-queueing. Message: Pod 
"resourcetask.4c8ffd0238954d27b7891491a8d6da4f" is invalid: 
[spec.containers[0].resources.requests: Invalid value: "500m": must be less 
than or equal to cpu limit, spec.containers[0].resources.requests: Invalid 
value: "500Mi": must be less than or equal to memory limit]
   [2021-10-29 00:15:02,308] {kubernetes_executor.py:621} DEBUG - Next timed 
event is in 53.668401
   
     ```
   </details>
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to