chenyair opened a new issue, #35792: URL: https://github.com/apache/airflow/issues/35792
### Apache Airflow version Other Airflow 2 version (please specify below) ### What happened I am using Airflow 2.5.2 but this issue applies to all versions of Airflow. When I'm creating a task and I don't have enough quota for the Airflow executor to create the kubernetes api returns an ApiException with the status code 403 that says `Reason: Forbidden` with the message: `Pods ... is forbidden: exceeded quota: compute-resources, ...`. The Kubernetes executor puts the task back in the queue because the status code is not 400 or 422, from `kubernetes_executor.py`: ```python # These codes indicate something is wrong with pod definition; otherwise we assume pod # definition is ok, and that retrying may work if e.status in (400, 422): ``` The problem is that it tries excessively to run the task again and again and it spams the Kubernetes API which then makes kyverno write a lot of obejcts to etcd. ### What you think should happen instead I want to be able to control the amount of times the scheduler re-queues a job and the timeout between each time it tires to re-run the task if it was re-queued. ### How to reproduce Run an Airflow task with insufficient memory and cpu in the ACRQ ### Operating System Red Hat Enterprise Linux 8.5 (Ootpa) ### Versions of Apache Airflow Providers _No response_ ### Deployment Other 3rd-party Helm chart ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
