Squigilum opened a new issue #12995: URL: https://github.com/apache/airflow/issues/12995
**Apache Airflow version**: 2.0.0rc1 **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): 1.19.4 **Environment**: - **Cloud provider or hardware configuration**: Laptop with 6 cores and 32GB RAM - **OS** (e.g. from /etc/os-release): Ubuntu 20.04.1 LTS - **Kernel** (e.g. `uname -a`): 5.4.0-56-generic - **Install tools**: - **Others**: **What happened**: I am running the 2.0.0 release candidate in minikube using the celery executor. It was installed using the helm chart in git, with the executor changed and a persistent volume claim for storing dags added. I'm testing different scaling options by launching large amounts of tasks and evaluating how quickly/consistently they run. The DAG is run manually through the web server and on most runs, either some of the tasks will fail with no explanation or some tasks will be left in the 'queued' state and never run. The tasks in the 'queued' state are shown as 'active' in the flower dashboard but do not appear to be actually running. As part of my testing I have increased the values of AIRFLOW__CORE__DAG_CONCURRENCY and AIRFLOW__CELERY__WORKER_CONCURRENCY. This seems like it might exacerbate the problem but I have reproduced it with the default settings. **What you expected to happen**: All run successfully **What do you think went wrong?** Initially I thought I was over-taxing the system, but resource monitoring has shown nothing indicating this. My system has 11Gb of RAM free and 4 CPUs, and CPU utilization never went over 30%. **How to reproduce it**: Attached is a simple DAG that produces the issue on my setup. [concurrent_workflow.zip](https://github.com/apache/airflow/files/5675042/concurrent_workflow.zip) **Anything else we need to know**: I haven't seen anything indicating an error in the logs, but would be happy to provide if requested. **How often does this problem occur? Once? Every time etc?** The majority of my runs (75-90%) have resulted in at between 1 and 4 tasks that are stuck in the 'queued' state. The failed tasks are less frequent (approximately 25%) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
