TreasureMaster opened a new issue, #41851:
URL: https://github.com/apache/airflow/issues/41851

   ### Apache Airflow version
   
   2.9.3
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   We have updated the version from 2.6.2 to 2.9.3. After the update the main 
page opens very slowly. The 'failed' button causes Airflow to fail due to a 
timeout. 
   [Start the discussion 
here](https://github.com/apache/airflow/discussions/41113)
   
   ### What you think should happen instead?
   
   We have 3200 Dags now. And their number is increasing. 
   The slowdown is caused by line 855 in the file `/airflow/www/views.py`:
   ```python
   status_count_failed = get_query_count(failed_dags, session=session)
   ```
   But not all of it, but only that part of it that filters dags by access 
rights. We commented out line 799 to prevent filtering by IN. This row:
   ```python
   dags_query = dags_query.where(DagModel.dag_id.in_(filter_dag_ids))
   ```
   This filter condition is not optimal when querying and is very slow. The 
slow part is highlighted in the query
   
![airflow-failed-sql](https://github.com/user-attachments/assets/7adf5f8d-ebd8-4c04-a7b6-8765dc94e206)
   
   It is converted from this line:
   ```sql
   WHERE NOT dag.is_subdag AND dag.is_active AND dag.dag_id IN 
(__[POSTCOMPILE_dag_id_1])
   ```
   The entire request takes between 60 and 70 seconds on our main page. But 
without this filtering part, the request takes just over 1 second.
   My dbeaver hangs during DB query due to lack of memory.
   
   Is it possible to optimize this query? Or split it into parts and filter by 
permissions it using Python.
   
   ### How to reproduce
   
   You need to have a large number of dags and dagruns.
   
   ### Operating System
   
   CentOS 7
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Our infrastructure:
   1) database server
   2) server with Airflow webserver, scheduler and Redis
   3) three servers with workers
   
   Modified bitnami images are used. Additional libraries installed.
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to