tirkarthi opened a new pull request, #57425: URL: https://github.com/apache/airflow/pull/57425
The recent dag runs query was triggering access to task_instances, task_instance_histories, deadlines and dag_run_note tables as part of `model_validate` using `DagRunResponse` and each attribute access caused a query resulting n+1 queries for each dagrun in the recent runs response. The recent runs response included a lot of fields that are unused in the UI. This PR simplifies the response to selectively query the fields and uses `DAGRunLightResponse` to include `duration` field which fulfills all the fields required in the frontend. Additionally, the `pending_actions` filter to filter by required actions had condition as `has_pending_actions.value is not False` and in case of normal dags list page the has_pending_actions.value is None and thus `None is not False` also lead to evaluating query where it was not required. This also improved the performance. With the removal of n+1 queries and unused fields the overall response was significantly improved up to 7-8x where page that took 1s second to load for 15 dags and 2100 dagruns now loaded in around 100ms. This is also more noticeable in large environments where the n+1 queries are not executed. The http response size was also reduced with removal of unused fields in the response. https://github.com/pydantic/pydantic/issues/8192 https://github.com/sqlalchemy/sqlalchemy/discussions/10120 https://docs.sqlalchemy.org/en/20/orm/queryguide/relationships.html#sqlalchemy.orm.noload Closes #57418 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
