tirkarthi opened a new issue, #60265: URL: https://github.com/apache/airflow/issues/60265
### Apache Airflow version main (development) ### If "Other Airflow 3 version" selected, which one? _No response_ ### What happened? We noticed in our instance the API calls had varied amount of time for same payload. For example the API call to fetch latest_run will be slow on the first load of the dag details page but subsequent auto refresh calls were faster. To debug this we enabled sql logging and the query for the endpoint was simple without any indexing or n+1 issues. Later we commented out the dependencies for the endpoint `Depends(requires_access_dag(method="GET", access_entity=DagAccessEntity.RUN))` and found the timings significantly improved. Then later we debugged and found that calls to convert the user_id from decoded jwt to User object from database was affecting the timings which includes a call to fetch user and also a complex join call to fetch permissions which could be causing the delays. We wrote a simple dict to store the id with the User object since we store the users in db and id is not reused. Thus it immutably holds a User to an id forever without any future conflicts. We applied the changes to our custom auth manager from fab auth manager and then noticed significant performance improvements. The distribution of the histogram was also much narrower without much deviation. The frontend also became snappy since the auto refresh calls and other calls are much faster. Database: sqlite. Also reproducible with mysql in our instance. Workers: 4 (Detault as per airflow api-server command) ``` hey -H "Cookie: _token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxIiwianRpIjoiMjI1MGFiMjE0NDIzNDQ2Mzk5YTQ5MDM4ZmY1Y2RiNjIiLCJpc3MiOltdLCJhdWQiOiJhcGFjaGUtYWlyZmxvdyIsIm5iZiI6MTc2Nzg3OTIwMSwiZXhwIjoxNzY3ODgyODAxLCJpYXQiOjE3Njc4NzkyMDF9.TljAVQubDebbjXl3gSROZG4H5_X7yU_wpU-IR4PMDfnvzcg4aJE7tHbHCLEGzDYsoa2uLQJLqyxA3V-8sqCJEg" http://localhost:8000/ui/dags/example_complex/latest_run Summary: Total: 2.2426 secs Slowest: 1.1800 secs Fastest: 0.0623 secs Average: 0.4897 secs Requests/sec: 89.1818 Response time histogram: 0.062 [1] |■ 0.174 [11] |■■■■■■ 0.286 [36] |■■■■■■■■■■■■■■■■■■■■ 0.398 [72] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.509 [26] |■■■■■■■■■■■■■■ 0.621 [4] |■■ 0.733 [0] | 0.845 [7] |■■■■ 0.956 [13] |■■■■■■■ 1.068 [16] |■■■■■■■■■ 1.180 [14] |■■■■■■■■ Latency distribution: 10% in 0.2208 secs 25% in 0.2948 secs 50% in 0.3543 secs 75% in 0.8007 secs 90% in 1.0434 secs 95% in 1.1479 secs 99% in 1.1799 secs Details (average, fastest, slowest): DNS+dialup: 0.0003 secs, 0.0623 secs, 1.1800 secs DNS-lookup: 0.0001 secs, 0.0000 secs, 0.0008 secs req write: 0.0001 secs, 0.0000 secs, 0.0016 secs resp wait: 0.4890 secs, 0.0621 secs, 1.1787 secs resp read: 0.0002 secs, 0.0000 secs, 0.0019 secs Status code distribution: [200] 200 responses ``` Queries ``` [2026-01-08T13:35:14.606436Z] {base.py:2710} INFO - BEGIN (implicit) [2026-01-08T13:35:14.608468Z] {base.py:1846} INFO - SELECT ab_user.id, ab_user.first_name, ab_user.last_name, ab_user.username, ab_user.password, ab_user.active, ab_user.email, ab_user.last_login, ab_user.login_count, ab_user.fail_login_count, ab_user.created_on, ab_user.changed_on, ab_user.created_by_fk, ab_user.changed_by_fk FROM ab_user WHERE ab_user.id = %(id_1)s [2026-01-08T13:35:14.608612Z] {base.py:1846} INFO - [generated in 0.00025s] {'id_1': 1} [2026-01-08T13:35:14.616390Z] {base.py:1846} INFO - SELECT ab_user_1.id AS ab_user_1_id, ab_role.id AS ab_role_id, ab_role.name AS ab_role_name, ab_permission_1.id AS ab_permission_1_id, ab_permission_1.name AS ab_permission_1_name, ab_view_menu_1.id AS ab_view_menu_1_id, ab_view_menu_1.name AS ab_view_menu_1_name, ab_permission_view_1.id AS ab_permission_view_1_id, ab_permission_view_1.permission_id AS ab_permission_view_1_permission_id, ab_permission_view_1.view_menu_id AS ab_permission_view_1_view_menu_id FROM ab_user AS ab_user_1 INNER JOIN ab_user_role AS ab_user_role_1 ON ab_user_1.id = ab_user_role_1.user_id INNER JOIN ab_role ON ab_role.id = ab_user_role_1.role_id LEFT OUTER JOIN (ab_permission_view_role AS ab_permission_view_role_1 INNER JOIN ab_permission_view AS ab_permission_view_1 ON ab_permission_view_1.id = ab_permission_view_role_1.permission_view_id) ON ab_role.id = ab_permission_view_role_1.role_id LEFT OUTER JOIN ab_permission AS ab_permission_1 ON ab_permission_1.id = ab_permission_view_1.permission_id LEFT OUTER JOIN ab_view_menu AS ab_view_menu_1 ON ab_view_menu_1.id = ab_permission_view_1.view_menu_id WHERE ab_user_1.id IN (%(primary_keys_1)s) [2026-01-08T13:35:14.616567Z] {base.py:1846} INFO - [generated in 0.00048s] {'primary_keys_1': 1} [2026-01-08T13:35:14.622088Z] {base.py:2716} INFO - COMMIT [2026-01-08T13:35:14.627531Z] {base.py:2710} INFO - BEGIN (implicit) [2026-01-08T13:35:14.731046Z] {base.py:1846} INFO - SELECT team.name FROM dag_bundle INNER JOIN dag_bundle_team AS dag_bundle_team_1 ON dag_bundle.name = dag_bundle_team_1.dag_bundle_name INNER JOIN team ON team.name = dag_bundle_team_1.team_name INNER JOIN dag ON dag.bundle_name = dag_bundle.name WHERE dag.dag_id = %(dag_id_1)s [2026-01-08T13:35:14.731210Z] {base.py:1846} INFO - [generated in 0.00030s] {'dag_id_1': 'example_complex'} [2026-01-08T13:35:14.732375Z] {base.py:2716} INFO - COMMIT [2026-01-08T13:35:14.734832Z] {base.py:2710} INFO - BEGIN (implicit) [2026-01-08T13:35:14.736355Z] {base.py:1846} INFO - SELECT dag_run.id, dag_run.dag_id, dag_run.run_id, dag_run.end_date, dag_run.logical_date, dag_run.run_after, dag_run.start_date, dag_run.state FROM dag_run WHERE dag_run.dag_id = %(dag_id_1)s ORDER BY dag_run.run_after DESC LIMIT %(param_1)s [2026-01-08T13:35:14.736486Z] {base.py:1846} INFO - [generated in 0.00022s] {'dag_id_1': 'example_complex', 'param_1': 1} [2026-01-08T13:35:14.738814Z] {base.py:2716} INFO - COMMIT INFO: 127.0.0.1:58590 - "GET /ui/dags/example_complex/latest_run HTTP/1.1" 200 OK ``` After 1 iteration to ensure all workers have the cached value ``` hey -H "Cookie: _token=eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxIiwianRpIjoiMjI1MGFiMjE0NDIzNDQ2Mzk5YTQ5MDM4ZmY1Y2RiNjIiLCJpc3MiOltdLCJhdWQiOiJhcGFjaGUtYWlyZmxvdyIsIm5iZiI6MTc2Nzg3OTIwMSwiZXhwIjoxNzY3ODgyODAxLCJpYXQiOjE3Njc4NzkyMDF9.TljAVQubDebbjXl3gSROZG4H5_X7yU_wpU-IR4PMDfnvzcg4aJE7tHbHCLEGzDYsoa2uLQJLqyxA3V-8sqCJEg" http://localhost:8000/ui/dags/example_complex/latest_run Summary: Total: 0.4605 secs Slowest: 0.1938 secs Fastest: 0.0415 secs Average: 0.1033 secs Requests/sec: 434.3205 Response time histogram: 0.042 [1] |■ 0.057 [7] |■■■■■ 0.072 [9] |■■■■■■ 0.087 [28] |■■■■■■■■■■■■■■■■■■■■ 0.102 [56] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.118 [44] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.133 [33] |■■■■■■■■■■■■■■■■■■■■■■■■ 0.148 [11] |■■■■■■■■ 0.163 [9] |■■■■■■ 0.179 [1] |■ 0.194 [1] |■ Latency distribution: 10% in 0.0748 secs 25% in 0.0893 secs 50% in 0.1020 secs 75% in 0.1205 secs 90% in 0.1344 secs 95% in 0.1488 secs 99% in 0.1640 secs Details (average, fastest, slowest): DNS+dialup: 0.0002 secs, 0.0415 secs, 0.1938 secs DNS-lookup: 0.0001 secs, 0.0000 secs, 0.0024 secs req write: 0.0001 secs, 0.0000 secs, 0.0008 secs resp wait: 0.1027 secs, 0.0413 secs, 0.1924 secs resp read: 0.0002 secs, 0.0001 secs, 0.0018 secs Status code distribution: [200] 200 responses ``` ``` [2026-01-08T13:36:15.881633Z] {base.py:2710} INFO - BEGIN (implicit) [2026-01-08T13:36:15.881946Z] {base.py:1846} INFO - SELECT team.name FROM dag_bundle INNER JOIN dag_bundle_team AS dag_bundle_team_1 ON dag_bundle.name = dag_bundle_team_1.dag_bundle_name INNER JOIN team ON team.name = dag_bundle_team_1.team_name INNER JOIN dag ON dag.bundle_name = dag_bundle.name WHERE dag.dag_id = %(dag_id_1)s [2026-01-08T13:36:15.882026Z] {base.py:1846} INFO - [cached since 4.759s ago] {'dag_id_1': 'example_complex'} [2026-01-08T13:36:15.883135Z] {base.py:2716} INFO - COMMIT [2026-01-08T13:36:15.885019Z] {base.py:2710} INFO - BEGIN (implicit) [2026-01-08T13:36:15.885327Z] {base.py:1846} INFO - SELECT dag_run.id, dag_run.dag_id, dag_run.run_id, dag_run.end_date, dag_run.logical_date, dag_run.run_after, dag_run.start_date, dag_run.state FROM dag_run WHERE dag_run.dag_id = %(dag_id_1)s ORDER BY dag_run.run_after DESC LIMIT %(param_1)s [2026-01-08T13:36:15.885404Z] {base.py:1846} INFO - [cached since 4.757s ago] {'dag_id_1': 'example_complex', 'param_1': 1} [2026-01-08T13:36:15.887501Z] {base.py:2716} INFO - COMMIT INFO: 127.0.0.1:58600 - "GET /ui/dags/example_complex/latest_run HTTP/1.1" 200 OK ``` Query analysis ``` mysql> explain SELECT ab_user_1.id AS ab_user_1_id, ab_role.id AS ab_role_id, ab_role.name AS ab_role_name, ab_permission_1.id AS ab_permission_1_id, ab_permission_1.name AS ab_permission_1_name, ab_view_menu_1.id AS ab_view_menu_1_id, ab_view_menu_1.name AS ab_view_menu_1_name, ab_permission_view_1.id AS ab_permission_view_1_id, ab_permission_view_1.permission_id AS ab_permission_view_1_permission_id, ab_permission_view_1.view_menu_id AS ab_permission_view_1_view_menu_id FROM ab_user AS ab_user_1 INNER JOIN ab_user_role AS ab_user_role_1 ON ab_user_1.id = ab_user_role_1.user_id INNER JOIN ab_role ON ab_role.id = ab_user_role_1.role_id LEFT OUTER JOIN (ab_permission_view_role AS ab_permission_view_role_1 INNER JOIN ab_permission_view AS ab_permission_view_1 ON ab_permission_view_1.id = ab_permission_view_role_1.permission_view_id) ON ab_role.id = ab_permission_view_role_1.role_id LEFT OUTER JOIN ab_permission AS ab_permission_1 ON ab_permission_1.id = ab_permission_view_1.permissi on_id LEFT OUTER JOIN ab_view_menu AS ab_view_menu_1 ON ab_view_menu_1.id = ab_permission_view_1.view_menu_id WHERE ab_user_1.id IN (1) -> ; +----+-------------+---------------------------+------------+--------+-------------------------------------------------------+-------------+---------+---------------------------------------------------------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------------+------------+--------+-------------------------------------------------------+-------------+---------+---------------------------------------------------------+------+----------+--------------------------+ | 1 | SIMPLE | ab_user_1 | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | Using index | | 1 | SIMPLE | ab_user_role_1 | NULL | ref | user_id,role_id | user_id | 5 | const | 1 | 100.00 | Using where; Using index | | 1 | SIMPLE | ab_role | NULL | eq_ref | PRIMARY | PRIMARY | 4 | airflow_db.ab_user_role_1.role_id | 1 | 100.00 | NULL | | 1 | SIMPLE | ab_permission_view_role_1 | NULL | ref | permission_view_id,idx_role_id,idx_permission_view_id | idx_role_id | 5 | airflow_db.ab_user_role_1.role_id | 62 | 100.00 | NULL | | 1 | SIMPLE | ab_permission_view_1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | airflow_db.ab_permission_view_role_1.permission_view_id | 1 | 100.00 | NULL | | 1 | SIMPLE | ab_permission_1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | airflow_db.ab_permission_view_1.permission_id | 1 | 100.00 | NULL | | 1 | SIMPLE | ab_view_menu_1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | airflow_db.ab_permission_view_1.view_menu_id | 1 | 100.00 | NULL | +----+-------------+---------------------------+------------+--------+-------------------------------------------------------+-------------+---------+---------------------------------------------------------+------+----------+--------------------------+ 7 rows in set, 1 warning (0.01 sec) mysql> explain analyze SELECT ab_user_1.id AS ab_user_1_id, ab_role.id AS ab_role_id, ab_role.name AS ab_role_name, ab_permission_1.id AS ab_permission_1_id, ab_permission_1.name AS ab_permission_1_name, ab_view_menu_1.id AS ab_view_menu_1_id, ab_view_menu_1.name AS ab_view_menu_1_name, ab_permission_view_1.id AS ab_permission_view_1_id, ab_permission_view_1.permission_id AS ab_permission_view_1_permission_id, ab_permission_view_1.view_menu_id AS ab_permission_view_1_view_menu_id FROM ab_user AS ab_user_1 INNER JOIN ab_user_role AS ab_user_role_1 ON ab_user_1.id = ab_user_role_1.user_id INNER JOIN ab_role ON ab_role.id = ab_user_role_1.role_id LEFT OUTER JOIN (ab_permission_view_role AS ab_permission_view_role_1 INNER JOIN ab_permission_view AS ab_permission_view_1 ON ab_permission_view_1.id = ab_permission_view_role_1.permission_view_id) ON ab_role.id = ab_permission_view_role_1.role_id LEFT OUTER JOIN ab_permission AS ab_permission_1 ON ab_permission_1.id = ab_permission_view_1.permission_id LEFT OUTER JOIN ab_view_menu AS ab_view_menu_1 ON ab_view_menu_1.id = ab_permission_view_1.view_menu_id WHERE ab_user_1.id IN (1) \G; *************************** 1. row *************************** EXPLAIN: -> Nested loop left join (cost=51.3 rows=62.2) (actual time=0.127..0.669 rows=99 loops=1) -> Nested loop left join (cost=29.5 rows=62.2) (actual time=0.123..0.524 rows=99 loops=1) -> Nested loop left join (cost=7.68 rows=62.2) (actual time=0.119..0.395 rows=99 loops=1) -> Nested loop inner join (cost=0.7 rows=1) (actual time=0.0137..0.0152 rows=1 loops=1) -> Filter: (ab_user_role_1.role_id is not null) (cost=0.35 rows=1) (actual time=0.00628..0.00747 rows=1 loops=1) -> Covering index lookup on ab_user_role_1 using user_id (user_id=1) (cost=0.35 rows=1) (actual time=0.00567..0.00673 rows=1 loops=1) -> Single-row index lookup on ab_role using PRIMARY (id=ab_user_role_1.role_id) (cost=0.35 rows=1) (actual time=0.0063..0.00635 rows=1 loops=1) -> Nested loop inner join (cost=28.8 rows=62.2) (actual time=0.105..0.368 rows=99 loops=1) -> Index lookup on ab_permission_view_role_1 using idx_role_id (role_id=ab_user_role_1.role_id) (cost=6.98 rows=62.2) (actual time=0.101..0.176 rows=99 loops=1) -> Single-row index lookup on ab_permission_view_1 using PRIMARY (id=ab_permission_view_role_1.permission_view_id) (cost=0.252 rows=1) (actual time=0.00161..0.00166 rows=1 loops=99) -> Single-row index lookup on ab_permission_1 using PRIMARY (id=ab_permission_view_1.permission_id) (cost=0.252 rows=1) (actual time=978e-6..0.00103 rows=1 loops=99) -> Single-row index lookup on ab_view_menu_1 using PRIMARY (id=ab_permission_view_1.view_menu_id) (cost=0.252 rows=1) (actual time=0.00114..0.00119 rows=1 loops=99) 1 row in set (0.01 sec) ERROR: No query specified ``` ```diff diff --git a/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py b/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py index c5486bcdc1..8df85f081f 100644 --- a/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py +++ b/providers/fab/src/airflow/providers/fab/auth_manager/fab_auth_manager.py @@ -176,6 +176,7 @@ class FabAuthManager(BaseAuthManager[User]): This auth manager is responsible for providing a backward compatible user management experience to users. """ + user_cache: dict[int, User] = {} appbuilder: AirflowAppBuilder | None = None def init_flask_resources(self) -> None: @@ -254,8 +255,15 @@ class FabAuthManager(BaseAuthManager[User]): return current_user def deserialize_user(self, token: dict[str, Any]) -> User: - with create_session() as session: - return session.scalars(select(User).where(User.id == int(token["sub"]))).one() + id = int(token["sub"]) + user = self.user_cache.get(id) + + if not user: + with create_session() as session: + user = session.scalars(select(User).where(User.id == id)).one() + self.user_cache[id] = user + + return user def serialize_user(self, user: User) -> dict[str, Any]: return {"sub": str(user.id)} ``` ### What you think should happen instead? _No response_ ### How to reproduce 1. Trigger a dagrun and then copy the token in cookie to get the latest dagrun as above. 2. Apply the patch and make one iteration to ensure the workers have their cache filled. Repeat the benchmark. ### Operating System Ubuntu 20.04 ### Versions of Apache Airflow Providers _No response_ ### Deployment Virtualenv installation ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
