dags endpoint serialization [airflow]

via GitHub Mon, 09 Feb 2026 21:02:27 -0800


john-rodriguez-mgni commented on code in PR #61483:
URL: https://github.com/apache/airflow/pull/61483#discussion_r2785813067



##########
airflow-core/src/airflow/api_fastapi/core_api/routes/ui/dags.py:
##########
@@ -234,18 +234,48 @@ def get_dags(
             pending_actions_by_dag_id[dag_id].append(hitl_detail)
 
     # aggregate rows by dag_id
-    dag_runs_by_dag_id: dict[str, DAGWithLatestDagRunsResponse] = {
-        dag.dag_id: DAGWithLatestDagRunsResponse.model_validate(
-            {
-                **DAGResponse.model_validate(dag).model_dump(),
-                "asset_expression": dag.asset_expression,
-                "latest_dag_runs": [],
-                "pending_actions": pending_actions_by_dag_id[dag.dag_id],
-                "is_favorite": dag.dag_id in favorite_dag_ids,
-            }
+    # PERFORMANCE FIX: Validate once per DAG, then extend with extra fields
+    dag_runs_by_dag_id: dict[str, DAGWithLatestDagRunsResponse] = {}
+    for dag in dags:
+        # Validate ORM to DAGResponse once (this computes file_token)
+        dag_response = DAGResponse.model_validate(dag)
+        # Construct DAGWithLatestDagRunsResponse directly from validated 
response
+        # Use model_construct with explicit field copying to avoid 
model_dump() overhead
+        # which would convert DagTagResponse objects to dicts
+        dag_runs_by_dag_id[dag.dag_id] = 
DAGWithLatestDagRunsResponse.model_construct(
+            # Fields from DAGResponse (keep objects as-is, don't serialize to 
dict)
+            dag_id=dag_response.dag_id,
+            dag_display_name=dag_response.dag_display_name,
+            is_paused=dag_response.is_paused,
+            is_stale=dag_response.is_stale,
+            last_parsed_time=dag_response.last_parsed_time,
+            last_parse_duration=dag_response.last_parse_duration,
+            last_expired=dag_response.last_expired,
+            bundle_name=dag_response.bundle_name,
+            bundle_version=dag_response.bundle_version,
+            relative_fileloc=dag_response.relative_fileloc,
+            fileloc=dag_response.fileloc,
+            description=dag_response.description,
+            timetable_summary=dag_response.timetable_summary,
+            timetable_description=dag_response.timetable_description,
+            tags=dag_response.tags,  # Keep as DagTagResponse objects
+            max_active_tasks=dag_response.max_active_tasks,
+            max_active_runs=dag_response.max_active_runs,
+            
max_consecutive_failed_dag_runs=dag_response.max_consecutive_failed_dag_runs,
+            
has_task_concurrency_limits=dag_response.has_task_concurrency_limits,
+            has_import_errors=dag_response.has_import_errors,
+            next_dagrun_logical_date=dag_response.next_dagrun_logical_date,
+            
next_dagrun_data_interval_start=dag_response.next_dagrun_data_interval_start,
+            
next_dagrun_data_interval_end=dag_response.next_dagrun_data_interval_end,
+            next_dagrun_run_after=dag_response.next_dagrun_run_after,
+            owners=dag_response.owners,
+            file_token=dag_response.file_token,  # Computed field - already 
computed
+            # Extra fields for DAGWithLatestDagRunsResponse
+            asset_expression=dag.asset_expression,
+            latest_dag_runs=[],
+            pending_actions=pending_actions_by_dag_id[dag.dag_id],
+            is_favorite=dag.dag_id in favorite_dag_ids,

Review Comment:
   Thanks for the feedback! @pierrejeambrun...  I've updated the PR with both 
suggestions:
   
   1. Module-level serializer: Replaced @lru_cache with a module-level 
`_FILE_TOKEN_SERIALIZER` constant.
   
   2. Single validation pass: Refactored to use a single` 
DAGWithLatestDagRunsResponse.model_validate()` call on a dict built from ORM 
attributes + extra fields. This ensures all fields go through Pydantic 
validation in one pass while avoiding the double-validation overhead.
   
   Tested locally with ~25 DAGs - the /ui/dags endpoint responds in ~130ms



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Optimize /ui/dags endpoint serialization [airflow]

Reply via email to