john-rodriguez-mgni commented on code in PR #61483:
URL: https://github.com/apache/airflow/pull/61483#discussion_r2785810835
##########
airflow-core/src/airflow/api_fastapi/core_api/routes/ui/dags.py:
##########
@@ -234,18 +234,48 @@ def get_dags(
pending_actions_by_dag_id[dag_id].append(hitl_detail)
# aggregate rows by dag_id
- dag_runs_by_dag_id: dict[str, DAGWithLatestDagRunsResponse] = {
- dag.dag_id: DAGWithLatestDagRunsResponse.model_validate(
- {
- **DAGResponse.model_validate(dag).model_dump(),
- "asset_expression": dag.asset_expression,
- "latest_dag_runs": [],
- "pending_actions": pending_actions_by_dag_id[dag.dag_id],
- "is_favorite": dag.dag_id in favorite_dag_ids,
- }
+ # PERFORMANCE FIX: Validate once per DAG, then extend with extra fields
+ dag_runs_by_dag_id: dict[str, DAGWithLatestDagRunsResponse] = {}
+ for dag in dags:
+ # Validate ORM to DAGResponse once (this computes file_token)
+ dag_response = DAGResponse.model_validate(dag)
+ # Construct DAGWithLatestDagRunsResponse directly from validated
response
+ # Use model_construct with explicit field copying to avoid
model_dump() overhead
+ # which would convert DagTagResponse objects to dicts
+ dag_runs_by_dag_id[dag.dag_id] =
DAGWithLatestDagRunsResponse.model_construct(
+ # Fields from DAGResponse (keep objects as-is, don't serialize to
dict)
+ dag_id=dag_response.dag_id,
+ dag_display_name=dag_response.dag_display_name,
+ is_paused=dag_response.is_paused,
+ is_stale=dag_response.is_stale,
+ last_parsed_time=dag_response.last_parsed_time,
+ last_parse_duration=dag_response.last_parse_duration,
+ last_expired=dag_response.last_expired,
+ bundle_name=dag_response.bundle_name,
+ bundle_version=dag_response.bundle_version,
+ relative_fileloc=dag_response.relative_fileloc,
+ fileloc=dag_response.fileloc,
+ description=dag_response.description,
+ timetable_summary=dag_response.timetable_summary,
+ timetable_description=dag_response.timetable_description,
+ tags=dag_response.tags, # Keep as DagTagResponse objects
+ max_active_tasks=dag_response.max_active_tasks,
+ max_active_runs=dag_response.max_active_runs,
+
max_consecutive_failed_dag_runs=dag_response.max_consecutive_failed_dag_runs,
+
has_task_concurrency_limits=dag_response.has_task_concurrency_limits,
+ has_import_errors=dag_response.has_import_errors,
+ next_dagrun_logical_date=dag_response.next_dagrun_logical_date,
+
next_dagrun_data_interval_start=dag_response.next_dagrun_data_interval_start,
+
next_dagrun_data_interval_end=dag_response.next_dagrun_data_interval_end,
+ next_dagrun_run_after=dag_response.next_dagrun_run_after,
+ owners=dag_response.owners,
+ file_token=dag_response.file_token, # Computed field - already
computed
+ # Extra fields for DAGWithLatestDagRunsResponse
+ asset_expression=dag.asset_expression,
+ latest_dag_runs=[],
+ pending_actions=pending_actions_by_dag_id[dag.dag_id],
+ is_favorite=dag.dag_id in favorite_dag_ids,
Review Comment:
Thanks for the feedback! @pierrejeambrun... I've updated the PR with both
suggestions:
1. Module-level serializer: Replaced @lru_cache with a module-level
`_FILE_TOKEN_SERIALIZER` constant.
2. Single validation pass: Refactored to use a single`
DAGWithLatestDagRunsResponse.model_validate()` call on a dict built from ORM
attributes + extra fields. This ensures all fields go through Pydantic
validation in one pass while avoiding the double-validation overhead.
Tested locally with ~25 DAGs - the /ui/dags endpoint responds in ~130ms
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]