Re: [PR] Add latest run info endpoint for graph UI load optimization. [airflow]

via GitHub Wed, 23 Jul 2025 08:15:09 -0700


pierrejeambrun commented on code in PR #53429:
URL: https://github.com/apache/airflow/pull/53429#discussion_r2225910802



##########
airflow-core/src/airflow/api_fastapi/core_api/routes/ui/dags.py:
##########
@@ -191,3 +195,36 @@ def get_dags(
         total_entries=total_entries,
         dags=list(dag_runs_by_dag_id.values()),
     )
+
+
+@dags_router.get(
+    "/{dag_id}/latest_run",
+    responses=create_openapi_http_exception_doc([status.HTTP_404_NOT_FOUND]),
+    dependencies=[Depends(requires_access_dag(method="GET", 
access_entity=DagAccessEntity.RUN))],
+)
+def get_latest_run_info(dag_id: str, session: SessionDep) -> 
list[DAGRunLightResponse]:
+    """Get latest run."""
+    if dag_id == "~":
+        raise HTTPException(
+            status.HTTP_400_BAD_REQUEST,
+            "`~` was supplied as dag_id, but querying multiple dags is not 
supported.",

Review Comment:
   It just feels like we're avoiding the real problem, making our way with a 
new duplicated overspecialized endpoint while fixing the public api is probably 
the best long term fix. If we say that it can't be done now because it's too 
complicated and this is a short term fix while the public API is slow and that 
it will be removed while we do improvements on the serialization side, that's a 
valid approach too, I just want to make sure we are aligned on the long term 
goal.
   
   
   ```
           dag: DAG = dag_bag.get_dag(dag_id)
   ```
   
   Is that the only line triggering Airflow deserialization which is not 
present in the new endpoint, making it fast? That's only done for 404 error 
handling, we could probably get away without doing that.
   
   Can you share the sample dag you are using for testing purpose that yields > 
1s response time for a single request.



##########
airflow-core/src/airflow/api_fastapi/core_api/routes/ui/dags.py:
##########
@@ -191,3 +195,36 @@ def get_dags(
         total_entries=total_entries,
         dags=list(dag_runs_by_dag_id.values()),
     )
+
+
+@dags_router.get(
+    "/{dag_id}/latest_run",
+    responses=create_openapi_http_exception_doc([status.HTTP_404_NOT_FOUND]),
+    dependencies=[Depends(requires_access_dag(method="GET", 
access_entity=DagAccessEntity.RUN))],
+)
+def get_latest_run_info(dag_id: str, session: SessionDep) -> 
list[DAGRunLightResponse]:
+    """Get latest run."""
+    if dag_id == "~":
+        raise HTTPException(
+            status.HTTP_400_BAD_REQUEST,
+            "`~` was supplied as dag_id, but querying multiple dags is not 
supported.",

Review Comment:
   It just feels like we're avoiding the real problem, making our way with a 
new duplicated overspecialized endpoint while fixing the public api is probably 
the best long term fix. If we say that it can't be done now because it's too 
complicated and this is a short term fix while the public API is slow and that 
it will be removed while we do improvements on the serialization side, that's a 
valid approach too, I just want to make sure we are aligned on the long term 
goal.
   
   
   ```
           dag: DAG = dag_bag.get_dag(dag_id)
   ```
   
   Is that the only line triggering Airflow deserialization which is not 
present in the new endpoint, making it fast? That's only done for 404 error 
handling, we could probably get away without doing that.
   
   Can you share the sample dag you are using for testing purpose that yields > 
1s response time for a single request?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add latest run info endpoint for graph UI load optimization. [airflow]

Reply via email to