Re: [PR] Add latest run info endpoint for graph UI load optimization. [airflow]

via GitHub Tue, 22 Jul 2025 11:07:58 -0700


dstandish commented on code in PR #53429:
URL: https://github.com/apache/airflow/pull/53429#discussion_r2223417179



##########
airflow-core/src/airflow/api_fastapi/core_api/routes/ui/dags.py:
##########
@@ -191,3 +195,36 @@ def get_dags(
         total_entries=total_entries,
         dags=list(dag_runs_by_dag_id.values()),
     )
+
+
+@dags_router.get(
+    "/{dag_id}/latest_run",
+    responses=create_openapi_http_exception_doc([status.HTTP_404_NOT_FOUND]),
+    dependencies=[Depends(requires_access_dag(method="GET", 
access_entity=DagAccessEntity.RUN))],
+)
+def get_latest_run_info(dag_id: str, session: SessionDep) -> 
list[DAGRunLightResponse]:
+    """Get latest run."""
+    if dag_id == "~":
+        raise HTTPException(
+            status.HTTP_400_BAD_REQUEST,
+            "`~` was supplied as dag_id, but querying multiple dags is not 
supported.",

Review Comment:
   > The public endpiont taking 1.3 seconds to fetch a single item (limit=1) 
and serialize it is seriously concerning, and maybe we should fix that instead 
of adding a new endpoint to go around the problem. Something is wrong to be 
that slow. Even if the serializers are different and holds more field compared 
to the private endpoint in this PR, accessing the DB, preloading options (which 
I think is missing) and serializing 1 item should literally takes about the 
same time. Does your install have any specific setup, what is actually taking 
all that time, that might be worth to take a look at that for public API 
improvement, I would expect a response time <200ms ? If we fix it we might 
don't even need that new endpoint
   
   Part of the problem is that Airflow's deserialization process is very slow, 
in general, but in particular for large dag.  And large dag is the thing that I 
am working at improving now, in order to help users and make our product better.
   
   Improving the deserialization process, and the public endpoint, may be a 
valid goal -- but that is not the problem I am tackling here.  I am making a 
pretty dramatic improvement here in load time.  You are essentially saying, why 
don't you solve some other problem.
   
   Why not just accept this optimization, and then, if and when we find time to 
tackle the broader issues with the public API, and solve them satisfactorily, 
if we then no longer need these specific endpoints, we can chop them?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add latest run info endpoint for graph UI load optimization. [airflow]

Reply via email to