tirkarthi opened a new issue, #59082:
URL: https://github.com/apache/airflow/issues/59082

   ### Description
   
   While looking into fastapi discussions I noticed that returning models could 
result in revalidation of the model though the validation was done during model 
construction. There were also issues with serialization using jsonable_encoder 
mentioned in the thread. To validate this I did some combinations benchmark to 
return pydantic model, serialized response using json and serialized response 
using orjson after validation for a simple "ui/dags" endpoint. Returning json 
or orjson encoded response seems to result in 2x requests per second. 
   
   Tradeoffs
   
   * Using orjson results in extra dependency and in that case FastAPI 
JSONResponse can be used as builtin option.
   * Skipping validation for response could be 
   * Need to check how automatic API generation works when return value is not 
specified in method.
   
   Start below as app.py with 10 workers in uvicorn using `uvicorn app:app 
--workers 10 --port 8000` command. Start a benchmark session for the endpoints 
with fixed response.
   
   Benchmark for different endpoints
   
   1. Return pydantic model with no return value signature
   2. Return orjson response with pydantic model in return value signature but 
no response validation
   3. Return orjson response with no model in return value signature but 
response validated using pydantic
   4. Response dict validated with pydantic and dict returned with model in 
return value signature.
   5. Return json response with model validation for response but no model in 
return signature
   
   https://github.com/fastapi/fastapi/issues/1359
   https://github.com/fastapi/fastapi/issues/1224#issuecomment-617243856
   
   ```
   seq 1 5 | xargs -I{} time hey -c 100 -n 10000 http://localhost:8000/dags\{\} 
| grep -iE 'Average:|Requests/sec'
     Average:   0.0137 secs
     Requests/sec:      5766.3920
   1.05user 0.47system 0:01.75elapsed 86%CPU (0avgtext+0avgdata 
20088maxresident)k
   0inputs+0outputs (1major+4194minor)pagefaults 0swaps
     Average:   0.0059 secs
     Requests/sec:      15063.7350
   0.86user 0.34system 0:00.68elapsed 178%CPU (0avgtext+0avgdata 
21236maxresident)k
   0inputs+0outputs (0major+4595minor)pagefaults 0swaps
     Average:   0.0063 secs
     Requests/sec:      13714.7256
   0.89user 0.35system 0:00.74elapsed 167%CPU (0avgtext+0avgdata 
26076maxresident)k
   0inputs+0outputs (0major+5741minor)pagefaults 0swaps
     Average:   0.0134 secs
     Requests/sec:      7052.8467
   1.03user 0.41system 0:01.43elapsed 100%CPU (0avgtext+0avgdata 
19932maxresident)k
   0inputs+0outputs (0major+4202minor)pagefaults 0swaps
     Average:   0.0070 secs
     Requests/sec:      13043.1176
   0.87user 0.38system 0:00.78elapsed 160%CPU (0avgtext+0avgdata 
23416maxresident)k
   0inputs+0outputs (0major+5021minor)pagefaults 0swaps
   ```
   
   app.py
   
   ```python
   from __future__ import annotations
   
   import orjson
   from fastapi import FastAPI
   from fastapi.responses import JSONResponse
   
   from airflow.api_fastapi.core_api.datamodels.monitor import 
HealthInfoResponse
   from airflow.api_fastapi.core_api.datamodels.ui.dags import (
       DAGWithLatestDagRunsCollectionResponse,
   )
   
   app = FastAPI()
   
   
   class ORJSONCustomResponse(JSONResponse):
       media_type = "application/json"
   
       def render(self, content) -> bytes:
           return orjson.dumps(content, option=orjson.OPT_NON_STR_KEYS | 
orjson.OPT_UTC_Z)
   
   
   @app.get("/health1")
   def get_health() -> HealthInfoResponse:
       res = {
           "metadatabase": {"status": "healthy"},
           "scheduler": {
               "status": "unhealthy",
               "latest_scheduler_heartbeat": "2025-12-04T15:14:35.876302+00:00",
           },
           "triggerer": {"status": None, "latest_triggerer_heartbeat": None},
           "dag_processor": {
               "status": "unhealthy",
               "latest_dag_processor_heartbeat": 
"2025-12-04T16:04:49.023929+00:00",
           },
       }
       return HealthInfoResponse.model_validate(res)
   
   
   @app.get("/health2", response_model=HealthInfoResponse, 
response_class=ORJSONCustomResponse)
   def get_health2() -> HealthInfoResponse:
       res = {
           "metadatabase": {"status": "healthy"},
           "scheduler": {
               "status": "unhealthy",
               "latest_scheduler_heartbeat": "2025-12-04T15:14:35.876302+00:00",
           },
           "triggerer": {"status": None, "latest_triggerer_heartbeat": None},
           "dag_processor": {
               "status": "unhealthy",
               "latest_dag_processor_heartbeat": 
"2025-12-04T16:04:49.023929+00:00",
           },
       }
       return ORJSONCustomResponse(res)
   
   
   @app.get("/health3")
   def get_health3():
       res = {
           "metadatabase": {"status": "healthy"},
           "scheduler": {
               "status": "unhealthy",
               "latest_scheduler_heartbeat": "2025-12-04T15:14:35.876302+00:00",
           },
           "triggerer": {"status": None, "latest_triggerer_heartbeat": None},
           "dag_processor": {
               "status": "unhealthy",
               "latest_dag_processor_heartbeat": 
"2025-12-04T16:04:49.023929+00:00",
           },
       }
       return res
   
   
   res = {
       "total_entries": 2,
       "dags": [
           {
               "dag_id": "gh58560",
               "dag_display_name": "gh58560",
               "is_paused": False,
               "is_stale": False,
               "last_parsed_time": "2025-12-04T16:04:24.620735Z",
               "last_parse_duration": 0.06622079999942798,
               "bundle_name": "dags-folder-1",
               "relative_fileloc": "gh58560.py",
               "fileloc": "/home/karthikeyan/airflow/dagsgh58560/gh58560.py",
               "timetable_description": "Never, external triggers only",
               "tags": [],
               "description": None,
               "next_dagrun": None,
               "next_dagrun_data_interval_start": None,
               "next_dagrun_data_interval_end": None,
               "next_dagrun_create_after": None,
               "last_expired": None,
               "timetable_summary": None,
               "asset_expression": None,
               "bundle_version": None,
               "max_active_tasks": 16,
               "max_active_runs": 16,
               "max_consecutive_failed_dag_runs": 0,
               "has_task_concurrency_limits": False,
               "has_import_errors": False,
               "owners": ["airflow"],
               "latest_dag_runs": [
                   {
                       "id": 1,
                       "dag_id": "gh58560",
                       "run_id": "manual__2025-12-04T14:57:33+00:00",
                       "logical_date": "2025-12-04T14:57:33Z",
                       "run_after": "2025-12-04T14:57:33Z",
                       "start_date": "2025-12-04T15:19:37.873413Z",
                       "end_date": "2025-12-04T15:19:37.873413Z",
                       "state": "success",
                   }
               ],
               "pending_actions": [],
               "is_favorite": False,
               "file_token": 
"eyJidW5kbGVfbmFtZSI6ImRhZ3MtZm9sZGVyLTEiLCJyZWxhdGl2ZV9maWxlbG9jIjoiZ2g1ODU2MC5weSJ9.bLgsjNGrIc5g1XjK1mzyqIOLPWM",
           },
           {
               "dag_id": "gh58560_1",
               "dag_display_name": "gh58560_1",
               "is_paused": False,
               "is_stale": False,
               "last_parsed_time": "2025-12-04T16:04:24.695772Z",
               "last_parse_duration": 0.09828795900102705,
               "bundle_name": "dags-folder-1",
               "relative_fileloc": "gh58560_1.py",
               "fileloc": "/home/karthikeyan/airflow/dagsgh58560/gh58560_1.py",
               "timetable_description": "Never, external triggers only",
               "tags": [],
               "description": None,
               "next_dagrun": None,
               "next_dagrun_data_interval_start": None,
               "next_dagrun_data_interval_end": None,
               "next_dagrun_create_after": None,
               "last_expired": None,
               "timetable_summary": None,
               "asset_expression": None,
               "bundle_version": None,
               "max_active_tasks": 16,
               "max_active_runs": 16,
               "max_consecutive_failed_dag_runs": 0,
               "has_task_concurrency_limits": False,
               "has_import_errors": False,
               "owners": ["airflow"],
               "latest_dag_runs": [],
               "pending_actions": [],
               "is_favorite": False,
               "file_token": 
"eyJidW5kbGVfbmFtZSI6ImRhZ3MtZm9sZGVyLTEiLCJyZWxhdGl2ZV9maWxlbG9jIjoiZ2g1ODU2MF8xLnB5In0.-xfPS1613MuegDAlA93RyV_K1kI",
           },
       ],
   }
   
   
   @app.get("/dags1")
   def get_dags1():
       return 
DAGWithLatestDagRunsCollectionResponse(total_entries=res["total_entries"], 
dags=res["dags"])
   
   
   @app.get("/dags2")
   def get_dags2() -> DAGWithLatestDagRunsCollectionResponse:
       return ORJSONCustomResponse(res)
   
   
   @app.get("/dags3")
   def get_dags3():
       
DAGWithLatestDagRunsCollectionResponse(total_entries=res["total_entries"], 
dags=res["dags"])
       return ORJSONCustomResponse(res)
   
   
   @app.get("/dags4")
   def get_dags4() -> DAGWithLatestDagRunsCollectionResponse:
       
DAGWithLatestDagRunsCollectionResponse(total_entries=res["total_entries"], 
dags=res["dags"])
       return res
   
   
   @app.get("/dags5")
   def get_dags5():
       
DAGWithLatestDagRunsCollectionResponse(total_entries=res["total_entries"], 
dags=res["dags"])
       return JSONResponse(res)
   
   ```
   
   ### Use case/motivation
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to