sonalprsd opened a new issue, #27782: URL: https://github.com/apache/airflow/issues/27782
### Description Airflow/MWAA does not seem to have any scalable API for returning the status of a dagRun, the APIs states-for-dag-run or list-runs are not scaling well. To fetch the dagRun status, every team seems to have some custom solution using sns_notification or updating the status to some external data store via Airflow callbacks. The ask is to expose an API which can return dagRun status in most optimized time/by an internal query operation and not a scan. Discussion https://github.com/apache/airflow/discussions/27765 ### Use case/motivation My use case is to fetch the Dag status of all the Active runs and update the status tables in the system. There is a poller (with a timeout of 150s configured based on our SLA). The states-for-dag-run API seems to be doing scan operation internally. As the number of DAG runs in system increases, the time to get the status of dagRun increases further. Initially, fetching the status of 100 runs took 2.5 minutes. With increase of dagRuns in the system by 50, the fetch operation to get status for 100 dagRuns is taking more than 5 minutes. ### Related issues NA ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
