Whatsonyourmind commented on issue #22001: URL: https://github.com/apache/airflow/issues/22001#issuecomment-4181008480
Strong +1 on this. We run 180+ DAGs with overlapping schedules and the per-DAG Gantt view makes it impossible to spot cross-DAG resource contention and idle gaps. The global view is critical for capacity planning. While waiting for the native implementation, we built a workaround that pulls task instance data from the Airflow metadata DB and feeds it into a graph analysis API to identify cross-DAG bottlenecks, critical paths, and idle windows: ```bash # Export task instances from a time window, then analyze the DAG dependency graph curl -X POST https://oraclaw-api.onrender.com/api/v1/analyze/graph \ -H "Content-Type: application/json" \ -d '{ "nodes": [ {"id": "etl_events", "type": "dag", "avgDuration": 2700, "schedule": "0 */2 * * *", "pool": "default"}, {"id": "etl_users", "type": "dag", "avgDuration": 480, "schedule": "0 */2 * * *", "pool": "default"}, {"id": "ml_features", "type": "dag", "avgDuration": 1800, "schedule": "30 */2 * * *", "pool": "ml_pool"}, {"id": "reporting_daily", "type": "dag", "avgDuration": 3600, "schedule": "0 6 * * *", "pool": "default"}, {"id": "dbt_transform", "type": "dag", "avgDuration": 2400, "schedule": "0 */3 * * *", "pool": "default"} ], "edges": [ {"from": "etl_events", "to": "ml_features", "type": "dataset_dependency"}, {"from": "etl_users", "to": "ml_features", "type": "dataset_dependency"}, {"from": "etl_events", "to": "dbt_transform", "type": "dataset_dependency"}, {"from": "dbt_transform", "to": "reporting_daily", "type": "dataset_dependency"} ], "analysis": ["critical_path", "bottleneck", "communities"] }' ``` Returns structural insights you'd normally need the global Gantt to see: ```json { "criticalPath": { "path": ["etl_events", "dbt_transform", "reporting_daily"], "totalDuration": 8700, "bottleneck": "etl_events" }, "bottlenecks": [ {"node": "etl_events", "fanOut": 2, "pageRank": 0.38, "impact": "high"} ], "communities": [ {"cluster": "etl_pipeline", "nodes": ["etl_events", "etl_users", "dbt_transform"]}, {"cluster": "downstream", "nodes": ["ml_features", "reporting_daily"]} ], "idleWindows": [ {"pool": "default", "start": "04:00", "end": "06:00", "utilizationPct": 12} ] } ``` We feed this into a scheduled report that shows which DAGs are on the critical path, where the idle windows are (ripe for moving non-critical DAGs into), and which DAGs to optimize first for maximum throughput gain. It's not a visual Gantt, but it answers the same questions programmatically. ~22ms per analysis. Free 25 calls/day at [oraclaw-api.onrender.com](https://oraclaw-api.onrender.com), $9/mo for continuous re-analysis after each DAG run. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
