aksdevs commented on code in PR #62152:
URL: https://github.com/apache/airflow/pull/62152#discussion_r2831418650
##########
airflow-core/src/airflow/api_fastapi/core_api/routes/ui/dashboard.py:
##########
@@ -58,48 +58,44 @@ def historical_metrics(
"""Return cluster activity historical metrics."""
current_time = timezone.utcnow()
permitted_dag_ids = cast("set[str]", readable_dags_filter.value)
- # DagRuns
- dag_run_types = session.execute(
- select(DagRun.run_type, func.count(DagRun.run_id))
- .where(
- func.coalesce(DagRun.start_date, current_time) >= start_date,
- func.coalesce(DagRun.end_date, current_time) <=
func.coalesce(end_date, current_time),
- )
- .where(DagRun.dag_id.in_(permitted_dag_ids))
- .group_by(DagRun.run_type)
- ).all()
- dag_run_states = session.execute(
- select(DagRun.state, func.count(DagRun.run_id))
- .where(
- func.coalesce(DagRun.start_date, current_time) >= start_date,
- func.coalesce(DagRun.end_date, current_time) <=
func.coalesce(end_date, current_time),
- )
+ dag_run_date_filter = (
+ func.coalesce(DagRun.start_date, current_time) >= start_date,
+ func.coalesce(DagRun.end_date, current_time) <=
func.coalesce(end_date, current_time),
+ )
Review Comment:
@viiccwen I did changes locally which are similar to those of @iharsh02.
Benchmark results below.
Seeding 10,000 dag_runs and 80,000 task_instances ... done in 4.3s
==========================================================================================
SCENARIO 1: Wide date range (96% selectivity)
Query window: 2024-01-01 to 2024-03-01, 200 DAGs, 30 iterations
==========================================================================================
OLD (3q, COALESCE) : avg=79.5ms min=46.9ms p50=75.7ms
max=119.1ms
V1 (2q, SARGable) : avg=69.5ms min=44.9ms p50=67.8ms
max=100.4ms
V2 (1q, SA CTE+UNION ALL) : avg=118.0ms min=68.5ms p50=110.9ms
max=172.7ms
V3 (1q, raw SQL mat. CTE) : avg=97.7ms min=75.4ms p50=96.4ms
max=162.9ms
V4 (2q, COALESCE combined) : avg=74.8ms min=45.3ms p50=74.2ms
max=129.4ms
V5 (1q, raw COALESCE+CTE) : avg=115.1ms min=73.9ms p50=116.2ms
max=161.4ms
V6 (1q, SA mat. CTE+UNION) : avg=114.0ms min=69.2ms p50=113.3ms
max=169.3ms
OLD (3q, COALESCE) 0.0% vs OLD (79.5ms)
V1 (2q, SARGable) 12.5% vs OLD (69.5ms) <-- best
V2 (1q, SA CTE+UNION ALL) +-48.5% vs OLD (118.0ms)
V3 (1q, raw SQL mat. CTE) +-23.0% vs OLD (97.7ms)
V4 (2q, COALESCE combined) 5.9% vs OLD (74.8ms)
V5 (1q, raw COALESCE+CTE) +-44.8% vs OLD (115.1ms)
V6 (1q, SA mat. CTE+UNION) +-43.4% vs OLD (114.0ms)
==========================================================================================
SCENARIO 2: Narrow date range (~10% selectivity, 'last 7 days')
Query window: 2024-01-01 to 2024-01-08, 200 DAGs, 30 iterations
==========================================================================================
OLD (3q, COALESCE) : avg=72.9ms min=46.7ms p50=69.5ms
max=107.2ms
V1 (2q, SARGable) : avg=77.7ms min=44.3ms p50=76.6ms
max=114.1ms
V2 (1q, SA CTE+UNION ALL) : avg=92.4ms min=62.6ms p50=85.1ms
max=152.6ms
V3 (1q, raw SQL mat. CTE) : avg=106.3ms min=73.8ms p50=100.2ms
max=189.8ms
V4 (2q, COALESCE combined) : avg=65.1ms min=43.8ms p50=64.0ms
max=106.5ms
V5 (1q, raw COALESCE+CTE) : avg=101.1ms min=54.7ms p50=100.3ms
max=160.7ms
V6 (1q, SA mat. CTE+UNION) : avg=105.5ms min=80.2ms p50=100.8ms
max=141.2ms
OLD (3q, COALESCE) 0.0% vs OLD (72.9ms)
V1 (2q, SARGable) + -6.5% vs OLD (77.7ms)
V2 (1q, SA CTE+UNION ALL) +-26.7% vs OLD (92.4ms)
V3 (1q, raw SQL mat. CTE) +-45.8% vs OLD (106.3ms)
V4 (2q, COALESCE combined) 10.7% vs OLD (65.1ms) <-- best
V5 (1q, raw COALESCE+CTE) +-38.6% vs OLD (101.1ms)
V6 (1q, SA mat. CTE+UNION) +-44.6% vs OLD (105.5ms)
==========================================================================================
SCENARIO 3: Narrow date + 20 DAGs (restrictive permissions)
Query window: 2024-01-01 to 2024-01-08, 20 DAGs, 30 iterations
==========================================================================================
OLD (3q, COALESCE) : avg=10.8ms min=7.0ms p50=10.6ms
max=13.9ms
V1 (2q, SARGable) : avg=14.5ms min=7.4ms p50=15.6ms
max=21.1ms
V2 (1q, SA CTE+UNION ALL) : avg=11.7ms min=6.3ms p50=10.1ms
max=26.9ms
V3 (1q, raw SQL mat. CTE) : avg=9.2ms min=6.1ms p50=8.6ms
max=23.2ms
V4 (2q, COALESCE combined) : avg=15.0ms min=7.9ms p50=15.0ms
max=18.7ms
V5 (1q, raw COALESCE+CTE) : avg=12.6ms min=8.5ms p50=12.6ms
max=18.3ms
V6 (1q, SA mat. CTE+UNION) : avg=15.7ms min=8.5ms p50=15.6ms
max=24.8ms
OLD (3q, COALESCE) 0.0% vs OLD (10.8ms)
V1 (2q, SARGable) +-34.5% vs OLD (14.5ms)
V2 (1q, SA CTE+UNION ALL) + -8.2% vs OLD (11.7ms)
V3 (1q, raw SQL mat. CTE) 14.5% vs OLD (9.2ms) <-- best
V4 (2q, COALESCE combined) +-39.0% vs OLD (15.0ms)
V5 (1q, raw COALESCE+CTE) +-16.2% vs OLD (12.6ms)
V6 (1q, SA mat. CTE+UNION) +-45.4% vs OLD (15.7ms)
==========================================================================================
ADDING COVERING INDEXES
==========================================================================================
Done.
==========================================================================================
SCENARIO 2 + COVERING INDEXES: Narrow date (~10% selectivity)
Query window: 2024-01-01 to 2024-01-08, 200 DAGs, 30 iterations
==========================================================================================
OLD (3q, COALESCE) : avg=77.3ms min=44.3ms p50=77.5ms
max=108.6ms
V1 (2q, SARGable) : avg=84.0ms min=45.8ms p50=92.0ms
max=123.9ms
V2 (1q, SA CTE+UNION ALL) : avg=100.8ms min=73.5ms p50=94.9ms
max=152.6ms
V3 (1q, raw SQL mat. CTE) : avg=114.4ms min=65.4ms p50=113.8ms
max=178.2ms
V4 (2q, COALESCE combined) : avg=74.3ms min=45.0ms p50=78.2ms
max=108.5ms
V5 (1q, raw COALESCE+CTE) : avg=107.9ms min=66.2ms p50=111.9ms
max=149.5ms
V6 (1q, SA mat. CTE+UNION) : avg=106.9ms min=69.0ms p50=108.6ms
max=163.5ms
OLD (3q, COALESCE) 0.0% vs OLD (77.3ms)
V1 (2q, SARGable) + -8.7% vs OLD (84.0ms)
V2 (1q, SA CTE+UNION ALL) +-30.4% vs OLD (100.8ms)
V3 (1q, raw SQL mat. CTE) +-48.0% vs OLD (114.4ms)
V4 (2q, COALESCE combined) 3.9% vs OLD (74.3ms) <-- best
V5 (1q, raw COALESCE+CTE) +-39.6% vs OLD (107.9ms)
V6 (1q, SA mat. CTE+UNION) +-38.3% vs OLD (106.9ms)
==========================================================================================
SCENARIO 3 + COVERING INDEXES: Narrow date + 20 DAGs
Query window: 2024-01-01 to 2024-01-08, 20 DAGs, 30 iterations
==========================================================================================
OLD (3q, COALESCE) : avg=15.5ms min=8.1ms p50=15.2ms
max=29.4ms
V1 (2q, SARGable) : avg=12.2ms min=9.3ms p50=11.8ms
max=16.1ms
V2 (1q, SA CTE+UNION ALL) : avg=12.5ms min=7.3ms p50=11.9ms
max=22.3ms
V3 (1q, raw SQL mat. CTE) : avg=9.1ms min=6.1ms p50=9.1ms
max=14.7ms
V4 (2q, COALESCE combined) : avg=12.0ms min=7.1ms p50=11.1ms
max=20.6ms
V5 (1q, raw COALESCE+CTE) : avg=9.0ms min=5.0ms p50=8.9ms
max=19.0ms
V6 (1q, SA mat. CTE+UNION) : avg=10.1ms min=7.5ms p50=9.8ms
max=17.8ms
OLD (3q, COALESCE) 0.0% vs OLD (15.5ms)
V1 (2q, SARGable) 21.2% vs OLD (12.2ms)
V2 (1q, SA CTE+UNION ALL) 19.1% vs OLD (12.5ms)
V3 (1q, raw SQL mat. CTE) 41.2% vs OLD (9.1ms)
V4 (2q, COALESCE combined) 22.4% vs OLD (12.0ms)
V5 (1q, raw COALESCE+CTE) 42.0% vs OLD (9.0ms) <-- best
V6 (1q, SA mat. CTE+UNION) 34.7% vs OLD (10.1ms)
Closes #62021
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]