The GitHub Actions job "Tests" on airflow.git/dag-run-filter has failed.
Run started by GitHub user kaxil (triggered by kaxil).

Head commit for run:
62b346134676905afb94dea386a3f3ab855eb6b2 / Kaxil Naik <[email protected]>
Optimize DAG list query for users with limited access

When users have limited DAG access, the DAG list query was inefficiently
grouping all DagRuns in the database before filtering. This caused severe
performance degradation in large deployments where a user might access
only a few DAGs out of hundreds or thousands.

The fix filters both the main DAG query and the DagRun subquery by
accessible dag_ids before performing the expensive GROUP BY operation.

Before (queries all dagruns):

```sql
  SELECT ... FROM dag
  LEFT OUTER JOIN (
    SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
    FROM dag_run
    GROUP BY dag_run.dag_id
  ) AS mrq ON dag.dag_id = mrq.dag_id
```

After (filters to accessible dags):

```sql
  SELECT ... FROM dag
  LEFT OUTER JOIN (
    SELECT dag_run.dag_id, max(dag_run.id) AS max_dag_run_id
    FROM dag_run
    WHERE dag_run.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
    GROUP BY dag_run.dag_id
  ) AS mrq ON dag.dag_id = mrq.dag_id
  WHERE dag.dag_id IN ('accessible_dag_1', 'accessible_dag_2')
```

Performance impact: In a deployment with 100 DAGs (100 runs each) where
a user has access to only 2 DAGs, this reduces the subquery from grouping
10,000 rows down to 200 rows (50x improvement), and eliminates fetching
98 unnecessary DAG models.

Fixes #57427

Report URL: https://github.com/apache/airflow/actions/runs/18892991993

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to