The GitHub Actions job "Tests" on airflow.git/fix/n-plus-1-query-tasks-tab has failed. Run started by GitHub user Arunodoy18 (triggered by Arunodoy18).
Head commit for run: 2634632e009f17618eb751d9e1d5e7058854b4f9 / Arunodoy18 <[email protected]> Fix N+1 query issue in DAG Tasks tab This change addresses a critical performance issue where the Tasks tab in the DAG details view triggers N individual API calls for each task to fetch recent task instances, causing severe performance degradation and timeouts for DAGs with 200+ tasks. Problem: - Each TaskCard component independently called the API to fetch its task instances: /api/v2/dags/{dag_id}/dagRuns/~/taskInstances?task_id={task_id} - For a DAG with 200 tasks, this resulted in 200+ sequential API calls - Backend experienced SQLAlchemy timeouts due to excessive query load - UI became unresponsive and unusable for large DAGs Solution: - Modified Tasks.tsx to batch-fetch all task instances for all tasks in a single API call using the existing batch endpoint: POST /api/v2/dags/~/dagRuns/~/taskInstances/list - Task instances are grouped by task_id and passed as props to TaskCard - Eliminated N+1 query pattern, reducing 200+ calls to just 1 call - Maintained existing functionality including auto-refresh for pending tasks Changes: - Tasks.tsx: Added batch query using TaskInstanceService.getTaskInstancesBatch() with grouping logic to distribute instances to cards - TaskCard.tsx: Modified to accept taskInstances as prop instead of fetching independently Performance Impact: - Reduces API calls from O(N) to O(1) where N is number of tasks - For 200 tasks: 200 calls 1 call (99.5% reduction) - Eliminates backend timeout issues - Significantly improves UI responsiveness for large DAGs Fixes: #[issue_number] Report URL: https://github.com/apache/airflow/actions/runs/20518090563 With regards, GitHub Actions via GitBox --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
