The GitHub Actions job "Tests" on airflow.git has failed. Run started by GitHub user ashb (triggered by ashb).
Head commit for run: 22a9293ff8f48411d39074d9bc88af35abe9850f / Ash Berlin-Taylor <[email protected]> Change approach to finding bad rows to LEFT OUTER JOIN. (#23528) Rather than sub-selects (two for count, or one for the CREATE TABLE). For a _large_ database (27m TaskInstances, 2m DagRuns) this takes the time from 10minutes to around 3 minutes per table (we have 3) down to 3 minutes per table. (All times on Postgres.) Before: ```sql CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS SELECT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.execution_date AS execution_date, rendered_task_instance_fields.rendered_fields AS rendered_fields, rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml + FROM rendered_task_instance_fields WHERE NOT ( EXISTS ( SELECT 1 FROM task_instance JOIN dag_run ON dag_run.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id WHERE rendered_task_instance_fields.dag_id = task_instance.dag_id AND rendered_task_instance_fields.task_id = task_instance.task_id AND rendered_task_instance_fields.execution_date = dag_run.execution_date ) ) ``` After: ```sql CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS SELECT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.execution_date AS execution_date, rendered_task_instance_fields.rendered_fields AS rendered_fields, rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml + FROM rendered_task_instance_fields LEFT OUTER JOIN dag_run ON rendered_task_instance_fields.dag_id = dag_run.dag_id AND rendered_task_instance_fields.execution_date = dag_run.execution_date LEFT OUTER JOIN task_instance ON dag_run.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id AND rendered_task_instance_fields.task_id = task_instance.task_id WHERE task_instance.dag_id IS NULL OR dag_run.dag_id IS NULL ; ``` Report URL: https://github.com/apache/airflow/actions/runs/2282777651 With regards, GitHub Actions via GitBox --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
