Kamil Bregula created AIRFLOW-6965:
--------------------------------------

             Summary: The method is performed playthree times during one 
creation of the DAGRun file.
                 Key: AIRFLOW-6965
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6965
             Project: Apache Airflow
          Issue Type: Improvement
          Components: scheduler
    Affects Versions: 1.10.9
            Reporter: Kamil Bregula


Hello,

Task_instances queries are executed three times. This is redundant. If we can 
limit the number of these queries, we can achieve performance improvements.

First query:

perform_file: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L792]

process_dags: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L853]

create_dag_run: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L726]

create_dagrun: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L638]

verify_integrity: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dag.py#L1454]

get_task_instances: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L436]

Third query:

perform_file: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L792]

process_dags: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L853]

_process_task_instances: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L738]

update_state: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L685]

get_task_instances: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L292
]

perform_file: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L792]

process_dags: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L853]

_process_task_instances: 
[https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L738]

verify_integrity: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L684]

get_task_instances: 
[https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L436]

[|https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L292]

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to