[ 
https://issues.apache.org/jira/browse/AIRFLOW-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16805004#comment-16805004
 ] 

ASF GitHub Bot commented on AIRFLOW-4173:
-----------------------------------------

ashb commented on pull request #4993: [AIRFLOW-4173] Improve scheduler 
performance by avoid unnecessary actions in SchedulerJob.process_file()
URL: https://github.com/apache/airflow/pull/4993
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve scheduler performance by avoid Unnecessary actions in 
> SchedulerJob.process_file()
> -----------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-4173
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4173
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 1.10.2
>            Reporter: Xiaodong DENG
>            Assignee: Xiaodong DENG
>            Priority: Critical
>
> In current implementation of *SchedulerJob.process_file()* 
> ([https://github.com/apache/airflow/blob/068ded96cd279dcd51f5b6d1e96f09205ecf40c8/airflow/jobs.py#L1722-L1734),]
>  action '*dag = dagbag.get_dag(dag_id)*' is to be done no matter if dag_id is 
> pointing to a paused DAG. However, the result will not be used later if that 
> DAG is paused.
> This is causing inefficiency.
> We can do the `if DAG is paused` check first, before we invoke '*dag = 
> dagbag.get_dag(dag_id)*'. This may bring considerable improvement.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to