t oo created AIRFLOW-6934:
-----------------------------

             Summary: max_active_runs from different dag in dagbag stopping any 
task from running
                 Key: AIRFLOW-6934
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6934
             Project: Apache Airflow
          Issue Type: Bug
          Components: scheduler
    Affects Versions: 1.10.7
            Reporter: t oo


I have a one .py that creates multiple dagids (it is a dynamic dag generator, 
so 25 diff dag ids created, including dagA and dagB). I have 
max_active_runs_per_dag =5 in .cfg. I then did airflow cli triggerdag for dagA 
for 7 diff execdates in parallel and triggerdag for dagB for 4 diff execdates 
in parallel. From looking in the UI the dagA showed red in the schedule column. 
There were tasks in scheduled & queued state in both dagA and dagB but there 
were no tasks in running state (even over last 3 hrs!). The scheduler was still 
up though and running tasks from dagC (which is created from a different .py 
than the .py that creates dagA and dagB). I noticed this message printed in the 
scheduler logs frequently: "Number of active dag runs reached max_active_run."

>From tracing the code I think this is what happens:
_process_file 
(https://github.com/apache/airflow/blob/1.10.7/airflow/jobs/scheduler_job.py#L1512-L1588)
 runs at level of .py (so many diff dagids)
it calls _process_dags
for each dagid from that .py it calls _process_task_instances
_process_task_instances has a counter (active_dag_runs) which is appended for 
each dag being iterated over, it breaks out of the loop (the loop which appends 
ids to a list) if the counter > max_active_runs_per_dag (from .cfg). I couldn't 
see where task_instances_list gets used though


I'm using localexecutor, v1.10.7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to