[
https://issues.apache.org/jira/browse/AIRFLOW-111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shenghu Yang updated AIRFLOW-111:
---------------------------------
Description:
Description of Issue
In airflow.cfg, we set: max_active_runs_per_dag = 1
In our dag, we set the dag_args['concurrency'] = 8, however, when the scheduler
starts to run, we can see this concurrency is not being honored, airflow
scheduler will run up to num of the 'parallelism' (we set as 25) task instances
for the ONE run dag_run.
What did you expect to happen?
dag_args['concurrency'] = 8 is honored, e.g. only run at most 8 task instances
concurrently.
What happened instead?
when the dag starts to run, we can see the concurrency is not being honored,
airflow scheduler/celery worker will run up to the 'parallelism' (we set as 25)
task instances.
Here is how you can reproduce this issue on your machine:
create a dag which contains nothing but 25 parallelized tasks.
set the dag dag_args['concurrency'] = 8
set the airflow parallelism = 25, and max_active_runs_per_dag = 1
then run: airflow scheduler
you will see all 25 task instance are scheduled to run, not 8.
was:
Description of Issue
In our dag, we set the dag_args['concurrency'] = 8, however, when the scheduler
starts to run, we can see this concurrency is not being honored, airflow
scheduler will run up to num of the 'parallelism' (we set as 25) jobs.
What did you expect to happen?
dag_args['concurrency'] = 8 is honored, e.g. only run at most 8 jobs
concurrently.
What happened instead?
when the dag starts to run, we can see the concurrency is not being honored,
airflow scheduler/celery worker will run up to the 'parallelism' (we set as 25)
jobs.
Here is how you can reproduce this issue on your machine:
create a dag which contains nothing but 25 parallelized jobs.
set the dag dag_args['concurrency'] = 8
set the airflow parallelism to 25
then run: airflow scheduler
you will see all 25 jobs are scheduled to run, not 8.
> DAG concurrency is not honored
> ------------------------------
>
> Key: AIRFLOW-111
> URL: https://issues.apache.org/jira/browse/AIRFLOW-111
> Project: Apache Airflow
> Issue Type: Sub-task
> Components: celery, scheduler
> Affects Versions: Airflow 1.6.2, Airflow 1.7.1.2
> Environment: Version of Airflow: 1.6.2
> Airflow configuration: Running a Scheduler with LocalExecutor
> Operating System: 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC
> 2015 x86_64 x86_64 x86_64 GNU/Linux
> Python Version: 2.7.6
> Screen shots of your DAG's status:
> Reporter: Shenghu Yang
> Fix For: Airflow 2.0
>
>
> Description of Issue
> In airflow.cfg, we set: max_active_runs_per_dag = 1
> In our dag, we set the dag_args['concurrency'] = 8, however, when the
> scheduler starts to run, we can see this concurrency is not being honored,
> airflow scheduler will run up to num of the 'parallelism' (we set as 25) task
> instances for the ONE run dag_run.
> What did you expect to happen?
> dag_args['concurrency'] = 8 is honored, e.g. only run at most 8 task
> instances concurrently.
> What happened instead?
> when the dag starts to run, we can see the concurrency is not being honored,
> airflow scheduler/celery worker will run up to the 'parallelism' (we set as
> 25) task instances.
> Here is how you can reproduce this issue on your machine:
> create a dag which contains nothing but 25 parallelized tasks.
> set the dag dag_args['concurrency'] = 8
> set the airflow parallelism = 25, and max_active_runs_per_dag = 1
> then run: airflow scheduler
> you will see all 25 task instance are scheduled to run, not 8.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)