Hey Jason, Are you running the SerialExecutor? This is the default out-of-the-box executor.
Cheers, Chris On Tue, May 31, 2016 at 12:59 PM, Jason Chen <chingchien.c...@gmail.com> wrote: > Hi Chris, > > I made the changes and tried it out. > It seems not working as expected. > When a dag is running (a particular task inside that dag is taking time), > another task from another dag seems "blocked". > > My setting: > (1) airflow.cfg > max_active_runs_per_dag = 16 > parallelism = 32 > dag_concurrency = 16 > > (2) A dag (dag1) python file is as below partially. Please note that > inside this DAG, the first task (task1) is a long running task > > dag1 = DAG('dag1', schedule_interval=timedelta(minutes=15), > max_active_runs=1, default_args=args) > > Then, the tasks are running in the order... > task1 (long running) --> task 2 --> task3 > ... > (3) In another dag (dag2) python file is as below partially. > dag2 = DAG('dag2', schedule_interval=timedelta(minutes=3), > max_active_runs=1, default_args=args) > ... > Then, the tasks are running in the order... > taskA (short running task) --> taskB > > (4) Inside the upstart script file. this is the main part how I start > airflow scheduler > > env SCHEDULER_RUNS=0 > export SCHEDULER_RUNS > > script > exec >> ${AIRFLOW_HOME}/scheduler-log/airflow-scheduler.log 2>&1 > exec usr/local/bin/airflow scheduler -n ${SCHEDULER_RUNS} > end script > > ========================= > > What I observed are that > (a) task1 (of dag1) is running about 20 mins and during it's running time, > there is no other dag1 triggered. This is as expected. > > (b) taskA (of dag2) should be triggered to run every 3 mins. However, it > is NOT triggered if task-1 of dag-1 is running. > taskA seems to be queued/bolcked and not run. It is executed after task-1 > (of dag-1) is done. So, it looks like it is dispatched into a "gap" of > task1 and task2 (of dag1). This looks not normal, as it's expected taskA > (of dag 2) should run no matter what happens to another dag (dag-1). > > > Any suggestions? > Thanks. > Jason > > > On Tue, May 31, 2016 at 9:02 AM, Chris Riccomini <criccom...@apache.org> > wrote: > >> Hey Jason, >> >> The problem is max_active_runs_per_dag=1. Set it back to 16. You just need >> max_active_runs=1 for the individual DAGs. This will allow multiple >> (different) DAGs to run in parallel, but only one DAG of each type can run >> at the same type. >> >> Cheers, >> Chris >> >> On Fri, May 27, 2016 at 11:42 PM, Jason Chen <chingchien.c...@gmail.com> >> wrote: >> >> > Hi Chris, >> > Thanks for your reply. After setting it up, I observed how it works for >> > couple of days.. >> > >> > I tried to to set max_active_runs=1 in the DAG >> > dag = DAG(...max_active_runs=1...) and it executed fine to avoid two >> runs >> > at the same time. >> > However, I noticed other dags (not the dag that is running) is also >> > "paused". >> > My understanding is that "max_active_runs" is basically >> > "max_active_runs_per_dag". >> > So, why another dag (different dag name) cannot run at the same time as >> the >> > first dag? >> > I want to have the two dags can be possibly run at the same time and >> inside >> > each dag, there is only >> > one run per dag. >> > Thanks. >> > >> > Jason >> > >> > My other settings in airflow.cfg >> > >> > max_active_runs_per_dag=1 >> > parallelism = 32 >> > dag_concurrency = 16 >> > >> > >> > >> > On Mon, May 16, 2016 at 8:57 PM, Chris Riccomini <criccom...@apache.org >> > >> > wrote: >> > >> > > Hey Jason, >> > > >> > > For (2), by default, task1 will start running again. You'll have two >> runs >> > > going at the same time. If you want to prevent this, you can set >> > > max_active_runs to 1 in your DAG. >> > > >> > > Cheers, >> > > Chris >> > > >> > > On Mon, May 16, 2016 at 1:09 PM, Jason Chen < >> chingchien.c...@gmail.com> >> > > wrote: >> > > >> > > > I have two questions >> > > > >> > > > (1) For the airflow UI: "Tree view", it lists the tasks along with >> the >> > > time >> > > > highlighted in the top (say, 08:30; 09:00, etc). What's the meaning >> of >> > > > time? It looks not the UTC time of the task was running. I know in >> > > > overall, airflow uses UTC time >> > > > (2) I have a DAG with two tasks: task1 --> task2 >> > > > Task1 is running hourly and could take longer than one hour to run, >> > > > sometimes. >> > > > In such a setup, task1 will be triggered hourly and what happens if >> the >> > > > previous task1 is still running ? Will the "new" task1 be queued ? >> > > > >> > > > Thanks. >> > > > Jason >> > > > >> > > >> > >> > >