Well I’ve gone ahead and run the UPDATE query now, so the scheduler is picking up tasks.
When I cleared the tasks, every DAG run that had a cleared task in it was set to running. Because I’d backfilled them all they were all `backfill_` dag runs. Inspection of various tasks via `task_failed_deps` indicated the tasks had all their dependencies filled. After running the update query, they’re all `scheduled__` dag runs. On May 29, 2018, 5:02 PM -0700, Maxime Beauchemin <maximebeauche...@gmail.com>, wrote: > While this may work it's clearly not the prescribed way to do this. > Clearing should just work. > > I'm trying to understand why the scheduler is not picking up the cleared > task. Clearing should remove the task instance state and set the state of > the related DAG Run to running so that the scheduler picks those up. > Perhaps there's a conflict between the backfill and scheduler-related DAG > Runs? Which DAG runs are set to running? The backfill or scheduler-related > ones? > > Originally when I introduced DAG runs, backfill was operating without any > consideration related to DAG runs (DAG runs were a scheduler-specific > construct), later on Bolke added backfill-specific DAG runs and I'm not > 100% sure how that works. > > Let's get to the bottom of this. > > Max > > On Fri, May 25, 2018 at 7:48 PM Ruiqin Yang <yrql...@gmail.com> wrote: > > > If you are sure the update query targets the desired rows, the behavior > > should be the same. > > > > Scott Halgrim <scott.halg...@zapier.com.invalid>于2018年5月25日 周五下午4:23写道: > > > > > So far no ill effects from: > > > > > > update dag_run > > > set run_id = concat('scheduled__', substring(run_id, 10, 19)) > > > where dag_id = 'daily' > > > and execution_date > '2017-08-31' and execution_date < '2018-01-11' > > > and run_id like 'backfill_%' > > > order by execution_date; > > > > > > On May 25, 2018, 4:03 PM -0700, Scott Halgrim <scott.halg...@zapier.com > > > , > > > wrote: > > > > Oh wow, that will work? Thanks! Is there any reason for me not to just > > > run a mass UPDATE on those dag runs directly in the metadata database? > > > > > > > > On May 25, 2018, 4:01 PM -0700, Ruiqin Yang <yrql...@gmail.com>, > > wrote: > > > > > Airflow is not going to schedule backfill DAG runs, by looking at the > > > dag > > > > > run ID (which will start by 'backfill__'). If you want the scheduler > > to > > > > > schedule those tasks, you can click the DAG run and edit its name > > back > > > to > > > > > 'scheduled__<something>' > > > > > > > > > > Cheers, > > > > > Kevin Y > > > > > > > > > > On Fri, May 25, 2018 at 3:53 PM, Scott Halgrim < > > > > > scott.halg...@zapier.com.invalid> wrote: > > > > > > > > > > > I’ve got four months of dag runs that were scheduled dag runs, > > then I > > > > > > backfilled them. And now when I clear a task from one of those the > > > dag run > > > > > > goes to “running,” but none of the tasks get scheduled (unless I > > > manually > > > > > > backfill each of them) > > > > > > > > > > > > What I really should have done here was just cleared a mid-dag task > > > as > > > > > > well as all downstream tasks for these dag runs, but, well, now I’m > > > here > > > > > > and I’m wondering what the best way to fix this. > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > >