Also coming back to mini-scheduler (Vikram - and still waiting for Ash to comment): I was thinking quite a bit about I think it was generally a good idea to prioritize queuing "next eligible" task after one of the tasks completed. I just think that that we should not lock the database when prioritizing them. And IMHO this also boils down to the same thing:
IMHO - we should have one single processing scheduler loop which is the only place where DAG runs are locked. And it should do everything - regular runs, backfill runs and mini-scheduler prioritized tasks. This way we limit complexity of scheduling and deadlocks to only that single loop and single SKIP LOCKED. And this should also include some ways of prioritizing/limiting those among each other so that things are not starving each other. I think this is the same design decision that we need to make for AIP-78 and AIP-72 - how we design that loop. And I would say it would be fantastic if we have that design upfront and can review and discuss it together here - I am happy for example to build an iterate on graphs and charts showing how it's going to work following such discussions. Because I think we should avoid the situation where only a handful of people understands "some" parts of the scheduler design and decision, and we do not have design decisions on it documented - just coded, without the context of what and how we tried to achieve it (except the fantastic talk of Ash of course which went through some of that - but quite long after the fact). I think it will be worth it if we discuss it here as this is a key component of Airflow. J. On Sat, Jul 13, 2024 at 10:33 PM Anand Bajpai <be.anandbaj...@gmail.com> wrote: > Hi, > > I am very glad to see thi AIP. I am sure, this is a long wished feature in > the user community. > > I am in favor of considering that one scheduler is better suited to include > both backfill and normal runs. Choices of using default or dedicated pools > are there and should remain. > > I have been following the Airflow discussion for some time now. However > this is the first opportunity for me to share my thoughts. I hope that I am > able to express my line of thinking clearly. > > > I would like to add that in my experience, there are these scenarios , > where we may want to execute older runs of a DAG. > > - Enabling a new DAG which should start running from an older date with > latest run in parallel > - Catching up on lagging DAG runs along with having latest run taking place > in parallel > - Rerunning old DAG runs along with having latest run taking place in > parallel > > I would like to consider them together, kind of `backfill` (or `old runs` > may be) and they could be benefitted from the same implementation. > > Here is how I think of this to start with. As a user, I will have a new web > page dedicated to active backfill runs on Airflow UI, On this page, I can > view and control backfill runs on Airflow level. I would follow and maybe > more configuration to start a backfill run. > > *DAG* - Lst of enabled DAGs > > *Date Range *- `from date` cannot be before `start date` of DAG and `end > date` cannot be after next execution date > > *Max backfill runs* - Default could be `Max active runs of DAG - 1` with > minimum value of `1` . If DAG has max active runs set to 1, it will require > more efforts to control how backfill and latest runs proceed together. > Options could be not allowing backfill on DAGs with max active runs as 1, > setting backfill DAG runs priority to lowest value or let it all run with > the same priority. > > *Run/Pause/Stop* - Option to control backfill runs. > > *Details* - Optionally showing error status in case the configuration of > backfill is not correct etc > > > The page will show a table of DAGs with active backfill runs and > configuration, control options and status of them. A new backfill run fails > to start if backfill is already in active on same DAG. > > The backfill can be either via deploying a configuration file with the > configuration or/and can be created on UI. Allowing through UI sounds > easier to handle in a multi user environment. Allowing both options will > need more work to handle conflicting wishes of users. > > I understand it might be a challenge to show backfill and latest active > runs in one view which we can try to address by showing a vertically split > table of latest DAG runs based on natural and backfill runs. > > I know I have not covered priority, pool, reporting etc but if the initial > idea looks interesting, we can include it in our follow up discussions. > > Kind Regards Anand >