> > But regarding the scheduler pressure, I think I have a bit of a > different observation and the deadlock problem which is a bit > downplayed in the current proposal is - IMHO - crucial problem to be > solved when we want to make backfilling more accessible >
I think we're the unfortunate victims of overloaded terminology. The deadlock I refer to in the AIP is not a database deadlock. The word choice is not ideal, but that's what it's currently called in the code. It is referring to the scenario where no tasks can be scheduled anymore -- not due to database deadlock, but because somehow the backfill process got into a scenario where no tasks can be scheduled. For example see this comment <https://github.com/apache/airflow/blob/c09fcdf1d0e69497cf1b628df9ba3349eb688256/airflow/jobs/backfill_job_runner.py#L496-L499> and this comment <https://github.com/apache/airflow/blob/c09fcdf1d0e69497cf1b628df9ba3349eb688256/airflow/jobs/backfill_job_runner.py#L730-L732>. It's not really clear exactly how this happens. But I have a suspicion that it's probably more common when we use params like `task_regex` to run backfill on only a subset of a a dag. And in that scenario, it's easier to imagine weird things happening. Incidentally, along with many other params, I am proposing initially to remove `task_regex`, and this is mentioned in the doc. My thought is that, I don't really like having to deal with that complexity and it's just confusing, not super well-defined behavior, and I suspect it is the likely cause of the deadlock concerns that can be found in the code. And I figured to just lead with that in the proposal and see if anyone has any objections. I think we should make sure that backfill runs are scheduled and > queued in the executor in the same "scheduler loop" or get a better > mechanism to avoid deadlocks. Yeah same thing, this is a different kind of deadlock. And on scheduling in the same scheduler loop, I agree. I did not propose otherwise. Indeed part of the motivator is to reduce complexity and "two ways of doing things", and if I write a second scheduler *in the scheduler* for this I would not be doing that! But, there will still be *more* for the scheduler to do. I think the main thing will be, essentially, creating the dag runs. I think there will need to be a different path for creating the dag runs, but once they are created, I expect the task instances would be managed the same as any other task instance. There are other comments about deadlocks but again it's a different thing from what I call out in the proposal, and I think I agree with you and I think we are on same page -- the tasks in a backfill should be handled in the same mechanisms as normal tasks and therefore have the same *database* deadlock risks. But that also requires some > mechanism to avoid starvation - for example we should only allow say > max 30% of runs scheduled and queued within a single scheduler loop to > be backfils. I was trying to avoid taking on a larger question about scheduling priority. I think that there have been some ideas around that simmering for a while but sorta merits its own AIP. That said, one limit mechanism available already that I do not propose to remove is using a pool. Interestingly right now backfill does not seem to respect the defined pool at all, though it does have an optional pool argument. I think we should keep pool, but apply it as an optional override -- so the defined pool will be respected unless it is overridden in backfill configuration. So a user would be able to run backfill tasks in a pool that limited their concurrency. One problem with things like priority weight in airflow are that they are not forward-looking and only evaluated in the current scheduler loop with the tasks at hand. So you might schedule a bunch of low priority things cus that's all that's there now to schedule; but in 2 minutes the highest priority thing comes up and now it can't be scheduled. This sort of complexity is why I was hoping to avoid folding it in to this AIP, which I think has enough on its plate. I am open to adding some simple concurrency controls for backfill. I think it's a reasonable idea. But I'm not sure exactly what would be the best thing. But I expect some thoughts on this will materialize throughout the course of the AIP's implementation. And I do make small mention of this at the bottom of the doc, that it's under consideration. But for now it is essentially (1) pool overrides and (2) pausing the backfill. I think it's crucial to design and describe how the "looping" process > should look > like for scheduler, whether we continue having mini-scheduler and how > backfill scheduling processing should look like. I think mini scheduler is sort of unrelated because my expectation is that once tasks are created they will be managed same as other tasks. At high level, my thought is that the "backfill part" of the scheduler will be in the dag run creation, and then where it comes to queuing and managing tasks, it would be handled by the normal process which should remain more or less unchanged. I can add this to the AIP doc.