This looks good Daniel and I am also happy that the scope is limited to moving backfill execution to scheduler and the interface change to data completeness will be part of Data Awareness AIPs -- that is a good separation imo
On Tue, 9 Jul 2024 at 15:13, Daniel Standish <daniel.stand...@astronomer.io.invalid> wrote: > I put up a draft AIP for scheduler-managed backfill here: > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-78+Scheduler-managed+backfill > > Quick summary: > > TLDR: move backfill from CLI process to the scheduler > > Backfill currently is a CLI-only feature that in effect runs a scheduler > locally in the CLI process. We don't have good visibility of backfill jobs > in the web UI, and users without CLI access cannot access the feature. > Additionally, it's not ideal to have a "second scheduler" from a project > maintenance perspective. > > This AIP focuses specifically on moving management of backfill jobs to the > scheduler. This will take something away from users. Previously you could > run backfill in local mode which would not only schedule the backfill > locally but run all the tasks locally as well. This will go away. And the > scheduler will of course have more to do, to the extent that backfill is > used. The scheduler will become somewhat more complex since it will have > to manage backfill runs too. > > There are some interactions with other AIPs. E.g. backfill is > fundamentally about data completeness. And the data awareness AIPs may > change what that can mean in Airflow. > > I look forward to your feedback. > > Thanks >