Thank you all for the feedback.
Will add more of the technical details and share as a next step.

On Tue, May 28, 2024 at 12:33 AM Amogh Desai <amoghdesai....@gmail.com>
wrote:

> Good proposal!
>
> I like the idea here but again talking in terms of timelines, do we make it
> in Airflow 2 if it's that critical or can it wait till Airflow 3? I think
> we should scope this out and add some technical data to back this up before
> making this an AIP.
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Sun, May 26, 2024 at 4:25 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Yes. Long time awaited - and indeed some implementation details would be
> > needed to get it to AIP. And I also think one important decision to
> > consider - should it be targeting Airflow 2?
> >
> > On Sun, May 26, 2024 at 12:26 PM Elad Kalif <elad...@apache.org> wrote:
> >
> > > > In order for this to become a reality, Backfills need to be handled
> by
> > > the
> > > Airflow Scheduler as a normal DAG execution
> > >
> > > I think it's a good idea.
> > > It should solve natively problems like
> > > https://github.com/apache/airflow/issues/11302
> > >
> > > On Fri, May 24, 2024 at 10:58 PM Vikram Koka
> > <vik...@astronomer.io.invalid
> > > >
> > > wrote:
> > >
> > > > Fellow Airflowers,
> > > >
> > > > I am following up on some of the proposed changes in the Airflow 3
> > > proposal
> > > > <
> > > >
> > >
> >
> https://docs.google.com/document/d/1MTr53101EISZaYidCUKcR6mRKshXGzW6DZFXGzetG3E/
> > > > >,
> > > > where more information was requested by the community.
> > > >
> > > > One specific topic was "Running Backfills at scale". This is not yet
> a
> > > full
> > > > fledged AIP, but a starting point for the discussion leading towards
> an
> > > AIP
> > > > with fully defined technical details.
> > > > Backfills at scale
> > > >
> > > > Backfills in Airflow 2.x are treated as an exception and executed by
> an
> > > > incarnation of the BackfillJob, rather than the regular Airflow
> > Scheduler
> > > > itself. This results in unexpected interactions with the other DAGs
> > being
> > > > run by the main Airflow Scheduler at the same time including resource
> > > > contention and possibly unexpected delays because established
> > scalability
> > > > configuration settings such as Concurrency are not consistently
> > applied,
> > > > and also code-level complexity by having two somewhat-similar
> > > > implementations of scheduling logic.
> > > >
> > > >
> > > > However, with ML model training, backfills are a common operation and
> > > need
> > > > to be treated as a regular Airflow DAG / Task execution operation and
> > not
> > > > treated as an exception. It is also not possible to run a backfill
> > unless
> > > > you have direct access to the Airflow database/SSH access to the
> > Airflow
> > > > server , which is not possible for many/most data engineers.
> > > >
> > > >
> > > > In order for this to become a reality, Backfills need to be handled
> by
> > > the
> > > > Airflow Scheduler as a normal DAG execution, building on the Dynamic
> > Task
> > > > Mapping execution pattern, rather than an exception. Additionally,
> > > Backfill
> > > > tasks will now ONLY be executed by the Airflow Workers, for obvious
> > > reasons
> > > > including scalability. A less obvious, but important reason is
> > Security,
> > > > since it is ideal to have data connections to Enterprise data only
> > happen
> > > > through Airflow Workers, rather than any Airflow system components.
> > > >
> > > >
> > > > As part of making Backfill support cleaner in Airflow, Backfill DAG
> > > > execution will also be supported in the Airflow REST API.
> > > >
> > > >
> > > > This proposal is purposefully light on exact implementation details
> but
> > > > will include at least:
> > > >
> > > >
> > > >
> > > >    -
> > > >
> > > >    Making the Airflow Scheduler responsible for scheduling decisions
> on
> > > all
> > > >    DagRuns (instead of the current where it purposefully ignores
> > backfill
> > > > runs)
> > > >    -
> > > >
> > > >    A new API endpoint to submit a "backfill request".
> > > >
> > > >
> > > > --
> > > >
> > > >
> > > > Best regards,
> > > > Vikram Koka, Ash Berlin-Taylor, Kaxil Naik, and Constance Martineau
> > > >
> > >
> >
>

Reply via email to