Ah, I guess I am a little bit late now.

In terms of the

> Isolating code execution and parsing of DAG files.
>

In Airbnb, we use `Docker runtime isolation for airflow tasks` plus
`Parsing Service` to totally isolate the dag parsing, task execution from
the airflow infra runtime.

`Docker runtime isolation for airflow tasks` (see email thread: [DISCUSS]
Docker runtime isolation for airflow tasks) introduces a docker layer which
wraps the dag parsing and task execution so each dag file can have its own
docker runtime.
`Parsing Service` totally removes the dag file parsing from the scheduler.

These two features have been running in Airbnb's production for close to 1
year. I am working on open source them.

Ping


Best wishes

Ping Zhang


On Wed, Dec 1, 2021 at 11:43 AM Jarek Potiuk <[email protected]> wrote:

> Very good/important thoughts.
>
> From the discussions and looking at the (upcoming) proposals from
> Mateusz  we are going to have this all optional:
>
> We plan to have two config options:
>
> * DB Isolation mode for separating out DB access
> * Standalone DAG processor
>
> I totally agree that standalone/quick/dirty access mode for Airflow
> should be the default (so business as usual). Moreover - that will
> allow the introduction of the multi-tenant mode as "optional" in
> otherwise backwards-compatible Airflow - i.e. it could start to be
> available in 2.x line.
>
> Actually (and this is something up for discussion in the AIP) we could
> introduce "soft" multi-tenancy mode, where DB access will be still
> possible but flagged as a warning.
> This could give the user an option to switch gradually their DAGs to
> the multi-tenancy mode, if they are already using some direct DB
> access (for example in their callbacks or custom operator).
>
> Also I think part of the AIP and proof of concept while discussing it
> should be initially rough, and later more comprehensive performance
> testing of some "real-life" scenarios.
>
> J.
>
> On Wed, Dec 1, 2021 at 6:12 PM Ash Berlin-Taylor <[email protected]> wrote:
> >
> > I look forward to seeing these propsals etc.
> >
> > One thought I've just had is that we should be careful about two things
> when taking on this work:
> >
> > 1. That performance is not impacted (specifically of the scheduler
> "throughput") -- at least when only a single "tenant" is in use if not for
> all.
> > 2. That we don't make the deployment story more complex for the small
> deployments, nor for the "getting started on a laptop" initial user
> experience.
> >
> > -ash
> >
> > On Fri, Nov 26 2021 at 18:23:32 +0100, Jarek Potiuk <[email protected]>
> wrote:
> >
> > Recording available here:
> https://drive.google.com/file/d/1Irw7qxxeTOHZTfdvT5lAbGowIfm9DHzi/view On
> Fri, Nov 26, 2021 at 6:17 PM Jarek Potiuk <[email protected]> wrote:
> >
> > Thanks for the meeting this morning/afternoon :) ! It was very
> productive, I believe: The notes are available here:
> https://docs.google.com/document/d/19d0jQeARnWm8VTVc_qSIEDldQ81acRk0KuhVwAd96wo/edit
> The most important take is that it looks like if the use cases are slightly
> different, we are all aligned of what needs to be done and how Action
> points: * Composer team (Mateusz) will soon submit AIP's (they are close to
> be ready for proposing) for * DB access isolation * Separating out DAG
> processor * Cloudera team (Ian) will work on follow-up Fine-grained
> resource access AIP - it can be implemented as next steps. The two AIPs
> above will implement "coarse" access level but in the way that the
> "fine-grained" access will be possible to be plugged-in I recorded the
> meeting and I am waiting for the video to be processed - I will send/add it
> to notes when I get it. J. J. On Fri, Nov 26, 2021 at 2:29 PM Jarek Potiuk <
> [email protected]> wrote: > > Reminder: the SIG meeting is today in ~2.5
> hrs. > > Calendar link here: >
> https://calendar.google.com/event?action=TEMPLATE&tmeid=N3ZmbGFxNGF1OXBtajc2ODU3bWduMWVvc2YgcG90aXVrLmFwYWNoZS5vcmdAbQ&tmsrc=potiuk.apache.org%40gmail.com
> > Notes/material links will be added here >
> https://docs.google.com/document/d/19d0jQeARnWm8VTVc_qSIEDldQ81acRk0KuhVwAd96wo/edit?usp=sharing
> > > I will record the meeting and post the link together with the notes. >
> > On Thu, Nov 25, 2021 at 3:31 PM Jarek Potiuk <[email protected]> wrote:
> > > > > Just a reminder -> multi-tenancy meeting tomorrow. Few people
> worked > > on what will be presented tomorrow, and I am super excited we
> will be > > able to kick that one off - it has been a long time on my
> waiting list > > :) > > > > J. > > > > On Sat, Nov 20, 2021 at 10:14 AM
> Jarek Potiuk <[email protected]> wrote: > > > > > > The meeting is set for
> Friday 26th Nov 5 PM CET (4 PM UTC) > > > > > > This is the calendar link
> (google meet link there): > > >
> https://calendar.google.com/event?action=TEMPLATE&tmeid=N3ZmbGFxNGF1OXBtajc2ODU3bWduMWVvc2YgcG90aXVrLmFwYWNoZS5vcmdAbQ&tmsrc=potiuk.apache.org%40gmail.com
> > > > > > > The initial agenda: > > > > > > 1) The goal of the group, intro
> about the "isolation" and various "scopes" > > > of the multi-tenancy -
> Jarek Potiuk > > > > > > 2) The review of the example architecture that > >
> > needs the "multitenancy" - this is from the Google Composer team - > > >
> Mateusz Henc > > > > > > 3) Maybe others would like to get their case
> explain similarly > > > > > > 4) Discus proposals on the scope of the
> AIP(s) we want to write > > > and rough approach we can take for
> implementation and who will do > > > whatGoogle Meet call:
> meet.google.com/rxu-tvdz-vpv (edited) > > > > > > We will send more
> info/slides then. Anyone who would like to show/add > > > something, please
> respond here :). > > > > > > J.
>

Reply via email to