Hello here,

New day - new prefix of the email.

I wanted to actually BOAST about the Extensible User Management AIP-56 (
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-56+Extensible+user+management
- we completed  the AIP a long time ago, but I think the actual completion
of the work is really now.

Big shoutout to Vincent leading it and everyone else who helped with it.

The soundness of the approach we took has just been finally confirmed by a
3rd-party implementation of LDAP Auth Manager - by Emre Can (@emredjan) -
https://github.com/emredjan/airflow-ldap-auth-manager just published and
made available at our Ecosystem page.

I find it super exciting, and I feel somewhat proud.y f - mostl, personally
even if I was just mostly gently pulling and pushing things here and there
and came up with the initial direction. Not everyone followed it, but I
wanted to share the small success story of some of the more "hairy" things
we have done. Or rather mostly Vincent did mostly with a lot of people
helping.

I think that one is a really a good example to follow of a long-term
feature that was done the right way with a big-hairy long term goal that
has been relentlessly followed - that was done really in parallel of
Airflow 3 work and had started a long time ago (2 years almost to the date).

We are doing it now with Amogh leadership now started by Ash - with
isolation of Task.sdk - even more "hairy" thing - and even more impactful.
And I hope we can take it as a good example for things like state
management - where we should take time and deliberate effort to make some
longer-term vision and decisions and relentlessly follow it to completion.

The story is pretty long, but I wanted to describe it here - even if not
for you to read today, but to capture some of the things that made it
happen, so that we can refer to it in the future. I thought about the day
that I will write the email for a looong time. And I knew what I wanted to
write - and the day finally came.

Brace yourselves, those how will venture into reading it.

----

It's a long journey since we had just mostly *whined* about FAB being such
a pain. In Airflow 1 FAB RBAC was already a step-up compared to the earlier
approach (everyone can do everything approach) - yes that was a default I
still remember from Airflow 1.

This one somehow disappears in memories even of those who were here 6 years
ago. But it was a great improvement back then to have RBAC and user
management. And one could think that we could only get it even more
feature-full as something that Airflow has "built-in" - but it turned out
to be a big dependency hog, FAB for a long time was not really evolving as
fast as Airflow was with its enterprise integration needs. Groups were only
recently added to FAB (and we are not even using groups even if we switched
to FAB that supports it). It was a pain for people to configure it, they
had to switch between Airflow and FAB docs, it was confusing whether we
have Airflow or FAB issue, we also at some point of time vendored-in
Security Manager partially in Airflow (and since then we need to manually
synchronise changes implemented in Fab to the manager we have in Airflow).
A lot of security issues came through FA and often we had to wait for FAB
or even contribute back the fixes. We worked closely with Daniel, that is
great cooperation, we even had calls when some security issues were raised
that affected FAB, Airflow and Superset and we discussed our responses
together and synced the remediations. But still - this was a lot, a lot, a
lot of pain, and we had this recurring whining *"Oh we wish we did not have
to depend on FAB".*

When we initially discussed it with Vincent we had many back-forth on how
to design the Auth Manager. We all came up with the idea to do it
differently - and rather than complicate the way how RBAC is done with
groups or other features, we decided that it's not really an Airflow job to
do Authentication and Authorisation management - and to delegate it to
those who do it better.

We made some choices that I think stand the pass of time with simple, yet
powerful API design where instead of thinking what "more" we can do in
Airflow, is what we can do to make Airflow more of an "extensible platform"
- that will be super-easy to extend. We believe that people will come and
do their own Auth Manager implementation some day. This day happened.

Then AWS team with AWS Auth Manager - mostly Vincent, Nick and AWS team
dogfooded and polished the rough edges in - initially experimental AWS Auth
Manager. That was still way back in Airflow 2. Great POC of the idea. One
that was absolutely necessary to move forward.

At the same time we (Vincent again!) implemented FAB Auth Manager interface
and moved all functionality to the provider and step-by-step we moved out
all FAB-y things to the provider, piece-by-piece, release by release.
Keeping the long-term vision in mind and getting ever closer to it. That
was **not an easy task** and it had MANY bumps, but with Vinc relentlessly
leading it and a number of people helping, (I had a very little part in it
- mostly lurking from behind Vincent's back - but I was in fact following
very closely, ready to step-in and help any time with some difficult
choices and decisions. There were many things around with dependencies,
issues, back-compatibility, etc. etc. And the work continues there still to
complete some issues.

In the meantime of course Airflow 3 happened with all the FastApi
conversion (which turns out to be a great decision) and new UI where we got
rid of the Fab ties from the UI part - Widgets, Connections, Plugins some
internal tooling we got this out as huge part of Airflow 3 migration. Jens,
Pierre. Karthikeyan and many others did that. Then we got SimpleAuthManager
as default - which is "good" as demo, development and shows the basic
capabilities of Auth Managers - with different types of (hard-coded) users
- which shows what Auth Managers can do, without even pretending, it can be
used in production (actually even actively shouting you in the face it
should not - stll some people wants to do it of course).

Eventually - we are extremely closely to cut the final last ties of FAB and
make the provider truly, truly optional and with its dependencies not
holding us back - there are literally few more APIs to convert to Fast API
for Fab Auth management left (and Yun-Ting Chiu - @chiuinggum has done and
continues doing a great job there). Some last remaining bridge between UI
and API for custom - old AuthBackends from Airflow 2 is to be added. Once
we do it, the day where we whine "Oh I wish we did not have to rely on FAB"
will be finally gone. It was great choice when we started, but it turned
out too much of a "gravity center" with a number of choices that held us
back,

Then KeyCloak implementation that was really "far fetched idea" - (also
Vincent's doing) - started. Tt is really nice and simple and extremely
powerful. Not everyone realizes but thanks to the way we designed Auth
Manager, the enterprise users of our can do a LOOOOOOT more than the
"constrained" FAB RBAC allows. You can configure all the complexity that
KeyCloak provides - mix and match authentication policies and rules - and
you are only limited to what KeyCloak provides. One of the examples is that
for example y)ou could configure that a given Dag or Dag group can be
triggered by a group of users between 9am and 5pm only - and at other times
it will be rejected. And generally any complexity  of deciding what users
should be capable of can be implemented in a simple way by defining and
combining javascript policies of KeyCloak. I really like how KeyCloak does
one thing BEST -> ID management integrated with enterprise ID systems. And
we now have a really nice (and simple, yet powerful) provider which people
can use. If you already used Keycloak - you could integrate it via FAB, but
with KeyCloak provider, it's so much more powerful having native
integration.

We already managed to implement nice batch optimizations and evolved tha
Auth Manager API.  We can have a single call for all Dags you are
attempting to display in dag view (avoiding n+1 issue) and see only those
that you can have access to. We have that in KeyCloak (or maybe is about to
be completed - but we know how to do it). And any implementation of Auth
Manager can use it as well. And we are discussing some discovery of
permission API that will allow for example to know if you should gray out
the "Trigger Run" button before you attempt it in the new UI if you do not
have permissions to get better UX.

For many of our users KeyCloak is too heavy to rely on. What if you have
LDAP and you don't care about all the features of KeyCloak and do not want
to bank on it? Fret not - as of a few days ago there is this 3rd-party LDAP
provider - so you can use it instead, without having to run and install the
KeyCloak for your company. The day might come if Emre would like that and
it proves to be useful we might get it contributed and we might be even
super happy to take maintenance of it - following our - just voted - new
provider acceptance policy. And it's the first fully reusable, generic Auth
Manager by "someone else" that is functional and published.

And ... you can also roll your own - and Emrecan has promised to review and
update our docs to make it even easier to do your own Auth Manager, so I
hope we will soon have some other "simpler-purpose" Auth Managers.

Fingers crossed.

That was a journey I wanted to share.

Vinc, you rock

J.

Reply via email to