Hi folks,

Firstly, thanks Jarek for putting together such a thorough and well-thought-out 
proposal.

I am very much in support of the multi-tenancy proposal. Having discussed this 
with over 30 customers (AWS and non-AWS), there's a clear desire to shift focus 
from the complex management of multiple Airflow environments to enhancing their 
capabilities, such as enabling data quality checks and lineage. This proposal 
is a significant step towards achieving that goal.

Acknowledging that not every Airflow user has enough time to thoroughly review 
the AIP, I have drafted a user scenario that encapsulates what's possible with 
the implementation of multi-tenancy support:

---- Scenario: Multi-Tenancy in Apache Airflow at [Rocket] ----
[Rocket], a leading [mobile gaming platform], has adeptly structured its cloud 
operations using Apache Airflow to provide an efficient and secure multi-tenant 
environment for orchestrating their complex workflows. This approach caters to 
the diverse needs of their three main user groups: the Data Engineering team, 
the Data Science team, and the Data Analytics team.

All teams share basic Airflow components like the Scheduler and Webserver, 
providing centralized management with shared cost. Each team has its own 
distinct tenant cluster, offering self-sufficiency, flexibility, and isolation. 
The Data Engineering team builds ETL/ELT pipelines and produces user profile, 
telemetry, and marketing data. The Analytics team works with marketing data and 
user information to build comprehensive dashboards. The Data Science team uses 
Kubernetes as their execution environment for heavy-duty machine learning 
tasks, producing a churn prediction dataset.

Members of each team can only see and work with their own workflows. However, 
Data engineers are granted access to all tenants, enabling them to assist with 
DAG troubleshooting and optimization across all teams. Upon logging in, users 
are presented with a tenant-specific view, displaying only the relevant DAGs 
and artifacts. For those with multi-tenant access, seamless navigation between 
different tenant views is available without the need for re-authentication.

This setup lets each team work independently with their own tools and data, 
while also getting help from data engineers when needed. It's secure, 
efficient, and user-friendly.

Image: https://imgur.com/gallery/uQNqiVc (highly recommend reviewing the image 
to understand the underlying setup)
-----------------------------------------------------------------------------------

I’d suggest that interested Airflow users review the scenario and share your 
support or concerns on this concept in this thread or AIP. For those interested 
in diving deeper into the details, the AIP is available here - 
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-tenant+deployment+of+Airflow+components

Thanks
Shubham
Product Manager - Amazon MWAA

From: Jarek Potiuk <ja...@potiuk.com>
Reply-To: "us...@airflow.apache.org" <us...@airflow.apache.org>
Date: Monday, March 11, 2024 at 4:05 PM
To: "dev@airflow.apache.org" <dev@airflow.apache.org>, 
"us...@airflow.apache.org" <us...@airflow.apache.org>
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] DRAFT AIP-67 Multi-tenant 
deployment of Airflow components


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.


I have iterated and already got a LOT of comments from a LOT of people (Thanks 
everyone who spent time on it ). I'd say the document is out of draft already, 
it very much describes the idea of multi-tenancy that I hope we will be voting 
on some time in the future.

Taking into account that ~ 30% of people in our survey said they want 
"mutl-tenancy" -  what I am REALLY interested in is to get honest feedback 
about the proposal. Manly:

*"Is this the multi-tenancy you were looking for?"

Or were you looking for different droids (err, tenants) maybe?.

I do not want to exercise my Jedi skills to influence your opinion, that's why 
the document is there (and some people say it's nice, readable and pretty 
complete) so that you can judge yourself and give feedback.

The document is here: 
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-tenant+deployment+of+Airflow+components

Feel free to comment here, or in the document. I would love to hear more 
voices, and have some ideas what to do next to validate the idea, so please - 
engage for now - but also expect some follow-ups.

J.


On Wed, Mar 6, 2024 at 9:16 AM Jarek Potiuk 
<ja...@potiuk.com<mailto:ja...@potiuk.com>> wrote:
Sooo.. Seems that it's an AIP time :D I've just published a Draft of AIP-67:

Multi-tenant deployment of Airflow components

https://cwiki.apache.org/confluence/display/AIRFLOW/%5BDRAFT%5D+AIP-67+Multi-tenant+deployment+of+Airflow+components

This AIP  is a bit lighter in detail than the others you could see
from Jed , Nikolas and Maciej. This is really a DRAFT / High Level
idea of Multi-Tenancy that could be implemented as the follow-up after
previous steps of Multi-Tenancy implemented (or being implemented)
right now.

I decided to - rather than describe all the details now -  focus on
the concept of Multitenancy that I wanted to propose. Most of all
explaining the concept, comparing it to current ways of achieving some
forms of multi-tenancy and showing benefits and drawbacks of the
solution and connected costs (i.e. what complexity we need to add to
achieve it).

When thinking about Multi-tenancy, I realized few things:

* everyone might understand multi-tenancy differently
* some forms of multi-tenancy are achievable even today
* but - most of all - I started to question myself "Is this what we
can do, enough for some, sufficiently numerous groups of users to call
it a useful feature for them".

So before we get into more details - my aim is to make sure we are all
at the same page on what we CAN do as a multi-tenancy, and eventually
to decide whether we SHOULD do it.

Have fun. Bring in comments and feedback.

More about all the currently active AIPs at today's Town Hall

BTW. Do you think it's a surprise that 5 AIPS were announced just
before the Town Hall? I think not  :D

J.

Reply via email to