Comment to Subham's question:

> In addition, are there any other user scenarios, beyond multi-tenancy, that 
> Airflow users are looking to enable and that require this pluggability? 
> Asking as I haven't come across them. Overall, I believe we need more 
> information on your proposal before seeking feedback from the community. 
> Could we work together during February to develop a concrete proposal?

I am glad you asked. I think, this is one of the  what I wanted to
achieve by adding this page
https://github.com/apache/airflow/blob/main/docs/apache-airflow/public-airflow-interface.rst
- it will be live in 2.6 and one of the main parts is this one:

https://github.com/apache/airflow/blob/main/docs/apache-airflow/public-airflow-interface.rst#using-public-interface-to-extend-airflow-capabilities

Which really explains what "Airflow as a Platform" is all about. I do
not think we already know all the parts that should be converted into
"Airflow extendability". It's more of an incremental effort like that
where we have those bright ideas "Hey - this part can be removed and
delegated to others".  I think this has never been formulated
explicitly but I think for quite a while we are really in the mode
where we think much more about what we can SPLIT OUT from Airflow
rather than what we can ADD to Airflow.

When you look at it, this is also the main idea behind Open Lineage
integration for example - we are adding open linage (which is really
just an API) so that others can build "everything-lineage" on top of
it. So we are adding a minimum-possible set of APIs and integration so
that we can expose the lineage capability so that all the lineage "UI"
and other use cases that lineage exposes would be done outside. We are
in a strong position to do it - being sure that when we expose it,
others will implement the integration they care about.

I think more and more (and It has been preached by Ash mostly, but
also others) that we should be focusing solely on being an extremely
powerful and robust scheduler and make sure we are exposing all of the
possible things that can be exposed as an external API (while still
providing basic implementation that makes airflow still a "finished"
product that can be used to handle basic cases.

BTW. We are now preparing for the Airflow Summit CFP (some
announcements will follow shortly, I do not want to spill too many
beans) and we have a very interesting broad category "Airflow and
...." . And I think we should work in the direction that the `...` is
far bigger than Airflow itself.

J.

On Tue, Feb 14, 2023 at 12:34 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
>
> Great idea Vikram, I love the idea of making this a provider/pluggable.
>
> In some ways, we already have a pluggable mechanism for Authentication with 
> Auth Backends [1]. Where we will need lot more work I think is:
>
> Replacing Access Control provided by FAB with a base/core security model 
> (that is still resource-based) [2]
> Extend this to the other Airflow components (scheduler, workers, triggered, 
> cli) or make them all driven by a single API that takes care of Auth. This 
> will also reduce a lot of duplication of code across many of the components
> For backwards compact, we could ship with FAB-provider that still uses 
> Flask-app builder in addition to our recommended provider that will have more 
> features and users/companies/stabkeholders can build on top of that provider 
> to extend it further.
>
>
> References:
> [1]: 
> https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#auth-backends
> [2]: 
> https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/security/access-control.html
>
> On Tue, 14 Feb 2023 at 02:06, Mehta, Shubham <shu...@amazon.com.invalid> 
> wrote:
>>
>> Hi Vikram,
>> Thank you for taking the time to review the proposal. I appreciate your 
>> insights — I will make sure to reach out to you directly in the future for 
>> feedback as that would've undoubtedly saved us some time and effort.
>>
>> In regards to the separation of user management, I understand your concerns 
>> and, on a high-level, I agree with you. However, I think it would be 
>> beneficial to have more details on how it will work. Here are a few 
>> questions that come to mind:
>> 1. How will the user-id/group-id interface interact with Airflow 
>> resource-level permissions? What parts of "John can-edit dag1 and can-view 
>> dag2" be part of Airflow core? What will be exposed to the external system?
>> 2. Who will be responsible for managing the resource-level permissions? Will 
>> it be the external system?
>> 3. What are the limitations of this new pluggable model compared to FAB? 
>> Will there be restrictions on the granularity of resource access that 
>> Airflow admins can provide to their users?
>> 4. As Jarek pointed out, with this change we want to make authorization 
>> externally driven. Will this have a significant impact on Airflow 
>> performance as authorization will be required for fetching variables, 
>> executing tasks, etc.?
>> 5. What will the migration process look like for existing users to this 
>> non-FAB pluggable model?
>>
>> In addition, are there any other user scenarios, beyond multi-tenancy, that 
>> Airflow users are looking to enable and that require this pluggability? 
>> Asking as I haven't come across them. Overall, I believe we need more 
>> information on your proposal before seeking feedback from the community. 
>> Could we work together during February to develop a concrete proposal?
>>
>> Beside this, I would like to propose that we define the scope and long-term 
>> vision of "Airflow core". To achieve this, it may be helpful to first 
>> outline the perspectives of the Airflow PMCs. Recently, there have been 
>> discussions regarding the separation of executors into a separate package, 
>> the implementation of pluggable schedulers, and other related topics. 
>> Currently, these decisions and discussions are somewhat ad hoc and are made 
>> through the mailing list. I would be happy to collaborate and invest time in 
>> this effort.
>>
>> Regards
>> Shubham
>>
>> On 2023-02-13, 11:04 AM, "Jarek Potiuk" <ja...@potiuk.com> wrote:
>>
>>     CAUTION: This email originated from outside of the organization. Do not 
>> click links or open attachments unless you can confirm the sender and know 
>> the content is safe.
>>
>>
>>
>>     Hey Vikram,
>>
>>     I think it's brilliant and I wonder how it happened that had not
>>     occurred to us earlier. And I believe that is due to the natural
>>     tendency of "following as we always did" rather than thinking
>>     completely out-of-the-box. Thanks Vikram for bringing it up.
>>
>>     The funny thing is that when I see this:
>>
>>     > However, I don't agree that this level of user management belongs in 
>> "Core Airflow".
>>
>>     I almost immediately think - NOOOOO, why, it's always been here, how
>>     can we remove it?
>>
>>     But then if you look a bit closer:
>>
>>     > think this is a time to consider the concept of a "user management 
>> provider" with a simple built-in implementation being the current Airflow 
>> functionality, enabling alternate more complex (but separate) 
>> implementations such as your proposal here as alternate user management 
>> providers.
>>
>>     Then it starts to make way more sense. Way more.
>>
>>     And when you look further:
>>
>>     >  Maybe, this also enables us to get rid of the Fab security manager 
>> from core Airflow?
>>
>>     My heart jumps and I am immediately sold on the idea.
>>
>>     When I was commenting on the doc  initially, something was not right.
>>     I had a feeling It is probably the 5th time I am looking and
>>     commenting on a similar document. And, well, I did, actually. Most of
>>     the things we discussed there are already implemented out there. We
>>     just need to make sure we expose enough of the API to use them. For
>>     example we have Keycloak that is an open source implementation of
>>     Identity and Access Management. With everything out there already
>>     integrated. and I've been part of the project that integrated just the
>>     authentication part. Now if we rethink the authorization and make it
>>     simpler and "externally driven", this will not only be faster IMHO,
>>     but also will allow enterprise users to integrate much better.
>>
>>     I believe following the path that Vikram outlined will be a good
>>     direction for everyone in the community - including all the Manage
>>     Service providers, who will have a far easier job on integrating
>>     Airflow into their authentication models.
>>
>>     J.
>>
>>
>>
>>     On Mon, Feb 13, 2023 at 6:24 PM Vikram Koka
>>     <vik...@astronomer.io.invalid> wrote:
>>     >
>>     > Shubham and Vincent,
>>     >
>>     > Let me start by saying that I apologize for my delayed response to 
>> your original email.
>>     >
>>     > I appreciate the detailed write-up and the thought behind it. I 
>> completely agree with your use case and understand how this is applicable to 
>> enterprises with multiple data teams using Airflow.
>>     >
>>     > However, I don't agree that this level of user management belongs in 
>> "Core Airflow".
>>     >
>>     > I strongly believe that the core Airflow mission is for the community 
>> at large and for data practitioners either individuals or teams within 
>> enterprises. And therefore, I don't disagree with the intent of making it 
>> easier for enterprise teams to adopt Airflow. But, I think there is a never 
>> ending list of user management features which are needed to support 
>> Enterprise needs. We have already struggled with this over time and faced 
>> challenges with the Fab security manager and its integration in Airflow.
>>     >
>>     > I think we should use this opportunity and your use case to "separate 
>> the user management" from Core Airflow outside of the absolute basics. I 
>> think this is a time to consider the concept of a "user management provider" 
>> with a simple built-in implementation being the current Airflow 
>> functionality, enabling alternate more complex (but separate) 
>> implementations such as your proposal here as alternate user management 
>> providers. Maybe, this also enables us to get rid of the Fab security 
>> manager from core Airflow?
>>     >
>>     > Best regards,
>>     > Vikram
>>     >
>>     >
>>     > On Fri, Feb 3, 2023 at 8:22 AM Beck, Vincent 
>> <vincb...@amazon.com.invalid> wrote:
>>     >>
>>     >> Thanks __
>>     >>
>>     >> On 2023-02-03, 10:55 AM, "Jarek Potiuk" <ja...@potiuk.com> wrote:
>>     >>
>>     >>     CAUTION: This email originated from outside of the organization. 
>> Do not click links or open attachments unless you can confirm the sender and 
>> know the content is safe.
>>     >>
>>     >>
>>     >>
>>     >>     Added.
>>     >>
>>     >>     On Fri, Feb 3, 2023 at 3:53 PM Beck, Vincent
>>     >>     <vincb...@amazon.com.invalid> wrote:
>>     >>     >
>>     >>     > Thank you! 
>> https://cwiki.apache.org/confluence/display/~vin100.beck
>>     >>     >
>>     >>     > On 2023-02-02, 5:38 PM, "Jarek Potiuk" <ja...@potiuk.com> wrote:
>>     >>     >
>>     >>     >     CAUTION: This email originated from outside of the 
>> organization. Do not click links or open attachments unless you can confirm 
>> the sender and know the content is safe.
>>     >>     >
>>     >>     >
>>     >>     >
>>     >>     >     What's your cwiki ID, Vincent (I'll add you without going 
>> into details yet)
>>     >>     >
>>     >>
>>

Reply via email to