Hey Subash, Jon

> Plugins, providers, and their associated Python libraries all need to execute 
> code in order to be installed which is a vulnerability.  Plugins in 
> particular are often developed/installed by the data engineers and not by 
> system administrators, leading us back to our original problem.

@John - it's You who introduced the User/Admin separation and
reasoning. I think you should follow the logical consequence of it and
introduce different level of access for those two types of users to
manage the platform to address it. You can control who has access to
install things and where. You are managing the access control to be
able to reconfigure the MWAA already and I am sure you do not give
casual "users" the ability to control certain aspects of the platform.
I am sure you could restrict the ability to install packages on
Webserver to only admins and have it open also for users in the
"scheduler/worker". Is that not possible? It sounds like what you
really need from your description.

> @Jarek - you are right about the use/admin difference, it’s a disambiguation 
> that permeates beyond the airflow UI layer in MWAA - IAM auth is used for 
> determining authN and AuthZ, hence to secure the webserver from un-authorized 
> code, we would have to either a/ treat plugin updates as an elevated 
> permission activity, or b/ separate out the webserver intended 
> requirements/plugins from the ones required for DAGs so that the authZ can be 
> handled separately.

Correct. This is exactly what I propose. Have a separate
"providers/plugins' install which only admins can update. Any package
added by "Admin" is added to both webserver is added to both -
webserver and worker/shcheduler. If you want dag-only packages that
are needed by "Users" they can be only added to workers. Sounds pretty
straightforward.

> We stayed with the one-DAG-bad ideology to not add complexity to customers 
> and coaching them on "if you add to A it goes here, and if B it goes to 
> webserver". That’s is why we are now between rock and a hard place - not 
> being to open up all installs into webserver OR separate the DAG bag for 
> webserver and other entities.

No. This is different. It's not "what" you install but "who" installs
it. I just follow the distinction introduced by Josh - if your
corporate customers have two distinct types of users, "Admins" and
"Users", I think you should follow this and introduce those two
different types of users. When a package is added by Admin user, it
should be added to both - webserver and worker/scheduler. If it is
added by the "User" - then it is added only to 'webserver/scheduler".
Then if the admins (I guess those are the ones who need to configure
connections anyway) - if they need a "connection type", they could add
the right provider themselves. Users will not be able to add them.
That completely solves the problem that Josh mentioned, I believe.
Please correct me if I am wrong.

> On 6/18/21, 1:36 PM, "Jarek Potiuk" <[email protected]> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do not 
> click links or open attachments unless you can confirm the sender and know 
> the content is safe.
>
>
>
>     > That would certainly help a bit, but unfortunately it's not just the 
> packages.  It's the fact that authentication is tied to Python code that can 
> be patched by anyone with permission to execute code on the web server, which 
> in turn would give them access to packages or any anything else they'd like.
>
>     But in Airflow 2.0 the code provided by "DAG writers" is not executed
>     any more.  This is entirely gone together with Airflow 1.10.  This has
>     been handled by DAG serialization, which is the only option available
>     in 2.0. I do not see how the "Users" could add any code if "Admins"
>     control the packages that are installed in the webserver. Now if
>     Admin/User is the only problem then I think this is really
>     misunderstanding coming from the pre-DAG-serialization world of Apache
>     Airflow.
>
>     J.
>


-- 
+48 660 796 129

Reply via email to