Just for clarity - correction to the last paragraph - <if added by the
User, the package is added only to "worker/scheduler">

J.

On Sat, Jun 19, 2021 at 1:03 AM Jarek Potiuk <[email protected]> wrote:
>
> Hey Subash, Jon
>
> > Plugins, providers, and their associated Python libraries all need to 
> > execute code in order to be installed which is a vulnerability.  Plugins in 
> > particular are often developed/installed by the data engineers and not by 
> > system administrators, leading us back to our original problem.
>
> @John - it's You who introduced the User/Admin separation and
> reasoning. I think you should follow the logical consequence of it and
> introduce different level of access for those two types of users to
> manage the platform to address it. You can control who has access to
> install things and where. You are managing the access control to be
> able to reconfigure the MWAA already and I am sure you do not give
> casual "users" the ability to control certain aspects of the platform.
> I am sure you could restrict the ability to install packages on
> Webserver to only admins and have it open also for users in the
> "scheduler/worker". Is that not possible? It sounds like what you
> really need from your description.
>
> > @Jarek - you are right about the use/admin difference, it’s a 
> > disambiguation that permeates beyond the airflow UI layer in MWAA - IAM 
> > auth is used for determining authN and AuthZ, hence to secure the webserver 
> > from un-authorized code, we would have to either a/ treat plugin updates as 
> > an elevated permission activity, or b/ separate out the webserver intended 
> > requirements/plugins from the ones required for DAGs so that the authZ can 
> > be handled separately.
>
> Correct. This is exactly what I propose. Have a separate
> "providers/plugins' install which only admins can update. Any package
> added by "Admin" is added to both webserver is added to both -
> webserver and worker/shcheduler. If you want dag-only packages that
> are needed by "Users" they can be only added to workers. Sounds pretty
> straightforward.
>
> > We stayed with the one-DAG-bad ideology to not add complexity to customers 
> > and coaching them on "if you add to A it goes here, and if B it goes to 
> > webserver". That’s is why we are now between rock and a hard place - not 
> > being to open up all installs into webserver OR separate the DAG bag for 
> > webserver and other entities.
>
> No. This is different. It's not "what" you install but "who" installs
> it. I just follow the distinction introduced by Josh - if your
> corporate customers have two distinct types of users, "Admins" and
> "Users", I think you should follow this and introduce those two
> different types of users. When a package is added by Admin user, it
> should be added to both - webserver and worker/scheduler. If it is
> added by the "User" - then it is added only to 'webserver/scheduler".
> Then if the admins (I guess those are the ones who need to configure
> connections anyway) - if they need a "connection type", they could add
> the right provider themselves. Users will not be able to add them.
> That completely solves the problem that Josh mentioned, I believe.
> Please correct me if I am wrong.
>
> > On 6/18/21, 1:36 PM, "Jarek Potiuk" <[email protected]> wrote:
> >
> >     CAUTION: This email originated from outside of the organization. Do not 
> > click links or open attachments unless you can confirm the sender and know 
> > the content is safe.
> >
> >
> >
> >     > That would certainly help a bit, but unfortunately it's not just the 
> > packages.  It's the fact that authentication is tied to Python code that 
> > can be patched by anyone with permission to execute code on the web server, 
> > which in turn would give them access to packages or any anything else 
> > they'd like.
> >
> >     But in Airflow 2.0 the code provided by "DAG writers" is not executed
> >     any more.  This is entirely gone together with Airflow 1.10.  This has
> >     been handled by DAG serialization, which is the only option available
> >     in 2.0. I do not see how the "Users" could add any code if "Admins"
> >     control the packages that are installed in the webserver. Now if
> >     Admin/User is the only problem then I think this is really
> >     misunderstanding coming from the pre-DAG-serialization world of Apache
> >     Airflow.
> >
> >     J.
> >
>
>
> --
> +48 660 796 129



-- 
+48 660 796 129

Reply via email to