potiuk commented on code in PR #33213: URL: https://github.com/apache/airflow/pull/33213#discussion_r1308050422
########## airflow/auth/managers/models/resource_details.py: ########## @@ -0,0 +1,31 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from __future__ import annotations + +from dataclasses import dataclass + + +@dataclass +class ResourceDetails: + """ + Represents the details of a resource. + + All fields must be optional. These details can be used in authorization decision. + """ + + id: str | None = None Review Comment: I am not insisting - if you will make a decision to make one "is_authorized" method because you think it will be better, I have no big issue with it - but I would like to warn a bit from taking FAB as the "model" for resources. I think we want to break out from the model introduced by FAB because it was "bad". If you want to map the resources 1-1, yes you will have a lot of them. But I think we only have so many resources, because of FAB declarative method of defining the access via configuraiton. What we are planning to do is to implement AuthManager to define authentication as Code. it's very similar to what you'd compare Oozie and Airflow Workflows. If you want to convert oozie workflows to Airflow you'd end up with strange looking Airflow DAG, which you could much easier express in a Code in Airflow. The FAB "resource" definition has been traditionally very fine-grained because every permission in FAB was "declaratively" defined by assigning permission to role. This was really the only way to achieve it so if you wanted to decide if particular "sub-resource" is accessible or not in a fine grained way - you had no choice but to define fine-grained resources. And it was one of the FAB problems that it was too fine grained in definitiion. The only way you could give/not give access to particular view in FAB was that you combine "ACTION/METHOD" - ADD/DEL etc. with "RESOURCE" and either give or reject the access. For example if you wanted to decide to give/not give access to "DAG Code" you had to give access to RESOURCE_DAG_CODE. There was no other way. But. in our case we can define much more COARSE grained resources and decide in the Auth Manager implementation, whether it should be further fine-grained or not. This is for example why we only have access "per DAG" as resource - because we had no way to express it differeintly. You could not express a "group of DAGs" or "permission per tag" for that very reason because FAB model was "fixed" on resources and you could not express "Give permissions to all DAGS that have tag=x". But we can do soooo much better with AuthManagerAsCode. We can programmatically decide individually what you do in each case - you do not have to have separate resource for every single action you do in them - you can make the decision internally in the implementaiton of the Auth manager. For example instead of "DAG Code" resource have `auth_DAG` method. And this method can have both DagDetails and "accessing" field indicating which particular "part" of the DAG to access: ```python def auth_dag(dag_details: DagDetails, task_details: TaskDetails. | None, accessing_action: DagAccessAction, method: Method): .... ``` Dag details: ``` DagDetails: id: tags: queue/tenant: .... ``` Accessing Action: ``` DagAccessAction(Enum): code, dependencies. run, task_instance, import_error. ``` ``` MethodMethod): read, write, edit, delete ``` Now - our implementation might decide, based on whatever it wants -whether to allow or not specific action. It can be as simple as: ``` def auth_dag(dag_details: DagDetails, task_details: TaskDetails. | None, accessing_action: DagAccessAction, method: Method): if dag_details.queue == 'x': return True # Allow all DAG/TASK operations for DAG where queue is X ``` Or as complex as FAB - mapping the above to the original "resources" and looking up the matrix of permission in the configuration and have access or not to particular tasks beloning to DAG based on whatever criteria. Assumption here is that the permission set usually will be simple - a user can acces all dags that belong to the "queue" - if he belongs to admin group, "read only" if he is regular user, none if he is not there. I believe it is much simpler. I think if we do it this way, then implementor of each of the Auth Manager will make a decision on how complex, fine grained and configurable it is. If AWS Auth Manager want to **just** define 3 levels of users (Admin/ReadAll/None) and use "queue" to determine the group where those users are looked up - they will be able to do it. > My understanding is the number of resource type will not change, hence from [permissions.py](https://github.com/apache/airflow/blob/main/airflow/security/permissions.py#L19), there are 40 resources types and as such, we should have 40 methods? This looks really massive. Yes if you map 1-1. But we should IMHO group access logically and leave the decision on "fine-graining" acces to the implementor. Also a lot of those resources are not needed (user/admin/role goes away) Here is my breakdown (very quick). I split the FAB resources into logical resources that could be used as "Auth manager" entrypoints - each entrypoint will have corresponding parameters determinining - if needed - not only the id of the resource but also action we want to do and method used. User/Admin permissions - not needed in the new model ``` RESOURCE_ACTION = "Permissions" RESOURCE_ADMIN_MENU = "Admin" RESOURCE_AIRFLOW = "Airflow" RESOURCE_MY_PASSWORD = "My Password" RESOURCE_MY_PROFILE = "My Profile" RESOURCE_PASSWORD = "Passwords" RESOURCE_PERMISSION = "Permission Views" # Refers to a Perm <-> View mapping, not an MVC View. RESOURCE_LOGIN = "Logins" RESOURCE_ROLE = "Roles" RESOURCE_USER = "Users" RESOURCE_USER_STATS_CHART = "User Stats Chart" ``` Not needed because it can be implicit from other (connection/config etc). ``` RESOURCE_BROWSE = "Browse" RESOURCE_RESOURCE = "View Menus" ``` We likely not need it: ``` RESOURCE_DOCS = "Documentation" RESOURCE_DOCS_MENU = "Docs" ``` 1. Pools: ``` RESOURCE_POOL = "Pools" ``` 2. SLA: ``` RESOURCE_SLA_MISS = "SLA Misses" ``` 3. Jobs: ``` RESOURCE_JOB = "Jobs" ``` 4. Audit logs: ``` RESOURCE_AUDIT_LOG = "Audit Logs" ``` 5. Configuration: ``` RESOURCE_CONFIG = "Configurations" ``` 6. Connections: ``` RESOURCE_CONNECTION = "Connections" ``` 7. Variables: ``` RESOURCE_VARIABLE = "Variables" ``` 9. XComs: ``` RESOURCE_XCOM = "Xooms" ``` 10. View (read-only state of the installation): ``` RESOURCE_PLUGIN = "Plugins" RESOURCE_PROVIDER = "Providers" RESOURCE_TRIGGER = "Triggers" RESOURCE_WEBSITE = "Website" ``` 11. Dags: ``` RESOURCE_DAG = "DAGs" RESOURCE_DAG_CODE = "DAG Code" RESOURCE_DAG_DEPENDENCIES = "DAG Dependencies" RESOURCE_DAG_RUN = "DAG Runs" RESOURCE_DAG_WARNING = "DAG Warnings" RESOURCE_TASK_INSTANCE = "Task Instances" RESOURCE_TASK_LOG = "Task Logs" RESOURCE_TASK_RESCHEDULE = "Task Reschedules" RESOURCE_IMPORT_ERROR = "ImportError" ``` 12. Cluster activity: ``` RESOURCE_CLUSTER_ACTIVITY = "Cluster Activity" ``` 13. Dataset ``` RESOURCE_DATASET = "Datasets" ``` I can meet and discuss more tomorrow if needed, I just wanted to make sure that we do not "stick" to the current - IMHO broken model of FAB but we come up with something much more reasonable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
