potiuk commented on code in PR #33213: URL: https://github.com/apache/airflow/pull/33213#discussion_r1305910630
########## airflow/auth/managers/models/resource_details.py: ########## @@ -0,0 +1,31 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from __future__ import annotations + +from dataclasses import dataclass + + +@dataclass +class ResourceDetails: + """ + Represents the details of a resource. + + All fields must be optional. These details can be used in authorization decision. + """ + + id: str | None = None Review Comment: How about having a single class/protocol that have multiple `is_authorized_dag(DagDetails)`, `is_authorized_task(TaskDetails)` methods? I am not super insisting on that but I just try to paint a picture of someone implementing and Auth Manager. If we have one method defined as it is currently, it will look more or less like that (unless I am mistaken and you can do it simpler): ```python class MyAuthManager(): def is_authorized( self, action: ResourceAction, resource_type: str, resource_details: ResourceDetails | None = None, user: BaseUser | None = None, ) -> bool: if resource_type = DAG: dag_details = cast(DagDetails, details) # here dag authorisation code elif resource_type = TASK: task_details = cast(TaskDetails, details) # here task authorisation code elif ... ``` Every single AuthManager will have to something like this I believe. Whereas if you do multiple methods, then it will look quite a bit easier IMHO. ```python class MyAuthManager(): def is_authorized_dag( self, action: ResourceMethod, dag_details: DagDetails, user: BaseUser | None = None, ) -> bool: # here dag authorisation code def is_authorized_task( self, action: ResourceMethod, task_details: TaskDetails, user: BaseUser | None = None, ) -> bool: # here tas authorisation code ``` If you ask me - as implementor, I DEFINITELY prefer the second case. I think the single method/pass resource type is forcing the implementor to do all the stuff that in the second case would be much more easily done in the Auth framework - there is no switching of type of resource information and no checking for None. There are several other benefits of such solution IMHO: 1) when you add new resource type - you do not have to add another "if/else" - you just add another method. And you immediately know if the implementing class supports it - you can even check if it has the attribute defined and adapt in the framwork to serve the case that in the future some optional resources might not even be available if the method is not defined. In the first case, if you add a new resource type, you have to add yet another method "is_supported(resource_type)" and run in on the created instance of the AuthManager. In the second case it is enough to check just if the Class (not even instance) of the auth manager has appropriate method defined. 2) the code for authentication of each type is smaller and more localized by default. In the first case with single is_auth_method, most implementation will have all the code in that method but as long as we have good programmers, they will - very quickly separate them to `_is_authorized_dag()`, `_is_authorized_task()` anyway - as soon as their if/elif/elif method will exceed 100 or so lines. So what eventually will happen - you will have those separate methods **anyway** - just without the benefit of your framework understanding, inspecting them . 3) This also immediately encourages the nice "DRY" pattern. if you want to have some common code that you will have in both _dag and _task (and other) - the first thing to do you will extract it to a separate method that will be used in both. There is basically no other way. and it's good. In the first case those "bad" programmers will keep the code in the `is_authorized` method which will make it veeeery long and difficult to test and reason about. We have a few of those in our codebase... My favourite one (so favourite I even created an issue for that https://github.com/apache/airflow/issues/33249 ) is `backfill_job_runner` https://github.com/apache/airflow/blob/main/airflow/jobs/backfill_job_runner.py#L428 (but there are few others). By having a single `is_authorized` method - you are encouraging similar implementations I am afraid. By having them separate from day 0, you encourage more modular, focused and DRY implementations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
