[GitHub] [airflow] potiuk commented on a diff in pull request #33213: Implement `is_authorized()` in auth manager

via GitHub Fri, 25 Aug 2023 09:51:02 -0700


potiuk commented on code in PR #33213:
URL: https://github.com/apache/airflow/pull/33213#discussion_r1305910630



##########
airflow/auth/managers/models/resource_details.py:
##########
@@ -0,0 +1,31 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from dataclasses import dataclass
+
+
+@dataclass
+class ResourceDetails:
+    """
+    Represents the details of a resource.
+
+    All fields must be optional. These details can be used in authorization 
decision.
+    """
+
+    id: str | None = None

Review Comment:
   How about having a single class/protocol that have multiple 
`is_authorized_dag(DagDetails)`,  `is_authorized_task(TaskDetails)` methods?
   
   I am not super insisting on that but I just try to paint a picture of 
someone implementing and Auth Manager. If we have one method defined as it is 
currently, it will look more or less like that (unless I am mistaken and you 
can do it simpler):
   
   ```python
   class MyAuthManager():
         def is_authorized(
           self,
           action: ResourceAction,
           resource_type: str,
           resource_details: ResourceDetails | None = None,
           user: BaseUser | None = None,
       ) -> bool:
          if resource_type = DAG:
              dag_details = cast(DagDetails, details)
              # here dag authorisation code
          elif resource_type = TASK:
              task_details = cast(TaskDetails, details)
              # here task authorisation code
         elif ...
   ```
   
   Every single AuthManager will have to something like this I believe.
   
   Whereas if you do multiple methods, then it will look quite a bit easier 
IMHO.
   
   ```python
   class MyAuthManager():
         def is_authorized_dag(
           self,
           action: ResourceMethod,
           resource_details: DagDetails,
           user: BaseUser | None = None,
       ) -> bool:
              # here dag authorisation code
   
         def is_authorized_task(
           self,
           action: ResourceMethod,
           resource_details: TaskDetails,
           user: BaseUser | None = None,
       ) -> bool:
              # here tas authorisation code
   ```
   
   If you ask me - as implementor, I DEFINITELY prefer the second case. I think 
the single method/pass resource type is forcing the implementor to do all the 
stuff that in the second case would be much more easily done in the Auth 
framework - there is no switching of type of resource information and no 
checking for None.
   
   There are several other benefits of such solution IMHO:
   
   1) when you add new resource type - you do not have to add another "if/else" 
- you just add another method. And you immediately know if the implementing 
class supports it - you can even check if it has the attribute defined and 
adapt in the framwork to serve the case that in the future some optional 
resources might not even be available if the method is not defined. In the 
first case, if you add a new resource type, you have to add yet another method 
"is_supported(resource_type)" and run in on the created instance of the 
AuthManager. In the second case it is enough to check just if the Class (not 
even instance) of the auth manager has appropriate method defined. 
   
   2) the code for authentication of each type is smaller and more localized by 
default. In the first case with single is_auth_method, most implementation will 
have all the code in that method but as long as we have good programmers, they 
will - very quickly separate them to `_is_authorized_dag()`, 
`_is_authorized_task()` anyway - as soon as their if/elif/elif method will 
exceed 100 or so lines. So what eventually will happen - you will have those 
separate methods **anyway** - just without the benefit of your framework 
understanding, inspecting them .
   
   3) This also immediately encourages the nice "DRY" pattern. if you want to 
have some common code that you will have in both _dag and _task (and other) - 
the first thing to do you will extract it to a separate method that will be 
used in both. There is basically no other way. and it's good. In the first case 
those "bad" programmers will keep the code in the `is_authorized` method which 
will make it veeeery long and difficult to test and reason about.  We have a 
few of those in our codebase... My favourite one (so favourite I even created 
an issue for that https://github.com/apache/airflow/issues/33249 ) is 
`backfill_job_runner` 
https://github.com/apache/airflow/blob/main/airflow/jobs/backfill_job_runner.py#L428
 (but there are few others). By having a single `is_authorized` method - you 
are encouraging similar implementations I am afraid. By having them separate 
from day 0, you encourage more modular, focused and DRY implementations.
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] potiuk commented on a diff in pull request #33213: Implement `is_authorized()` in auth manager

Reply via email to