[GitHub] [superset] villebro commented on a change in pull request #16991: fix(QueryContext): validation does not validate query_context metrics

GitBox Mon, 29 Nov 2021 01:21:10 -0800


villebro commented on a change in pull request #16991:
URL: https://github.com/apache/superset/pull/16991#discussion_r758153548




##########
File path: tests/common/query_context_generator.py
##########
@@ -248,12 +302,77 @@ def generate(
         return {
             "datasource": {"id": table.id, "type": table.type},
             "queries": [
-                get_query_object(
+                self.generate_query_object(
                     query_name, add_postprocessing_operations, 
add_time_offsets,
                 )
             ],
             "result_type": ChartDataResultType.FULL,
         }
 
+    @staticmethod
+    def generate_query_object(
+        query_name: str,
+        add_postprocessing_operations: bool = False,
+        add_time_offsets: bool = False,
+        **kwargs: Any,
+    ) -> Dict[Any, Any]:
+        if query_name in QUERY_OBJECTS:
+            obj = QUERY_OBJECTS[query_name]
+
+            # apply overrides
+            if ":" in query_name:
+                parent_query_name = query_name.split(":")[0]
+                obj = {
+                    **QUERY_OBJECTS[parent_query_name],
+                    **obj,
+                }
+        else:
+            obj = DEFAULT_VALUES
+
+        query_object = copy.deepcopy(obj)
+        if add_postprocessing_operations:
+            query_object["post_processing"] = 
_get_postprocessing_operation(query_name)
+        if add_time_offsets:
+            query_object["time_offsets"] = ["1 year ago"]
+
+        if kwargs:
+            dict_merge(query_object, kwargs)
+
+        return query_object
+
+    @staticmethod
+    def generate_sql_expression_metric(
+        column_name="name", table_name="superset.ab_permission", **kwargs: Any
+    ) -> Dict[str, Any]:
+        return dict_merge(
+            {
+                "aggregate": "COUNT",
+                "column": {
+                    "certification_details": None,
+                    "certified_by": None,
+                    "column_name": column_name,
+                    "description": None,
+                    "expression": None,
+                    "filterable": True,
+                    "groupby": True,
+                    "id": 572,
+                    "is_certified": False,
+                    "is_dttm": False,
+                    "python_date_format": None,
+                    "type": "TEXT",
+                    "type_generic": 1,
+                    "verbose_name": None,
+                    "warning_markdown": None,
+                },
+                "expressionType": "SQL",
+                "hasCustomLabel": False,
+                "isNew": False,
+                "label": "COUNT({})".format(column_name),

Review comment:
       nit:
   ```suggestion
                   "label": f"COUNT({column_name})",
   ```

##########
File path: superset/security/manager.py
##########
@@ -14,8 +14,10 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-# pylint: disable=too-many-lines
+# pylint: disable=too-many-lines, invalid-name

Review comment:
       I don't think we should be adding global disables like this here

##########
File path: superset/charts/data/commands/get_data_command.py
##########
@@ -14,6 +14,8 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from __future__ import annotations
+

Review comment:
       nit: I don't believe we need these, as we're no longer actively 
supporting 3.7 (will be fully deprecated shortly)

##########
File path: superset/security/manager.py
##########
@@ -974,14 +976,14 @@ def set_perm(  # pylint: disable=unused-argument
                 )
 
     def raise_for_access(
-        # pylint: disable=too-many-arguments,too-many-locals
+        # pylint: disable=too-many-arguments
         self,
-        database: Optional["Database"] = None,
-        datasource: Optional["BaseDatasource"] = None,
-        query: Optional["Query"] = None,
-        query_context: Optional["QueryContext"] = None,
-        table: Optional["Table"] = None,
-        viz: Optional["BaseViz"] = None,
+        database: Optional[Database] = None,
+        datasource: Optional[BaseDatasource] = None,
+        query: Optional[Query] = None,
+        query_context: Optional[QueryContext] = None,
+        table: Optional[Table] = None,
+        viz: Optional[BaseViz] = None,

Review comment:
       Is this nowadays the correct way of referring to imports from 
`TYPE_CHECKING`?

##########
File path: superset/security/manager.py
##########
@@ -1047,17 +1048,23 @@ def raise_for_access(
 
             assert datasource
 
-            if not (
-                self.can_access_schema(datasource)
-                or self.can_access("datasource_access", datasource.perm or "")
-                or (
-                    feature_flag_manager.is_feature_enabled("DASHBOARD_RBAC")
-                    and self.can_access_based_on_dashboard(datasource)
-                )
-            ):
-                raise SupersetSecurityException(
-                    self.get_datasource_access_error_object(datasource)
-                )
+            self.raise_when_there_is_no_access_to(datasource)
+
+    def raise_when_there_is_no_access_to(self, datasource: BaseDatasource) -> 
None:

Review comment:
       This name should IMO follow the existing naming conventions. In this 
case `raise_for_datasource_access` would be more appropriate and more aligned 
with the pre-existing `raise_for_dashboard_access`.

##########
File path: tests/common/query_context_generator.py
##########
@@ -248,12 +302,77 @@ def generate(
         return {
             "datasource": {"id": table.id, "type": table.type},
             "queries": [
-                get_query_object(
+                self.generate_query_object(
                     query_name, add_postprocessing_operations, 
add_time_offsets,
                 )
             ],
             "result_type": ChartDataResultType.FULL,
         }
 
+    @staticmethod
+    def generate_query_object(
+        query_name: str,
+        add_postprocessing_operations: bool = False,
+        add_time_offsets: bool = False,
+        **kwargs: Any,
+    ) -> Dict[Any, Any]:
+        if query_name in QUERY_OBJECTS:
+            obj = QUERY_OBJECTS[query_name]
+
+            # apply overrides
+            if ":" in query_name:
+                parent_query_name = query_name.split(":")[0]
+                obj = {
+                    **QUERY_OBJECTS[parent_query_name],
+                    **obj,
+                }
+        else:
+            obj = DEFAULT_VALUES
+
+        query_object = copy.deepcopy(obj)
+        if add_postprocessing_operations:
+            query_object["post_processing"] = 
_get_postprocessing_operation(query_name)
+        if add_time_offsets:
+            query_object["time_offsets"] = ["1 year ago"]
+
+        if kwargs:
+            dict_merge(query_object, kwargs)
+
+        return query_object
+
+    @staticmethod
+    def generate_sql_expression_metric(
+        column_name="name", table_name="superset.ab_permission", **kwargs: Any
+    ) -> Dict[str, Any]:
+        return dict_merge(
+            {
+                "aggregate": "COUNT",
+                "column": {
+                    "certification_details": None,
+                    "certified_by": None,
+                    "column_name": column_name,
+                    "description": None,
+                    "expression": None,
+                    "filterable": True,
+                    "groupby": True,
+                    "id": 572,
+                    "is_certified": False,
+                    "is_dttm": False,
+                    "python_date_format": None,
+                    "type": "TEXT",
+                    "type_generic": 1,
+                    "verbose_name": None,
+                    "warning_markdown": None,
+                },
+                "expressionType": "SQL",
+                "hasCustomLabel": False,
+                "isNew": False,
+                "label": "COUNT({})".format(column_name),
+                "optionName": "metric_bvvzfjgdg7_eawlsdp84tq",
+                "sqlExpression": "(SELECT count(name) FROM 
{})".format(table_name),

Review comment:
       nit: 
   ```suggestion
                   "sqlExpression": f"(SELECT count(name) FROM {table_name})",
   ```

##########
File path: superset/charts/data/query_context_validator.py
##########
@@ -0,0 +1,168 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=invalid-name, no-self-use, too-few-public-methods
+from __future__ import annotations
+
+from typing import List, Optional, Set, TYPE_CHECKING
+
+from superset.charts.data.commands.get_data_command import 
QueryContextValidator
+from superset.errors import ErrorLevel, SupersetError, SupersetErrorType
+from superset.exceptions import SupersetSecurityException
+from superset.sql_parse import ParsedQuery, Table
+
+if TYPE_CHECKING:
+    from superset import SupersetSecurityManager
+    from superset.common.query_context import QueryContext
+    from superset.common.query_object import QueryObject
+    from superset.connectors.base.models import BaseDatasource
+    from superset.connectors.sqla.models import SqlaTable
+    from superset.datasets.dao import DatasetDAO
+    from superset.models.core import Database
+    from superset.typing import Metric
+
+
+class QueryContextValidatorImpl(QueryContextValidator):
+    _dataset_dao: DatasetDAO
+    _security_manager: SupersetSecurityManager
+
+    def __init__(
+        self, dataset_dao: DatasetDAO, security_manager: 
SupersetSecurityManager
+    ):
+        self._dataset_dao = dataset_dao
+        self._security_manager = security_manager
+
+    def validate(self, query_context: QueryContext) -> None:
+        self._validate_actor_can_access(query_context)
+        self._validate_queries_context(query_context)
+
+    def _validate_actor_can_access(self, query_context: QueryContext) -> None:
+        sql_database: Optional[Database] = query_context.get_database()
+        if sql_database is not None:
+            self._validate_when_context_based_on_sql_database(
+                query_context, sql_database
+            )
+        else:
+            
self._security_manager.raise_for_access(query_context=query_context)
+
+    def _validate_when_context_based_on_sql_database(

Review comment:
       This has come up in previous review comments from me. While I agree that 
it's good practice to break out logical chunks to improve readability and 
enforce separation of concerns, there is a point when abstraction can actually 
decrease readability. Each nested call causes additional cognitive overhead for 
the reviewer, and dilutes the relevance of other more important 
functions/methods. As a rule of thumb, if a private method is 1) only used once 
2) consists of less than 5 lines of code 3) further branches out into another 
private method which also satisfies 1) and 2), I would consider just inlining 
the code and adding a comment describing what is happening (if needed).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [superset] villebro commented on a change in pull request #16991: fix(QueryContext): validation does not validate query_context metrics

Reply via email to