codeant-ai-for-open-source[bot] commented on code in PR #37371:
URL: https://github.com/apache/superset/pull/37371#discussion_r2738631824


##########
superset/security/manager.py:
##########
@@ -212,6 +213,182 @@ def freeze_value(value: Any) -> str:
     return json.dumps(value, sort_keys=True)
 
 
+# -----------------------------------------------------------------------------
+# Field comparison configuration for query_context_modified()
+# -----------------------------------------------------------------------------
+
+
+class FieldMapping(TypedDict):
+    """Configuration for comparing a field between requested and stored 
values."""
+
+    field: str
+    equivalent: list[str]
+    comparator: NotRequired[
+        Callable[
+            ["QueryContext", "Slice", set[str]],  # query_context, 
stored_chart, visible
+            bool,  # True if modified (should block)
+        ]
+    ]
+
+
+def _extract_orderby_column_name(orderby_item: Any) -> str | None:
+    """
+    Extract column/metric name from an orderby tuple element.
+
+    Returns None for SQL expressions which should be blocked for security.
+    """
+    if isinstance(orderby_item, str):
+        return orderby_item
+
+    if isinstance(orderby_item, dict):
+        # Block adhoc SQL expressions - potential injection vector
+        if orderby_item.get("expressionType") == "SQL":
+            return None
+        if label := orderby_item.get("label"):
+            return label
+        if col := orderby_item.get("column"):
+            if isinstance(col, dict):
+                return col.get("column_name")
+            return None
+
+    return None
+
+
+def _get_visible_columns(stored_chart: "Slice") -> set[str]:
+    """
+    Extract column/metric names visible in the chart.
+
+    Guest users can only sort by these columns (whitelist approach).
+    """
+    params = stored_chart.params_dict
+    visible: set[str] = set()
+
+    for col in params.get("columns") or []:
+        if isinstance(col, str):
+            visible.add(col)
+        elif isinstance(col, dict) and (label := col.get("label")):
+            visible.add(label)
+
+    for col in params.get("groupby") or []:
+        if isinstance(col, str):
+            visible.add(col)
+        elif isinstance(col, dict) and (label := col.get("label")):
+            visible.add(label)
+
+    for metric in params.get("metrics") or []:
+        if isinstance(metric, str):
+            visible.add(metric)
+        elif isinstance(metric, dict) and (label := metric.get("label")):
+            visible.add(label)
+
+    return visible
+
+
+def _validate_orderby_list(
+    orderby: Any,
+    visible_columns: set[str],
+) -> bool:
+    """
+    Validate a single orderby list against visible columns.
+
+    Returns True if invalid (should block), False if valid.
+    """
+    if orderby is not None and not isinstance(orderby, list):
+        return True
+    for orderby_tuple in orderby or []:
+        if not isinstance(orderby_tuple, (list, tuple)) or len(orderby_tuple) 
< 1:
+            return True
+        col_name = _extract_orderby_column_name(orderby_tuple[0])

Review Comment:
   **Suggestion:** `_validate_orderby_list` relies on 
`_extract_orderby_column_name` but treats dict-based orderby items whose 
`"column"` key is a plain string as invalid (because 
`_extract_orderby_column_name` returns None for that shape). Extract the column 
name for dicts with a string `"column"` key before delegating to 
`_extract_orderby_column_name`, and accept outer `orderby` that are tuples as 
well. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Guest sorting blocked for dict-shaped orderby items.
   - ⚠️ Blocks valid column sorts with `{"column":"col"}` shape.
   - ⚠️ Affects embedded dashboard sort UX.
   ```
   </details>
   
   ```suggestion
       # Accept both lists and tuples as the outer container
       if orderby is not None and not isinstance(orderby, (list, tuple)):
           return True
       for orderby_tuple in orderby or []:
           if not isinstance(orderby_tuple, (list, tuple)) or 
len(orderby_tuple) < 1:
               return True
           first = orderby_tuple[0]
           # Handle common shapes:
           # - string column names
           # - dicts with 'label' or 'expressionType'
           # - dicts with 'column' which may be either a dict or a plain string
           col_name = None
           if isinstance(first, dict):
               # If column is a plain string (e.g., {"column": "col_name"}), 
accept it.
               col = first.get("column")
               if isinstance(col, str):
                   col_name = col
               else:
                   # Fall back to the shared extractor which handles dict 
shapes,
                   # including blocking SQL expressionType.
                   col_name = _extract_orderby_column_name(first)
           else:
               col_name = _extract_orderby_column_name(first)
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Inspect superset/security/manager.py: the helper `_validate_orderby_list` 
is added in
   the PR hunk starting at line 287 and its outer-type guard lives at line 296 
(`if orderby
   is not None and not isinstance(orderby, list): return True`).
   
   2. Consider an orderby element shaped as a dict with a plain `column` 
string, for example:
   `orderby = [["", {"column": "user_email"}]]` or a tuple-equivalent attached 
to
   `query.orderby` or `form_data["orderby"]`. Such shapes can be produced by 
code that
   encodes column references using `{"column": "col_name"}`.
   
   3. `_validate_orderby_list` currently delegates extraction to
   `_extract_orderby_column_name(first)` (line 301). 
`_extract_orderby_column_name` returns
   `None` when the `column` key is present but is a plain string (it only 
extracts when
   `column` is a dict and returns its `column_name`), causing `col_name` to be 
None.
   
   4. Because `col_name` is None, `_validate_orderby_list` returns True (line 
302-303),
   causing `_orderby_whitelist_compare` to treat the orderby as invalid and 
block the
   request. The blocked request surfaces as the guest-facing "Guest user cannot 
modify chart
   payload" when `query_context_modified()` is used.
   
   Note: The code already intentionally blocks dicts with `expressionType: 
"SQL"` (security).
   This suggestion only affects dicts where `column` is a plain string — a 
common, non-SQL
   shape that should be accepted and whitelisted if the column is visible.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/security/manager.py
   **Line:** 296:301
   **Comment:**
        *Logic Error: `_validate_orderby_list` relies on 
`_extract_orderby_column_name` but treats dict-based orderby items whose 
`"column"` key is a plain string as invalid (because 
`_extract_orderby_column_name` returns None for that shape). Extract the column 
name for dicts with a string `"column"` key before delegating to 
`_extract_orderby_column_name`, and accept outer `orderby` that are tuples as 
well.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/security/manager.py:
##########
@@ -212,6 +213,182 @@ def freeze_value(value: Any) -> str:
     return json.dumps(value, sort_keys=True)
 
 
+# -----------------------------------------------------------------------------
+# Field comparison configuration for query_context_modified()
+# -----------------------------------------------------------------------------
+
+
+class FieldMapping(TypedDict):
+    """Configuration for comparing a field between requested and stored 
values."""
+
+    field: str
+    equivalent: list[str]
+    comparator: NotRequired[
+        Callable[
+            ["QueryContext", "Slice", set[str]],  # query_context, 
stored_chart, visible
+            bool,  # True if modified (should block)
+        ]
+    ]
+
+
+def _extract_orderby_column_name(orderby_item: Any) -> str | None:
+    """
+    Extract column/metric name from an orderby tuple element.
+
+    Returns None for SQL expressions which should be blocked for security.
+    """
+    if isinstance(orderby_item, str):
+        return orderby_item
+
+    if isinstance(orderby_item, dict):
+        # Block adhoc SQL expressions - potential injection vector
+        if orderby_item.get("expressionType") == "SQL":
+            return None
+        if label := orderby_item.get("label"):
+            return label
+        if col := orderby_item.get("column"):
+            if isinstance(col, dict):
+                return col.get("column_name")
+            return None
+
+    return None
+
+
+def _get_visible_columns(stored_chart: "Slice") -> set[str]:
+    """
+    Extract column/metric names visible in the chart.
+
+    Guest users can only sort by these columns (whitelist approach).
+    """
+    params = stored_chart.params_dict
+    visible: set[str] = set()
+
+    for col in params.get("columns") or []:
+        if isinstance(col, str):
+            visible.add(col)
+        elif isinstance(col, dict) and (label := col.get("label")):
+            visible.add(label)
+
+    for col in params.get("groupby") or []:
+        if isinstance(col, str):
+            visible.add(col)
+        elif isinstance(col, dict) and (label := col.get("label")):
+            visible.add(label)
+
+    for metric in params.get("metrics") or []:
+        if isinstance(metric, str):
+            visible.add(metric)
+        elif isinstance(metric, dict) and (label := metric.get("label")):
+            visible.add(label)
+
+    return visible
+
+
+def _validate_orderby_list(
+    orderby: Any,
+    visible_columns: set[str],
+) -> bool:
+    """
+    Validate a single orderby list against visible columns.
+
+    Returns True if invalid (should block), False if valid.
+    """
+    if orderby is not None and not isinstance(orderby, list):
+        return True
+    for orderby_tuple in orderby or []:
+        if not isinstance(orderby_tuple, (list, tuple)) or len(orderby_tuple) 
< 1:
+            return True
+        col_name = _extract_orderby_column_name(orderby_tuple[0])
+        if col_name is None or col_name not in visible_columns:
+            return True
+    return False
+
+
+def _orderby_whitelist_compare(
+    query_context: "QueryContext",
+    stored_chart: "Slice",
+    visible_columns: set[str],
+) -> bool:
+    """
+    Compare orderby using whitelist approach.
+
+    Allows sorting by any visible column, blocks hidden columns and SQL 
expressions.
+    Returns True if modified (request should be blocked).
+
+    Defensive barriers:
+    - If orderby is not a list, block (fail-closed)
+    - If orderby element is not a tuple/list, block (fail-closed)
+    """
+    form_data = query_context.form_data or {}
+
+    # Check form_data orderby
+    if _validate_orderby_list(form_data.get("orderby"), visible_columns):
+        return True
+
+    # Check queries orderby
+    for query in query_context.queries:
+        if _validate_orderby_list(getattr(query, "orderby", None), 
visible_columns):

Review Comment:
   **Suggestion:** The comparator iterates `query_context.queries` and directly 
validates `query.orderby`; if a `query.orderby` is a tuple it will be 
considered invalid and block the request. Normalize tuple orderby values from 
each `query` to a list before calling `_validate_orderby_list`. [possible bug]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Embedded guest dashboard sorting blocked (server-side queries).
   - ⚠️ Stored QueryContext deserialization may be rejected.
   - ⚠️ Affects guest UX when query.orderby shape varies.
   ```
   </details>
   
   ```suggestion
           orderby = getattr(query, "orderby", None)
           if isinstance(orderby, tuple):
               orderby = list(orderby)
           if _validate_orderby_list(orderby, visible_columns):
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Open superset/security/manager.py and inspect 
`_orderby_whitelist_compare` (hunk
   beginning around line 312). The loop over `query_context.queries` starts at 
line 329 and
   performs the validation call at line 330: `if 
_validate_orderby_list(getattr(query,
   "orderby", None), visible_columns):`.
   
   2. Produce a QueryContext where one of the `query` objects has its 
`.orderby` attribute
   set to a Python tuple (this can occur if server-side code constructs Query 
objects with
   tuple-valued orderby, or if a stored `query_context` decoded from JSON is 
converted into
   tuple structures before being attached to `query_context.queries`).
   
   3. When `_validate_orderby_list` (defined at lines 287-304) receives a tuple 
for
   `orderby`, the first guard (`if orderby is not None and not 
isinstance(orderby, list):
   return True`) treats it as invalid and returns True.
   
   4. The `_orderby_whitelist_compare` loop then returns True (line 330), 
causing
   `query_context_modified()` to report modification and
   `SupersetSecurityManager.raise_for_access()` to raise a 
`SupersetSecurityException`,
   blocking the request (guest sorting blocked).
   
   Note: This reproducer traces the exact call site that validates 
`query.orderby` (line
   330). If all code paths that build `query.orderby` use lists, the issue 
won't surface. If
   some server-side code uses tuples, this is a real problem.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/security/manager.py
   **Line:** 330:330
   **Comment:**
        *Possible Bug: The comparator iterates `query_context.queries` and 
directly validates `query.orderby`; if a `query.orderby` is a tuple it will be 
considered invalid and block the request. Normalize tuple orderby values from 
each `query` to a list before calling `_validate_orderby_list`.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to