codeant-ai-for-open-source[bot] commented on code in PR #37371:
URL: https://github.com/apache/superset/pull/37371#discussion_r2738631824
##########
superset/security/manager.py:
##########
@@ -212,6 +213,182 @@ def freeze_value(value: Any) -> str:
return json.dumps(value, sort_keys=True)
+# -----------------------------------------------------------------------------
+# Field comparison configuration for query_context_modified()
+# -----------------------------------------------------------------------------
+
+
+class FieldMapping(TypedDict):
+ """Configuration for comparing a field between requested and stored
values."""
+
+ field: str
+ equivalent: list[str]
+ comparator: NotRequired[
+ Callable[
+ ["QueryContext", "Slice", set[str]], # query_context,
stored_chart, visible
+ bool, # True if modified (should block)
+ ]
+ ]
+
+
+def _extract_orderby_column_name(orderby_item: Any) -> str | None:
+ """
+ Extract column/metric name from an orderby tuple element.
+
+ Returns None for SQL expressions which should be blocked for security.
+ """
+ if isinstance(orderby_item, str):
+ return orderby_item
+
+ if isinstance(orderby_item, dict):
+ # Block adhoc SQL expressions - potential injection vector
+ if orderby_item.get("expressionType") == "SQL":
+ return None
+ if label := orderby_item.get("label"):
+ return label
+ if col := orderby_item.get("column"):
+ if isinstance(col, dict):
+ return col.get("column_name")
+ return None
+
+ return None
+
+
+def _get_visible_columns(stored_chart: "Slice") -> set[str]:
+ """
+ Extract column/metric names visible in the chart.
+
+ Guest users can only sort by these columns (whitelist approach).
+ """
+ params = stored_chart.params_dict
+ visible: set[str] = set()
+
+ for col in params.get("columns") or []:
+ if isinstance(col, str):
+ visible.add(col)
+ elif isinstance(col, dict) and (label := col.get("label")):
+ visible.add(label)
+
+ for col in params.get("groupby") or []:
+ if isinstance(col, str):
+ visible.add(col)
+ elif isinstance(col, dict) and (label := col.get("label")):
+ visible.add(label)
+
+ for metric in params.get("metrics") or []:
+ if isinstance(metric, str):
+ visible.add(metric)
+ elif isinstance(metric, dict) and (label := metric.get("label")):
+ visible.add(label)
+
+ return visible
+
+
+def _validate_orderby_list(
+ orderby: Any,
+ visible_columns: set[str],
+) -> bool:
+ """
+ Validate a single orderby list against visible columns.
+
+ Returns True if invalid (should block), False if valid.
+ """
+ if orderby is not None and not isinstance(orderby, list):
+ return True
+ for orderby_tuple in orderby or []:
+ if not isinstance(orderby_tuple, (list, tuple)) or len(orderby_tuple)
< 1:
+ return True
+ col_name = _extract_orderby_column_name(orderby_tuple[0])
Review Comment:
**Suggestion:** `_validate_orderby_list` relies on
`_extract_orderby_column_name` but treats dict-based orderby items whose
`"column"` key is a plain string as invalid (because
`_extract_orderby_column_name` returns None for that shape). Extract the column
name for dicts with a string `"column"` key before delegating to
`_extract_orderby_column_name`, and accept outer `orderby` that are tuples as
well. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Guest sorting blocked for dict-shaped orderby items.
- ⚠️ Blocks valid column sorts with `{"column":"col"}` shape.
- ⚠️ Affects embedded dashboard sort UX.
```
</details>
```suggestion
# Accept both lists and tuples as the outer container
if orderby is not None and not isinstance(orderby, (list, tuple)):
return True
for orderby_tuple in orderby or []:
if not isinstance(orderby_tuple, (list, tuple)) or
len(orderby_tuple) < 1:
return True
first = orderby_tuple[0]
# Handle common shapes:
# - string column names
# - dicts with 'label' or 'expressionType'
# - dicts with 'column' which may be either a dict or a plain string
col_name = None
if isinstance(first, dict):
# If column is a plain string (e.g., {"column": "col_name"}),
accept it.
col = first.get("column")
if isinstance(col, str):
col_name = col
else:
# Fall back to the shared extractor which handles dict
shapes,
# including blocking SQL expressionType.
col_name = _extract_orderby_column_name(first)
else:
col_name = _extract_orderby_column_name(first)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Inspect superset/security/manager.py: the helper `_validate_orderby_list`
is added in
the PR hunk starting at line 287 and its outer-type guard lives at line 296
(`if orderby
is not None and not isinstance(orderby, list): return True`).
2. Consider an orderby element shaped as a dict with a plain `column`
string, for example:
`orderby = [["", {"column": "user_email"}]]` or a tuple-equivalent attached
to
`query.orderby` or `form_data["orderby"]`. Such shapes can be produced by
code that
encodes column references using `{"column": "col_name"}`.
3. `_validate_orderby_list` currently delegates extraction to
`_extract_orderby_column_name(first)` (line 301).
`_extract_orderby_column_name` returns
`None` when the `column` key is present but is a plain string (it only
extracts when
`column` is a dict and returns its `column_name`), causing `col_name` to be
None.
4. Because `col_name` is None, `_validate_orderby_list` returns True (line
302-303),
causing `_orderby_whitelist_compare` to treat the orderby as invalid and
block the
request. The blocked request surfaces as the guest-facing "Guest user cannot
modify chart
payload" when `query_context_modified()` is used.
Note: The code already intentionally blocks dicts with `expressionType:
"SQL"` (security).
This suggestion only affects dicts where `column` is a plain string — a
common, non-SQL
shape that should be accepted and whitelisted if the column is visible.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/security/manager.py
**Line:** 296:301
**Comment:**
*Logic Error: `_validate_orderby_list` relies on
`_extract_orderby_column_name` but treats dict-based orderby items whose
`"column"` key is a plain string as invalid (because
`_extract_orderby_column_name` returns None for that shape). Extract the column
name for dicts with a string `"column"` key before delegating to
`_extract_orderby_column_name`, and accept outer `orderby` that are tuples as
well.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/security/manager.py:
##########
@@ -212,6 +213,182 @@ def freeze_value(value: Any) -> str:
return json.dumps(value, sort_keys=True)
+# -----------------------------------------------------------------------------
+# Field comparison configuration for query_context_modified()
+# -----------------------------------------------------------------------------
+
+
+class FieldMapping(TypedDict):
+ """Configuration for comparing a field between requested and stored
values."""
+
+ field: str
+ equivalent: list[str]
+ comparator: NotRequired[
+ Callable[
+ ["QueryContext", "Slice", set[str]], # query_context,
stored_chart, visible
+ bool, # True if modified (should block)
+ ]
+ ]
+
+
+def _extract_orderby_column_name(orderby_item: Any) -> str | None:
+ """
+ Extract column/metric name from an orderby tuple element.
+
+ Returns None for SQL expressions which should be blocked for security.
+ """
+ if isinstance(orderby_item, str):
+ return orderby_item
+
+ if isinstance(orderby_item, dict):
+ # Block adhoc SQL expressions - potential injection vector
+ if orderby_item.get("expressionType") == "SQL":
+ return None
+ if label := orderby_item.get("label"):
+ return label
+ if col := orderby_item.get("column"):
+ if isinstance(col, dict):
+ return col.get("column_name")
+ return None
+
+ return None
+
+
+def _get_visible_columns(stored_chart: "Slice") -> set[str]:
+ """
+ Extract column/metric names visible in the chart.
+
+ Guest users can only sort by these columns (whitelist approach).
+ """
+ params = stored_chart.params_dict
+ visible: set[str] = set()
+
+ for col in params.get("columns") or []:
+ if isinstance(col, str):
+ visible.add(col)
+ elif isinstance(col, dict) and (label := col.get("label")):
+ visible.add(label)
+
+ for col in params.get("groupby") or []:
+ if isinstance(col, str):
+ visible.add(col)
+ elif isinstance(col, dict) and (label := col.get("label")):
+ visible.add(label)
+
+ for metric in params.get("metrics") or []:
+ if isinstance(metric, str):
+ visible.add(metric)
+ elif isinstance(metric, dict) and (label := metric.get("label")):
+ visible.add(label)
+
+ return visible
+
+
+def _validate_orderby_list(
+ orderby: Any,
+ visible_columns: set[str],
+) -> bool:
+ """
+ Validate a single orderby list against visible columns.
+
+ Returns True if invalid (should block), False if valid.
+ """
+ if orderby is not None and not isinstance(orderby, list):
+ return True
+ for orderby_tuple in orderby or []:
+ if not isinstance(orderby_tuple, (list, tuple)) or len(orderby_tuple)
< 1:
+ return True
+ col_name = _extract_orderby_column_name(orderby_tuple[0])
+ if col_name is None or col_name not in visible_columns:
+ return True
+ return False
+
+
+def _orderby_whitelist_compare(
+ query_context: "QueryContext",
+ stored_chart: "Slice",
+ visible_columns: set[str],
+) -> bool:
+ """
+ Compare orderby using whitelist approach.
+
+ Allows sorting by any visible column, blocks hidden columns and SQL
expressions.
+ Returns True if modified (request should be blocked).
+
+ Defensive barriers:
+ - If orderby is not a list, block (fail-closed)
+ - If orderby element is not a tuple/list, block (fail-closed)
+ """
+ form_data = query_context.form_data or {}
+
+ # Check form_data orderby
+ if _validate_orderby_list(form_data.get("orderby"), visible_columns):
+ return True
+
+ # Check queries orderby
+ for query in query_context.queries:
+ if _validate_orderby_list(getattr(query, "orderby", None),
visible_columns):
Review Comment:
**Suggestion:** The comparator iterates `query_context.queries` and directly
validates `query.orderby`; if a `query.orderby` is a tuple it will be
considered invalid and block the request. Normalize tuple orderby values from
each `query` to a list before calling `_validate_orderby_list`. [possible bug]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Embedded guest dashboard sorting blocked (server-side queries).
- ⚠️ Stored QueryContext deserialization may be rejected.
- ⚠️ Affects guest UX when query.orderby shape varies.
```
</details>
```suggestion
orderby = getattr(query, "orderby", None)
if isinstance(orderby, tuple):
orderby = list(orderby)
if _validate_orderby_list(orderby, visible_columns):
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open superset/security/manager.py and inspect
`_orderby_whitelist_compare` (hunk
beginning around line 312). The loop over `query_context.queries` starts at
line 329 and
performs the validation call at line 330: `if
_validate_orderby_list(getattr(query,
"orderby", None), visible_columns):`.
2. Produce a QueryContext where one of the `query` objects has its
`.orderby` attribute
set to a Python tuple (this can occur if server-side code constructs Query
objects with
tuple-valued orderby, or if a stored `query_context` decoded from JSON is
converted into
tuple structures before being attached to `query_context.queries`).
3. When `_validate_orderby_list` (defined at lines 287-304) receives a tuple
for
`orderby`, the first guard (`if orderby is not None and not
isinstance(orderby, list):
return True`) treats it as invalid and returns True.
4. The `_orderby_whitelist_compare` loop then returns True (line 330),
causing
`query_context_modified()` to report modification and
`SupersetSecurityManager.raise_for_access()` to raise a
`SupersetSecurityException`,
blocking the request (guest sorting blocked).
Note: This reproducer traces the exact call site that validates
`query.orderby` (line
330). If all code paths that build `query.orderby` use lists, the issue
won't surface. If
some server-side code uses tuples, this is a real problem.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/security/manager.py
**Line:** 330:330
**Comment:**
*Possible Bug: The comparator iterates `query_context.queries` and
directly validates `query.orderby`; if a `query.orderby` is a tuple it will be
considered invalid and block the request. Normalize tuple orderby values from
each `query` to a list before calling `_validate_orderby_list`.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]