codeant-ai-for-open-source[bot] commented on code in PR #41218:
URL: https://github.com/apache/superset/pull/41218#discussion_r3440344055
##########
superset/security/manager.py:
##########
@@ -520,42 +520,87 @@ def _native_filter_request_modified(query_context:
"QueryContext") -> bool:
)
-def query_context_modified(query_context: "QueryContext") -> bool:
+def _collect_sortable_identifiers(
+ stored_chart: "Slice",
+ stored_query_context: Optional[dict[str, Any]],
+) -> set[str]:
"""
- Check if a query context has been modified.
-
- This is used to ensure guest users don't modify the payload and fetch data
- different from what was shared with them in dashboards.
+ Frozen column names and metric labels/definitions a guest may legitimately
+ sort by: every column or metric the stored chart already references.
+
+ Order-by only changes the ordering of the result, not which data is read,
so
+ any column or metric already part of the chart is a safe sort target. A
term
+ that is not present in the stored chart (for example a free-form
``random()``
+ expression) cannot be validated and must be rejected. Order-by entries are
+ ``(column_or_metric, ascending)`` pairs, so only their first element is
+ collected.
"""
- form_data = query_context.form_data
- stored_chart = query_context.slice_
+ allowed: set[str] = set()
- # Native-filter data requests have no associated chart (no slice_id).
Rather
- # than accepting any payload, constrain them to the column(s) the
dashboard's
- # native filter is allowed to target; other chartless paths keep prior
- # behavior (see _native_filter_request_modified).
- if stored_chart is None:
- return _native_filter_request_modified(query_context)
+ def add(values: Any) -> None:
+ for value in values or []:
+ allowed.add(freeze_value(value))
- if form_data is None:
- return False
+ def add_orderby(entries: Any) -> None:
+ for entry in entries or []:
+ if isinstance(entry, (list, tuple)) and entry:
+ allowed.add(freeze_value(entry[0]))
- # cannot request a different chart
- if form_data.get("slice_id") != stored_chart.id:
- return True
+ params = stored_chart.params_dict
+ for key in ("columns", "groupby", "metrics", "all_columns"):
+ add(params.get(key))
+ add_orderby(params.get("orderby"))
- stored_query_context = (
- json.loads(cast(str, stored_chart.query_context))
- if stored_chart.query_context
- else None
- )
+ if stored_query_context:
+ for query in stored_query_context.get("queries") or []:
+ for key in ("columns", "groupby", "metrics"):
+ add(query.get(key))
+ add_orderby(query.get("orderby"))
+
+ return allowed
- # compare columns and metrics in form_data with stored values
+
+def _orderby_modified(
+ query_context: "QueryContext",
+ stored_chart: "Slice",
+ stored_query_context: Optional[dict[str, Any]],
+) -> bool:
+ """
+ Whether any order-by clause sorts by a term the stored chart does not
already
+ reference.
+
+ A guest reordering an embedded chart by one of its existing columns or
+ metrics is legitimate and must not read as tampering; introducing a new
+ expression is not, and is rejected.
+ """
+ allowed = _collect_sortable_identifiers(stored_chart, stored_query_context)
+ form_data = query_context.form_data or {}
+ requested = list(form_data.get("orderby") or [])
+ for query in query_context.queries:
+ requested.extend(getattr(query, "orderby", None) or [])
+
+ for entry in requested:
+ term = entry[0] if isinstance(entry, (list, tuple)) and entry else
entry
+ if freeze_value(term) not in allowed:
+ return True
Review Comment:
**Suggestion:** The new order-by check validates only the first element of
each entry but never validates the entry shape. A malformed payload like
`orderby=["gender"]` or `orderby=[["gender"]]` can pass this guard (because the
term is allowed) and then fail later when query code unpacks each order-by item
as `(column, ascending)`, causing a runtime error for guest requests. Reject
order-by entries unless they are exactly two-item list/tuple pairs with a
boolean direction. [api mismatch]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Malformed orderby can crash chart compile/query execution.
- ❌ Guest payload guard fails to reject structurally invalid input.
- ⚠️ Embedded/ChartData callers can trigger unexpected 500 errors.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Trigger chart compilation via `_compile_chart` in
`superset/mcp_service/chart/compile.py:44-55` with a form_data payload
containing a
malformed order-by clause, e.g. `form_data["orderby"] = ["gender"]` and a
valid
metric/column `"gender"` on the chosen dataset.
2. `_compile_chart` builds a `QueryContext` by calling
`QueryContextFactory().create(...)`
and passes `queries=[{"columns": columns, "metrics": metrics, "orderby":
form_data.get("orderby", []), ...}]` (compile.py:42-55) so the malformed
`["gender"]` list
becomes the `orderby` field for the first `QueryObject`.
3. `ChartDataCommand.validate()` in
`superset/commands/chart/data/get_data_command.py:72-73` calls
`QueryContext.raise_for_access()`, which delegates to
`security_manager.raise_for_access(..., query_context=query_context)`; in
`SupersetSecurityManager.raise_for_access`
(`superset/security/manager.py:3436-3450`)
guest users invoke `query_context_modified(query_context)`, which calls
`_orderby_modified` (`superset/security/manager.py:84-107`). Inside
`_orderby_modified`,
the loop at lines 582‑585 iterates `entry = "gender"`, computes `term =
entry` (not a
list/tuple), finds `freeze_value(term)` in the allowed set from
`_collect_sortable_identifiers`, and therefore does not flag the malformed
shape as
tampering.
4. `ChartDataCommand.run()` then executes the query via
`QueryContext.get_payload` →
`QueryContextProcessor.get_df_payload`
(`superset/common/query_context_processor.py:82-90`) → datasource query
building, which
ultimately reaches `superset/models/helpers.py:3170-3172` where `for
orig_col, ascending
in orderby:` attempts to unpack each `orderby` element. Because `orderby`
contains a bare
string `"gender"` instead of `(column, bool)`, Python raises a `ValueError`
during
unpacking, causing a 500 error for this request instead of being rejected
early as an
invalid/tampered guest payload.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=83c77c93a5d940e3868c710ba5b07c4f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=83c77c93a5d940e3868c710ba5b07c4f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/security/manager.py
**Line:** 582:585
**Comment:**
*Api Mismatch: The new order-by check validates only the first element
of each entry but never validates the entry shape. A malformed payload like
`orderby=["gender"]` or `orderby=[["gender"]]` can pass this guard (because the
term is allowed) and then fail later when query code unpacks each order-by item
as `(column, ascending)`, causing a runtime error for guest requests. Reject
order-by entries unless they are exactly two-item list/tuple pairs with a
boolean direction.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41218&comment_hash=0976d53aaaa81935ccfc712039aba97b9b6d9ebbb43bc7e5098e242d2e877adf&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41218&comment_hash=0976d53aaaa81935ccfc712039aba97b9b6d9ebbb43bc7e5098e242d2e877adf&reaction=dislike'>👎</a>
##########
superset/security/manager.py:
##########
@@ -520,42 +520,87 @@ def _native_filter_request_modified(query_context:
"QueryContext") -> bool:
)
-def query_context_modified(query_context: "QueryContext") -> bool:
+def _collect_sortable_identifiers(
+ stored_chart: "Slice",
+ stored_query_context: Optional[dict[str, Any]],
+) -> set[str]:
"""
- Check if a query context has been modified.
-
- This is used to ensure guest users don't modify the payload and fetch data
- different from what was shared with them in dashboards.
+ Frozen column names and metric labels/definitions a guest may legitimately
+ sort by: every column or metric the stored chart already references.
+
+ Order-by only changes the ordering of the result, not which data is read,
so
+ any column or metric already part of the chart is a safe sort target. A
term
+ that is not present in the stored chart (for example a free-form
``random()``
+ expression) cannot be validated and must be rejected. Order-by entries are
+ ``(column_or_metric, ascending)`` pairs, so only their first element is
+ collected.
"""
- form_data = query_context.form_data
- stored_chart = query_context.slice_
+ allowed: set[str] = set()
- # Native-filter data requests have no associated chart (no slice_id).
Rather
- # than accepting any payload, constrain them to the column(s) the
dashboard's
- # native filter is allowed to target; other chartless paths keep prior
- # behavior (see _native_filter_request_modified).
- if stored_chart is None:
- return _native_filter_request_modified(query_context)
+ def add(values: Any) -> None:
+ for value in values or []:
+ allowed.add(freeze_value(value))
- if form_data is None:
- return False
+ def add_orderby(entries: Any) -> None:
+ for entry in entries or []:
+ if isinstance(entry, (list, tuple)) and entry:
+ allowed.add(freeze_value(entry[0]))
- # cannot request a different chart
- if form_data.get("slice_id") != stored_chart.id:
- return True
+ params = stored_chart.params_dict
+ for key in ("columns", "groupby", "metrics", "all_columns"):
+ add(params.get(key))
Review Comment:
**Suggestion:** Sortable identifiers are collected from
`columns/groupby/metrics/all_columns` only, but many legacy chart payloads
store a single metric under `metric` (singular). For charts without stored
`query_context`, this misses the chart's actual metric and wrongly flags
legitimate guest sorts by that metric as tampering. Include legacy singular
metric fields when building the allowed sort target set. [incomplete
implementation]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Embedded guests blocked on legacy metric-based charts.
- ⚠️ Legitimate metric sorts misclassified as payload tampering.
- ⚠️ Inconsistent behavior between legacy and modern chart configs.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Consider a legacy Big Number-style chart whose saved form_data (stored on
the `Slice`)
uses the singular `"metric"` field instead of the newer `"metrics"` list;
this legacy
behavior is documented in `_compile_chart` in
`superset/mcp_service/chart/compile.py:33-36`, which explicitly treats
`form_data["metric"]` as the metric when `metrics` is empty.
2. When this chart is embedded and loaded as a guest, the backend
reconstructs a
`QueryContext` (via `QueryContextFactory.create`,
`superset/common/query_context_factory.py:88-19`), and
`query_context_modified(query_context)` in
`superset/security/manager.py:152-195` is
invoked from `SupersetSecurityManager.raise_for_access`
(`security/manager.py:3436-3450`)
to enforce guest payload integrity.
3. Inside `query_context_modified`,
`_collect_sortable_identifiers(stored_chart,
stored_query_context)` is called (`security/manager.py:44-81`). For charts
without a
stored `query_context` JSON, `stored_query_context` is `None`, so the
allowed set is built
only from `stored_chart.params_dict`. The loop at lines 550‑551 iterates
keys `("columns",
"groupby", "metrics", "all_columns")` and never reads
`params_dict["metric"]`, so the
chart's actual legacy metric is omitted from the `allowed` sort identifiers.
4. If the chart's query uses that metric in `orderby` (for example, Big
Number's
`sort_by_metric` logic sets `query_obj["orderby"] =
[(query_obj["metrics"][0], False)]` in
`superset/viz.py:1277-1279`), then `_orderby_modified`
(`security/manager.py:84-107`) will
see each order-by term's `freeze_value(term)` not present in the `allowed`
set and return
`True`. This causes `query_context_modified` to return `True` and
`SupersetSecurityManager.raise_for_access` to raise a
`SupersetSecurityException` with
message "Guest user cannot modify chart payload"
(`security/manager.py:3436-3450`),
wrongly blocking legitimate guest views/sorts on such legacy charts.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=e4cb6dfef3e04ef8a46b74e4e9088684&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=e4cb6dfef3e04ef8a46b74e4e9088684&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/security/manager.py
**Line:** 550:551
**Comment:**
*Incomplete Implementation: Sortable identifiers are collected from
`columns/groupby/metrics/all_columns` only, but many legacy chart payloads
store a single metric under `metric` (singular). For charts without stored
`query_context`, this misses the chart's actual metric and wrongly flags
legitimate guest sorts by that metric as tampering. Include legacy singular
metric fields when building the allowed sort target set.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41218&comment_hash=912bde1944837a6c7753a34c3c7238f66b708c0a69f8e9623cc59748f360b408&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41218&comment_hash=912bde1944837a6c7753a34c3c7238f66b708c0a69f8e9623cc59748f360b408&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]