codeant-ai-for-open-source[bot] commented on code in PR #40392:
URL: https://github.com/apache/superset/pull/40392#discussion_r3319513354
##########
superset/connectors/sqla/models.py:
##########
@@ -867,6 +870,57 @@ def values_for_column(self, column_name: str, limit: int =
10000) -> list[Any]:
raise NotImplementedError()
+def validate_stored_expression(
+ database: Database,
+ catalog: str | None,
+ schema: str | None,
+ expression: str | None,
+) -> None:
+ """
+ Apply the adhoc-expression validator to a stored column or metric
expression.
+
+ Wrapping in a synthetic ``SELECT <expr>`` reuses the column-position parser
+ rules already enforced for adhoc expressions, so the same policy on
+ sub-queries, set operations, and multi-statement SQL applies to stored
+ expressions when they are saved.
+ """
+ if not expression:
+ return
+ engine = database.backend
+ wrapped = f"SELECT {expression}"
+
+ try:
+ parsed = SQLStatement(wrapped, engine)
+ except SupersetParseError as ex:
+ raise SupersetSecurityException(
+ SupersetError(
+ error_type=SupersetErrorType.ADHOC_SUBQUERY_NOT_ALLOWED_ERROR,
+ message=_(
+ "Custom SQL fields cannot be parsed as a single SQL
statement."
+ ),
+ level=ErrorLevel.ERROR,
+ )
+ ) from ex
Review Comment:
**Suggestion:** This new validator parses stored expressions as raw SQL
without processing Jinja template syntax first, so valid existing metric/column
expressions like `{{ current_username() }}` or `{{ url_param(...) }}` will now
fail dataset updates with a parse/security error. Preserve current behavior by
handling templated expressions before SQL parsing (or safely replacing Jinja
blocks before parsing) so supported stored Jinja expressions are not rejected.
[api mismatch]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ UpdateDatasetCommand rejects Jinja-based stored column expressions.
- ❌ UpdateDatasetCommand rejects Jinja-based stored metric expressions.
- ⚠️ Dataset edit API fails when Jinja expressions present.
- ⚠️ Existing Jinja datasets become unsaveable when updating metadata.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Verify that stored dataset columns explicitly support Jinja expressions
by inspecting
`tests/integration_tests/sqla_models_tests.py:160-31`, where a `SqlaTable`
is created and
a `TableColumn` is added with `expression=\"case when '{{ current_username()
}}' = 'abc'
then 'yes' else 'no' end'\" and the query generated via
`table.get_sqla_query()` executes
correctly with Jinja templating applied.
2. Use the existing unit-test pattern in
`tests/unit_tests/commands/dataset/update_test.py:59-97`
(`test_update_dataset_accepts_benign_expression`), but change the payload so
that
`payload['columns'][0]['expression']` is the same Jinja expression from step
1 (e.g.
`\"case when '{{ current_username() }}' = 'abc' then 'yes' else 'no' end\"`)
instead of
the plain `CASE` expression.
3. Run `UpdateDatasetCommand(1, payload).validate()` as done in
`update_test.py:59-97`;
inside `UpdateDatasetCommand.validate` in
`superset/commands/dataset/update.py:98-124`,
`_validate_semantics` calls `_validate_expressions` (line ~292) which
resolves `database`,
`catalog`, and `schema` and then calls `validate_stored_expression(database,
catalog,
schema, expression)` for each item with a non-empty `expression` (see
`update.py:55-61`
where `validate_stored_expression` is invoked).
4. In `validate_stored_expression` at
`superset/connectors/sqla/models.py:873-83`, the
function builds `wrapped = f"SELECT {expression}"` (line 890) and
instantiates
`SQLStatement(wrapped, engine)` (line 893) without running the expression
through a Jinja
`BaseTemplateProcessor`; `SQLStatement` in `superset/sql/parse.py:520-181`
uses
`sqlglot.parse` on the raw string and raises `SupersetParseError` when it
encounters the
Jinja tokens (`{{ current_username() }}`), which is caught and re-raised as
`SupersetSecurityException` with message `"Custom SQL fields cannot be
parsed as a single
SQL statement."`. `_validate_expressions` catches this and appends a
`ValidationError` on
`columns.0.expression`, causing `UpdateDatasetCommand.validate()` (and thus
dataset
updates via this command) to fail for stored columns/metrics that currently
and
intentionally use Jinja expressions.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=069370558e1e4824941e4aab6379a270&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=069370558e1e4824941e4aab6379a270&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/connectors/sqla/models.py
**Line:** 890:903
**Comment:**
*Api Mismatch: This new validator parses stored expressions as raw SQL
without processing Jinja template syntax first, so valid existing metric/column
expressions like `{{ current_username() }}` or `{{ url_param(...) }}` will now
fail dataset updates with a parse/security error. Preserve current behavior by
handling templated expressions before SQL parsing (or safely replacing Jinja
blocks before parsing) so supported stored Jinja expressions are not rejected.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40392&comment_hash=7e6c248d2d97e5669cccfdbbdef88faf6dba82ebdaf519b10054f8b55fd5c56d&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40392&comment_hash=7e6c248d2d97e5669cccfdbbdef88faf6dba82ebdaf519b10054f8b55fd5c56d&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]