codeant-ai-for-open-source[bot] commented on code in PR #38716:
URL: https://github.com/apache/superset/pull/38716#discussion_r3481484645
##########
superset/models/helpers.py:
##########
@@ -2724,6 +2724,40 @@ def adhoc_column_to_sqla(
) -> tuple[ColumnElement, Optional[GenericDataType]]:
raise NotImplementedError()
+ def find_adhoc_column_and_convert_to_sqla(
+ self,
+ columns: list[Column],
+ label: str,
+ template_processor: Optional[BaseTemplateProcessor] = None,
+ ) -> Optional[ColumnElement]:
+ """
+ Find an adhoc column by its label and convert it to a SQLAlchemy
column.
+
+ This helper method searches through a list of columns for an adhoc
column
+ with a matching label and converts it to a SQLAlchemy column
expression.
+
+ Args:
+ columns: List of columns to search through
+ label: The label to match against adhoc columns
+ template_processor: Optional template processor for SQL templating
+
+ Returns:
+ SQLAlchemy column element if found, None otherwise
+ """
+ adhoc_col = None
+ for c in columns:
+ if utils.is_adhoc_column(c):
+ if c.get("label") == label:
+ adhoc_col = c
+ break
+ if adhoc_col:
+ sqla_col, _ = self.adhoc_column_to_sqla(
+ col=adhoc_col,
+ template_processor=template_processor,
+ )
+ return sqla_col
+ return None
Review Comment:
**Suggestion:** The new helper directly calls `adhoc_column_to_sqla` without
handling `ColumnNotFoundException`, unlike the existing filter path that
rejects invalid adhoc filters gracefully. If a matched adhoc column requires
type probing (for example with time grain) and fails resolution, this can
bubble up as an unhandled query failure instead of being treated as a rejected
filter/order-by column. Catch `ColumnNotFoundException` in this helper (or in
the new call sites) and return `None` so callers can keep the existing graceful
behavior. [possible bug]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Invalid adhoc label crashes dataset queries with 404.
- ⚠️ Users see hard failures instead of graceful filter rejection.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Define an adhoc SQL column on a `SqlaTable` dataset (model at
`superset/connectors/sqla/models.py:1294`) whose expression cannot be
resolved by the
database, for example by referencing a non-existent physical column, and
give it a label
like `"invalid_expr"` so it appears in the query `columns` list passed into
`get_sqla_query` (`superset/models/helpers.py:3136`).
2. Build a query object similar to those in
`tests/integration_tests/sqla_models_tests.py:190`, but include the adhoc
column dict in
`columns` and a filter or order-by that refers to it by label string, e.g.
`filter=[{"col": "invalid_expr", "op": "IN", "val": ["x"]}]` or
`orderby=[("invalid_expr",
True)]`, then call `table.get_sqla_query(**query_obj)`.
3. In `get_sqla_query`, the string `"invalid_expr"` does not match a
physical column or
metric, so in the order-by path it hits the branch at `3313–3319` or in the
filter path it
hits the new branch at `3538 elif col_obj is None and isinstance(flt_col,
str):`, both of
which call `self.find_adhoc_column_and_convert_to_sqla(columns=columns,
label=...,
template_processor=...)` defined at `superset/models/helpers.py:2727`.
4. Inside `find_adhoc_column_and_convert_to_sqla`, once the adhoc dict is
found, it calls
`self.adhoc_column_to_sqla(col=adhoc_col,
template_processor=template_processor)` at
`2754` without a `try/except`; the concrete implementation in
`SqlaTable.adhoc_column_to_sqla`
(`superset/connectors/sqla/models.py:1800–1827`) may
raise `ColumnNotFoundException` when `get_columns_description` fails
(wrapped at `1818`),
and unlike the direct adhoc filter branch in `get_sqla_query` which catches
`ColumnNotFoundException` at `3512–3514`, this helper lets the exception
bubble up,
causing the entire `get_sqla_query` call and thus the chart query to fail
with a 404 error
instead of treating the adhoc reference as a rejected filter/order-by.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=7fe75e9104824578bb300e9cc37df7ce&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=7fe75e9104824578bb300e9cc37df7ce&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 2753:2759
**Comment:**
*Possible Bug: The new helper directly calls `adhoc_column_to_sqla`
without handling `ColumnNotFoundException`, unlike the existing filter path
that rejects invalid adhoc filters gracefully. If a matched adhoc column
requires type probing (for example with time grain) and fails resolution, this
can bubble up as an unhandled query failure instead of being treated as a
rejected filter/order-by column. Catch `ColumnNotFoundException` in this helper
(or in the new call sites) and return `None` so callers can keep the existing
graceful behavior.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38716&comment_hash=c17eaf88630365d6632fe500375cdf92de355a6ceede6c2fc32e521c67ce20b9&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38716&comment_hash=c17eaf88630365d6632fe500375cdf92de355a6ceede6c2fc32e521c67ce20b9&reaction=dislike'>👎</a>
##########
superset/models/helpers.py:
##########
@@ -3506,6 +3535,15 @@ def get_sqla_query( # pylint:
disable=too-many-arguments,too-many-locals,too-ma
template_processor=template_processor
)
is_metric_filter = True
+ elif col_obj is None and isinstance(flt_col, str):
+ sqla_col = self.find_adhoc_column_and_convert_to_sqla(
+ columns=columns,
+ label=flt_col,
+ template_processor=template_processor,
+ )
+ if sqla_col is not None:
+ applied_adhoc_filters_columns.append(flt_col)
+
applied_template_filters.append(get_column_name(flt_col))
Review Comment:
**Suggestion:** The new label-based adhoc filter path sets `sqla_col` but
never sets `adhoc_generic_type`, so filter value coercion falls back to
`GenericDataType.STRING`. This causes numeric/date adhoc column filters to be
quoted as strings and can return incorrect results. Resolve this by returning
the generic type from the adhoc lookup helper (or by calling
`adhoc_column_to_sqla(..., force_type_check=True)` in this branch) and
assigning it to `adhoc_generic_type` before `filter_values_handler` runs. [type
error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Adhoc numeric/date filter comparisons use string coercion semantics.
- ⚠️ Table and other charts show incorrect filtered result sets.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Use the SQLA dataset model `SqlaTable` defined at
`superset/connectors/sqla/models.py:1294` and construct a query object as in
`tests/integration_tests/sqla_models_tests.py:190`, but include an adhoc SQL
column in
`columns` with a numeric expression, for example `{"expressionType": "SQL",
"sqlExpression": "CAST(num AS BIGINT)", "label": "num_expr"}`.
2. In the same query object, add a filter clause so that `flt["col"]` is the
string
`"num_expr"` and `flt["op"]`/`flt["val"]` represent a numeric comparison
(e.g. `{"col":
"num_expr", "op": "IN", "val": ["1", "2"]}`), then call
`table.get_sqla_query(**query_obj)` which dispatches to `get_sqla_query` in
`superset/models/helpers.py:3136`.
3. Inside `get_sqla_query`, in the filter loop around
`superset/models/helpers.py:3494`,
the filter column `"num_expr"` is not an adhoc dict and is not a physical
column, so
`col_obj` stays `None` and execution reaches the new branch at `3538 elif
col_obj is None
and isinstance(flt_col, str):`, which calls
`self.find_adhoc_column_and_convert_to_sqla(...)` to set `sqla_col` but
never updates
`adhoc_generic_type` (initialized at `3497` and only assigned in the
`is_adhoc_column(flt_col)` branch at `3505`).
4. Later in the same loop, when computing `target_generic_type` at
`3597–3605`, `col_spec`
is `None` (no `TableColumn` for adhoc expressions) and `adhoc_generic_type`
is still
`None`, so the `else` branch at `3606–3607` sets `target_generic_type =
GenericDataType.STRING`; as a result `filter_values_handler` (see
`superset/models/helpers.py:2596–2618`) coerces numeric filter values as
strings and
generates predicates like `CAST(num AS BIGINT) IN ('1', '2')` instead of
numeric
comparisons, which can change filter semantics for non-string adhoc columns.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=f4344bfa4a554dcc9c8001c659647b64&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=f4344bfa4a554dcc9c8001c659647b64&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 3538:3546
**Comment:**
*Type Error: The new label-based adhoc filter path sets `sqla_col` but
never sets `adhoc_generic_type`, so filter value coercion falls back to
`GenericDataType.STRING`. This causes numeric/date adhoc column filters to be
quoted as strings and can return incorrect results. Resolve this by returning
the generic type from the adhoc lookup helper (or by calling
`adhoc_column_to_sqla(..., force_type_check=True)` in this branch) and
assigning it to `adhoc_generic_type` before `filter_values_handler` runs.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38716&comment_hash=a1143fbd23a4df9fc3bdf5d283382b27a25efefdb7ccb89e9d5dd823ade8a3b8&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38716&comment_hash=a1143fbd23a4df9fc3bdf5d283382b27a25efefdb7ccb89e9d5dd823ade8a3b8&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]