codeant-ai-for-open-source[bot] commented on code in PR #40224:
URL: https://github.com/apache/superset/pull/40224#discussion_r3260080523
##########
tests/unit_tests/connectors/sqla/utils_test.py:
##########
@@ -137,3 +137,38 @@ def test_get_virtual_table_metadata_multiple(mocker:
MockerFixture) -> None:
with pytest.raises(SupersetSecurityException) as excinfo:
get_virtual_table_metadata(dataset)
assert str(excinfo.value) == "Only single queries supported"
+
+
+def test_get_virtual_table_metadata_renders_jinja(mocker: MockerFixture) ->
None:
+ """Regression for #25839: Jinja templates in a virtual dataset's SQL must
+ be rendered via the template processor before SQL parsing. Otherwise the
+ raw Jinja tokens reach sqlglot and the parser rejects them as a syntax
+ error (the user-visible symptom is "Invalid SQL" when clicking
+ "SYNC COLUMNS FROM SOURCE" on a dataset that uses {{ from_dttm }} etc.).
+ """
+ mocker.patch(
+ "superset.connectors.sqla.utils.get_columns_description",
+ return_value=[{"name": "rendered_col", "type": "INTEGER"}],
+ )
+
+ raw_sql = "SELECT * FROM tbl WHERE ts > '{{ from_dttm }}'"
+ rendered_sql = "SELECT * FROM tbl WHERE ts > '2024-01-01 00:00:00'"
+
+ dataset = mocker.MagicMock(sql=raw_sql)
+ dataset.database.db_engine_spec.engine = "postgresql"
+ dataset.template_params_dict = {}
+ dataset.get_template_processor().process_template.return_value =
rendered_sql
+
+ # If Jinja rendering is skipped, sqlglot tries to parse the raw {{ ... }}
+ # and raises SupersetGenericDBErrorException / SupersetParseError.
+ assert get_virtual_table_metadata(dataset) == [
+ {"name": "rendered_col", "type": "INTEGER"}
+ ]
Review Comment:
**Suggestion:** The regression test never verifies which SQL string is
passed into `get_columns_description`, so it can still pass if the function
renders Jinja for parsing but then sends the original unrendered SQL
downstream. Capture the patched `get_columns_description` mock and assert it
was called with `rendered_sql` as the query argument to ensure the rendered
statement is used end-to-end. [incomplete implementation]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Virtual dataset metadata may still use unrendered Jinja SQL.
- ⚠️ Column sync regression guard misses downstream SQL argument.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open `tests/unit_tests/connectors/sqla/utils_test.py` and inspect
`test_get_virtual_table_metadata_renders_jinja` at lines 149–166, where
`mocker.patch("superset.connectors.sqla.utils.get_columns_description",
return_value=[{"name": "rendered_col", "type": "INTEGER"}])` stubs
`get_columns_description` to return a fixed list regardless of the `query`
argument.
2. Open `superset/connectors/sqla/utils.py` and inspect
`get_virtual_table_metadata` at
lines 99–137: it currently calls `sql =
dataset.get_template_processor().process_template(dataset.sql,
**dataset.template_params_dict)` (around line 108) and then passes this
`sql` into both
`SQLScript(sql, engine=db_engine_spec.engine)` and
`get_columns_description(dataset.database, dataset.catalog, dataset.schema,
sql)` (around
lines 137–62).
3. Consider a regression consistent with issue #25839 where only the parsing
path uses
rendered SQL: change `get_virtual_table_metadata` so that `SQLScript`
continues to receive
the rendered `sql`, but the final `get_columns_description(...)` call (lines
137–62) is
modified to pass `dataset.sql` instead of `sql`, meaning that metadata
retrieval runs
against the raw `"SELECT * FROM tbl WHERE ts > '{{ from_dttm }}'"`.
4. In this regressed state, a real virtual dataset using Jinja (e.g. via
`SqlaTable.external_metadata` at `superset/connectors/sqla/models.py:8–16`)
would still
fail when the user clicks "SYNC COLUMNS FROM SOURCE", because the DB backend
sees
unrendered Jinja; however, running `pytest
tests/unit_tests/connectors/sqla/utils_test.py::test_get_virtual_table_metadata_renders_jinja
-v` will still pass since the test only checks the returned column list and
never asserts
that `get_columns_description` was called with `rendered_sql` as its `query`
argument.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fconnectors%2Fsqla%2Futils_test.py%0A%2A%2ALine%3A%2A%2A%20149%3A166%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncomplete%20Implementation%3A%20The%20regression%20test%20never%20verifies%20which%20SQL%20string%20is%20passed%20into%20%60get_columns_description%60%2C%20so%20it%20can%20still%20pass%20if%20the%20function%20renders%20Jinja%20for%20parsing%20but%20then%20sends%20the%20original%20unrendered%20SQL%20downstream.%20Capture%20the%20patched%20%60get_columns_description%60%20mock%20and%20assert%20it%20was%20called%20with%20%60rendered_sql%60%20as%20the%20query%20argument%20to%20ensure%20the%20rendered%20statement%20is%20used%20end-to-end.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%2
0please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fconnectors%2Fsqla%2Futils_test.py%0A%2A%2ALine%3A%2A%2A%20149%3A166%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncomplete%20Implementation%3A%20The%20regression%20test%20never%20verifies%20which%20SQL%20string%20is%20passed%20into%20%60get_columns_description%60%2C%20so%20it%20can%20still%20pass%20if%20the%20function%20renders%20Jinja%20for%20parsing%20but%20then%20sends%20the%20original%20unrendered%20SQL%20downstream.%20Capture%20the%20patched%2
0%60get_columns_description%60%20mock%20and%20assert%20it%20was%20called%20with%20%60rendered_sql%60%20as%20the%20query%20argument%20to%20ensure%20the%20rendered%20statement%20is%20used%20end-to-end.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** tests/unit_tests/connectors/sqla/utils_test.py
**Line:** 149:166
**Comment:**
*Incomplete Implementation: The regression test never verifies which
SQL string is passed into `get_columns_description`, so it can still pass if
the function renders Jinja for parsing but then sends the original unrendered
SQL downstream. Capture the patched `get_columns_description` mock and assert
it was called with `rendered_sql` as the query argument to ensure the rendered
statement is used end-to-end.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40224&comment_hash=6a2cf77c803bd337a4a28a14ebee12941757d48282fa4ecdc5edd66a1c672775&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40224&comment_hash=6a2cf77c803bd337a4a28a14ebee12941757d48282fa4ecdc5edd66a1c672775&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]