codeant-ai-for-open-source[bot] commented on code in PR #40224:
URL: https://github.com/apache/superset/pull/40224#discussion_r3260080523


##########
tests/unit_tests/connectors/sqla/utils_test.py:
##########
@@ -137,3 +137,38 @@ def test_get_virtual_table_metadata_multiple(mocker: 
MockerFixture) -> None:
     with pytest.raises(SupersetSecurityException) as excinfo:
         get_virtual_table_metadata(dataset)
     assert str(excinfo.value) == "Only single queries supported"
+
+
+def test_get_virtual_table_metadata_renders_jinja(mocker: MockerFixture) -> 
None:
+    """Regression for #25839: Jinja templates in a virtual dataset's SQL must
+    be rendered via the template processor before SQL parsing. Otherwise the
+    raw Jinja tokens reach sqlglot and the parser rejects them as a syntax
+    error (the user-visible symptom is "Invalid SQL" when clicking
+    "SYNC COLUMNS FROM SOURCE" on a dataset that uses {{ from_dttm }} etc.).
+    """
+    mocker.patch(
+        "superset.connectors.sqla.utils.get_columns_description",
+        return_value=[{"name": "rendered_col", "type": "INTEGER"}],
+    )
+
+    raw_sql = "SELECT * FROM tbl WHERE ts > '{{ from_dttm }}'"
+    rendered_sql = "SELECT * FROM tbl WHERE ts > '2024-01-01 00:00:00'"
+
+    dataset = mocker.MagicMock(sql=raw_sql)
+    dataset.database.db_engine_spec.engine = "postgresql"
+    dataset.template_params_dict = {}
+    dataset.get_template_processor().process_template.return_value = 
rendered_sql
+
+    # If Jinja rendering is skipped, sqlglot tries to parse the raw {{ ... }}
+    # and raises SupersetGenericDBErrorException / SupersetParseError.
+    assert get_virtual_table_metadata(dataset) == [
+        {"name": "rendered_col", "type": "INTEGER"}
+    ]

Review Comment:
   **Suggestion:** The regression test never verifies which SQL string is 
passed into `get_columns_description`, so it can still pass if the function 
renders Jinja for parsing but then sends the original unrendered SQL 
downstream. Capture the patched `get_columns_description` mock and assert it 
was called with `rendered_sql` as the query argument to ensure the rendered 
statement is used end-to-end. [incomplete implementation]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Virtual dataset metadata may still use unrendered Jinja SQL.
   - ⚠️ Column sync regression guard misses downstream SQL argument.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Open `tests/unit_tests/connectors/sqla/utils_test.py` and inspect
   `test_get_virtual_table_metadata_renders_jinja` at lines 149–166, where
   `mocker.patch("superset.connectors.sqla.utils.get_columns_description",
   return_value=[{"name": "rendered_col", "type": "INTEGER"}])` stubs
   `get_columns_description` to return a fixed list regardless of the `query` 
argument.
   
   2. Open `superset/connectors/sqla/utils.py` and inspect 
`get_virtual_table_metadata` at
   lines 99–137: it currently calls `sql =
   dataset.get_template_processor().process_template(dataset.sql,
   **dataset.template_params_dict)` (around line 108) and then passes this 
`sql` into both
   `SQLScript(sql, engine=db_engine_spec.engine)` and
   `get_columns_description(dataset.database, dataset.catalog, dataset.schema, 
sql)` (around
   lines 137–62).
   
   3. Consider a regression consistent with issue #25839 where only the parsing 
path uses
   rendered SQL: change `get_virtual_table_metadata` so that `SQLScript` 
continues to receive
   the rendered `sql`, but the final `get_columns_description(...)` call (lines 
137–62) is
   modified to pass `dataset.sql` instead of `sql`, meaning that metadata 
retrieval runs
   against the raw `"SELECT * FROM tbl WHERE ts > '{{ from_dttm }}'"`.
   
   4. In this regressed state, a real virtual dataset using Jinja (e.g. via
   `SqlaTable.external_metadata` at `superset/connectors/sqla/models.py:8–16`) 
would still
   fail when the user clicks "SYNC COLUMNS FROM SOURCE", because the DB backend 
sees
   unrendered Jinja; however, running `pytest
   
tests/unit_tests/connectors/sqla/utils_test.py::test_get_virtual_table_metadata_renders_jinja
   -v` will still pass since the test only checks the returned column list and 
never asserts
   that `get_columns_description` was called with `rendered_sql` as its `query` 
argument.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fconnectors%2Fsqla%2Futils_test.py%0A%2A%2ALine%3A%2A%2A%20149%3A166%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncomplete%20Implementation%3A%20The%20regression%20test%20never%20verifies%20which%20SQL%20string%20is%20passed%20into%20%60get_columns_description%60%2C%20so%20it%20can%20still%20pass%20if%20the%20function%20renders%20Jinja%20for%20parsing%20but%20then%20sends%20the%20original%20unrendered%20SQL%20downstream.%20Capture%20the%20patched%20%60get_columns_description%60%20mock%20and%20assert%20it%20was%20called%20with%20%60rendered_sql%60%20as%20the%20query%20argument%20to%20ensure%20the%20rendered%20statement%20is%20used%20end-to-end.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%2
 
0please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fconnectors%2Fsqla%2Futils_test.py%0A%2A%2ALine%3A%2A%2A%20149%3A166%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncomplete%20Implementation%3A%20The%20regression%20test%20never%20verifies%20which%20SQL%20string%20is%20passed%20into%20%60get_columns_description%60%2C%20so%20it%20can%20still%20pass%20if%20the%20function%20renders%20Jinja%20for%20parsing%20but%20then%20sends%20the%20original%20unrendered%20SQL%20downstream.%20Capture%20the%20patched%2
 
0%60get_columns_description%60%20mock%20and%20assert%20it%20was%20called%20with%20%60rendered_sql%60%20as%20the%20query%20argument%20to%20ensure%20the%20rendered%20statement%20is%20used%20end-to-end.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** tests/unit_tests/connectors/sqla/utils_test.py
   **Line:** 149:166
   **Comment:**
        *Incomplete Implementation: The regression test never verifies which 
SQL string is passed into `get_columns_description`, so it can still pass if 
the function renders Jinja for parsing but then sends the original unrendered 
SQL downstream. Capture the patched `get_columns_description` mock and assert 
it was called with `rendered_sql` as the query argument to ensure the rendered 
statement is used end-to-end.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40224&comment_hash=6a2cf77c803bd337a4a28a14ebee12941757d48282fa4ecdc5edd66a1c672775&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40224&comment_hash=6a2cf77c803bd337a4a28a14ebee12941757d48282fa4ecdc5edd66a1c672775&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to