Re: [PR] feat(sql): schema-qualified table denylist + information_schema/lo_* defaults [superset]

via GitHub Mon, 22 Jun 2026 18:42:24 -0700


codeant-ai-for-open-source[bot] commented on code in PR #41120:
URL: https://github.com/apache/superset/pull/41120#discussion_r3456475011



##########
tests/unit_tests/models/helpers_test.py:
##########
@@ -2960,3 +2960,54 @@ def 
test_process_sql_expression_no_gate_when_denylists_empty(
         template_processor=None,
     )
     assert result is not None
+
+
+def test_resolve_denylist_schema_uses_query_aware_resolution(
+    mocker: MockerFixture, database: Database
+) -> None:
+    """A datasource without an explicit schema resolves the denylist schema
+    through the query-aware ``get_default_schema_for_query`` (matching the SQL
+    Lab / executor gate), not the static inspector-based 
``get_default_schema``."""
+    from superset.connectors.sqla.models import SqlaTable
+
+    query_aware = mocker.patch.object(
+        database, "get_default_schema_for_query", return_value="resolved"
+    )
+    static = mocker.patch.object(database, "get_default_schema")
+    table = SqlaTable(database=database, schema=None, table_name="t")
+
+    assert table._resolve_denylist_schema("SELECT 1") == "resolved"
+    query_aware.assert_called_once()
+    static.assert_not_called()
+
+
+def test_resolve_denylist_schema_memoizes_across_expressions(
+    mocker: MockerFixture, database: Database
+) -> None:
+    """The resolved schema is cached per datasource so adhoc-expression
+    validation, which runs once per column/metric/order-by, does not re-resolve
+    (and re-probe) the schema for every expression."""
+    from superset.connectors.sqla.models import SqlaTable
+
+    query_aware = mocker.patch.object(
+        database, "get_default_schema_for_query", return_value="resolved"
+    )
+    table = SqlaTable(database=database, schema=None, table_name="t")
+
+    for _ in range(3):
+        assert table._resolve_denylist_schema("SELECT 1") == "resolved"
+    query_aware.assert_called_once()

Review Comment:
   **Suggestion:** This test claims to verify memoization across different 
adhoc expressions, but it calls `_resolve_denylist_schema` three times with the 
exact same SQL. If the implementation accidentally caches per SQL string 
instead of per datasource instance, this test will still pass and miss the 
regression. Use different expression shapes in the loop (eg 
column/metric/order-by style SQL) while still asserting a single 
schema-resolution call. [incomplete implementation]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Adhoc-expression validation may re-probe schema per expression.
   - ⚠️ SQL Lab denylist gate performance regressions may go undetected.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Open `_resolve_denylist_schema` in `superset/models/helpers.py:1175-24`, 
which caches
   the resolved schema on `self._denylist_default_schema` so it should be 
shared across all
   expressions for the same datasource instance.
   
   2. Observe the call site in `superset/models/helpers.py:1260`, where
   `_resolve_denylist_schema(sql_to_check)` is invoked during adhoc-expression 
validation for
   each expression; here `sql_to_check` varies across columns/metrics/order-bys 
in a real
   request.
   
   3. Open `tests/unit_tests/models/helpers_test.py:47-52` and note the 
docstring of
   `test_resolve_denylist_schema_memoizes_across_expressions`, which states 
that the resolved
   schema is cached per datasource so validation "does not re-resolve (and 
re-probe) the
   schema for every expression."
   
   4. Inspect the test body at `tests/unit_tests/models/helpers_test.py:55-62` 
and see that
   inside the loop it calls `table._resolve_denylist_schema("SELECT 1")` three 
times with the
   same SQL literal, then only asserts `query_aware.assert_called_once()`, 
meaning the test
   never exercises or asserts behavior for multiple *different* SQL expressions 
as actually
   used at the call site in step 2, so it cannot detect regressions where 
memoization is
   keyed per SQL string instead of per datasource instance.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=1fce5fdd14b54a859bd282df43e3a613&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=1fce5fdd14b54a859bd282df43e3a613&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** tests/unit_tests/models/helpers_test.py
   **Line:** 2997:2999
   **Comment:**
        *Incomplete Implementation: This test claims to verify memoization 
across different adhoc expressions, but it calls `_resolve_denylist_schema` 
three times with the exact same SQL. If the implementation accidentally caches 
per SQL string instead of per datasource instance, this test will still pass 
and miss the regression. Use different expression shapes in the loop (eg 
column/metric/order-by style SQL) while still asserting a single 
schema-resolution call.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41120&comment_hash=d11d00a5aac112ce030d0446f4398176bba901d5057a819cc886d8568b2fc9c6&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41120&comment_hash=d11d00a5aac112ce030d0446f4398176bba901d5057a819cc886d8568b2fc9c6&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(sql): schema-qualified table denylist + information_schema/lo_* defaults [superset]

Reply via email to