codeant-ai-for-open-source[bot] commented on code in PR #41120:
URL: https://github.com/apache/superset/pull/41120#discussion_r3422648596


##########
superset/sql_lab.py:
##########
@@ -433,15 +433,18 @@ def execute_sql_statements(  # noqa: C901
         db_engine_spec.engine,
         set(),
     )
-    if disallowed_tables and 
parsed_script.check_tables_present(disallowed_tables):
-        # Report only the tables actually found in the query
-        found_tables = set()
-        for statement in parsed_script.statements:
-            present = {table.table.lower() for table in statement.tables}
-            for table in disallowed_tables:
-                if table.lower() in present:
-                    found_tables.add(table)
-        raise SupersetDisallowedSQLTableException(found_tables or 
disallowed_tables)
+    if disallowed_tables:
+        # Report only the denylisted tables actually referenced in the query,
+        # honoring schema-qualified entries (e.g. 
``information_schema.tables``).
+        # Resolve the effective default schema so an unqualified reference that
+        # the connection resolves to it at runtime (e.g. Postgres ``public``)
+        # still matches a qualified entry, even when no schema was selected.
+        effective_schema = query.schema or 
database.get_default_schema(query.catalog)
+        found_tables = parsed_script.get_disallowed_tables(
+            disallowed_tables, effective_schema

Review Comment:
   **Suggestion:** The effective schema is derived with `query.schema or 
database.get_default_schema(...)`, which does not follow the engine-specific 
runtime resolution used elsewhere (`get_default_schema_for_query`). On engines 
that do not support dynamic schema switching, `query.schema` can differ from 
the actual execution schema, so unqualified table references may be matched 
against the wrong schema and the denylist can be bypassed (or legitimate 
queries can be blocked). Use the same query-aware schema resolution path used 
by RLS (`database.get_default_schema_for_query(query)`) when computing the 
schema passed to `get_disallowed_tables`. [security]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ SQL Lab disallowed-table checks can mis-evaluate unqualified tables.
   - ❌ Potential bypass of DISALLOWED_SQL_TABLES on static-schema engines.
   - ⚠️ Legitimate SQL Lab queries may be rejected due to mismatch.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. A user submits a SQL Lab query in the UI targeting a database whose 
engine spec does
   not override `BaseEngineSpec.supports_dynamic_schema` (defaults to `False` at
   `superset/db_engine_specs/base.py:565`), and selects a schema in the SQL Lab 
form that
   differs from the database's actual default schema. The query object is 
created in
   `SqlLabExecutionContext.create_query` at
   `superset/sqllab/sqllab_execution_context.py:153-176`, which instantiates 
`Query` with
   `schema=self.schema` and `catalog=self.catalog`, and these fields are stored 
on the ORM
   model (`superset/models/sql_lab.py:141-143`).
   
   2. The SQL Lab execution path invokes `execute_sql_statements` for that 
`query_id`
   (wrapper call at `superset/sql_lab.py:189`, implementation starting at
   `superset/sql_lab.py:391`). Inside this function, the code loads the `Query` 
via
   `get_query(query_id=query_id)` and retrieves `database = query.database` and
   `db_engine_spec = database.db_engine_spec` (`superset/sql_lab.py:17-21` in 
the function
   body).
   
   3. After parsing the SQL into an `SQLScript`, `execute_sql_statements` 
computes
   `disallowed_tables` from `app.config["DISALLOWED_SQL_TABLES"]` and, if 
non-empty, derives
   `effective_schema = query.schema or 
database.get_default_schema(query.catalog)`
   (`superset/sql_lab.py:42-52` / diff lines 432-442). This manual resolution 
bypasses the
   engine-specific `get_default_schema_for_query` logic implemented in
   `Database.get_default_schema_for_query` (`superset/models/core.py:12-30`), 
which in turn
   relies on `BaseEngineSpec.get_default_schema_for_query`
   (`superset/db_engine_specs/base.py:37-51`) to ignore `query.schema` when
   `supports_dynamic_schema` is `False` and instead use schema from the URI or 
inspector.
   
   4. The function then calls 
`parsed_script.get_disallowed_tables(disallowed_tables,
   effective_schema)` (`superset/sql_lab.py:53-55`), which uses the 
`default_schema` argument
   to interpret unqualified table references (`superset/sql/parse.py:519-535`). 
On engines
   with `supports_dynamic_schema=False`, if the user-set `query.schema` differs 
from the
   actual execution schema, `effective_schema` will incorrectly be 
`query.schema` instead of
   the true default schema that `get_default_schema_for_query` would return. As 
a result,
   unqualified table names in the query are matched against the wrong schema: 
entries in
   `DISALLOWED_SQL_TABLES` that should match can be missed (bypassing the 
denylist), or,
   conversely, legitimate queries can be blocked when `query.schema` does not 
reflect the
   schema actually used at runtime.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=8ee4aa35c37f4f1eb3492a7be940130c&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=8ee4aa35c37f4f1eb3492a7be940130c&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/sql_lab.py
   **Line:** 442:444
   **Comment:**
        *Security: The effective schema is derived with `query.schema or 
database.get_default_schema(...)`, which does not follow the engine-specific 
runtime resolution used elsewhere (`get_default_schema_for_query`). On engines 
that do not support dynamic schema switching, `query.schema` can differ from 
the actual execution schema, so unqualified table references may be matched 
against the wrong schema and the denylist can be bypassed (or legitimate 
queries can be blocked). Use the same query-aware schema resolution path used 
by RLS (`database.get_default_schema_for_query(query)`) when computing the 
schema passed to `get_disallowed_tables`.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41120&comment_hash=a83b7b834f2aa1378232b9854430fca2e94c4eac642f61f9b1bd1b5ac3365589&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41120&comment_hash=a83b7b834f2aa1378232b9854430fca2e94c4eac642f61f9b1bd1b5ac3365589&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to