codeant-ai-for-open-source[bot] commented on code in PR #41120:
URL: https://github.com/apache/superset/pull/41120#discussion_r3422648596
##########
superset/sql_lab.py:
##########
@@ -433,15 +433,18 @@ def execute_sql_statements( # noqa: C901
db_engine_spec.engine,
set(),
)
- if disallowed_tables and
parsed_script.check_tables_present(disallowed_tables):
- # Report only the tables actually found in the query
- found_tables = set()
- for statement in parsed_script.statements:
- present = {table.table.lower() for table in statement.tables}
- for table in disallowed_tables:
- if table.lower() in present:
- found_tables.add(table)
- raise SupersetDisallowedSQLTableException(found_tables or
disallowed_tables)
+ if disallowed_tables:
+ # Report only the denylisted tables actually referenced in the query,
+ # honoring schema-qualified entries (e.g.
``information_schema.tables``).
+ # Resolve the effective default schema so an unqualified reference that
+ # the connection resolves to it at runtime (e.g. Postgres ``public``)
+ # still matches a qualified entry, even when no schema was selected.
+ effective_schema = query.schema or
database.get_default_schema(query.catalog)
+ found_tables = parsed_script.get_disallowed_tables(
+ disallowed_tables, effective_schema
Review Comment:
**Suggestion:** The effective schema is derived with `query.schema or
database.get_default_schema(...)`, which does not follow the engine-specific
runtime resolution used elsewhere (`get_default_schema_for_query`). On engines
that do not support dynamic schema switching, `query.schema` can differ from
the actual execution schema, so unqualified table references may be matched
against the wrong schema and the denylist can be bypassed (or legitimate
queries can be blocked). Use the same query-aware schema resolution path used
by RLS (`database.get_default_schema_for_query(query)`) when computing the
schema passed to `get_disallowed_tables`. [security]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ SQL Lab disallowed-table checks can mis-evaluate unqualified tables.
- ❌ Potential bypass of DISALLOWED_SQL_TABLES on static-schema engines.
- ⚠️ Legitimate SQL Lab queries may be rejected due to mismatch.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. A user submits a SQL Lab query in the UI targeting a database whose
engine spec does
not override `BaseEngineSpec.supports_dynamic_schema` (defaults to `False` at
`superset/db_engine_specs/base.py:565`), and selects a schema in the SQL Lab
form that
differs from the database's actual default schema. The query object is
created in
`SqlLabExecutionContext.create_query` at
`superset/sqllab/sqllab_execution_context.py:153-176`, which instantiates
`Query` with
`schema=self.schema` and `catalog=self.catalog`, and these fields are stored
on the ORM
model (`superset/models/sql_lab.py:141-143`).
2. The SQL Lab execution path invokes `execute_sql_statements` for that
`query_id`
(wrapper call at `superset/sql_lab.py:189`, implementation starting at
`superset/sql_lab.py:391`). Inside this function, the code loads the `Query`
via
`get_query(query_id=query_id)` and retrieves `database = query.database` and
`db_engine_spec = database.db_engine_spec` (`superset/sql_lab.py:17-21` in
the function
body).
3. After parsing the SQL into an `SQLScript`, `execute_sql_statements`
computes
`disallowed_tables` from `app.config["DISALLOWED_SQL_TABLES"]` and, if
non-empty, derives
`effective_schema = query.schema or
database.get_default_schema(query.catalog)`
(`superset/sql_lab.py:42-52` / diff lines 432-442). This manual resolution
bypasses the
engine-specific `get_default_schema_for_query` logic implemented in
`Database.get_default_schema_for_query` (`superset/models/core.py:12-30`),
which in turn
relies on `BaseEngineSpec.get_default_schema_for_query`
(`superset/db_engine_specs/base.py:37-51`) to ignore `query.schema` when
`supports_dynamic_schema` is `False` and instead use schema from the URI or
inspector.
4. The function then calls
`parsed_script.get_disallowed_tables(disallowed_tables,
effective_schema)` (`superset/sql_lab.py:53-55`), which uses the
`default_schema` argument
to interpret unqualified table references (`superset/sql/parse.py:519-535`).
On engines
with `supports_dynamic_schema=False`, if the user-set `query.schema` differs
from the
actual execution schema, `effective_schema` will incorrectly be
`query.schema` instead of
the true default schema that `get_default_schema_for_query` would return. As
a result,
unqualified table names in the query are matched against the wrong schema:
entries in
`DISALLOWED_SQL_TABLES` that should match can be missed (bypassing the
denylist), or,
conversely, legitimate queries can be blocked when `query.schema` does not
reflect the
schema actually used at runtime.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=8ee4aa35c37f4f1eb3492a7be940130c&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=8ee4aa35c37f4f1eb3492a7be940130c&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/sql_lab.py
**Line:** 442:444
**Comment:**
*Security: The effective schema is derived with `query.schema or
database.get_default_schema(...)`, which does not follow the engine-specific
runtime resolution used elsewhere (`get_default_schema_for_query`). On engines
that do not support dynamic schema switching, `query.schema` can differ from
the actual execution schema, so unqualified table references may be matched
against the wrong schema and the denylist can be bypassed (or legitimate
queries can be blocked). Use the same query-aware schema resolution path used
by RLS (`database.get_default_schema_for_query(query)`) when computing the
schema passed to `get_disallowed_tables`.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41120&comment_hash=a83b7b834f2aa1378232b9854430fca2e94c4eac642f61f9b1bd1b5ac3365589&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41120&comment_hash=a83b7b834f2aa1378232b9854430fca2e94c4eac642f61f9b1bd1b5ac3365589&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]