codeant-ai-for-open-source[bot] commented on code in PR #37395:
URL: https://github.com/apache/superset/pull/37395#discussion_r2721267166
##########
superset/utils/rls.py:
##########
@@ -98,14 +98,20 @@ def get_predicates_for_table(
if not dataset:
return []
+ # Use internal method with include_guest_rls=False to prevent double RLS
+ # application in virtual datasets. Guest RLS will be applied to the outer
+ # query via get_sqla_row_level_filters() on the virtual dataset itself.
+ # See Issue #37359.
return [
str(
predicate.compile(
dialect=database.get_dialect(),
compile_kwargs={"literal_binds": True},
)
)
- for predicate in dataset.get_sqla_row_level_filters()
+ for predicate in dataset._get_sqla_row_level_filters_internal(
+ include_guest_rls=False
+ )
Review Comment:
**Suggestion:** Logic/security regression: the new code unconditionally
calls the internal
`_get_sqla_row_level_filters_internal(include_guest_rls=False)` which disables
guest RLS for every call to `get_predicates_for_table`. That can cause RLS
rules not to be applied in contexts that previously relied on guest RLS (for
example SQL Lab or other callers), opening a security/regression hole. Restore
the previous public API call so guest RLS continues to be applied where
expected. [security]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ RLS enforcement missing for some guest scenarios.
- ⚠️ SQL Lab queries may miss guest-level filters.
- ⚠️ Cache key generation for virtual datasets affected.
- ⚠️ Applies to apply_rls() callers in helpers.py.
```
</details>
```suggestion
for predicate in dataset.get_sqla_row_level_filters()
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Start Superset with EMBEDDED_SUPERSET enabled and define guest RLS rules
(see
guest RLS handling in superset/connectors/sqla/models.py:751-796).
2. Execute any code path that calls get_predicates_for_table() in
superset/utils/rls.py (the caller is apply_rls() in this file). Example
entry:
- a SQL execution path that reaches apply_rls()
(superset/models/helpers.py:2052-2057
and/or superset/models/helpers.py:3198 per PR analysis).
3. At superset/utils/rls.py:101-115, get_predicates_for_table() now always
calls
dataset._get_sqla_row_level_filters_internal(include_guest_rls=False)
(line 112),
which omits guest RLS clauses even when the caller previously expected
them.
4. Observe that RLS predicates coming from guest rules are absent from the
returned
list (no guest predicate strings are compiled at
superset/utils/rls.py:106-111),
causing queries or cache-key logic that relied on guest RLS to behave
incorrectly
(e.g., missing row filtering or incorrect cache key composition).
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/rls.py
**Line:** 112:114
**Comment:**
*Security: Logic/security regression: the new code unconditionally
calls the internal
`_get_sqla_row_level_filters_internal(include_guest_rls=False)` which disables
guest RLS for every call to `get_predicates_for_table`. That can cause RLS
rules not to be applied in contexts that previously relied on guest RLS (for
example SQL Lab or other callers), opening a security/regression hole. Restore
the previous public API call so guest RLS continues to be applied where
expected.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]