YuriyKrasilnikov opened a new issue, #37359:
URL: https://github.com/apache/superset/issues/37359

   ### Bug description
   
   **Summary:**
   PR #36061 "fix: RLS in virtual datasets" introduced a regression causing 
**double RLS application** for guest users with embedded dashboards. This is 
similar to #34894 (double time filter application) fixed in PR #35890.
   
   **Steps to Reproduce:**
   1. Create a physical dataset for table `my_table`
   2. Create a virtual dataset with SQL referencing that table:
      ```sql
      SELECT * FROM my_schema.my_table WHERE some_column = 'value'
      ```
   3. Create a dashboard with a chart using this virtual dataset
   4. Generate a guest token with RLS clause: `tenant_id = 123`
   5. Open the embedded dashboard with this guest token
   6. Observe SQL error with incorrect table alias
   
   **Expected behavior:**
   RLS should be applied once - either to the underlying table OR to the 
virtual dataset output.
   
   **Actual behavior:**
   RLS is applied twice:
   1. First by `apply_rls()` in `models/helpers.py:2047-2057` (wraps underlying 
table in subquery)
   2. Second by `get_sqla_row_level_filters()` in `models/helpers.py:3198` 
(adds WHERE clause to virtual dataset query)
   
   Generated SQL example:
   ```sql
   SELECT ... FROM (
     SELECT * FROM my_schema.my_table
     WHERE tenant_id = 123  -- First RLS applied to underlying table (correct)
   ) AS `my_schema.my_table`
   WHERE my_table.tenant_id = 123  -- Second RLS applied with WRONG alias!
   GROUP BY ...
   ```
   
   The second WHERE clause uses `my_table.tenant_id` but the subquery is 
aliased as `my_schema.my_table`, causing SQL errors like:
   ```
   Unknown expression or function identifier 'my_table.tenant_id'
   ```
   
   **Root Cause:**
   - `get_from_clause()` in `models/helpers.py:2047-2057` calls `apply_rls()` 
which wraps tables referenced in virtual dataset SQL with subqueries containing 
RLS predicates
   - `get_sqla_query()` in `models/helpers.py:3198` calls 
`get_sqla_row_level_filters()` which adds guest token RLS to the outer WHERE 
clause
   - For guest users with embedded dashboards, both mechanisms fire 
independently, causing double application with mismatched aliases
   
   **Related:**
   - PR #35890 fixed an identical pattern for time filters (double time filter 
application)
   - PR #36061 introduced this regression (merged 2025-11-14)
   
   ### Screenshots/recordings
   
   N/A - backend SQL generation issue
   
   ### Superset version
   
   6.0.0
   
   ### Python version
   
   3.11
   
   ### Node version
   
   18 or greater
   
   ### Browser
   
   Not applicable
   
   ### Additional context
   
   - Feature flags enabled: `EMBEDDED_SUPERSET`
   - This is a regression - worked correctly in Superset 5.x
   - No configuration workaround available
   - Workaround: use physical datasets instead of virtual datasets (not always 
possible)
   
   ### Checklist
   
   - [X] I have searched Superset docs and Slack and didn't find a solution to 
my problem.
   - [X] I have searched the GitHub issue tracker and didn't find a similar bug 
report.
   - [X] I have checked Superset's logs for errors and if I found a relevant 
Python stacktrace, I included it here as text in the "additional context" 
section.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to