[PR] feat: Explorable protocol [superset]

via GitHub Mon, 24 Nov 2025 13:22:53 -0800


betodealmeida opened a new pull request, #36245:
URL: https://github.com/apache/superset/pull/36245


   <!---
   Please write the PR title following the conventions at 
https://www.conventionalcommits.org/en/v1.0.0/
   Example:
   fix(dashboard): load charts correctly
   -->
   
   ### SUMMARY
   <!--- Describe the change below, including rationale and design decisions -->
   
   This PR introduces a clean, type-safe protocol for data sources in Superset 
that can be "explored" (visualized). While the immediate implementation focuses 
on SQL datasources (dataserts and queries), the protocol lays the groundwork 
for first-class [semantic layer 
support](https://github.com/apache/superset/issues/35003).
   
   #### 🎯 Core Protocol (`superset/explorables/base.py`)
   
   Introduced the `Explorable` Protocol defining the interface for any data 
source that can be visualized in Superset:
   
   ```python
   @runtime_checkable
   class Explorable(Protocol):
       def get_query_result(self, query_object: QueryObject) -> QueryResult: ...
       def get_query_str(self, query_obj: QueryObjectDict) -> str: ...
       def has_drill_by_columns(self, column_names: list[str]) -> bool: ...
   
       @property
       def uid(self) -> str: ...
       @property
       def type(self) -> str: ...
       @property
       def columns(self) -> list[Any]: ...
       @property
       def cache_timeout(self) -> int | None: ...
       @property
       def is_rls_supported(self) -> bool: ...
       # ... other properties
   ```
   
   **Key design decisions:**
   - **Minimal interface**: Only includes methods/properties needed for 
exploration
   - **No SQL-specific concepts**: `database`, `schema`, etc. are not in the 
protocol
   - **Runtime checkable**: Enables protocol checks without explicit inheritance
   - **Required `is_rls_supported`**: Makes RLS support explicit (can return 
`False`)i
   
   #### 🔧 QueryContext Integration
   
   Updated `QueryContext` to work with `Explorable` instead of just 
`BaseDatasource`:
   
   ```python
   # Before
   datasource: BaseDatasource
   
   # After
   datasource: Explorable
   ```
   
   This enables polymorphic handling of different data source types while 
maintaining full backward compatibility.
   
   #### 📊 Drill-By Abstraction
   
   **Before**: Drill-by logic was SQL-specific and lived in the security manager
   ```python
   # Security manager had SQL queries like:
   drillable_columns = {
       row[0] for row in session.query(TableColumn.column_name)
       .filter(TableColumn.table_id == datasource_id)
       .filter(TableColumn.groupby).all()
   }
   ```
   
   **After**: Drill-by is datasource-driven through the protocol
   ```python
   # Security manager (clean and datasource-agnostic)
   def has_drill_by_access(...):
       # ...
       and datasource.has_drill_by_columns(dimensions)
   
   # BaseDatasource implements the SQL-specific logic
   def has_drill_by_columns(self, column_names: list[str]) -> bool:
       drillable_columns = {
           row[0] for row in db.session.query(TableColumn.column_name)
           .filter(TableColumn.table_id == self.id)
           .filter(TableColumn.groupby).all()
       }
       return set(column_names).issubset(drillable_columns)
   ```
   
   ### Benefits (even without semantic layers)
   
   #### ✅ Immediate Value
   
   1. **Clearer boundaries**: SQL-specific code is now clearly scoped and 
doesn't leak into generic components
   2. **Better maintainability**: Security manager is simpler and focuses on 
authorization logic
   4. **Documentation**: The protocol serves as living documentation of what it 
means to be "explorable"
   5. **Testability**: Easier to mock and test with a defined protocol interface
   
   #### ✅ Architecture Improvements
   
   1. **Single responsibility**: Each component does one thing well
   2. **Dependency inversion**: High-level modules (security, query context) 
depend on abstractions
   3. **Open/closed principle**: New data source types (like semantic views) 
can be added without modifying existing code
   
   ### Future Semantic Layer Support
   
   When semantic layers are implemented, they'll simply implement the 
`Explorable` protocol:
   
   ```python
   class SemanticView:
       @property
       def uid(self) -> str:
           return f"semantic_view_{self.id}"
   
       def has_drill_by_columns(self, column_names: list[str]) -> bool:
           # Check against semantic layer dimension definitions
           return all(col in self.dimensions for col in column_names)
   
       @property
       def is_rls_supported(self) -> bool:
           return True  # Semantic layer handles RLS internally
   
       # ... implement other protocol methods
   ```
   
   **Zero changes needed** to:
   - QueryContext
   - Security manager
   - Chart data API
   - Any other component that works with `Explorable`
   
   ### Testing
   
   All existing tests pass with the new protocol. The test suite validates:
   - SQL datasource functionality remains unchanged
   - Type checking passes (`mypy`)
   - Security manager correctly handles both SQL and future non-SQL datasources
   - Drill-by logic works through the new abstraction
   
   ### Backward Compatibility
   
   This is a 100% backward-compatible refactor:
   - All existing SQL datasources work exactly as before
   - No API changes for charts, dashboards, or SQL Lab
   - No database migrations needed
   - No configuration changes required
   
   The protocol is purely additive - it formalizes what was already there, 
making it explicit and extensible.
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in 
[SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] feat: Explorable protocol [superset]

Reply via email to