YuriyKrasilnikov commented on PR #37371:
URL: https://github.com/apache/superset/pull/37371#issuecomment-3791086817
## Response to codeant-ai bot review
### The bot's suggestion is incorrect
The bot suggested supporting `{"column": "string"}` format. After **thorough
investigation of Superset's type system**, this format **does not exist**:
```python
# superset/superset_typing.py:51-57
class AdhocMetric(TypedDict, total=False):
column: AdhocMetricColumn | None # ← DICT or None, NEVER string!
# superset-frontend/packages/superset-ui-core/src/query/types/Metric.ts
export interface AdhocMetricSimple {
column: { column_name?: string, ... }; // ← Always DICT
}
```
### Data Flow Analysis
```
Frontend (formData)
↓
saveModalActions.ts:172 → JSON.stringify(formData)
↓
/api/v1/chart/{id} (PUT/POST)
↓
charts/schemas.py → fields.Raw() ← NO STRUCTURE VALIDATION
↓
DATABASE: slices.params (MediumText)
↓
[Query Request]
↓
query_context_factory.py → ChartDAO.find_by_id()
↓
stored_chart.params_dict → json_to_dict(self.params)
↓
security/manager.py:_orderby_whitelist_compare() ← OUR BARRIERS
```
**Key point**: `fields.Raw()` in schema accepts ANY input without
validation. Barriers must be in security code.
### However, we added defensive barriers anyway
Even though the bot's specific scenario is invalid, we applied **fail-closed
defensive coding**. This protects against malformed API requests, 500 errors,
and potential bypasses.
### What we added
**1. Validate orderby is a list** (`_orderby_whitelist_compare`):
```python
if form_orderby is not None and not isinstance(form_orderby, list):
return True # block invalid format
```
**Attack prevented:**
```python
form_data = {"orderby": "malicious"} # string, not list
# Without barrier: iterates chars "m","a","l"... all skip → PASSES!
# With barrier: blocked immediately
```
**2. Validate each element is tuple/list**
**3. Validate nested column is dict** (already existed)
### Similar patterns in Superset codebase
Our barriers follow existing Superset patterns:
| Location | Pattern | Code |
|----------|---------|------|
| `security/manager.py:812` | isinstance before access | `if
isinstance(datasource, BaseDatasource):` |
| `charts/client_processing.py:112` | Type check before use | `if not
isinstance(df, pd.DataFrame):` |
| `models/slice.py:374` | Safe conversion | `if id_or_uuid.isdigit(): return
int()` |
| `common/query_object.py:279` | Validation method | `def validate(): ...
raise QueryObjectValidationError` |
| `views/base.py:337` | try/except with fallback | `except
json.JSONDecodeError: return fallback` |
### Why defensive barriers even if bot was wrong
| Reason | Explanation |
|--------|-------------|
| **Security-critical code** | Must be fail-closed (block unknown, don't
pass) |
| **Schema accepts Raw()** | No structure validation at API level |
| **Defense in depth** | Don't rely on "frontend won't send this" |
| **Prevents 500 errors** | Graceful block instead of crash |
### Tests added (20 new tests)
- `test_extract_orderby_column_name_*` - unit tests for extraction
- `test_get_visible_columns_*` - unit tests for whitelist
- `test_query_context_modified_orderby_*` - integration tests for barriers
- Including:
`test_query_context_modified_orderby_string_instead_of_list_blocked`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]