codeant-ai-for-open-source[bot] commented on code in PR #38403:
URL: https://github.com/apache/superset/pull/38403#discussion_r2943415328
##########
superset/mcp_service/chart/schemas.py:
##########
@@ -590,6 +590,111 @@ class MixedTimeseriesChartConfig(BaseModel):
filters: List[FilterConfig] | None = None
+class BigNumberChartConfig(BaseModel):
+ model_config = ConfigDict(extra="forbid")
+
+ chart_type: Literal["big_number"] = Field(
+ ...,
+ description=(
+ "Chart type discriminator - MUST be 'big_number'. "
+ "Creates Big Number charts that display a single prominent "
+ "metric value. Set show_trendline=True with a temporal_column "
+ "for a number with trendline, or leave show_trendline=False "
+ "for a standalone number."
+ ),
+ )
+ metric: ColumnRef = Field(
+ ...,
+ description=(
+ "The metric to display as a big number. "
+ "Must include an aggregate function (e.g., SUM, COUNT)."
+ ),
+ )
+ temporal_column: str | None = Field(
+ None,
+ description=(
+ "Temporal column for the trendline x-axis. "
+ "Required when show_trendline is True."
+ ),
+ min_length=1,
+ max_length=255,
+ pattern=r"^[a-zA-Z0-9_][a-zA-Z0-9_\s\-\.]*$",
+ )
Review Comment:
**Suggestion:** `temporal_column` is accepted as raw user input and passed
through to `granularity_sqla` without the same sanitization used for other
column identifiers. This can allow SQL-keyword payloads or malformed values
that break query generation or create an injection surface; sanitize it with
`sanitize_user_input(..., check_sql_keywords=True)` in a field validator.
[security]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Big Number trendline requests can fail at compile stage.
- ⚠️ Temporal identifiers lack consistent SQL-keyword sanitization path.
```
</details>
```suggestion
temporal_column: str | None = Field(
None,
description=(
"Temporal column for the trendline x-axis. "
"Required when show_trendline is True."
),
min_length=1,
max_length=255,
pattern=r"^[a-zA-Z0-9_][a-zA-Z0-9_\s\-\.]*$",
)
@field_validator("temporal_column")
@classmethod
def sanitize_temporal_column(cls, v: str | None) -> str | None:
"""Sanitize temporal column name to prevent injection attacks."""
return sanitize_user_input(
v,
"Temporal column",
max_length=255,
allow_empty=True,
check_sql_keywords=True,
)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Call MCP `generate_chart`
(`superset/mcp_service/chart/tool/generate_chart.py:124`)
with `config.chart_type='big_number'`, `show_trendline=true`, and
`temporal_column='SELECT'` (or another keyword-looking identifier).
2. Request parsing succeeds because `BigNumberChartConfig.temporal_column` in
`superset/mcp_service/chart/schemas.py:613-622` only applies regex/length
checks and has
no sanitizer validator.
3. The value is copied directly into query form_data at
`superset/mcp_service/chart/chart_utils.py:674`
(`form_data["granularity_sqla"] =
config.temporal_column`).
4. Dataset validation does not enforce Big Number column existence/types
(`superset/mcp_service/chart/validation/dataset_validator.py:56-57`,
`178-197`), so this
reaches compile/query execution and can fail generation with
`CHART_COMPILE_FAILED` in
`generate_chart.py:296-336`.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/schemas.py
**Line:** 613:622
**Comment:**
*Security: `temporal_column` is accepted as raw user input and passed
through to `granularity_sqla` without the same sanitization used for other
column identifiers. This can allow SQL-keyword payloads or malformed values
that break query generation or create an injection surface; sanitize it with
`sanitize_user_input(..., check_sql_keywords=True)` in a field validator.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38403&comment_hash=e84c3d426a6358489f77be1cdcfbe96b9ec1ea21aede831d4c3a090755525231&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38403&comment_hash=e84c3d426a6358489f77be1cdcfbe96b9ec1ea21aede831d4c3a090755525231&reaction=dislike'>👎</a>
##########
superset/mcp_service/chart/schemas.py:
##########
@@ -590,6 +590,111 @@ class MixedTimeseriesChartConfig(BaseModel):
filters: List[FilterConfig] | None = None
+class BigNumberChartConfig(BaseModel):
+ model_config = ConfigDict(extra="forbid")
+
+ chart_type: Literal["big_number"] = Field(
+ ...,
+ description=(
+ "Chart type discriminator - MUST be 'big_number'. "
+ "Creates Big Number charts that display a single prominent "
+ "metric value. Set show_trendline=True with a temporal_column "
+ "for a number with trendline, or leave show_trendline=False "
+ "for a standalone number."
+ ),
+ )
+ metric: ColumnRef = Field(
+ ...,
+ description=(
+ "The metric to display as a big number. "
+ "Must include an aggregate function (e.g., SUM, COUNT)."
+ ),
+ )
+ temporal_column: str | None = Field(
+ None,
+ description=(
+ "Temporal column for the trendline x-axis. "
+ "Required when show_trendline is True."
+ ),
+ min_length=1,
+ max_length=255,
+ pattern=r"^[a-zA-Z0-9_][a-zA-Z0-9_\s\-\.]*$",
+ )
+ time_grain: TimeGrain | None = Field(
+ None,
+ description=(
+ "Time granularity for trendline data. "
+ "Common values: PT1H (hour), P1D (day), P1W (week), "
+ "P1M (month), P1Y (year)."
+ ),
+ )
+ show_trendline: bool = Field(
+ False,
+ description=(
+ "Show a trendline below the big number. "
+ "Requires 'temporal_column' to be set."
+ ),
+ )
+ subheader: str | None = Field(
+ None,
+ description="Subtitle text displayed below the big number",
+ max_length=500,
+ )
Review Comment:
**Suggestion:** `subheader` is user-controlled display text but is not
sanitized before being stored and returned in chart configuration. This
introduces an XSS risk when rendered by clients; sanitize it using the existing
`sanitize_user_input` path used elsewhere in this module. [security]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ User subtitle text persists unsanitized in chart params.
- ⚠️ External clients may render unsafe subtitle content.
```
</details>
```suggestion
subheader: str | None = Field(
None,
description="Subtitle text displayed below the big number",
max_length=500,
)
@field_validator("subheader")
@classmethod
def sanitize_subheader(cls, v: str | None) -> str | None:
"""Sanitize subtitle text to prevent XSS attacks."""
return sanitize_user_input(v, "Subheader", max_length=500,
allow_empty=True)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Call MCP `generate_chart`
(`superset/mcp_service/chart/tool/generate_chart.py:124`)
with Big Number config containing `subheader` HTML/script-like text.
2. `BigNumberChartConfig` accepts it as-is because `subheader` in
`superset/mcp_service/chart/schemas.py:638-642` has only max length, no
sanitizing
validator.
3. Mapping layer copies raw text into chart form_data
(`superset/mcp_service/chart/chart_utils.py:663-664`), and save mode
persists it in chart
params (`generate_chart.py:106-113`).
4. Response also returns raw `form_data` to clients
(`generate_chart.py:728-730`), so any
downstream renderer not escaping this field can surface XSS-like content.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/schemas.py
**Line:** 638:642
**Comment:**
*Security: `subheader` is user-controlled display text but is not
sanitized before being stored and returned in chart configuration. This
introduces an XSS risk when rendered by clients; sanitize it using the existing
`sanitize_user_input` path used elsewhere in this module.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38403&comment_hash=502280911f19727ea12588671747b529f3cd9462880a8c4d32958bec46fa2ef7&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38403&comment_hash=502280911f19727ea12588671747b529f3cd9462880a8c4d32958bec46fa2ef7&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]