codeant-ai-for-open-source[bot] commented on code in PR #38725:
URL: https://github.com/apache/superset/pull/38725#discussion_r2955084916
##########
superset/mcp_service/chart/schemas.py:
##########
@@ -660,6 +660,32 @@ class XYChartConfig(BaseModel):
filters: List[FilterConfig] | None = None
row_limit: int = Field(10000, description="Max data points", ge=1,
le=50000)
+ @model_validator(mode="before")
+ @classmethod
+ def handle_legacy_field_names(cls, data: Any) -> Any:
+ """Map Superset-native field names to MCP schema field names.
+
+ LLMs sometimes use native Superset field names instead of MCP field
names.
+ This validator remaps known aliases before extra="forbid" validation
runs.
+ """
+ if not isinstance(data, dict):
+ return data
+ # groupby: list[str] -> group_by: ColumnRef (take first item only)
+ # Always pop groupby to avoid extra="forbid" rejection; only map it if
+ # group_by is not already set (explicit group_by takes precedence).
+ if "groupby" in data:
+ groupby_val = data.pop("groupby")
+ if "group_by" not in data:
+ if isinstance(groupby_val, list) and groupby_val:
+ data["group_by"] = {"name": str(groupby_val[0])}
+ elif isinstance(groupby_val, str):
+ data["group_by"] = {"name": groupby_val}
+ # adhoc_filters: incompatible format with MCP FilterConfig - strip
silently
Review Comment:
**Suggestion:** Converting `groupby` entries with `str(...)` can turn
non-string payloads into invalid column names (for example dicts become
stringified objects), causing avoidable validation failures. Extract only valid
string/dict name values instead of blindly coercing. [type error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Structured legacy groupby requests fail XY schema validation.
- ❌ generate_explore_link cannot return usable explore URL.
- ⚠️ LLM retries may drop grouping semantics.
```
</details>
```suggestion
if isinstance(groupby_val, list) and groupby_val:
first_groupby = groupby_val[0]
if isinstance(first_groupby, str):
data["group_by"] = {"name":
first_groupby}
elif isinstance(first_groupby, dict) and
isinstance(
first_groupby.get("name"), str
):
data["group_by"] = {"name":
first_groupby["name"]}
elif isinstance(groupby_val, str):
data["group_by"] = {"name": groupby_val}
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Invoke `generate_explore_link`
(`superset/mcp_service/explore/tool/generate_explore_link.py:51-54`) or
`generate_chart`
(`superset/mcp_service/chart/tool/generate_chart.py:131-134`) with
`config.chart_type="xy"` and legacy `groupby=[{"name":"region"}]`.
2. During request parsing, `handle_legacy_field_names()` coerces first list
item using
`str(groupby_val[0])` (`superset/mcp_service/chart/schemas.py:679-683`),
producing
`"{'name': 'region'}"`.
3. `ColumnRef.name` validation requires regex
`^[a-zA-Z0-9_][a-zA-Z0-9_\s\-\.]*$`
(`superset/mcp_service/chart/schemas.py:385-390`), so stringified dict fails
schema
validation.
4. `parse_request` converts that `ValidationError` into tool-level failure
(`superset/mcp_service/utils/schema_utils.py:470-474`, `154-171`), blocking
chart
URL/chart generation.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/schemas.py
**Line:** 679:683
**Comment:**
*Type Error: Converting `groupby` entries with `str(...)` can turn
non-string payloads into invalid column names (for example dicts become
stringified objects), causing avoidable validation failures. Extract only valid
string/dict name values instead of blindly coercing.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38725&comment_hash=c5383bbaf04c838c015cbece0f1514c5e5e83cb56ef9e928c185393dba4ce809&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38725&comment_hash=c5383bbaf04c838c015cbece0f1514c5e5e83cb56ef9e928c185393dba4ce809&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]