Re: [PR] feat(mcp): chart type plugin registry for extensible generate_chart [superset]

via GitHub Thu, 11 Jun 2026 12:36:38 -0700


codeant-ai-for-open-source[bot] commented on code in PR #39922:
URL: https://github.com/apache/superset/pull/39922#discussion_r3398655586



##########
superset/mcp_service/chart/plugins/pivot_table.py:
##########
@@ -0,0 +1,158 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Pivot table chart type plugin."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from superset.mcp_service.chart.chart_utils import (
+    _pivot_table_what,
+    _summarize_filters,
+    map_pivot_table_config,
+)
+from superset.mcp_service.chart.plugin import BaseChartPlugin
+from superset.mcp_service.chart.schemas import ColumnRef, PivotTableChartConfig
+from superset.mcp_service.chart.validation.dataset_validator import 
DatasetValidator
+from superset.mcp_service.common.error_schemas import ChartGenerationError
+
+
+class PivotTableChartPlugin(BaseChartPlugin):
+    """Plugin for pivot_table chart type."""
+
+    chart_type = "pivot_table"
+    display_name = "Pivot Table"
+    native_viz_types = {
+        "pivot_table_v2": "Pivot Table",
+    }
+
+    def pre_validate(
+        self,
+        config: dict[str, Any],
+    ) -> ChartGenerationError | None:
+        missing_fields = []
+
+        if not config.get("rows"):
+            missing_fields.append("'rows' (row grouping columns)")

Review Comment:
   **Suggestion:** The pre-validation only checks `rows`, but the schema 
explicitly allows aliases like `groupby` and `dimension`. Requests using those 
valid aliases are rejected before Pydantic can normalize them, causing false 
"missing required fields" errors for otherwise valid payloads. Make 
pre-validation honor the same aliases as the schema. [api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Pivot table requests using groupby aliases rejected early.
   - ⚠️ Valid configs fail before Pydantic schema normalization.
   - ⚠️ LLM clients following schema docs see spurious errors.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Invoke the MCP `generate_chart` tool
   (`superset/mcp_service/chart/tool/generate_chart.py:220-279`) with a pivot 
table request
   whose `config` uses a valid alias for rows, e.g. `{\"chart_type\": 
\"pivot_table\",
   \"groupby\": [{\"name\": \"region\"}], \"metrics\": [{\"name\": \"revenue\",
   \"aggregate\": \"SUM\"}]}` and no explicit `rows` key.
   
   2. The request flows into `SchemaValidator.validate_request()` at
   `superset/mcp_service/chart/validation/schema_validator.py:44-57`, which 
calls
   `_pre_validate(request_data)` at `schema_validator.py:64-79`; 
`_pre_validate` extracts
   `config` and `chart_type` and dispatches to 
`_pre_validate_chart_type(chart_type, config)`
   at `schema_validator.py:27-31`.
   
   3. `_pre_validate_chart_type` loads the `PivotTableChartPlugin` from the 
registry and
   calls `plugin.pre_validate(config)` at `schema_validator.py:59-77`; inside
   `PivotTableChartPlugin.pre_validate`
   (`superset/mcp_service/chart/plugins/pivot_table.py:44-69`), the code checks 
only
   `config.get("rows")` at line 50 and, because the caller used `groupby`, 
appends `"'rows'
   (row grouping columns)"` to `missing_fields` and returns a 
`ChartGenerationError` with
   `error_code="MISSING_PIVOT_FIELDS"`.
   
   4. Because pre-validation returns an error, 
`SchemaValidator.validate_request()` exits
   early at `schema_validator.py:51-52`, and Pydantic is never invoked to apply 
the `rows`
   field aliases (`AliasChoices("rows", "groupby", "dimension")` defined on
   `PivotTableChartConfig.rows` in 
`superset/mcp_service/chart/schemas.py:106-112`), so a
   request that would be valid per the schema is rejected with a false "missing 
required
   fields" error.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=3844a3b331a541c6a00e1a6f433edeee&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=3844a3b331a541c6a00e1a6f433edeee&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/plugins/pivot_table.py
   **Line:** 50:51
   **Comment:**
        *Api Mismatch: The pre-validation only checks `rows`, but the schema 
explicitly allows aliases like `groupby` and `dimension`. Requests using those 
valid aliases are rejected before Pydantic can normalize them, causing false 
"missing required fields" errors for otherwise valid payloads. Make 
pre-validation honor the same aliases as the schema.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=be9276764d5b8175ff4b6302fc3758373ab43aaec93a5bfa18b33b4a09bc8dcb&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=be9276764d5b8175ff4b6302fc3758373ab43aaec93a5bfa18b33b4a09bc8dcb&reaction=dislike'>👎</a>



##########
superset/mcp_service/chart/plugins/mixed_timeseries.py:
##########
@@ -0,0 +1,170 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Mixed timeseries chart type plugin."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from superset.mcp_service.chart.chart_utils import (
+    _mixed_timeseries_what,
+    _summarize_filters,
+    map_mixed_timeseries_config,
+)
+from superset.mcp_service.chart.plugin import BaseChartPlugin
+from superset.mcp_service.chart.schemas import ColumnRef, 
MixedTimeseriesChartConfig
+from superset.mcp_service.chart.validation.dataset_validator import 
DatasetValidator
+from superset.mcp_service.common.error_schemas import ChartGenerationError
+
+
+class MixedTimeseriesChartPlugin(BaseChartPlugin):
+    """Plugin for mixed_timeseries chart type."""
+
+    chart_type = "mixed_timeseries"
+    display_name = "Mixed Timeseries"
+    native_viz_types = {
+        "mixed_timeseries": "Mixed Timeseries Chart",
+    }
+
+    def pre_validate(
+        self,
+        config: dict[str, Any],
+    ) -> ChartGenerationError | None:
+        missing_fields = []
+
+        if "x" not in config and "x_axis" not in config:
+            missing_fields.append("'x' (X-axis temporal column)")
+        if not config.get("y") and not config.get("metrics"):
+            missing_fields.append("'y' (primary Y-axis metrics)")
+        if not config.get("y_secondary") and not config.get("metrics_b"):
+            missing_fields.append("'y_secondary' (secondary Y-axis metrics)")
+
+        if missing_fields:
+            return ChartGenerationError(
+                error_type="missing_mixed_timeseries_fields",
+                message=(
+                    f"Mixed timeseries chart missing required fields: "
+                    f"{', '.join(missing_fields)}"
+                ),
+                details=(
+                    "Mixed timeseries charts require an x-axis, primary 
metrics, "
+                    "and secondary metrics"
+                ),
+                suggestions=[
+                    "Add 'x' field: {'name': 'date_column'}",
+                    "Add 'y' field: [{'name': 'revenue', 'aggregate': 'SUM'}]",
+                    "Add 'y_secondary': [{'name': 'orders', 'aggregate': 
'COUNT'}]",
+                    "Optional: 'primary_kind' and 'secondary_kind' for chart 
types",
+                ],
+                error_code="MISSING_MIXED_TIMESERIES_FIELDS",
+            )
+
+        for field_name in ["y", "y_secondary"]:
+            if not isinstance(config.get(field_name, []), list):
+                return ChartGenerationError(
+                    error_type=f"invalid_{field_name}_format",
+                    message=f"'{field_name}' must be a list of metrics",
+                    details=(
+                        f"The '{field_name}' field must be an array of metric "
+                        "specifications"
+                    ),
+                    suggestions=[
+                        f"Wrap in array: '{field_name}': "
+                        "[{'name': 'col', 'aggregate': 'SUM'}]",
+                    ],
+                    error_code=f"INVALID_{field_name.upper()}_FORMAT",
+                )
+
+        return None
+
+    def extract_column_refs(self, config: Any) -> list[ColumnRef]:
+        if not isinstance(config, MixedTimeseriesChartConfig):
+            return []
+        refs: list[ColumnRef] = [config.x]
+        refs.extend(config.y)
+        refs.extend(config.y_secondary)
+        if config.group_by:
+            refs.extend(config.group_by)
+        if config.group_by_secondary:
+            refs.extend(config.group_by_secondary)
+        if config.filters:
+            for f in config.filters:
+                refs.append(ColumnRef(name=f.column))
+        return refs
+
+    def to_form_data(
+        self, config: Any, dataset_id: int | str | None = None
+    ) -> dict[str, Any]:
+        return map_mixed_timeseries_config(config, dataset_id=dataset_id)
+
+    def generate_name(self, config: Any, dataset_name: str | None = None) -> 
str:
+        what = _mixed_timeseries_what(config)
+        context = _summarize_filters(config.filters)
+        return self._with_context(what, context)
+
+    def resolve_viz_type(self, config: Any) -> str:
+        return "mixed_timeseries"
+
+    def normalize_column_refs(self, config: Any, dataset_context: Any) -> Any:
+        config_dict = config.model_dump()
+
+        def _norm_single(key: str) -> None:
+            if config_dict.get(key):
+                config_dict[key]["name"] = 
DatasetValidator._get_canonical_column_name(
+                    config_dict[key]["name"], dataset_context
+                )
+
+        def _norm_list(key: str) -> None:
+            if config_dict.get(key):
+                for col in config_dict[key]:
+                    if col.get("saved_metric"):
+                        col["name"] = 
DatasetValidator._get_canonical_metric_name(
+                            col["name"], dataset_context
+                        )
+                    else:
+                        col["name"] = 
DatasetValidator._get_canonical_column_name(
+                            col["name"], dataset_context
+                        )

Review Comment:
   **Suggestion:** Non-saved metrics are normalized as if they always have a 
`name`, but mixed-timeseries metrics can be SQL expressions with `name=None`. 
This path passes `None` into canonical column lookup, which raises and aborts 
normalization for the entire chart config. [null pointer]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Mixed-timeseries charts with SQL metrics skip name normalization.
   - ⚠️ generate_chart pipeline loses case-fixing for mixed-timeseries.
   - ⚠️ Normalization errors logged but silently ignored for these configs.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Send a Mixed Timeseries chart request to the MCP `generate_chart` tool
   (`superset/mcp_service/chart/tool/generate_chart.py:220-57`) with 
`config.chart_type =
   "mixed_timeseries"` and at least one primary or secondary metric (`y` or 
`y_secondary`)
   defined as a SQL-expression `ColumnRef` (populate `sql_expression` and 
`label`, omit
   `name`) as allowed by `MixedTimeseriesChartConfig`
   (`superset/mcp_service/chart/schemas.py:18-59` and `ColumnRef` at 
`schemas.py:36-123`).
   
   2. The request flows through 
`ValidationPipeline.validate_request_with_warnings()`
   (`superset/mcp_service/chart/validation/pipeline.py:60-80`), which, after 
schema and
   dataset validation, calls `_normalize_column_names()` (`pipeline.py:10-53`) 
to
   canonicalize column names using the dataset context.
   
   3. `_normalize_column_names()` calls 
`DatasetValidator.normalize_column_names(config,
   request.dataset_id, dataset_context)`
   (`superset/mcp_service/chart/validation/dataset_validator.py:120-167`), 
which resolves the
   `"mixed_timeseries"` plugin and invokes
   `MixedTimeseriesChartPlugin.normalize_column_refs(config, dataset_context)`
   (`superset/mcp_service/chart/plugins/mixed_timeseries.py:123-150`).
   
   4. In `normalize_column_refs`, `config.model_dump()` yields dict entries 
where
   SQL-expression metrics have `col["name"] = None` and `col["saved_metric"] = 
False`; the
   `_norm_list` helper at `mixed_timeseries.py:132-142` iterates the metrics, 
and for such a
   metric the `else` branch at lines 139-142 passes `None` to
   `DatasetValidator._get_canonical_column_name()`. That method
   (`dataset_validator.py:53-84`) calls `column_name.lower()`, raising 
`AttributeError`,
   which is caught by `_normalize_column_names()` (`pipeline.py:49-53`) and 
results in a
   warning and a fallback to the unnormalized request, so no column names in 
the Mixed
   Timeseries config are normalized.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=141bd0ac77b14fd89c9c988c7f93beba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=141bd0ac77b14fd89c9c988c7f93beba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/plugins/mixed_timeseries.py
   **Line:** 139:142
   **Comment:**
        *Null Pointer: Non-saved metrics are normalized as if they always have 
a `name`, but mixed-timeseries metrics can be SQL expressions with `name=None`. 
This path passes `None` into canonical column lookup, which raises and aborts 
normalization for the entire chart config.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=25904df3b7502e840a901dcd827b9d996d09c591ff6b831f611597fc7910db30&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=25904df3b7502e840a901dcd827b9d996d09c591ff6b831f611597fc7910db30&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(mcp): chart type plugin registry for extensible generate_chart [superset]

Reply via email to