Re: [PR] feat(mcp): chart type plugin registry for extensible generate_chart [superset]

via GitHub Thu, 21 May 2026 03:23:06 -0700


codeant-ai-for-open-source[bot] commented on code in PR #39922:
URL: https://github.com/apache/superset/pull/39922#discussion_r3280417277



##########
superset/mcp_service/chart/plugins/big_number.py:
##########
@@ -0,0 +1,220 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Big number chart type plugin."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from superset.mcp_service.chart.chart_utils import (
+    _big_number_chart_what,
+    _summarize_filters,
+    is_column_truly_temporal,
+    map_big_number_config,
+)
+from superset.mcp_service.chart.plugin import BaseChartPlugin
+from superset.mcp_service.chart.schemas import BigNumberChartConfig, ColumnRef
+from superset.mcp_service.chart.validation.dataset_validator import 
DatasetValidator
+from superset.mcp_service.common.error_schemas import ChartGenerationError
+
+
+class BigNumberChartPlugin(BaseChartPlugin):
+    """Plugin for big_number chart type."""
+
+    chart_type = "big_number"
+    display_name = "Big Number"
+    native_viz_types = {
+        "big_number": "Big Number with Trendline",
+        "big_number_total": "Big Number",
+    }
+
+    def pre_validate(
+        self,
+        config: dict[str, Any],
+    ) -> ChartGenerationError | None:
+        if "metric" not in config:
+            return ChartGenerationError(
+                error_type="missing_metric",
+                message="Big Number chart missing required field: metric",
+                details=(
+                    "Big Number charts require a 'metric' field "
+                    "specifying the value to display"
+                ),
+                suggestions=[
+                    "Add 'metric' with name and aggregate: "
+                    "{'name': 'revenue', 'aggregate': 'SUM'}",
+                    "The aggregate function is required (SUM, COUNT, AVG, MIN, 
MAX)",
+                    "Example: {'chart_type': 'big_number', "
+                    "'metric': {'name': 'sales', 'aggregate': 'SUM'}}",
+                ],
+                error_code="MISSING_BIG_NUMBER_METRIC",
+            )
+
+        metric = config.get("metric", {})
+        if not isinstance(metric, dict):
+            return ChartGenerationError(
+                error_type="invalid_metric_type",
+                message="Big Number metric must be a dict with 'name' and 
'aggregate'",
+                details=(
+                    f"The 'metric' field must be an object, got 
{type(metric).__name__}"
+                ),
+                suggestions=[
+                    "Use a dict: {'name': 'col', 'aggregate': 'SUM'}",
+                    "Valid aggregates: SUM, COUNT, AVG, MIN, MAX",
+                ],
+                error_code="INVALID_BIG_NUMBER_METRIC_TYPE",
+            )
+        if not metric.get("aggregate") and not metric.get("saved_metric"):
+            return ChartGenerationError(
+                error_type="missing_metric_aggregate",
+                message=(
+                    "Big Number metric must include an aggregate function "
+                    "or reference a saved metric"
+                ),
+                details=(
+                    "The metric must have an 'aggregate' field or 
'saved_metric': true"
+                ),
+                suggestions=[
+                    "Add 'aggregate': {'name': 'col', 'aggregate': 'SUM'}",
+                    "Or use a saved metric: {'name': 'metric', 'saved_metric': 
true}",
+                    "Valid aggregates: SUM, COUNT, AVG, MIN, MAX",
+                ],
+                error_code="MISSING_BIG_NUMBER_AGGREGATE",
+            )
+
+        show_trendline = config.get("show_trendline", False)
+        temporal_column = config.get("temporal_column")
+        if show_trendline and not temporal_column:
+            return ChartGenerationError(
+                error_type="missing_temporal_column",
+                message="Trendline requires a temporal column",
+                details=(
+                    "When 'show_trendline' is True, "
+                    "a 'temporal_column' must be specified"
+                ),
+                suggestions=[
+                    "Add 'temporal_column': 'date_column_name'",
+                    "Or set 'show_trendline': false for number only",
+                    "Use get_dataset_info to find temporal columns",
+                ],
+                error_code="MISSING_TEMPORAL_COLUMN",
+            )
+
+        return None
+
+    def extract_column_refs(self, config: Any) -> list[ColumnRef]:
+        if not isinstance(config, BigNumberChartConfig):
+            return []
+        refs: list[ColumnRef] = [config.metric]
+        # temporal_column is a str field, not a ColumnRef — validate it exists
+        if config.temporal_column:
+            refs.append(ColumnRef(name=config.temporal_column))
+        if config.filters:
+            for f in config.filters:
+                refs.append(ColumnRef(name=f.column))
+        return refs
+
+    def to_form_data(
+        self, config: Any, dataset_id: int | str | None = None
+    ) -> dict[str, Any]:
+        return map_big_number_config(config)
+
+    def post_map_validate(
+        self,
+        config: Any,
+        form_data: dict[str, Any],
+        dataset_id: int | str | None = None,
+    ) -> ChartGenerationError | None:
+        """Verify the trendline temporal column is a real temporal SQL type.
+
+        This check was previously baked into map_config_to_form_data() in
+        chart_utils.py as a special case. Moving it here keeps the dispatcher
+        clean and makes the constraint explicit and discoverable.
+        """
+        if not isinstance(config, BigNumberChartConfig):
+            return None
+        if not (config.show_trendline and config.temporal_column):
+            return None
+
+        if not is_column_truly_temporal(config.temporal_column, dataset_id):
+            return ChartGenerationError(
+                error_type="non_temporal_trendline_column",
+                message=(
+                    f"Big Number trendline requires a temporal SQL column; "
+                    f"'{config.temporal_column}' is not temporal."
+                ),
+                details=(
+                    f"Column '{config.temporal_column}' does not have a 
temporal "
+                    f"SQL type (DATE, DATETIME, TIMESTAMP). The trendline 
requires "
+                    f"a true temporal column for DATE_TRUNC to work."
+                ),
+                suggestions=[
+                    "Use get_dataset_info to find columns with temporal SQL 
types",
+                    "Set 'show_trendline': false to use any column as the 
metric",
+                    "If the column contains dates stored as integers, "
+                    "consider casting it in a virtual dataset",
+                ],
+                error_code="NON_TEMPORAL_TRENDLINE_COLUMN",
+            )
+
+        return None
+
+    def generate_name(self, config: Any, dataset_name: str | None = None) -> 
str:
+        what = _big_number_chart_what(config)
+        context = _summarize_filters(getattr(config, "filters", None))
+        return self._with_context(what, context)
+
+    def resolve_viz_type(self, config: Any) -> str:
+        show_trendline = getattr(config, "show_trendline", False)
+        temporal_column = getattr(config, "temporal_column", None)
+        if show_trendline and temporal_column:
+            return "big_number"
+        return "big_number_total"
+
+    def normalize_column_refs(self, config: Any, dataset_context: Any) -> Any:
+        config_dict = config.model_dump()
+
+        if config_dict.get("metric") and not 
config_dict["metric"].get("saved_metric"):
+            config_dict["metric"]["name"] = 
DatasetValidator._get_canonical_column_name(
+                config_dict["metric"]["name"], dataset_context
+            )

Review Comment:
   **Suggestion:** Saved metrics are skipped during name normalization, so 
case-insensitive validation can pass but the emitted metric string can still 
use the wrong casing and fail later in Superset's exact-name metric lookup. 
Normalize saved metric names against dataset metrics (not just raw columns) so 
the final form_data always uses canonical metric names. [api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Big Number charts with mis-cased saved metrics fail.
   - ⚠️ Validation passes but Superset query errors at runtime.
   - ⚠️ LLM clients see confusing success-then-fail behavior.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Define a dataset in Superset with a saved metric whose name differs in 
case from the
   desired client input, e.g. metric_name="Revenue" (saved metric) on a dataset 
that
   ValidationPipeline will load via DatasetValidator._get_dataset_context at
   `superset/mcp_service/chart/validation/dataset_validator.py:197-252`.
   
   2. Call the MCP tool `generate_chart` at
   `superset/mcp_service/chart/tool/generate_chart.py:95` with a Big Number 
config using the
   saved metric but with different casing, e.g. `"metric": {"name": "revenue",
   "saved_metric": true}` in `GenerateChartRequest.config` 
(chart_type="big_number").
   
   3. ValidationPipeline.validate_request_with_warnings at
   `superset/mcp_service/chart/validation/pipeline.py:90-55` invokes
   DatasetValidator.validate_against_dataset at `dataset_validator.py:55-72`, 
which calls
   `_validate_saved_metrics` at `dataset_validator.py:485-46`; this checks
   `col_ref.name.lower()` against `available_metrics` lowercased, so 
`"revenue"` is accepted
   as a valid saved metric even though its canonical name is `"Revenue"`.
   
   4. The same pipeline then normalizes column names via
   DatasetValidator.normalize_column_names at `dataset_validator.py:94-141`, 
which delegates
   to BigNumberChartPlugin.normalize_column_refs at
   `superset/mcp_service/chart/plugins/big_number.py:189-203`; because the 
metric has
   `"saved_metric": true`, the guard at lines 192-195 skips 
`_get_canonical_column_name`, so
   the metric name remains `"revenue"` and is never canonicalized to 
`"Revenue"`.
   
   5. After validation, `generate_chart` calls `map_config_to_form_data` at
   `superset/mcp_service/chart/chart_utils.py:93-131`, which routes to
   BigNumberChartPlugin.to_form_data at `big_number.py:132-135`; 
`map_big_number_config` at
   `chart_utils.py:763-799` builds form_data using 
`create_metric_object(config.metric)` at
   `chart_utils.py:11-51`, which returns the raw `col.name` string `"revenue"` 
for saved
   metrics.
   
   6. When Superset executes the query, its SQLA connector resolves metrics via 
a
   `metrics_by_name` dict (see docstring and usage at `chart_utils.py:491-499` 
and
   `superset/connectors/sqla/models.py:1758-38`); because the dict keys are the 
canonical
   metric names (e.g. `"Revenue"`), the mis-cased `"revenue"` key is missing 
and Superset
   raises `QueryObjectValidationError("Metric '%(metric)s' does not exist")`, 
causing the Big
   Number chart generation to fail at runtime even though validation passed.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=9fb46a540a414dd2b72d5acc26f21fe1&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=9fb46a540a414dd2b72d5acc26f21fe1&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/plugins/big_number.py
   **Line:** 192:195
   **Comment:**
        *Api Mismatch: Saved metrics are skipped during name normalization, so 
case-insensitive validation can pass but the emitted metric string can still 
use the wrong casing and fail later in Superset's exact-name metric lookup. 
Normalize saved metric names against dataset metrics (not just raw columns) so 
the final form_data always uses canonical metric names.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=73a3370023c419db03ad37c214ad624fd371db98a58137840dfe6b1137ad3410&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=73a3370023c419db03ad37c214ad624fd371db98a58137840dfe6b1137ad3410&reaction=dislike'>👎</a>



##########
superset/mcp_service/chart/plugins/pie.py:
##########
@@ -0,0 +1,128 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Pie chart type plugin."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from superset.mcp_service.chart.chart_utils import (
+    _pie_chart_what,
+    _summarize_filters,
+    map_pie_config,
+)
+from superset.mcp_service.chart.plugin import BaseChartPlugin
+from superset.mcp_service.chart.schemas import ColumnRef, PieChartConfig
+from superset.mcp_service.chart.validation.dataset_validator import 
DatasetValidator
+from superset.mcp_service.common.error_schemas import ChartGenerationError
+
+
+class PieChartPlugin(BaseChartPlugin):
+    """Plugin for pie chart type."""
+
+    chart_type = "pie"
+    display_name = "Pie / Donut Chart"
+    native_viz_types = {
+        "pie": "Pie Chart",
+    }
+
+    def pre_validate(
+        self,
+        config: dict[str, Any],
+    ) -> ChartGenerationError | None:
+        missing_fields = []
+
+        if "dimension" not in config:
+            missing_fields.append("'dimension' (category column for slices)")
+        if "metric" not in config:
+            missing_fields.append("'metric' (value metric for slice sizes)")
+
+        if missing_fields:
+            return ChartGenerationError(
+                error_type="missing_pie_fields",
+                message=(
+                    f"Pie chart missing required fields: {', 
'.join(missing_fields)}"
+                ),
+                details=(
+                    "Pie charts require a dimension (categories) and a metric 
(values)"
+                ),
+                suggestions=[
+                    "Add 'dimension' field: {'name': 'category_column'}",
+                    "Add 'metric' field: {'name': 'value_column', 'aggregate': 
'SUM'}",
+                    "Example: {'chart_type': 'pie', 'dimension': {'name': 
'product'}, "
+                    "'metric': {'name': 'revenue', 'aggregate': 'SUM'}}",
+                ],
+                error_code="MISSING_PIE_FIELDS",
+            )
+
+        return None
+
+    def extract_column_refs(self, config: Any) -> list[ColumnRef]:
+        if not isinstance(config, PieChartConfig):
+            return []
+        refs: list[ColumnRef] = [config.dimension, config.metric]
+        if config.filters:
+            for f in config.filters:
+                refs.append(ColumnRef(name=f.column))
+        return refs
+
+    def to_form_data(
+        self, config: Any, dataset_id: int | str | None = None
+    ) -> dict[str, Any]:
+        return map_pie_config(config)
+
+    def generate_name(self, config: Any, dataset_name: str | None = None) -> 
str:
+        what = _pie_chart_what(config)
+        context = _summarize_filters(config.filters)
+        return self._with_context(what, context)
+
+    def resolve_viz_type(self, config: Any) -> str:
+        return "pie"
+
+    def normalize_column_refs(self, config: Any, dataset_context: Any) -> Any:
+        config_dict = config.model_dump()
+
+        if config_dict.get("dimension"):
+            config_dict["dimension"]["name"] = (
+                DatasetValidator._get_canonical_column_name(
+                    config_dict["dimension"]["name"], dataset_context
+                )
+            )
+        if config_dict.get("metric") and not 
config_dict["metric"].get("saved_metric"):
+            config_dict["metric"]["name"] = 
DatasetValidator._get_canonical_column_name(
+                config_dict["metric"]["name"], dataset_context

Review Comment:
   **Suggestion:** The metric normalization path excludes saved metrics, which 
leaves user-provided casing unchanged even though downstream metric resolution 
is exact-name based; this can cause "metric not found" failures at query time. 
Normalize saved metric names to canonical dataset metric names before building 
form_data. [api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Pie charts using mis-cased saved metrics error.
   - ⚠️ Dataset validation greenlights configs that later fail.
   - ⚠️ Users get runtime errors instead of clear validation feedback.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure a dataset that has a saved metric with a mixed-case name, e.g.
   metric_name="Total_Sales" exposed via DatasetContext.available_metrics in
   DatasetValidator._get_dataset_context at
   `superset/mcp_service/chart/validation/dataset_validator.py:197-252`.
   
   2. Invoke the MCP `generate_chart` tool at
   `superset/mcp_service/chart/tool/generate_chart.py:95` with a Pie chart 
config whose
   metric references that saved metric but with different casing, e.g. 
`"metric": {"name":
   "total_sales", "saved_metric": true}` under `"chart_type": "pie"`.
   
   3. ValidationPipeline.validate_request_with_warnings at
   `superset/mcp_service/chart/validation/pipeline.py:90-55` calls
   DatasetValidator.validate_against_dataset at `dataset_validator.py:55-72`, 
which in turn
   calls `_validate_saved_metrics` at `dataset_validator.py:485-46`; this uses 
a lowercase
   set of metric names, so `"total_sales"` is accepted as a valid saved metric 
despite its
   canonical name being `"Total_Sales"`.
   
   4. The same pipeline then normalizes column names using
   DatasetValidator.normalize_column_names at `dataset_validator.py:94-141`, 
which looks up
   the PieChart plugin and calls PieChartPlugin.normalize_column_refs at
   `superset/mcp_service/chart/plugins/pie.py:97-111`; the metric normalization 
block at
   lines 106-108 is guarded by `not config_dict["metric"].get("saved_metric")`, 
so saved
   metrics bypass `_get_canonical_column_name` and retain the user-provided 
casing
   `"total_sales"`.
   
   5. After validation, `generate_chart` maps the typed config to Superset 
form_data by
   calling map_config_to_form_data at 
`superset/mcp_service/chart/chart_utils.py:93-131`,
   which dispatches to PieChartPlugin.to_form_data at `pie.py:84-87`; 
`map_pie_config` at
   `chart_utils.py:735-760` calls `create_metric_object(config.metric)` at
   `chart_utils.py:11-51`, returning the bare string `"total_sales"` for the 
saved metric.
   
   6. Superset's SQLA connector later resolves metrics via the 
`metrics_by_name` dictionary,
   as documented in `create_metric_object` at `chart_utils.py:491-499` and used 
at
   `superset/connectors/sqla/models.py:1758-38`; because that dict is keyed by 
the canonical
   saved metric name `"Total_Sales"`, the mis-cased `"total_sales"` key is not 
found and the
   engine raises `QueryObjectValidationError("Metric '%(metric)s' does not 
exist")`, causing
   Pie chart execution to fail even though dataset validation succeeded.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=a028e087b37e4aedabb0d3bc02ccef82&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=a028e087b37e4aedabb0d3bc02ccef82&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/plugins/pie.py
   **Line:** 106:108
   **Comment:**
        *Api Mismatch: The metric normalization path excludes saved metrics, 
which leaves user-provided casing unchanged even though downstream metric 
resolution is exact-name based; this can cause "metric not found" failures at 
query time. Normalize saved metric names to canonical dataset metric names 
before building form_data.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=c6d30a15af813f3529dbfea28c4f244f205c8a6a795fcd816a6daa2931a764f3&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=c6d30a15af813f3529dbfea28c4f244f205c8a6a795fcd816a6daa2931a764f3&reaction=dislike'>👎</a>



##########
superset/mcp_service/chart/plugins/pivot_table.py:
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Pivot table chart type plugin."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from superset.mcp_service.chart.chart_utils import (
+    _pivot_table_what,
+    _summarize_filters,
+    map_pivot_table_config,
+)
+from superset.mcp_service.chart.plugin import BaseChartPlugin
+from superset.mcp_service.chart.schemas import ColumnRef, PivotTableChartConfig
+from superset.mcp_service.chart.validation.dataset_validator import 
DatasetValidator
+from superset.mcp_service.common.error_schemas import ChartGenerationError
+
+
+class PivotTableChartPlugin(BaseChartPlugin):
+    """Plugin for pivot_table chart type."""
+
+    chart_type = "pivot_table"
+    display_name = "Pivot Table"
+    native_viz_types = {
+        "pivot_table_v2": "Pivot Table",
+    }
+
+    def pre_validate(
+        self,
+        config: dict[str, Any],
+    ) -> ChartGenerationError | None:
+        missing_fields = []
+
+        if "rows" not in config:
+            missing_fields.append("'rows' (row grouping columns)")
+        if "metrics" not in config:
+            missing_fields.append("'metrics' (aggregation metrics)")
+
+        if missing_fields:
+            return ChartGenerationError(
+                error_type="missing_pivot_fields",
+                message=(
+                    f"Pivot table missing required fields: {', 
'.join(missing_fields)}"
+                ),
+                details="Pivot tables require row groupings and metrics",
+                suggestions=[
+                    "Add 'rows' field: [{'name': 'category'}]",
+                    "Add 'metrics' field: [{'name': 'sales', 'aggregate': 
'SUM'}]",
+                    "Optional 'columns' for cross-tabulation: [{'name': 
'region'}]",
+                ],
+                error_code="MISSING_PIVOT_FIELDS",
+            )
+
+        if not isinstance(config.get("rows", []), list):
+            return ChartGenerationError(
+                error_type="invalid_rows_format",
+                message="Rows must be a list of columns",
+                details="The 'rows' field must be an array of column 
specifications",
+                suggestions=[
+                    "Wrap row columns in array: 'rows': [{'name': 
'category'}]",
+                ],
+                error_code="INVALID_ROWS_FORMAT",
+            )
+
+        if not isinstance(config.get("metrics", []), list):
+            return ChartGenerationError(
+                error_type="invalid_metrics_format",
+                message="Metrics must be a list",
+                details="The 'metrics' field must be an array of metric 
specifications",
+                suggestions=[
+                    "Wrap metrics in array: 'metrics': [{'name': 'sales', "
+                    "'aggregate': 'SUM'}]",
+                ],
+                error_code="INVALID_METRICS_FORMAT",
+            )
+
+        return None
+
+    def extract_column_refs(self, config: Any) -> list[ColumnRef]:
+        if not isinstance(config, PivotTableChartConfig):
+            return []
+        refs: list[ColumnRef] = list(config.rows)
+        refs.extend(config.metrics)
+        if config.columns:
+            refs.extend(config.columns)
+        if config.filters:
+            for f in config.filters:
+                refs.append(ColumnRef(name=f.column))
+        return refs
+
+    def to_form_data(
+        self, config: Any, dataset_id: int | str | None = None
+    ) -> dict[str, Any]:
+        return map_pivot_table_config(config)
+
+    def generate_name(self, config: Any, dataset_name: str | None = None) -> 
str:
+        what = _pivot_table_what(config)
+        context = _summarize_filters(config.filters)
+        return self._with_context(what, context)
+
+    def resolve_viz_type(self, config: Any) -> str:
+        return "pivot_table_v2"
+
+    def normalize_column_refs(self, config: Any, dataset_context: Any) -> Any:
+        config_dict = config.model_dump()
+
+        def _norm_col_list(key: str) -> None:
+            if config_dict.get(key):
+                for col in config_dict[key]:
+                    col["name"] = DatasetValidator._get_canonical_column_name(
+                        col["name"], dataset_context
+                    )
+
+        _norm_col_list("rows")
+        _norm_col_list("metrics")
+        _norm_col_list("columns")

Review Comment:
   **Suggestion:** Metrics are normalized with a helper that prefers matching 
dataset columns before metrics, so a saved metric that case-insensitively 
collides with a column name can be rewritten to the column identifier and stop 
resolving as a saved metric. Handle metric normalization with metric-aware 
precedence (or branch on `saved_metric`) to preserve metric identity. [logic 
error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Pivot table charts can fail on valid saved metrics.
   - ⚠️ Column-first canonicalization corrupts saved metric identity.
   - ⚠️ Users get runtime metric-not-found errors after validation.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Consider a dataset where a physical column and a saved metric share the 
same
   case-insensitive name but differ in canonical casing, e.g. available_columns 
includes
   {"name": "Revenue"} and available_metrics includes {"name": "REVENUE"} as 
built in
   DatasetValidator._get_dataset_context at
   `superset/mcp_service/chart/validation/dataset_validator.py:197-252`.
   
   2. A client calls the MCP `generate_chart` tool at
   `superset/mcp_service/chart/tool/generate_chart.py:95` with a pivot_table 
config
   referencing the saved metric using a lowercased name, e.g. `"metrics": 
[{"name":
   "revenue", "saved_metric": true}]` in `PivotTableChartConfig`
   (`superset/mcp_service/chart/schemas.py:243-261`).
   
   3. ValidationPipeline.validate_request_with_warnings at
   `superset/mcp_service/chart/validation/pipeline.py:90-55` runs
   DatasetValidator.validate_against_dataset at `dataset_validator.py:55-72`, 
which calls
   `_validate_saved_metrics` at `dataset_validator.py:485-46`; this checks
   `col_ref.name.lower()` against `available_metrics` lowercased, so 
`"revenue"` is accepted
   as a valid saved metric in spite of the ambiguous column name.
   
   4. In the column-normalization layer, 
ValidationPipeline._normalize_column_names at
   `validation/pipeline.py:137-181` calls 
DatasetValidator.normalize_column_names at
   `dataset_validator.py:94-141`, which looks up the PivotTableChartPlugin and 
invokes its
   normalize_column_refs at 
`superset/mcp_service/chart/plugins/pivot_table.py:120-134`;
   `_norm_col_list("metrics")` at lines 123-131 runs 
`_get_canonical_column_name` at
   `dataset_validator.py:46-77` for each metric, and because that helper checks
   `available_columns` before `available_metrics`, the shared lowercase name 
`"revenue"` is
   canonicalized to the column's name `"Revenue"` instead of the saved metric's 
canonical
   name `"REVENUE"`.
   
   5. The now-normalized config is used in `generate_chart` to build Superset 
form_data via
   map_config_to_form_data at `chart_utils.py:93-131`, which routes to
   PivotTableChartPlugin.to_form_data at `pivot_table.py:107-110`; 
`map_pivot_table_config`
   at `superset/mcp_service/chart/chart_utils.py:841-871` builds the 
`"metrics"` list by
   calling `create_metric_object(col)` at `chart_utils.py:11-51`, which returns 
the string
   `"Revenue"` for the saved metric because `col.saved_metric` is still True.
   
   6. At query execution time, Superset resolves metric strings via 
`metrics_by_name` (see
   docstring at `chart_utils.py:491-499` and access pattern at
   `superset/connectors/sqla/models.py:1758-38`); this dict is keyed by the 
canonical saved
   metric name `"REVENUE"`, so the rewritten `"Revenue"` key is missing, causing
   `QueryObjectValidationError("Metric '%(metric)s' does not exist")` and 
breaking pivot
   table chart execution for datasets where saved metric names collide 
case-insensitively
   with column names.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=1badcaf515e440168bb0dcac688e5a28&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=1badcaf515e440168bb0dcac688e5a28&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/plugins/pivot_table.py
   **Line:** 123:132
   **Comment:**
        *Logic Error: Metrics are normalized with a helper that prefers 
matching dataset columns before metrics, so a saved metric that 
case-insensitively collides with a column name can be rewritten to the column 
identifier and stop resolving as a saved metric. Handle metric normalization 
with metric-aware precedence (or branch on `saved_metric`) to preserve metric 
identity.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=48ccfc1cb5281dff153f0dba0498d86ec39f73064840381f85af6a1739c7ad33&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39922&comment_hash=48ccfc1cb5281dff153f0dba0498d86ec39f73064840381f85af6a1739c7ad33&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(mcp): chart type plugin registry for extensible generate_chart [superset]

Reply via email to