Re: [PR] feat(mcp): chart type plugin registry for extensible generate_chart [superset]

via GitHub Mon, 01 Jun 2026 11:56:18 -0700


gabotorresruiz commented on code in PR #39922:
URL: https://github.com/apache/superset/pull/39922#discussion_r3331063305



##########
superset/mcp_service/chart/plugin.py:
##########
@@ -0,0 +1,255 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+ChartTypePlugin protocol and BaseChartPlugin base class.
+
+Each chart type owns its pre-validation, column extraction, form_data mapping,
+and post-map validation in a single plugin class. This eliminates the previous
+pattern of 4 separate dispatch points (schema_validator.py, 
dataset_validator.py,
+chart_utils.py, pipeline.py) that had to be updated in sync whenever a new 
chart
+type was added.
+"""
+
+from __future__ import annotations
+
+from typing import Any, Protocol, runtime_checkable
+
+from superset.mcp_service.chart.schemas import ColumnRef
+from superset.mcp_service.common.error_schemas import ChartGenerationError
+
+
+@runtime_checkable
+class ChartTypePlugin(Protocol):
+    """
+    Protocol that every chart-type plugin must satisfy.
+
+    Implementing all eight methods in a single class guarantees that adding a
+    new chart type requires only one new file — the plugin — rather than edits
+    across multiple separate files.
+    """
+
+    #: Discriminator value matching ChartConfig's chart_type field.
+    chart_type: str
+
+    #: Human-readable name shown to users (e.g. "Line / Bar / Area / Scatter").
+    display_name: str
+
+    #: Maps every Superset-internal viz_type this plugin can produce to a
+    #: user-facing display name, e.g. {"echarts_timeseries_line": "Line 
Chart"}.
+    #: Used by the registry to resolve display names for existing charts 
without
+    #: needing a separate JSON mapping file.
+    native_viz_types: dict[str, str]
+
+    def pre_validate(
+        self,
+        config: dict[str, Any],
+    ) -> ChartGenerationError | None:
+        """
+        Early validation of the raw config dict before Pydantic parsing.
+
+        Called by SchemaValidator before attempting to parse the request.
+        Should check that required top-level keys are present and well-typed.
+
+        Returns None if valid, ChartGenerationError if invalid.
+        """
+        ...
+
+    def extract_column_refs(
+        self,
+        config: Any,
+    ) -> list[ColumnRef]:
+        """
+        Extract all column references from a parsed chart config.
+
+        Called by DatasetValidator to validate that all referenced columns 
exist
+        in the dataset. Must cover every field that holds a column name,
+        including filters.
+
+        Returns a list of ColumnRef objects (may be empty).
+        """
+        ...
+
+    def to_form_data(
+        self,
+        config: Any,
+        dataset_id: int | str | None = None,
+    ) -> dict[str, Any]:
+        """
+        Map a parsed chart config to Superset's internal form_data dict.
+
+        Replaces the if/elif chain in chart_utils.map_config_to_form_data().
+
+        Returns a Superset form_data dict ready for caching and rendering.
+        """
+        ...
+
+    def post_map_validate(
+        self,
+        config: Any,
+        form_data: dict[str, Any],
+        dataset_id: int | str | None = None,
+    ) -> ChartGenerationError | None:
+        """
+        Validate the mapped form_data after to_form_data() runs.
+
+        Use this for cross-field constraints that can only be checked once
+        form_data is assembled (e.g. BigNumber trendline requires a temporal
+        column whose type must be verified against the dataset).
+
+        Returns None if valid, ChartGenerationError if invalid.
+        """
+        ...
+
+    def normalize_column_refs(
+        self,
+        config: Any,
+        dataset_context: Any,
+    ) -> Any:
+        """
+        Return a new config with column names normalized to canonical dataset 
casing.
+
+        Called by DatasetValidator.normalize_column_names(). The default
+        implementation (in BaseChartPlugin) returns the config unchanged; 
plugins
+        with column fields override this to fix case sensitivity mismatches.
+
+        Returns a new config object (or the original if no normalization 
needed).
+        """
+        ...
+
+    def get_runtime_warnings(
+        self,
+        config: Any,
+        dataset_id: int | str,
+    ) -> list[str]:
+        """
+        Return chart-type-specific runtime warnings (performance, 
compatibility).
+
+        Called by RuntimeValidator to collect per-type warnings. Warnings are
+        informational only — they never block chart generation. The default
+        implementation returns an empty list; plugins override this to emit
+        chart-type-specific warnings (e.g. XY cardinality checks).
+
+        Returns a list of warning message strings (may be empty).
+        """
+        ...
+
+    def generate_name(
+        self,
+        config: Any,
+        dataset_name: str | None = None,
+    ) -> str:
+        """
+        Return a descriptive chart name for the given config.
+
+        Called by chart_utils.generate_chart_name(). The name should follow
+        the standard format conventions documented in that function. Plugins
+        that do not override this return the generic fallback "Chart".
+        """
+        ...
+
+    def resolve_viz_type(self, config: Any) -> str:
+        """
+        Return the Superset-internal viz_type string for this config.
+
+        Called by chart_utils._resolve_viz_type(). The returned string must
+        match a registered Superset viz plugin (e.g. 
"echarts_timeseries_line").
+        Plugins that do not override this return "unknown".
+        """
+        ...
+
+    def schema_error_hint(self) -> ChartGenerationError | None:
+        """
+        Return a user-friendly error for Pydantic discriminated-union parse 
failures.
+
+        Called by SchemaValidator when Pydantic cannot parse the config union 
and
+        the chart_type is known. Returning None falls back to the generic 
error.
+        """
+        ...
+
+
+class BaseChartPlugin:
+    """
+    Base class providing sensible defaults for all ChartTypePlugin methods.
+
+    Concrete plugins extend this and override only what they need.
+    """
+
+    chart_type: str = ""
+    display_name: str = ""
+    native_viz_types: dict[str, str] = {}
+
+    def pre_validate(
+        self,
+        config: dict[str, Any],
+    ) -> ChartGenerationError | None:
+        return None
+
+    def extract_column_refs(
+        self,
+        config: Any,
+    ) -> list[ColumnRef]:
+        return []
+
+    def to_form_data(
+        self,
+        config: Any,
+        dataset_id: int | str | None = None,
+    ) -> dict[str, Any]:
+        raise NotImplementedError(
+            f"{self.__class__.__name__}.to_form_data() is not implemented"
+        )
+
+    def post_map_validate(
+        self,
+        config: Any,
+        form_data: dict[str, Any],
+        dataset_id: int | str | None = None,
+    ) -> ChartGenerationError | None:
+        return None
+
+    def normalize_column_refs(

Review Comment:
   Future-direction note, not a blocker: every plugin's override of 
`normalize_column_refs` and `extract_column_refs` is the same ~25-line pattern 
(model_dump, branch on saved_metric, canonicalize, normalize filters, 
model_validate). With 7 plugins that's ~200 lines of near-identical code, and 
every future chart type pays the same cost. A declarative `column_fields` spec 
on `BaseChartPlugin` could let the base class handle the loop once. Worth 
considering if the registry gets more plugins



##########
superset/mcp_service/chart/registry.py:
##########
@@ -0,0 +1,256 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+ChartTypeRegistry — central registry mapping chart_type strings to plugins.
+
+Replaces the four previously-scattered dispatch locations:
+  - schema_validator.py: chart_type_validators dict
+  - dataset_validator.py: isinstance branches in _extract_column_references()
+  - chart_utils.py: if/elif chain in map_config_to_form_data()
+  - dataset_validator.py: isinstance branches in normalize_column_names()
+
+Usage::
+
+    from superset.mcp_service.chart.registry import get_registry
+
+    plugin = get_registry().get("xy")
+    if plugin is None:
+        raise ValueError("Unknown chart type: xy")
+    form_data = plugin.to_form_data(config, dataset_id)
+"""
+
+from __future__ import annotations
+
+import logging
+import sys
+import threading
+from collections.abc import Callable, Iterable
+from dataclasses import dataclass, field
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from superset.mcp_service.chart.plugin import ChartTypePlugin
+
+logger = logging.getLogger(__name__)
+
+_REGISTRY: dict[str, "ChartTypePlugin"] = {}
+_plugins_loaded = False
+_plugins_load_failed = False
+_plugins_lock = threading.RLock()
+
+# ---------------------------------------------------------------------------
+# Plugin filter — replaced atomically by configure() at app startup.
+# Default: all registered plugins visible (no disabled set, no callable).
+# ---------------------------------------------------------------------------
+
+PluginEnabledFunc = Callable[[str], bool]
+
+
+@dataclass(frozen=True)
+class _PluginFilterConfig:
+    disabled_plugins: frozenset[str] = field(default_factory=frozenset)
+    enabled_func: PluginEnabledFunc | None = None
+
+
+_filter_config: _PluginFilterConfig = _PluginFilterConfig()
+
+
+def _ensure_plugins_loaded() -> None:
+    """Lazily import the plugins package to populate _REGISTRY.
+
+    Called before every registry lookup so the registry is always populated,
+    even when callers (tests, chart_utils, validators) import this module
+    directly without first importing app.py.
+    """
+    global _plugins_loaded, _plugins_load_failed
+    if _plugins_loaded or _plugins_load_failed:
+        return
+    with _plugins_lock:
+        if not _plugins_loaded and not _plugins_load_failed:
+            registry_before_import = dict(_REGISTRY)
+            try:
+                import superset.mcp_service.chart.plugins  # noqa: F401
+
+                _plugins_loaded = True
+            except Exception:  # noqa: BLE001 — plugin import may raise 
anything
+                _REGISTRY.clear()
+                _REGISTRY.update(registry_before_import)
+                _plugins_load_failed = True
+                logger.exception(
+                    "Failed to load built-in chart type plugins; "
+                    "further lookups will return None"
+                )
+
+
+def configure(
+    disabled: Iterable[str] | None = None,
+    enabled_func: PluginEnabledFunc | None = None,
+) -> None:
+    """Set runtime plugin filters. Called once during app initialization.
+
+    Replaces the filter config atomically with a single assignment so 
concurrent
+    readers always observe a consistent (disabled_plugins, enabled_func) pair.
+
+    Args:
+        disabled: chart_type strings to suppress. Accepts any iterable (set,
+            frozenset, list, tuple). Ignored when enabled_func is provided.
+        enabled_func: callable(chart_type) -> bool.  When set, overrides
+            ``disabled``.  Must be cheap and in-process — no network I/O per
+            call.  On exception the registry fails *closed* (plugin hidden).
+    """
+    global _filter_config
+
+    if enabled_func is not None and not callable(enabled_func):
+        raise TypeError("enabled_func must be callable or None")
+
+    new_config = _PluginFilterConfig(
+        disabled_plugins=frozenset(disabled or ()),
+        enabled_func=enabled_func,
+    )
+    _filter_config = new_config
+
+    if new_config.disabled_plugins:
+        logger.info(
+            "MCP chart plugins disabled: %s", 
sorted(new_config.disabled_plugins)
+        )
+    if new_config.enabled_func is not None:
+        logger.info(
+            "MCP chart plugin dynamic filter configured: %r", 
new_config.enabled_func
+        )
+
+
+def _is_plugin_enabled(chart_type: str) -> bool:
+    """Return True if the plugin is currently enabled (not filtered out)."""
+    config = _filter_config  # read once — atomic reference in CPython
+    if config.enabled_func is not None:
+        try:
+            return bool(config.enabled_func(chart_type))
+        except Exception:  # noqa: BLE001 — operator-supplied callable may 
raise anything
+            logger.warning(
+                "MCP_CHART_PLUGIN_ENABLED_FUNC raised for chart_type=%r; "
+                "failing closed (plugin hidden)",
+                chart_type,
+                exc_info=True,
+            )
+            return False
+    return chart_type not in config.disabled_plugins
+
+
+def register(plugin: "ChartTypePlugin") -> None:
+    """Register a chart type plugin in the global registry."""
+    if not plugin.chart_type:
+        raise ValueError(f"{type(plugin).__name__} must define a non-empty 
chart_type")
+    with _plugins_lock:
+        if plugin.chart_type in _REGISTRY:
+            logger.warning(
+                "Overwriting existing plugin for chart_type=%r", 
plugin.chart_type
+            )
+        _REGISTRY[plugin.chart_type] = plugin
+    logger.debug("Registered chart plugin: %r", plugin.chart_type)
+
+
+def get(chart_type: str) -> "ChartTypePlugin | None":
+    """Return the plugin for chart_type, or None if unknown or disabled."""
+    _ensure_plugins_loaded()
+    if chart_type not in _REGISTRY or not _is_plugin_enabled(chart_type):
+        return None
+    return _REGISTRY[chart_type]
+
+
+def all_types() -> list[str]:
+    """Return enabled registered chart type strings in insertion order."""
+    _ensure_plugins_loaded()
+    return [ct for ct in _REGISTRY if _is_plugin_enabled(ct)]
+
+
+def is_registered(chart_type: str) -> bool:
+    """Return True if chart_type has a registered plugin, regardless of 
enabled state.
+
+    Use this to distinguish an unknown chart type from a disabled one.
+    Use is_enabled() to check whether the plugin is currently available.
+    """
+    _ensure_plugins_loaded()
+    return chart_type in _REGISTRY
+
+
+def is_enabled(chart_type: str) -> bool:
+    """Return True if chart_type is registered AND currently enabled."""
+    _ensure_plugins_loaded()
+    return chart_type in _REGISTRY and _is_plugin_enabled(chart_type)
+
+
+def display_name_for_viz_type(viz_type: str) -> str | None:
+    """Return the user-facing display name for a Superset-internal viz_type.
+
+    Searches every registered plugin's ``native_viz_types`` mapping.
+    Returns None if no plugin recognises the viz_type.
+
+    Example::
+
+        display_name_for_viz_type("echarts_timeseries_line")  # "Line Chart"
+        display_name_for_viz_type("pivot_table_v2")           # "Pivot Table"
+        display_name_for_viz_type("unknown_type")             # None
+    """
+    _ensure_plugins_loaded()
+    for plugin in _REGISTRY.values():

Review Comment:
   If two plugins ever declare the same `viz_type` in their `native_viz_types`, 
the iteration-order winner silently wins and the other is unreachable here. 
None of the 7 built-in plugins collide today, but worth catching this at 
`register()` time: walk the new plugin's `native_viz_types` keys against 
already-registered ones and `logger.warning` (or raise) on collision. Cheap 
insurance for future plugin authors



##########
superset/initialization/__init__.py:
##########
@@ -776,6 +776,7 @@ def init_app(self) -> None:
         # conditionally
         self.configure_feature_flags()
         self.check_guest_token_secret()
+        self.configure_mcp_chart_registry()

Review Comment:
   `flask_singleton.py` re-runs `registry.configure(...)` after the MCP config 
overlay, so this earlier call is effectively overwritten in the MCP path. Both 
calls land on the same values today because `get_mcp_config()` round-trips the 
operator overrides, but maintaining two configure sites is a footgun if either 
side changes how config keys are merged. Consider keeping only the call in 
`flask_singleton.py` since that's where the MCP-specific config is finalized



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(mcp): chart type plugin registry for extensible generate_chart [superset]

Reply via email to