eudaimos opened a new issue, #40733:
URL: https://github.com/apache/superset/issues/40733

   ### Bug description
   
   Multiple `superset.mcp_service` tools return a `2 validation errors for 
UserInfo` Pydantic error in their response when the calling user has Superset 
`Role` ORM objects assigned (e.g., `Admin` and `Gamma`). The Pydantic 
`UserInfo` model in the MCP response schema expects role names as strings, but 
the code populates them with the raw `Role` ORM objects, so the response 
serializer raises.
   
   ### Environment
   
   - Apache Superset image: 
`apachesuperset.docker.scarf.sh/apache/superset:6.1.0rc1`
   - MCP server: `superset.mcp_service` running via `fastmcp` `streamable-http` 
transport (in-process with the Flask app), started from the same image as a 
sidecar container.
   - Calling user has two Superset roles assigned: `Admin` and `Gamma`.
   
   ### How to reproduce
   
   Any of the following tool calls reliably produce the error:
   
   1. `get_dataset_info` against any dataset:
      ```json
      {"identifier": <dataset_id>}
      ```
   2. `get_chart_info` against any chart:
      ```json
      {"identifier": <chart_id>}
      ```
   3. `generate_chart` for a table chart with aggregates set:
      ```json
      {
        "dataset_id": <id>,
        "save_chart": true,
        "config": {
          "chart_type": "table",
          "columns": [{"name": "col", "aggregate": "MAX"}]
        }
      }
      ```
   
   Each call returns:
   
   ```
   2 validation errors for UserInfo
   roles.0
     Input should be a valid string [type=string_type, input_value=Admin, 
input_type=Role]
       For further information visit 
https://errors.pydantic.dev/2.11/v/string_type
   roles.1
     Input should be a valid string [type=string_type, input_value=Gamma, 
input_type=Role]
       For further information visit 
https://errors.pydantic.dev/2.11/v/string_type
   ```
   
   ### Expected behavior
   
   The MCP response model should accept the user's role names. The simplest fix 
is to coerce each `Role` ORM object to its `.name` (string) before populating 
`UserInfo.roles`, e.g.:
   
   ```python
   UserInfo(
       ...,
       roles=[role.name if hasattr(role, "name") else role for role in 
user.roles],
   )
   ```
   
   …or change the `UserInfo` Pydantic model to accept `list[str | Role]` with a 
validator that extracts `.name`.
   
   ### Actual behavior
   
   Pydantic validation fails on the response model and the tool returns an 
error response — even when the underlying operation succeeded.
   
   ### Notable side effect
   
   `generate_chart` with `save_chart: true` and `aggregate` set on columns 
**actually saves the chart to the Superset database successfully** — the error 
is purely in serializing the response back to the MCP client. Verified by 
listing charts after the failed-looking call:
   
   ```json
   {"id": 278, "slice_name": "BC Q1 — ... (gold)", "viz_type": "table"}
   ```
   
   …and confirming `get_chart_data` returns the expected rows. So the bug 
silently produces a false-negative response: the client thinks the call failed 
and may retry, but the side effect (chart creation) already happened.
   
   This makes the error particularly confusing: the only way to know the chart 
was saved is to call `list_charts` after every failed `generate_chart` and 
check whether the chart name now exists.
   
   ### Impact
   
   - `get_dataset_info` is effectively broken for users with multiple roles — 
there's no other MCP path to introspect a dataset's column metadata.
   - `get_chart_info` is effectively broken for the same reason.
   - `generate_chart` works but reports failure on success, causing retry 
storms and confusing both human users and LLM clients that interpret the 
response.
   
   Workaround: have an LLM client treat the `UserInfo` validation error as 
"success but response invalid", and always verify by calling `list_charts` / 
`list_datasets` after the affected operations. This is brittle and shouldn't be 
required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to