codeant-ai-for-open-source[bot] commented on code in PR #40328:
URL: https://github.com/apache/superset/pull/40328#discussion_r3280730045


##########
superset/mcp_service/dashboard/schemas.py:
##########
@@ -974,3 +1171,58 @@ def serialize_dashboard_object(dashboard: Any) -> 
DashboardInfo:
             else [],
         )
     )
+
+
+def _sanitize_dashboard_layout_for_llm_context(
+    layout: DashboardLayout,
+) -> DashboardLayout:
+    """Wrap layout text fields before LLM exposure."""
+    payload = layout.model_dump(mode="python")
+    payload["dashboard_title"] = sanitize_for_llm_context(
+        payload.get("dashboard_title"),
+        field_path=("dashboard_title",),
+    )
+    payload["tabs"] = [
+        {
+            **tab,
+            "name": sanitize_for_llm_context(
+                tab.get("name"),
+                field_path=("tabs", str(index), "name"),
+            ),
+        }
+        for index, tab in enumerate(payload.get("tabs", []))
+    ]
+    payload["charts"] = [
+        {
+            **chart,
+            "slice_name": sanitize_for_llm_context(
+                chart.get("slice_name"),
+                field_path=("charts", str(index), "slice_name"),
+            ),
+            "tab_path": [
+                sanitize_for_llm_context(
+                    name,
+                    field_path=("charts", str(index), "tab_path", 
str(part_index)),
+                )
+                for part_index, name in enumerate(chart.get("tab_path", []) or 
[])
+            ],
+        }
+        for index, chart in enumerate(payload.get("charts", []))
+    ]
+    return DashboardLayout.model_validate(payload)
+
+
+def dashboard_layout_serializer(dashboard: "Dashboard") -> DashboardLayout:
+    """Serialize a Dashboard model to a parsed DashboardLayout."""
+    position_json_str = getattr(dashboard, "position_json", None)
+    tabs, charts = _extract_layout_from_position(position_json_str)
+    return _sanitize_dashboard_layout_for_llm_context(
+        DashboardLayout(
+            id=dashboard.id,
+            dashboard_title=dashboard.dashboard_title or "Untitled",
+            uuid=str(dashboard.uuid) if dashboard.uuid else None,
+            tabs=tabs,
+            charts=charts,
+            has_layout=bool(position_json_str),

Review Comment:
   **Suggestion:** `has_layout` is derived from `bool(position_json_str)`, so 
malformed or whitespace-only `position_json` values are reported as `true` even 
when parsing returns no tabs/charts. This creates an inconsistent contract 
(`has_layout=true` with empty layout data) and can mislead callers into 
treating broken layouts as valid. Compute `has_layout` from successful parsed 
layout state instead (for example, based on 
`_parse_position_json`/`_extract_layout_from_position` results). [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ MCP get_dashboard_layout may report layouts for invalid JSON.
   - ⚠️ Agents misinterpret corrupted layouts as valid, confusing responses.
   - ⚠️ Callers must infer validity from empty tabs/charts manually.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Note the layout extraction logic in
   `superset/mcp_service/dashboard/schemas.py:_extract_layout_from_position` 
(around lines
   829–885 in the PR), which calls `_parse_position_json(position_json_str)` 
and returns
   `(tabs, charts)`; when `position_json_str` is invalid JSON or not a dict,
   `_parse_position_json` returns `None` and `_extract_layout_from_position` 
returns `tabs =
   []`, `charts = []`.
   
   2. Observe the serializer `dashboard_layout_serializer` in
   `superset/mcp_service/dashboard/schemas.py` (lines 1215–1228), which builds a
   `DashboardLayout` with `tabs` and `charts` from
   `_extract_layout_from_position(position_json_str)` but sets
   `has_layout=bool(position_json_str)` (line 1226), based solely on the raw 
string's
   truthiness.
   
   3. The unit test `test_extract_layout_handles_invalid_json` in
   
`superset/tests/unit_tests/mcp_service/dashboard/tool/test_get_dashboard_layout.py:87-90`
   demonstrates that invalid `position_json` values like `"{ not json"` are 
expected inputs:
   `_extract_layout_from_position("{ not json")` returns `tabs == []` and 
`charts == []`,
   proving the code anticipates invalid JSON in the DB.
   
   4. In a real run, if a dashboard row has `position_json` set to a non-empty 
but invalid
   JSON string (as in step 3), calling the MCP tool `get_dashboard_layout` 
(registered in
   `superset/mcp_service/app.py:609` and implemented in
   `superset/mcp_service/dashboard/tool/get_dashboard_layout.py:55-92`) will 
invoke
   `dashboard_layout_serializer`; this will emit a `DashboardLayout` where 
`tabs == []`,
   `charts == []`, but `has_layout` is `True` (because 
`bool(position_json_str)` is `True`
   for any non-empty string), creating an inconsistent contract that reports a 
layout as
   present even though parsing produced no usable layout data.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=012ceeb50216409c96f7dbbba9592bef&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=012ceeb50216409c96f7dbbba9592bef&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/dashboard/schemas.py
   **Line:** 1226:1226
   **Comment:**
        *Logic Error: `has_layout` is derived from `bool(position_json_str)`, 
so malformed or whitespace-only `position_json` values are reported as `true` 
even when parsing returns no tabs/charts. This creates an inconsistent contract 
(`has_layout=true` with empty layout data) and can mislead callers into 
treating broken layouts as valid. Compute `has_layout` from successful parsed 
layout state instead (for example, based on 
`_parse_position_json`/`_extract_layout_from_position` results).
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40328&comment_hash=f8bf83dc1e1e2b61a40f6dc3605fed981020b4898e3ad020415748fa59bddfd1&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40328&comment_hash=f8bf83dc1e1e2b61a40f6dc3605fed981020b4898e3ad020415748fa59bddfd1&reaction=dislike'>👎</a>



##########
superset/mcp_service/dashboard/tool/get_dashboard_layout.py:
##########
@@ -0,0 +1,122 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Get dashboard layout FastMCP tool
+
+Companion to get_dashboard_info: returns the parsed dashboard layout
+(tabs and chart positions) extracted from position_json. Use this
+when get_dashboard_info's omitted_fields hint indicates position_json
+was stripped and structured layout data is needed for analysis.
+"""
+
+import logging
+from datetime import datetime, timezone
+
+from fastmcp import Context
+from sqlalchemy.orm import subqueryload
+from superset_core.mcp.decorators import tool, ToolAnnotations
+
+from superset.extensions import event_logger
+from superset.mcp_service.dashboard.schemas import (
+    dashboard_layout_serializer,
+    DashboardError,
+    DashboardLayout,
+    GetDashboardLayoutRequest,
+)
+from superset.mcp_service.mcp_core import ModelGetInfoCore
+
+logger = logging.getLogger(__name__)
+
+
+@tool(
+    tags=["discovery"],
+    class_permission_name="Dashboard",
+    annotations=ToolAnnotations(
+        title="Get dashboard layout",
+        readOnlyHint=True,
+        destructiveHint=False,
+    ),
+)
+async def get_dashboard_layout(
+    request: GetDashboardLayoutRequest, ctx: Context
+) -> DashboardLayout | DashboardError:
+    """
+    Get parsed dashboard layout by ID, UUID, or slug.
+
+    Returns the tabs and chart positions extracted from the dashboard's
+    position_json. get_dashboard_info omits position_json to keep responses
+    small; call this tool when you need the structured layout (e.g. to
+    explain which charts live under which tab, or to locate a chart by
+    its parent tab).
+
+    Example usage:
+    ```json
+    {
+        "identifier": 123
+    }
+    ```
+    """
+    await ctx.info("Retrieving dashboard layout: identifier=%s" % 
(request.identifier,))
+
+    try:
+        from superset.daos.dashboard import DashboardDAO
+        from superset.models.dashboard import Dashboard
+
+        eager_options = [subqueryload(Dashboard.slices)]

Review Comment:
   **Suggestion:** The tool eagerly loads `Dashboard.slices`, but the 
serializer only reads `id`, `dashboard_title`, `uuid`, and `position_json`. 
This adds an unnecessary extra relationship load for every request and can 
significantly increase query cost and memory usage on large dashboards. Remove 
this eager load for this tool path. [performance]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Each get_dashboard_layout call eagerly loads all dashboard slices.
   - ⚠️ Large dashboards incur unnecessary extra SQL queries.
   - ⚠️ Increased memory usage for unused `Dashboard.slices` collections.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. The MCP tool `get_dashboard_layout` is registered and exposed in
   `superset/mcp_service/app.py` at line 609 (`get_dashboard_layout` is 
included in the tools
   tuple) and implemented in
   `superset/mcp_service/dashboard/tool/get_dashboard_layout.py:55-92`; MCP 
clients call this
   tool to fetch parsed dashboard layout information.
   
   2. In `get_dashboard_layout` (file
   `superset/mcp_service/dashboard/tool/get_dashboard_layout.py`, lines 76-92), 
the code
   imports `DashboardDAO` and `Dashboard`, then defines `eager_options =
   [subqueryload(Dashboard.slices)]` (line 80) and passes this list as 
`query_options` when
   constructing `ModelGetInfoCore` (lines 83-21).
   
   3. `ModelGetInfoCore` in `superset/mcp_service/mcp_core.py` uses 
`self.query_options` when
   building the base query: `_base_filtered_query` (lines 51-69) calls
   `query.options(*self.query_options)` if any query options are provided. 
Passing
   `subqueryload(Dashboard.slices)` forces SQLAlchemy to execute additional 
queries and fully
   load `Dashboard.slices` for each dashboard returned.
   
   4. The serializer configured for this tool, `dashboard_layout_serializer` in
   `superset/mcp_service/dashboard/schemas.py:36-49`, only accesses 
`dashboard.id`,
   `dashboard.dashboard_title`, `dashboard.uuid`, and 
`dashboard.position_json`, and never
   touches `dashboard.slices`; thus, every call to `get_dashboard_layout` pays 
the cost of
   eagerly loading all chart slices (extra SQL and memory) without using them, 
which is
   particularly wasteful for dashboards with many charts, as demonstrated by 
the tool-level
   tests in
   
`superset/tests/unit_tests/mcp_service/dashboard/tool/test_get_dashboard_layout.py`
 that
   exercise `get_dashboard_layout` as the main entrypoint.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=9e3583acd2b944ba91e963d144bbc374&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=9e3583acd2b944ba91e963d144bbc374&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/dashboard/tool/get_dashboard_layout.py
   **Line:** 80:80
   **Comment:**
        *Performance: The tool eagerly loads `Dashboard.slices`, but the 
serializer only reads `id`, `dashboard_title`, `uuid`, and `position_json`. 
This adds an unnecessary extra relationship load for every request and can 
significantly increase query cost and memory usage on large dashboards. Remove 
this eager load for this tool path.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40328&comment_hash=a21064661809464f91125028c88293e836112ac76a6d5d9c81a8297920ae7231&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40328&comment_hash=a21064661809464f91125028c88293e836112ac76a6d5d9c81a8297920ae7231&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to