codeant-ai-for-open-source[bot] commented on code in PR #38416:
URL: https://github.com/apache/superset/pull/38416#discussion_r2930413744


##########
superset/mcp_service/dashboard/tool/list_dashboards.py:
##########
@@ -108,34 +124,49 @@ def _serialize_dashboard(
         """Serialize dashboard object (field filtering handled by 
model_serializer)."""
         return serialize_dashboard_object(obj)
 
-    tool = ModelListCore(
-        dao_class=DashboardDAO,
-        output_schema=DashboardInfo,
-        item_serializer=_serialize_dashboard,
-        filter_type=DashboardFilter,
-        default_columns=DEFAULT_DASHBOARD_COLUMNS,
-        search_columns=[
-            "dashboard_title",
-            "slug",
-            "uuid",
-        ],
-        list_field_name="dashboards",
-        output_list_schema=DashboardList,
-        all_columns=all_columns,
-        sortable_columns=DASHBOARD_SORTABLE_COLUMNS,
-        logger=logger,
-    )
-
-    with event_logger.log_context(action="mcp.list_dashboards.query"):
-        result = tool.run_tool(
-            filters=request.filters,
-            search=request.search,
-            select_columns=request.select_columns,
-            order_column=request.order_column,
-            order_direction=request.order_direction,
-            page=max(request.page - 1, 0),
-            page_size=request.page_size,
+    # Two-pass approach when sorting by popularity_score
+    if request.order_column == "popularity_score":
+        with 
event_logger.log_context(action="mcp.list_dashboards.popularity_sort"):
+            result = _list_dashboards_by_popularity(
+                request, DashboardDAO, _serialize_dashboard, all_columns, ctx
+            )
+    else:
+        list_core = ModelListCore(
+            dao_class=DashboardDAO,
+            output_schema=DashboardInfo,
+            item_serializer=_serialize_dashboard,
+            filter_type=DashboardFilter,
+            default_columns=DEFAULT_DASHBOARD_COLUMNS,
+            search_columns=DASHBOARD_SEARCH_COLUMNS,
+            list_field_name="dashboards",
+            output_list_schema=DashboardList,
+            all_columns=all_columns,
+            sortable_columns=DASHBOARD_SORTABLE_COLUMNS,
+            logger=logger,
         )
+
+        # Strip computed fields before passing to DAO query
+        dao_columns = request.select_columns
+        if dao_columns:
+            dao_columns = [c for c in dao_columns if c != "popularity_score"]

Review Comment:
   **Suggestion:** When `popularity_score` is requested via `select_columns`, 
the DAO query removes it but does not ensure `id` is still loaded. Since score 
attachment depends on object IDs, rows loaded without `id` cannot be scored and 
the response will miss/empty `popularity_score`. Keep `id` in `dao_columns` 
whenever `popularity_score` is requested. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ list_dashboards cannot reliably return requested popularity_score.
   - ⚠️ MCP ranking clients get incomplete dashboard relevance data.
   ```
   </details>
   
   ```suggestion
           dao_columns = request.select_columns
           if dao_columns:
               dao_columns = [c for c in dao_columns if c != "popularity_score"]
               if "popularity_score" in request.select_columns and "id" not in 
dao_columns:
                   dao_columns = list(dao_columns) + ["id"]
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Call MCP tool `list_dashboards` with
   `select_columns=["dashboard_title","popularity_score"]` (tool is registered 
in
   `superset/mcp_service/app.py:406` and invoked in tests via
   `client.call_tool("list_dashboards", ...)` at
   
`tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py:102-105`).
   
   2. In `superset/mcp_service/dashboard/tool/list_dashboards.py:149-157`, 
`popularity_score`
   is stripped and `ModelListCore.run_tool` receives 
`select_columns=["dashboard_title"]`.
   
   3. `ModelListCore.run_tool` forwards `columns=columns_to_load` to DAO
   (`superset/mcp_service/mcp_core.py:165-174`); DAO then queries only 
requested columns
   (`superset/daos/base.py:643-645`), so rows may not contain `id`. Serializer 
sets
   `id=getattr(..., "id", None)` 
(`superset/mcp_service/dashboard/schemas.py:528`), yielding
   `id=None`.
   
   4. Score attachment path
   (`superset/mcp_service/dashboard/tool/list_dashboards.py:165-168`) skips 
rows with `id is
   None`, so scores are not mapped. Final response filtering uses 
`result.columns_requested`
   (`list_dashboards.py:178-185`, `dashboard/schemas.py:378`), so requested 
popularity
   metadata is not reliably returned.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/dashboard/tool/list_dashboards.py
   **Line:** 149:151
   **Comment:**
        *Logic Error: When `popularity_score` is requested via 
`select_columns`, the DAO query removes it but does not ensure `id` is still 
loaded. Since score attachment depends on object IDs, rows loaded without `id` 
cannot be scored and the response will miss/empty `popularity_score`. Keep `id` 
in `dao_columns` whenever `popularity_score` is requested.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=2450534cd6e37e892eec8df57fcfddc76ff28bd232cda84a61f26f2870e77b2b&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=2450534cd6e37e892eec8df57fcfddc76ff28bd232cda84a61f26f2870e77b2b&reaction=dislike'>👎</a>



##########
superset/mcp_service/dataset/tool/list_datasets.py:
##########
@@ -116,32 +129,50 @@ def _serialize_dataset(
             """Serialize dataset (filtering via model_serializer)."""
             return serialize_dataset_object(obj)
 
-        # Create tool with standard serialization
-        tool = ModelListCore(
-            dao_class=DatasetDAO,
-            output_schema=DatasetInfo,
-            item_serializer=_serialize_dataset,
-            filter_type=DatasetFilter,
-            default_columns=DEFAULT_DATASET_COLUMNS,
-            search_columns=["schema", "sql", "table_name", "uuid"],
-            list_field_name="datasets",
-            output_list_schema=DatasetList,
-            all_columns=all_columns,
-            sortable_columns=DATASET_SORTABLE_COLUMNS,
-            logger=logger,
-        )
-
-        with event_logger.log_context(action="mcp.list_datasets.query"):
-            result = tool.run_tool(
-                filters=request.filters,
-                search=request.search,
-                select_columns=request.select_columns,
-                order_column=request.order_column,
-                order_direction=request.order_direction,
-                page=max(request.page - 1, 0),
-                page_size=request.page_size,
+        # Two-pass approach when sorting by popularity_score
+        if request.order_column == "popularity_score":
+            with 
event_logger.log_context(action="mcp.list_datasets.popularity_sort"):
+                result = _list_datasets_by_popularity(
+                    request, DatasetDAO, _serialize_dataset, all_columns, ctx
+                )
+        else:
+            # Create tool with standard serialization
+            list_core = ModelListCore(
+                dao_class=DatasetDAO,
+                output_schema=DatasetInfo,
+                item_serializer=_serialize_dataset,
+                filter_type=DatasetFilter,
+                default_columns=DEFAULT_DATASET_COLUMNS,
+                search_columns=DATASET_SEARCH_COLUMNS,
+                list_field_name="datasets",
+                output_list_schema=DatasetList,
+                all_columns=all_columns,
+                sortable_columns=DATASET_SORTABLE_COLUMNS,
+                logger=logger,
             )
 
+            # Strip computed fields before passing to DAO query
+            dao_columns = request.select_columns
+            if dao_columns:
+                dao_columns = [c for c in dao_columns if c != 
"popularity_score"]
+
+            with event_logger.log_context(action="mcp.list_datasets.query"):
+                result = list_core.run_tool(
+                    filters=request.filters,
+                    search=request.search,
+                    select_columns=dao_columns,
+                    order_column=request.order_column,
+                    order_direction=request.order_direction,
+                    page=max(request.page - 1, 0),
+                    page_size=request.page_size,
+                )
+
+            # Attach popularity scores if requested in select_columns
+            if request.select_columns and "popularity_score" in 
request.select_columns:
+                if ds_ids := [d.id for d in result.datasets if d.id is not 
None]:
+                    scores = compute_dataset_popularity(ds_ids)
+                    attach_popularity_scores(result.datasets, scores)

Review Comment:
   **Suggestion:** When `select_columns` includes `popularity_score` (without 
sorting by it), the code strips that column before calling `run_tool`, and 
`run_tool` then overwrites `columns_requested` without `popularity_score`. 
Serialization filters by `columns_requested`, so the computed score is silently 
dropped from the response. Preserve the original requested columns after 
attaching scores. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ⚠️ `list_datasets` omits explicitly requested `popularity_score`.
   - ❌ MCP ranking workflows lose returned score visibility.
   ```
   </details>
   
   ```suggestion
               # Attach popularity scores if requested in select_columns
               if request.select_columns and "popularity_score" in 
request.select_columns:
                   if ds_ids := [d.id for d in result.datasets if d.id is not 
None]:
                       scores = compute_dataset_popularity(ds_ids)
                       attach_popularity_scores(result.datasets, scores)
                   result.columns_requested = request.select_columns
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Invoke MCP tool `list_datasets` (tool is registered in
   `superset/mcp_service/app.py:408-410`; tests call it via
   `Client.call_tool("list_datasets", ...)` at
   `tests/unit_tests/mcp_service/dataset/tool/test_dataset_tools.py:50-52`).
   
   2. Send request with `select_columns=["id","table_name","popularity_score"]` 
and
   non-popularity ordering (code path in
   `superset/mcp_service/dataset/tool/list_datasets.py:133-168` enters 
non-popularity
   branch).
   
   3. In that branch, `popularity_score` is removed before DAO call 
(`dao_columns = [c for c
   in dao_columns if c != "popularity_score"]` at `list_datasets.py:155-157`), 
and
   `ModelListCore.run_tool()` stores stripped columns as `columns_requested`
   (`superset/mcp_service/mcp_core.py:154-160,210`).
   
   4. Scores are attached (`list_datasets.py:171-174`), but final JSON 
filtering uses
   `result.columns_requested` (`list_datasets.py:186-195`), and `DatasetInfo` 
serializer
   drops non-requested fields 
(`superset/mcp_service/dataset/schemas.py:153-176`), so
   `popularity_score` is omitted from output.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/dataset/tool/list_datasets.py
   **Line:** 170:174
   **Comment:**
        *Logic Error: When `select_columns` includes `popularity_score` 
(without sorting by it), the code strips that column before calling `run_tool`, 
and `run_tool` then overwrites `columns_requested` without `popularity_score`. 
Serialization filters by `columns_requested`, so the computed score is silently 
dropped from the response. Preserve the original requested columns after 
attaching scores.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=1903195305c79ed12a141dab0d3b89a72cbfd322fcf3ffe8c252f9886db60a4e&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=1903195305c79ed12a141dab0d3b89a72cbfd322fcf3ffe8c252f9886db60a4e&reaction=dislike'>👎</a>



##########
superset/mcp_service/chart/tool/list_charts.py:
##########
@@ -106,34 +129,50 @@ def _serialize_chart(
         """Serialize chart object (field filtering handled by 
model_serializer)."""
         return serialize_chart_object(cast(ChartLike | None, obj))
 
-    tool = ModelListCore(
-        dao_class=ChartDAO,
-        output_schema=ChartInfo,
-        item_serializer=_serialize_chart,
-        filter_type=ChartFilter,
-        default_columns=DEFAULT_CHART_COLUMNS,
-        search_columns=[
-            "slice_name",
-            "description",
-        ],
-        list_field_name="charts",
-        output_list_schema=ChartList,
-        all_columns=all_columns,
-        sortable_columns=CHART_SORTABLE_COLUMNS,
-        logger=logger,
-    )
-
     try:
-        with event_logger.log_context(action="mcp.list_charts.query"):
-            result = tool.run_tool(
-                filters=request.filters,
-                search=request.search,
-                select_columns=request.select_columns,
-                order_column=request.order_column,
-                order_direction=request.order_direction,
-                page=max(request.page - 1, 0),
-                page_size=request.page_size,
+        # Two-pass approach when sorting by popularity_score
+        if request.order_column == "popularity_score":
+            with 
event_logger.log_context(action="mcp.list_charts.popularity_sort"):
+                result = _list_charts_by_popularity(
+                    request, ChartDAO, _serialize_chart, all_columns, ctx
+                )
+        else:
+            list_core = ModelListCore(
+                dao_class=ChartDAO,
+                output_schema=ChartInfo,
+                item_serializer=_serialize_chart,
+                filter_type=ChartFilter,
+                default_columns=DEFAULT_CHART_COLUMNS,
+                search_columns=CHART_SEARCH_COLUMNS,
+                list_field_name="charts",
+                output_list_schema=ChartList,
+                all_columns=all_columns,
+                sortable_columns=CHART_SORTABLE_COLUMNS,
+                logger=logger,
             )
+
+            # Strip computed fields before passing to DAO query
+            dao_columns = request.select_columns
+            if dao_columns:
+                dao_columns = [c for c in dao_columns if c != 
"popularity_score"]
+
+            with event_logger.log_context(action="mcp.list_charts.query"):
+                result = list_core.run_tool(
+                    filters=request.filters,
+                    search=request.search,
+                    select_columns=dao_columns,
+                    order_column=request.order_column,
+                    order_direction=request.order_direction,
+                    page=max(request.page - 1, 0),
+                    page_size=request.page_size,
+                )
+
+            # Attach popularity scores if requested in select_columns
+            if request.select_columns and "popularity_score" in 
request.select_columns:
+                if chart_ids := [c.id for c in result.charts if c.id is not 
None]:
+                    scores = compute_chart_popularity(chart_ids)
+                    attach_popularity_scores(result.charts, scores)

Review Comment:
   **Suggestion:** The response field filtering is based on 
`result.columns_requested`, but this value is built from `dao_columns` after 
`popularity_score` is stripped out. When clients request `select_columns` 
including `popularity_score`, the score is computed and attached, but then 
filtered out of the JSON output. Preserve the original requested columns for 
response filtering while still removing computed fields from the DAO query. 
[logic error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ⚠️ list_charts drops requested popularity_score column.
   - ❌ MCP relevance ranking loses chart score signal.
   ```
   </details>
   
   ```suggestion
               # Strip computed fields before passing to DAO query
                           requested_columns = request.select_columns or 
DEFAULT_CHART_COLUMNS
                           dao_columns = [c for c in requested_columns if c != 
"popularity_score"]
   
                           with 
event_logger.log_context(action="mcp.list_charts.query"):
                               result = list_core.run_tool(
                                   filters=request.filters,
                                   search=request.search,
                                   select_columns=dao_columns,
                                   order_column=request.order_column,
                                   order_direction=request.order_direction,
                                   page=max(request.page - 1, 0),
                                   page_size=request.page_size,
                               )
   
                           # Keep serialization filtering aligned with the 
actual request
                           result.columns_requested = list(requested_columns)
   
                           # Attach popularity scores if requested in 
select_columns
                           if "popularity_score" in requested_columns:
                               if chart_ids := [c.id for c in result.charts if 
c.id is not None]:
                                   scores = compute_chart_popularity(chart_ids)
                                   attach_popularity_scores(result.charts, 
scores)
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Start MCP service where `list_charts` is registered by import in
   `superset/mcp_service/app.py:393-399` (`from superset.mcp_service.chart.tool 
import ...
   list_charts`).
   
   2. Call MCP tool `list_charts` with
   `select_columns=["id","slice_name","popularity_score"]` and a non-popularity 
sort (so code
   follows normal branch in 
`superset/mcp_service/chart/tool/list_charts.py:140-174`).
   
   3. In that branch, `popularity_score` is removed before DAO call at
   `list_charts.py:155-163`; then `ModelListCore.run_tool()` sets 
`columns_requested` from
   that stripped list (`superset/mcp_service/mcp_core.py:158-162,210`).
   
   4. Code still computes and attaches the score (`list_charts.py:171-174`), 
but final
   serialization uses `result.columns_requested` (`list_charts.py:184-191`), 
and `ChartInfo`
   serializer drops fields not in `select_columns` context
   (`superset/mcp_service/chart/schemas.py:153-155`), so `popularity_score` is 
omitted from
   JSON.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/chart/tool/list_charts.py
   **Line:** 154:174
   **Comment:**
        *Logic Error: The response field filtering is based on 
`result.columns_requested`, but this value is built from `dao_columns` after 
`popularity_score` is stripped out. When clients request `select_columns` 
including `popularity_score`, the score is computed and attached, but then 
filtered out of the JSON output. Preserve the original requested columns for 
response filtering while still removing computed fields from the DAO query.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=3bb52c7d283ba9d486909b756b9820852f1591b224c9d711bdfd463999791971&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=3bb52c7d283ba9d486909b756b9820852f1591b224c9d711bdfd463999791971&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to