codeant-ai-for-open-source[bot] commented on code in PR #38416:
URL: https://github.com/apache/superset/pull/38416#discussion_r2930413744
##########
superset/mcp_service/dashboard/tool/list_dashboards.py:
##########
@@ -108,34 +124,49 @@ def _serialize_dashboard(
"""Serialize dashboard object (field filtering handled by
model_serializer)."""
return serialize_dashboard_object(obj)
- tool = ModelListCore(
- dao_class=DashboardDAO,
- output_schema=DashboardInfo,
- item_serializer=_serialize_dashboard,
- filter_type=DashboardFilter,
- default_columns=DEFAULT_DASHBOARD_COLUMNS,
- search_columns=[
- "dashboard_title",
- "slug",
- "uuid",
- ],
- list_field_name="dashboards",
- output_list_schema=DashboardList,
- all_columns=all_columns,
- sortable_columns=DASHBOARD_SORTABLE_COLUMNS,
- logger=logger,
- )
-
- with event_logger.log_context(action="mcp.list_dashboards.query"):
- result = tool.run_tool(
- filters=request.filters,
- search=request.search,
- select_columns=request.select_columns,
- order_column=request.order_column,
- order_direction=request.order_direction,
- page=max(request.page - 1, 0),
- page_size=request.page_size,
+ # Two-pass approach when sorting by popularity_score
+ if request.order_column == "popularity_score":
+ with
event_logger.log_context(action="mcp.list_dashboards.popularity_sort"):
+ result = _list_dashboards_by_popularity(
+ request, DashboardDAO, _serialize_dashboard, all_columns, ctx
+ )
+ else:
+ list_core = ModelListCore(
+ dao_class=DashboardDAO,
+ output_schema=DashboardInfo,
+ item_serializer=_serialize_dashboard,
+ filter_type=DashboardFilter,
+ default_columns=DEFAULT_DASHBOARD_COLUMNS,
+ search_columns=DASHBOARD_SEARCH_COLUMNS,
+ list_field_name="dashboards",
+ output_list_schema=DashboardList,
+ all_columns=all_columns,
+ sortable_columns=DASHBOARD_SORTABLE_COLUMNS,
+ logger=logger,
)
+
+ # Strip computed fields before passing to DAO query
+ dao_columns = request.select_columns
+ if dao_columns:
+ dao_columns = [c for c in dao_columns if c != "popularity_score"]
Review Comment:
**Suggestion:** When `popularity_score` is requested via `select_columns`,
the DAO query removes it but does not ensure `id` is still loaded. Since score
attachment depends on object IDs, rows loaded without `id` cannot be scored and
the response will miss/empty `popularity_score`. Keep `id` in `dao_columns`
whenever `popularity_score` is requested. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ list_dashboards cannot reliably return requested popularity_score.
- ⚠️ MCP ranking clients get incomplete dashboard relevance data.
```
</details>
```suggestion
dao_columns = request.select_columns
if dao_columns:
dao_columns = [c for c in dao_columns if c != "popularity_score"]
if "popularity_score" in request.select_columns and "id" not in
dao_columns:
dao_columns = list(dao_columns) + ["id"]
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Call MCP tool `list_dashboards` with
`select_columns=["dashboard_title","popularity_score"]` (tool is registered
in
`superset/mcp_service/app.py:406` and invoked in tests via
`client.call_tool("list_dashboards", ...)` at
`tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py:102-105`).
2. In `superset/mcp_service/dashboard/tool/list_dashboards.py:149-157`,
`popularity_score`
is stripped and `ModelListCore.run_tool` receives
`select_columns=["dashboard_title"]`.
3. `ModelListCore.run_tool` forwards `columns=columns_to_load` to DAO
(`superset/mcp_service/mcp_core.py:165-174`); DAO then queries only
requested columns
(`superset/daos/base.py:643-645`), so rows may not contain `id`. Serializer
sets
`id=getattr(..., "id", None)`
(`superset/mcp_service/dashboard/schemas.py:528`), yielding
`id=None`.
4. Score attachment path
(`superset/mcp_service/dashboard/tool/list_dashboards.py:165-168`) skips
rows with `id is
None`, so scores are not mapped. Final response filtering uses
`result.columns_requested`
(`list_dashboards.py:178-185`, `dashboard/schemas.py:378`), so requested
popularity
metadata is not reliably returned.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/dashboard/tool/list_dashboards.py
**Line:** 149:151
**Comment:**
*Logic Error: When `popularity_score` is requested via
`select_columns`, the DAO query removes it but does not ensure `id` is still
loaded. Since score attachment depends on object IDs, rows loaded without `id`
cannot be scored and the response will miss/empty `popularity_score`. Keep `id`
in `dao_columns` whenever `popularity_score` is requested.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=2450534cd6e37e892eec8df57fcfddc76ff28bd232cda84a61f26f2870e77b2b&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=2450534cd6e37e892eec8df57fcfddc76ff28bd232cda84a61f26f2870e77b2b&reaction=dislike'>👎</a>
##########
superset/mcp_service/dataset/tool/list_datasets.py:
##########
@@ -116,32 +129,50 @@ def _serialize_dataset(
"""Serialize dataset (filtering via model_serializer)."""
return serialize_dataset_object(obj)
- # Create tool with standard serialization
- tool = ModelListCore(
- dao_class=DatasetDAO,
- output_schema=DatasetInfo,
- item_serializer=_serialize_dataset,
- filter_type=DatasetFilter,
- default_columns=DEFAULT_DATASET_COLUMNS,
- search_columns=["schema", "sql", "table_name", "uuid"],
- list_field_name="datasets",
- output_list_schema=DatasetList,
- all_columns=all_columns,
- sortable_columns=DATASET_SORTABLE_COLUMNS,
- logger=logger,
- )
-
- with event_logger.log_context(action="mcp.list_datasets.query"):
- result = tool.run_tool(
- filters=request.filters,
- search=request.search,
- select_columns=request.select_columns,
- order_column=request.order_column,
- order_direction=request.order_direction,
- page=max(request.page - 1, 0),
- page_size=request.page_size,
+ # Two-pass approach when sorting by popularity_score
+ if request.order_column == "popularity_score":
+ with
event_logger.log_context(action="mcp.list_datasets.popularity_sort"):
+ result = _list_datasets_by_popularity(
+ request, DatasetDAO, _serialize_dataset, all_columns, ctx
+ )
+ else:
+ # Create tool with standard serialization
+ list_core = ModelListCore(
+ dao_class=DatasetDAO,
+ output_schema=DatasetInfo,
+ item_serializer=_serialize_dataset,
+ filter_type=DatasetFilter,
+ default_columns=DEFAULT_DATASET_COLUMNS,
+ search_columns=DATASET_SEARCH_COLUMNS,
+ list_field_name="datasets",
+ output_list_schema=DatasetList,
+ all_columns=all_columns,
+ sortable_columns=DATASET_SORTABLE_COLUMNS,
+ logger=logger,
)
+ # Strip computed fields before passing to DAO query
+ dao_columns = request.select_columns
+ if dao_columns:
+ dao_columns = [c for c in dao_columns if c !=
"popularity_score"]
+
+ with event_logger.log_context(action="mcp.list_datasets.query"):
+ result = list_core.run_tool(
+ filters=request.filters,
+ search=request.search,
+ select_columns=dao_columns,
+ order_column=request.order_column,
+ order_direction=request.order_direction,
+ page=max(request.page - 1, 0),
+ page_size=request.page_size,
+ )
+
+ # Attach popularity scores if requested in select_columns
+ if request.select_columns and "popularity_score" in
request.select_columns:
+ if ds_ids := [d.id for d in result.datasets if d.id is not
None]:
+ scores = compute_dataset_popularity(ds_ids)
+ attach_popularity_scores(result.datasets, scores)
Review Comment:
**Suggestion:** When `select_columns` includes `popularity_score` (without
sorting by it), the code strips that column before calling `run_tool`, and
`run_tool` then overwrites `columns_requested` without `popularity_score`.
Serialization filters by `columns_requested`, so the computed score is silently
dropped from the response. Preserve the original requested columns after
attaching scores. [logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ⚠️ `list_datasets` omits explicitly requested `popularity_score`.
- ❌ MCP ranking workflows lose returned score visibility.
```
</details>
```suggestion
# Attach popularity scores if requested in select_columns
if request.select_columns and "popularity_score" in
request.select_columns:
if ds_ids := [d.id for d in result.datasets if d.id is not
None]:
scores = compute_dataset_popularity(ds_ids)
attach_popularity_scores(result.datasets, scores)
result.columns_requested = request.select_columns
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Invoke MCP tool `list_datasets` (tool is registered in
`superset/mcp_service/app.py:408-410`; tests call it via
`Client.call_tool("list_datasets", ...)` at
`tests/unit_tests/mcp_service/dataset/tool/test_dataset_tools.py:50-52`).
2. Send request with `select_columns=["id","table_name","popularity_score"]`
and
non-popularity ordering (code path in
`superset/mcp_service/dataset/tool/list_datasets.py:133-168` enters
non-popularity
branch).
3. In that branch, `popularity_score` is removed before DAO call
(`dao_columns = [c for c
in dao_columns if c != "popularity_score"]` at `list_datasets.py:155-157`),
and
`ModelListCore.run_tool()` stores stripped columns as `columns_requested`
(`superset/mcp_service/mcp_core.py:154-160,210`).
4. Scores are attached (`list_datasets.py:171-174`), but final JSON
filtering uses
`result.columns_requested` (`list_datasets.py:186-195`), and `DatasetInfo`
serializer
drops non-requested fields
(`superset/mcp_service/dataset/schemas.py:153-176`), so
`popularity_score` is omitted from output.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/dataset/tool/list_datasets.py
**Line:** 170:174
**Comment:**
*Logic Error: When `select_columns` includes `popularity_score`
(without sorting by it), the code strips that column before calling `run_tool`,
and `run_tool` then overwrites `columns_requested` without `popularity_score`.
Serialization filters by `columns_requested`, so the computed score is silently
dropped from the response. Preserve the original requested columns after
attaching scores.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=1903195305c79ed12a141dab0d3b89a72cbfd322fcf3ffe8c252f9886db60a4e&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=1903195305c79ed12a141dab0d3b89a72cbfd322fcf3ffe8c252f9886db60a4e&reaction=dislike'>👎</a>
##########
superset/mcp_service/chart/tool/list_charts.py:
##########
@@ -106,34 +129,50 @@ def _serialize_chart(
"""Serialize chart object (field filtering handled by
model_serializer)."""
return serialize_chart_object(cast(ChartLike | None, obj))
- tool = ModelListCore(
- dao_class=ChartDAO,
- output_schema=ChartInfo,
- item_serializer=_serialize_chart,
- filter_type=ChartFilter,
- default_columns=DEFAULT_CHART_COLUMNS,
- search_columns=[
- "slice_name",
- "description",
- ],
- list_field_name="charts",
- output_list_schema=ChartList,
- all_columns=all_columns,
- sortable_columns=CHART_SORTABLE_COLUMNS,
- logger=logger,
- )
-
try:
- with event_logger.log_context(action="mcp.list_charts.query"):
- result = tool.run_tool(
- filters=request.filters,
- search=request.search,
- select_columns=request.select_columns,
- order_column=request.order_column,
- order_direction=request.order_direction,
- page=max(request.page - 1, 0),
- page_size=request.page_size,
+ # Two-pass approach when sorting by popularity_score
+ if request.order_column == "popularity_score":
+ with
event_logger.log_context(action="mcp.list_charts.popularity_sort"):
+ result = _list_charts_by_popularity(
+ request, ChartDAO, _serialize_chart, all_columns, ctx
+ )
+ else:
+ list_core = ModelListCore(
+ dao_class=ChartDAO,
+ output_schema=ChartInfo,
+ item_serializer=_serialize_chart,
+ filter_type=ChartFilter,
+ default_columns=DEFAULT_CHART_COLUMNS,
+ search_columns=CHART_SEARCH_COLUMNS,
+ list_field_name="charts",
+ output_list_schema=ChartList,
+ all_columns=all_columns,
+ sortable_columns=CHART_SORTABLE_COLUMNS,
+ logger=logger,
)
+
+ # Strip computed fields before passing to DAO query
+ dao_columns = request.select_columns
+ if dao_columns:
+ dao_columns = [c for c in dao_columns if c !=
"popularity_score"]
+
+ with event_logger.log_context(action="mcp.list_charts.query"):
+ result = list_core.run_tool(
+ filters=request.filters,
+ search=request.search,
+ select_columns=dao_columns,
+ order_column=request.order_column,
+ order_direction=request.order_direction,
+ page=max(request.page - 1, 0),
+ page_size=request.page_size,
+ )
+
+ # Attach popularity scores if requested in select_columns
+ if request.select_columns and "popularity_score" in
request.select_columns:
+ if chart_ids := [c.id for c in result.charts if c.id is not
None]:
+ scores = compute_chart_popularity(chart_ids)
+ attach_popularity_scores(result.charts, scores)
Review Comment:
**Suggestion:** The response field filtering is based on
`result.columns_requested`, but this value is built from `dao_columns` after
`popularity_score` is stripped out. When clients request `select_columns`
including `popularity_score`, the score is computed and attached, but then
filtered out of the JSON output. Preserve the original requested columns for
response filtering while still removing computed fields from the DAO query.
[logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ⚠️ list_charts drops requested popularity_score column.
- ❌ MCP relevance ranking loses chart score signal.
```
</details>
```suggestion
# Strip computed fields before passing to DAO query
requested_columns = request.select_columns or
DEFAULT_CHART_COLUMNS
dao_columns = [c for c in requested_columns if c !=
"popularity_score"]
with
event_logger.log_context(action="mcp.list_charts.query"):
result = list_core.run_tool(
filters=request.filters,
search=request.search,
select_columns=dao_columns,
order_column=request.order_column,
order_direction=request.order_direction,
page=max(request.page - 1, 0),
page_size=request.page_size,
)
# Keep serialization filtering aligned with the
actual request
result.columns_requested = list(requested_columns)
# Attach popularity scores if requested in
select_columns
if "popularity_score" in requested_columns:
if chart_ids := [c.id for c in result.charts if
c.id is not None]:
scores = compute_chart_popularity(chart_ids)
attach_popularity_scores(result.charts,
scores)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Start MCP service where `list_charts` is registered by import in
`superset/mcp_service/app.py:393-399` (`from superset.mcp_service.chart.tool
import ...
list_charts`).
2. Call MCP tool `list_charts` with
`select_columns=["id","slice_name","popularity_score"]` and a non-popularity
sort (so code
follows normal branch in
`superset/mcp_service/chart/tool/list_charts.py:140-174`).
3. In that branch, `popularity_score` is removed before DAO call at
`list_charts.py:155-163`; then `ModelListCore.run_tool()` sets
`columns_requested` from
that stripped list (`superset/mcp_service/mcp_core.py:158-162,210`).
4. Code still computes and attaches the score (`list_charts.py:171-174`),
but final
serialization uses `result.columns_requested` (`list_charts.py:184-191`),
and `ChartInfo`
serializer drops fields not in `select_columns` context
(`superset/mcp_service/chart/schemas.py:153-155`), so `popularity_score` is
omitted from
JSON.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/list_charts.py
**Line:** 154:174
**Comment:**
*Logic Error: The response field filtering is based on
`result.columns_requested`, but this value is built from `dao_columns` after
`popularity_score` is stripped out. When clients request `select_columns`
including `popularity_score`, the score is computed and attached, but then
filtered out of the JSON output. Preserve the original requested columns for
response filtering while still removing computed fields from the DAO query.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=3bb52c7d283ba9d486909b756b9820852f1591b224c9d711bdfd463999791971&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38416&comment_hash=3bb52c7d283ba9d486909b756b9820852f1591b224c9d711bdfd463999791971&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]