codeant-ai-for-open-source[bot] commented on code in PR #39916:
URL: https://github.com/apache/superset/pull/39916#discussion_r3198685089
##########
superset/mcp_service/chart/ascii_charts.py:
##########
@@ -79,7 +79,11 @@ def _generate_ascii_bar_chart(data: list[Any], width: int,
height: int) -> str:
label_val = None
for _key, val in row.items():
- if isinstance(val, (int, float)) and numeric_val is None:
+ if (
+ isinstance(val, (int, float))
+ and not _is_nan_value(val)
+ and numeric_val is None
+ ):
Review Comment:
**Suggestion:** This numeric extraction condition now accepts booleans as
numeric values (`True`/`False` are subclasses of `int`) and, because of the
`numeric_val is None` gate, it can lock onto a boolean field and ignore the
real metric in the same row. In datasets where a boolean column appears before
the metric column, bars will be computed from 0/1 instead of the actual values.
Exclude booleans from numeric detection (for example, require `not
isinstance(val, bool)`) before assigning `numeric_val`. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ MCP get_chart_preview bar charts show 0/1 instead of metrics.
- ⚠️ ASCII bar/column previews misrepresent numeric values for users.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure any Superset chart whose `viz_type` is `"bar"`, `"column"`, or
`"echarts_timeseries_bar"` (these are routed to the bar renderer in
`superset/mcp_service/chart/ascii_charts.py:50-51`) using a dataset where
the first column
in the query result row is BOOLEAN (e.g. `is_active`) and the second is a
numeric metric
(e.g. `total_sales`).
2. Use the MCP `get_chart_preview` tool, which executes
`ChartPreviewStrategy.generate()`
in `superset/mcp_service/chart/tool/get_chart_preview.py:3-35`. That method
runs a
`ChartDataCommand`, obtains `data = result["queries"][0].get("data", [])` at
`get_chart_preview.py:26-28`, and then calls `ascii_chart =
generate_ascii_chart(data,
self.chart.viz_type or "table", ...)` at `get_chart_preview.py:30-35`.
3. Inside `generate_ascii_chart` in
`superset/mcp_service/chart/ascii_charts.py:34-53`,
the `"bar"`/`"column"`/`"echarts_timeseries_bar"` chart types are dispatched
to
`_generate_ascii_bar_chart(data, width, height)` at `ascii_charts.py:50-51`.
4. In `_generate_ascii_bar_chart` (`ascii_charts.py:66-93`), each result row
dict is
iterated in insertion order (`for _key, val in row.items():` at line 81).
For a row like
`{"is_active": True, "total_sales": 100.0}`, the first `val` is the boolean
`True`. The
condition at lines 82-86:
`if (isinstance(val, (int, float)) and not _is_nan_value(val) and
numeric_val is
None):`
evaluates to True because `True` is an instance of `int` and
`_is_nan_value(True)`
(implemented at `ascii_charts.py:23-27`) returns False. This sets
`numeric_val = True`
(1) at line 87, and because `numeric_val` is no longer `None`, the actual
metric
`total_sales` is ignored. The `values` list appended at lines 91-93 thus
contains 0/1
flags instead of real metric values, and the horizontal/vertical bar
chart produced by
`_generate_horizontal_bar_chart`/`_generate_vertical_bar_chart` uses
these incorrect
0/1-based values in the MCP ASCII preview.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20superset%2Fmcp_service%2Fchart%2Fascii_charts.py%0A%2A%2ALine%3A%2A%2A%2082%3A86%0A%2A%2AComment%3A%2A%2A%0A%09%2ALogic%20Error%3A%20This%20numeric%20extraction%20condition%20now%20accepts%20booleans%20as%20numeric%20values%20%28%60True%60%2F%60False%60%20are%20subclasses%20of%20%60int%60%29%20and%2C%20because%20of%20the%20%60numeric_val%20is%20None%60%20gate%2C%20it%20can%20lock%20onto%20a%20boolean%20field%20and%20ignore%20the%20real%20metric%20in%20the%20same%20row.%20In%20datasets%20where%20a%20boolean%20column%20appears%20before%20the%20metric%20column%2C%20bars%20will%20be%20computed%20from%200%2F1%20instead%20of%20the%20actual%20values.%20Exclude%20booleans%20from%20numeric%20detection%20%28for%20example%2C%20require%20%60not%20isinstance%28val%2C%20bool%29%60%29%20before%20assigning%20%60numeric_val%60.%0A%0AValidate%2
0the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20superset%2Fmcp_service%2Fchart%2Fascii_charts.py%0A%2A%2ALine%3A%2A%2A%2082%3A86%0A%2A%2AComment%3A%2A%2A%0A%09%2ALogic%20Error%3A%20This%20numeric%20extraction%20condition%20now%20accepts%20booleans%20as%20numeric%20values%20%28%60True%60%2F%60False%60%20are%20subclasses%20of%20%60int%60%29%20and%2C
%20because%20of%20the%20%60numeric_val%20is%20None%60%20gate%2C%20it%20can%20lock%20onto%20a%20boolean%20field%20and%20ignore%20the%20real%20metric%20in%20the%20same%20row.%20In%20datasets%20where%20a%20boolean%20column%20appears%20before%20the%20metric%20column%2C%20bars%20will%20be%20computed%20from%200%2F1%20instead%20of%20the%20actual%20values.%20Exclude%20booleans%20from%20numeric%20detection%20%28for%20example%2C%20require%20%60not%20isinstance%28val%2C%20bool%29%60%29%20before%20assigning%20%60numeric_val%60.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate
%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/ascii_charts.py
**Line:** 82:86
**Comment:**
*Logic Error: This numeric extraction condition now accepts booleans as
numeric values (`True`/`False` are subclasses of `int`) and, because of the
`numeric_val is None` gate, it can lock onto a boolean field and ignore the
real metric in the same row. In datasets where a boolean column appears before
the metric column, bars will be computed from 0/1 instead of the actual values.
Exclude booleans from numeric detection (for example, require `not
isinstance(val, bool)`) before assigning `numeric_val`.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39916&comment_hash=9cdd8908b30bb4d594881b9a411500ae4417df932bd8135914fefe52d8d5df04&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39916&comment_hash=9cdd8908b30bb4d594881b9a411500ae4417df932bd8135914fefe52d8d5df04&reaction=dislike'>👎</a>
##########
superset/mcp_service/chart/ascii_charts.py:
##########
@@ -274,7 +284,11 @@ def _extract_time_series_data(data: list[Any]) ->
tuple[list[float], list[str]]:
label_val = None
for key, val in row.items():
- if isinstance(val, (int, float)) and numeric_val is None:
+ if (
+ isinstance(val, (int, float))
+ and not _is_nan_value(val)
+ and numeric_val is None
+ ):
Review Comment:
**Suggestion:** The same extraction logic in time-series parsing can also
select boolean columns as numeric points and then stop at the first match due
`numeric_val is None`, which can replace the intended metric series with 0/1
values when boolean fields come first in row order. This breaks
line/time-series charts for valid mixed-schema datasets. Exclude booleans from
the numeric branch so only true numeric measures are used. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ MCP get_chart_preview line charts use boolean 0/1 values.
- ⚠️ Time-series ASCII previews misrepresent metric trends for users.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a Superset chart whose `viz_type` is `"line"` or
`"echarts_timeseries_line"`
(routed to the line renderer in
`superset/mcp_service/chart/ascii_charts.py:52-53`) using
a dataset whose query returns rows with a BOOLEAN column first (e.g.
`is_active`) and a
numeric metric second (e.g. `total_sales`) along with a temporal label
column (e.g.
`date`).
2. Invoke the MCP `get_chart_preview` tool so that
`ChartPreviewStrategy.generate()` in
`superset/mcp_service/chart/tool/get_chart_preview.py:3-35` runs
`ChartDataCommand`,
populates `data = result["queries"][0].get("data", [])` at lines 26-28, and
calls
`generate_ascii_chart(data, self.chart.viz_type or "table", ...)` at lines
30-35.
3. In `generate_ascii_chart`
(`superset/mcp_service/chart/ascii_charts.py:34-54`), when
`chart_type` is `"line"` or `"echarts_timeseries_line"`, control flows to
`_generate_ascii_line_chart(data, width, height)` as seen at line 52.
`_generate_ascii_line_chart` then calls `values, labels =
_extract_time_series_data(data)`
at `ascii_charts.py:253`.
4. Inside `_extract_time_series_data` (`ascii_charts.py:275-307`), each row
dict is
iterated (`for key, val in row.items():` at line 286). For a row such as
`{"is_active":
True, "total_sales": 100.0, "date": "2024-01-01"}`, the first `val` is the
boolean `True`.
The numeric-branch condition at lines 287-291:
`if (isinstance(val, (int, float)) and not _is_nan_value(val) and
numeric_val is
None):`
evaluates to True because `True` is an `int` and `_is_nan_value(True)`
returns False
(implementation at `ascii_charts.py:23-27` uses
`math.isnan(float(value))`). This sets
`numeric_val = True` at line 292 and prevents later numeric fields from
being
considered (`numeric_val is None` gate). As a result, the `values` list
appended at
lines 303-305 contains 0/1 boolean flags instead of the intended metric
(e.g.
`total_sales`), so the line chart drawn by `_create_enhanced_line_chart`
(`ascii_charts.py:310-378`) and returned to the MCP `get_chart_preview`
caller has an
incorrect time-series shape.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20superset%2Fmcp_service%2Fchart%2Fascii_charts.py%0A%2A%2ALine%3A%2A%2A%20287%3A291%0A%2A%2AComment%3A%2A%2A%0A%09%2ALogic%20Error%3A%20The%20same%20extraction%20logic%20in%20time-series%20parsing%20can%20also%20select%20boolean%20columns%20as%20numeric%20points%20and%20then%20stop%20at%20the%20first%20match%20due%20%60numeric_val%20is%20None%60%2C%20which%20can%20replace%20the%20intended%20metric%20series%20with%200%2F1%20values%20when%20boolean%20fields%20come%20first%20in%20row%20order.%20This%20breaks%20line%2Ftime-series%20charts%20for%20valid%20mixed-schema%20datasets.%20Exclude%20booleans%20from%20the%20numeric%20branch%20so%20only%20true%20numeric%20measures%20are%20used.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2
C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20superset%2Fmcp_service%2Fchart%2Fascii_charts.py%0A%2A%2ALine%3A%2A%2A%20287%3A291%0A%2A%2AComment%3A%2A%2A%0A%09%2ALogic%20Error%3A%20The%20same%20extraction%20logic%20in%20time-series%20parsing%20can%20also%20select%20boolean%20columns%20as%20numeric%20points%20and%20then%20stop%20at%20the%20first%20match%20due%20%60numeric_val%20is%20None%60%2C%20which%20can%20replace%20the%20intended%20metric%20series%20with%200%2F1%20values%20w
hen%20boolean%20fields%20come%20first%20in%20row%20order.%20This%20breaks%20line%2Ftime-series%20charts%20for%20valid%20mixed-schema%20datasets.%20Exclude%20booleans%20from%20the%20numeric%20branch%20so%20only%20true%20numeric%20measures%20are%20used.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/ascii_charts.py
**Line:** 287:291
**Comment:**
*Logic Error: The same extraction logic in time-series parsing can also
select boolean columns as numeric points and then stop at the first match due
`numeric_val is None`, which can replace the intended metric series with 0/1
values when boolean fields come first in row order. This breaks
line/time-series charts for valid mixed-schema datasets. Exclude booleans from
the numeric branch so only true numeric measures are used.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39916&comment_hash=2b37dfc11f193bd9dfbe452e730fc0e49f82f5cad5266fc740a6a01dd422e7ac&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39916&comment_hash=2b37dfc11f193bd9dfbe452e730fc0e49f82f5cad5266fc740a6a01dd422e7ac&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]