codeant-ai-for-open-source[bot] commented on code in PR #38513:
URL: https://github.com/apache/superset/pull/38513#discussion_r2910686606
##########
superset/views/core.py:
##########
@@ -206,6 +210,19 @@ def generate_json(
payload = viz_obj.get_payload()
return self.send_data_payload_response(viz_obj, payload)
+ @staticmethod
+ def _generate_xlsx(viz_obj: BaseViz) -> FlaskResponse:
+ import pandas as pd
+
+ from superset.utils.excel import df_to_excel
+
+ payload = viz_obj.get_df_payload()
+ df = payload.get("df")
+ if df is None:
+ df = pd.DataFrame()
+ xlsx_data = df_to_excel(df, index=False)
+ return XlsxResponse(xlsx_data,
headers=generate_download_headers("xlsx"))
Review Comment:
**Suggestion:** The XLSX export for legacy charts calls `df_to_excel`
directly on the raw dataframe from `viz_obj.get_df_payload()` without applying
`apply_column_types`, so if the dataframe contains timezone-aware datetimes (or
other types Excel can't handle) the `to_excel` call will raise at runtime,
whereas the newer chart data API path normalizes these columns before writing;
pulling the `coltypes` from the payload and passing the dataframe through
`apply_column_types` first will align behavior and prevent these runtime
errors. [type error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Legacy /superset/explore_json/ XLSX downloads can 500 on tz data.
- ⚠️ Legacy XLSX export diverges from /api/v1/chart/data behavior.
- ⚠️ Affects dashboard users downloading legacy chart data.
```
</details>
```suggestion
@staticmethod
def _generate_xlsx(viz_obj: BaseViz) -> FlaskResponse:
import pandas as pd
from superset.utils.excel import apply_column_types, df_to_excel
payload = viz_obj.get_df_payload()
df = payload.get("df")
if df is None:
df = pd.DataFrame()
coltypes = []
else:
coltypes = payload.get("coltypes") or []
if coltypes:
df = apply_column_types(df, coltypes)
xlsx_data = df_to_excel(df, index=False)
return XlsxResponse(xlsx_data,
headers=generate_download_headers("xlsx"))
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a dataset with a timezone-aware datetime column (e.g. a
TIMESTAMP WITH TIME
ZONE column mapped to a Superset dataset) and build a legacy visualization
using that
dataset, such as `WorldMapViz` in `superset/viz.py:1265-1326`.
2. From the Explore UI for that legacy chart (served by `Superset.explore` in
`superset/views/core.py:424-599`), use the chart menu to choose "Download as
Excel" so the
frontend issues a request to `/superset/explore_json/` with `?xlsx=true`
(the legacy
export endpoint, referenced throughout the frontend tests under
`superset-frontend/src/explore/exploreUtils/getExploreUrl.test.ts`).
3. The request is handled by `Superset.explore_json` in
`superset/views/core.py:287-369`,
which inspects `request.args`, sets `response_type` to
`ChartDataResultFormat.XLSX`,
constructs `viz_obj = get_viz(...)`, and then calls
`self.generate_json(viz_obj,
response_type)` at line 368.
4. In `Superset.generate_json` at `superset/views/core.py:190-211`, the XLSX
branch calls
`self._generate_xlsx(viz_obj)` (line 198); `_generate_xlsx` at lines 213-224
calls
`viz_obj.get_df_payload()` (implemented in `BaseViz.get_df_payload` at
`superset/viz.py:520-646`) and passes the returned `df` directly into
`df_to_excel` from
`superset/utils/excel.py:43-53` without running `apply_column_types`; if the
dataframe
contains a `DatetimeTZDtype` column (timezone-aware datetimes),
pandas/xlsxwriter raises
at `df.to_excel(...)` because Excel does not support timezone-aware
datetimes, causing the
legacy `/superset/explore_json/` XLSX download to fail with a 500 error,
whereas the
modern `/api/v1/chart/data` XLSX path normalizes via
`excel.apply_column_types` in
`QueryContextProcessor.get_data` at
`superset/common/query_context_processor.py:245-264`.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/views/core.py
**Line:** 213:224
**Comment:**
*Type Error: The XLSX export for legacy charts calls `df_to_excel`
directly on the raw dataframe from `viz_obj.get_df_payload()` without applying
`apply_column_types`, so if the dataframe contains timezone-aware datetimes (or
other types Excel can't handle) the `to_excel` call will raise at runtime,
whereas the newer chart data API path normalizes these columns before writing;
pulling the `coltypes` from the payload and passing the dataframe through
`apply_column_types` first will align behavior and prevent these runtime errors.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38513&comment_hash=ffa2f82558eb4aed63044e0d18cf8428c0a6bfddeeb695d5533431e6ce6ee7e5&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38513&comment_hash=ffa2f82558eb4aed63044e0d18cf8428c0a6bfddeeb695d5533431e6ce6ee7e5&reaction=dislike'>👎</a>
##########
superset/views/core.py:
##########
@@ -290,13 +307,13 @@ def explore_json(
response_type = response_option
break
- # Verify user has permission to export CSV file
- if (
- response_type == ChartDataResultFormat.CSV
- and not security_manager.can_access("can_csv", "Superset")
- ):
+ # Verify user has permission to export data files
+ if response_type in (
+ ChartDataResultFormat.CSV,
+ ChartDataResultFormat.XLSX,
+ ) and not security_manager.can_access("can_csv", "Superset"):
return json_error_response(
- _("You don't have the rights to download as csv"),
+ _("You don't have the rights to export data"),
status=403,
Review Comment:
**Suggestion:** The new permission check for CSV/XLSX export in
`explore_json` always uses the legacy `can_csv` permission and ignores the
`GRANULAR_EXPORT_CONTROLS` feature flag, so in deployments where granular
export controls are enabled and roles have `can_export_data` but not `can_csv`,
users will see 403 errors when exporting even though the UI indicates they are
allowed; the check should mirror the existing logic in `explore` by switching
to `can_export_data` when granular controls are enabled. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Authorized users blocked exporting via /superset/explore_json/.
- ⚠️ Permission semantics diverge from /api/v1/chart/data endpoint.
- ⚠️ UI shows download allowed but backend returns 403.
```
</details>
```suggestion
# Verify user has permission to export data files
if response_type in (
ChartDataResultFormat.CSV,
ChartDataResultFormat.XLSX,
):
if is_feature_enabled("GRANULAR_EXPORT_CONTROLS"):
can_export = security_manager.can_access("can_export_data",
"Superset")
else:
can_export = security_manager.can_access("can_csv",
"Superset")
if not can_export:
return json_error_response(
_("You don't have the rights to export data"),
status=403,
)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Enable granular export controls by setting the `GRANULAR_EXPORT_CONTROLS`
feature flag
to true (feature flag referenced at `superset/charts/data/api.py:407-415` and
`superset/views/core.py:509-515`), and create a role that has the
`can_export_data`
permission on the `Superset` view but does NOT have `can_csv`.
2. Log in as a user with that role and open a legacy chart in Explore
(handled by
`Superset.explore` in `superset/views/core.py:424-599`), where the UI-level
download
permission is determined using `slice_download_perm`; when
`GRANULAR_EXPORT_CONTROLS` is
enabled, this uses `security_manager.can_access("can_export_data",
"Superset")` as seen at
lines 509-515, so the UI will show the "Download as CSV/XLSX" options as
allowed.
3. From the chart menu, trigger a legacy CSV or XLSX download, which the
frontend (legacy
chart path; see multiple references to `/superset/explore_json/` in the
frontend tests
under `superset-frontend/src/explore/exploreUtils/exploreUtils.test.tsx` and
`plugins/legacy-preset-chart-*/src/utils/explore.ts`) sends to
`/superset/explore_json/`
with `?csv=true` or `?xlsx=true`.
4. The request is handled by `Superset.explore_json` in
`superset/views/core.py:287-369`;
after computing `response_type` from query args (lines 300-308), it executes
the
permission block at lines 310-317, which unconditionally checks
`security_manager.can_access("can_csv", "Superset")` even when
`GRANULAR_EXPORT_CONTROLS`
is enabled; because the role lacks `can_csv`, the condition fails and
`explore_json`
returns a 403 JSON error `"You don't have the rights to export data"`, while
the newer
`/api/v1/chart/data` export path in `ChartDataRestApi._send_chart_response`
(`superset/charts/data/api.py:407-416`) would have allowed the same user via
`can_export_data`.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/views/core.py
**Line:** 310:317
**Comment:**
*Logic Error: The new permission check for CSV/XLSX export in
`explore_json` always uses the legacy `can_csv` permission and ignores the
`GRANULAR_EXPORT_CONTROLS` feature flag, so in deployments where granular
export controls are enabled and roles have `can_export_data` but not `can_csv`,
users will see 403 errors when exporting even though the UI indicates they are
allowed; the check should mirror the existing logic in `explore` by switching
to `can_export_data` when granular controls are enabled.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38513&comment_hash=6b5e2e0f8dc44499632af8f882f79d0ebef102cd3939fd4cbbbc60f22226c3dd&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38513&comment_hash=6b5e2e0f8dc44499632af8f882f79d0ebef102cd3939fd4cbbbc60f22226c3dd&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]