codeant-ai-for-open-source[bot] commented on code in PR #38616:
URL: https://github.com/apache/superset/pull/38616#discussion_r2929299301
##########
tests/unit_tests/charts/test_client_processing.py:
##########
@@ -2788,3 +2788,45 @@ def
test_apply_client_processing_csv_format_default_na_behavior():
assert (
"Alice," in lines[2]
) # Second data row should have empty last_name (NA converted to null)
+
+
+@with_config({"CSV_EXPORT": {"sep": ";", "decimal": ","}})
+def test_apply_client_processing_csv_format_custom_delimiter():
+ """
+ Test that apply_client_processing respects CSV_EXPORT sep and decimal
config.
+ Without the fix, pd.read_csv() uses default comma separator and fails to
parse
+ semicolon-delimited CSV correctly, causing HTTP 500 in email reports.
+ """
+ csv_data = "name;value\nfoo;1,5\nbar;2,0"
+
+ result = {
+ "queries": [
+ {
+ "result_format": ChartDataResultFormat.CSV,
+ "data": csv_data,
+ }
+ ]
+ }
+
+ form_data = {
+ "datasource": "1__table",
+ "viz_type": "table",
+ "slice_id": 1,
+ "url_params": {},
+ "metrics": [],
+ "groupby": [],
+ "columns": ["name", "value"],
+ "extra_form_data": {},
+ "force": False,
+ "result_format": "csv",
+ "result_type": "results",
+ }
+
+ processed_result = apply_client_processing(result, form_data)
+
+ output_data = processed_result["queries"][0]["data"]
+ lines = output_data.strip().split("\n")
+ # Should have header + 2 data rows, with correct column parsing
+ assert len(lines) == 3
+ # name and value should be separate columns, not merged into one
+ assert processed_result["queries"][0]["colnames"] == ["name", "value"]
Review Comment:
**Suggestion:** This test does not actually verify that the configured
delimiter/decimal are preserved in the generated CSV output, so it can pass
even when email CSV exports are still emitted with default comma and dot
formatting. Strengthen the assertions to validate the serialized CSV content
and numeric typing of the parsed `value` column. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Scheduled CSV emails may ignore configured separator/decimal output.
- ⚠️ Current unit test misses CSV serialization regressions.
```
</details>
```suggestion
output_data = processed_result["queries"][0]["data"]
lines = output_data.strip().split("\n")
# Should have header + 2 data rows, with correct column parsing
assert len(lines) == 3
# name and value should be separate columns, not merged into one
assert processed_result["queries"][0]["colnames"] == ["name", "value"]
# value should be parsed as numeric using decimal="," config
assert processed_result["queries"][0]["coltypes"] == [
GenericDataType.STRING,
GenericDataType.NUMERIC,
]
# serialized CSV should honor configured separator/decimal for email
exports
assert lines[0] == "name;value"
assert "foo;1,5" in lines[1]
assert "bar;2,0" in lines[2]
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a report with non-default CSV settings
(`CSV_EXPORT={"sep":";","decimal":","}`), then trigger scheduled CSV
generation through
`ReportScheduleStateMachine._get_csv_data()` in
`superset/commands/report/execute.py:439-443`, which calls
`ChartDataRestApi.get_data`
with `type=post_processed` at `execute.py:226-230`.
2. In chart response handling, `_send_chart_response()` applies
post-processing when
`result_type == POST_PROCESSED` (`superset/charts/data/api.py:404-405`),
invoking
`apply_client_processing()`.
3. In `apply_client_processing()`, CSV is parsed with configured
`sep`/`decimal`
(`superset/charts/client_processing.py:348-356`) but serialized back using
default
`processed_df.to_csv(...)` without config (`client_processing.py:397`),
yielding comma/dot
formatted output.
4. Run the current unit test
`test_apply_client_processing_csv_format_custom_delimiter`
(`tests/unit_tests/charts/test_client_processing.py:2794-2832`): it only
asserts row count
and `colnames` (`2827-2832`), so it still passes even when output CSV
formatting is wrong.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** tests/unit_tests/charts/test_client_processing.py
**Line:** 2827:2832
**Comment:**
*Logic Error: This test does not actually verify that the configured
delimiter/decimal are preserved in the generated CSV output, so it can pass
even when email CSV exports are still emitted with default comma and dot
formatting. Strengthen the assertions to validate the serialized CSV content
and numeric typing of the parsed `value` column.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38616&comment_hash=2171f418a011e134c41f6f12c424bdcfde87be2166f47790c63e9b2a0b3dad05&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38616&comment_hash=2171f418a011e134c41f6f12c424bdcfde87be2166f47790c63e9b2a0b3dad05&reaction=dislike'>👎</a>
##########
superset/charts/client_processing.py:
##########
@@ -345,10 +345,15 @@ def apply_client_processing( # noqa: C901
# reports to avoid unwanted conversions
# This allows users to control which values should be treated as
null/NA
na_values = current_app.config["REPORTS_CSV_NA_NAMES"]
+ csv_export_config = current_app.config.get("CSV_EXPORT", {})
Review Comment:
**Suggestion:** The CSV parser now respects configured delimiter/decimal,
but the CSV writer still uses pandas defaults, so processed email report output
is re-serialized with `,` and `.` instead of the configured format. This breaks
the intended contract for localized CSV exports and can corrupt round-tripping
for semicolon/comma-decimal setups. Reuse the same `sep` and `decimal` values
when calling `to_csv`. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Scheduled CSV reports ignore configured output separators.
- ⚠️ Locale decimal formatting changes in emailed CSV files.
- ⚠️ Behavior diverges from GUI CSV export formatting.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]