codeant-ai-for-open-source[bot] commented on code in PR #41103:
URL: https://github.com/apache/superset/pull/41103#discussion_r3418614107
##########
superset/commands/report/execute.py:
##########
@@ -506,9 +509,88 @@ def _get_pdf(self) -> bytes:
return pdf
+ def _get_chart_data_request_payload(
+ self,
+ result_format: ChartDataResultFormat,
+ ) -> dict[str, Any]:
+ """
+ Build the same POST payload shape used by frontend exports.
+ """
+ try:
+ query_context =
json.loads(self._report_schedule.chart.query_context)
+ except (TypeError, json.JSONDecodeError) as ex:
+ raise ReportScheduleExecuteUnexpectedError(
+ "Chart has no valid query context saved."
+ ) from ex
+
+ if not isinstance(query_context, dict):
+ raise ReportScheduleExecuteUnexpectedError(
+ "Chart has no valid query context saved."
+ )
+
+ result_type = ChartDataResultType.POST_PROCESSED.value
+ force = bool(self._report_schedule.force_screenshot)
+ query_context["result_format"] = result_format.value
+ query_context["result_type"] = result_type
+ query_context["force"] = force
+
+ form_data = query_context.get("form_data")
+ if isinstance(form_data, dict):
+ form_data["result_format"] = result_format.value
+ form_data["result_type"] = result_type
+ form_data["force"] = force
+
+ if form_data.get("server_pagination"):
+ row_limit = form_data.get("row_limit") or 0
+ queries = query_context.get("queries")
+ if isinstance(queries, list):
+ data_query_updated = False
+ download_queries = []
+ for query in queries:
+ if isinstance(query, dict) and
query.get("is_rowcount"):
+ continue
+ if isinstance(query, dict) and not data_query_updated:
+ query = {
+ **query,
+ "row_limit": row_limit,
+ "row_offset": 0,
+ }
+ data_query_updated = True
+ download_queries.append(query)
+ query_context["queries"] = download_queries
+
+ return query_context
+
+ @staticmethod
+ def _post_chart_data(
+ chart_url: str,
+ auth_cookies: Optional[dict[str, str]],
+ request_payload: dict[str, Any],
+ ) -> Optional[bytes]:
+ if not auth_cookies:
+ return None
+
+ cookie_str = ";".join([f"{key}={val}" for key, val in
auth_cookies.items()])
+ request_body = urllib.parse.urlencode(
+ {"form_data": json.dumps(request_payload)}
+ ).encode("utf-8")
+ request = urllib.request.Request(
+ chart_url,
+ data=request_body,
+ headers={
+ "Content-Type": "application/x-www-form-urlencoded",
+ "Cookie": cookie_str,
+ },
+ method="POST",
+ )
+ response = urllib.request.build_opener().open(request)
+ content = response.read()
+ if response.getcode() != 200:
+ raise URLError(response.getcode())
Review Comment:
**Suggestion:** The HTTP response object returned by
`urllib.request.build_opener().open(request)` is never closed. In a
long-running report worker this can leak sockets/file descriptors across many
exports and eventually cause request failures. Wrap the open call in a context
manager (`with ... as response`) so the connection is always released,
including on exceptions. [resource leak]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Scheduled CSV report exports leak HTTP connections each execution.
- ⚠️ Long-running report workers risk exhausting sockets/descriptors.
- ⚠️ Subsequent chart data exports may fail with URLError.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a scheduled report whose `report_format` is CSV so it uses the
report
execution command in `superset/commands/report/execute.py` and is run
periodically by the
scheduler (see `run()` method at lines 13–31, which constructs and runs
`ReportScheduleStateMachine` at lines 28–30).
2. When the schedule fires, `ReportScheduleStateMachine.run()` eventually
enters
`ReportSuccessState.next()` (class defined around line 1102; the `next()`
method calls
`self.send()` at lines 28–29 in the 1149–1188 block), which invokes `send()`
at lines
880–887.
3. The `send()` method at lines 880–887 calls `_get_notification_content()`
at line 886;
for a chart report with `report_format == ReportDataFormat.CSV` the
`_get_notification_content()` branch at lines 774–778 calls `csv_data =
self._get_csv_data()`.
4. `_get_csv_data()` at lines 592–610 builds the chart data request payload
and calls
`self._post_chart_data(...)` at lines 610–614, which performs the HTTP
request in
`_post_chart_data()` at lines 565–590 by executing `response =
urllib.request.build_opener().open(request)` (line 586), reading `content =
response.read()` (line 587), and returning without ever closing `response`,
leaving the
underlying HTTP connection/socket open until garbage collection. Repeating
this scheduled
CSV export many times in the long‑running worker will accumulate unclosed
HTTPResponse
objects and sockets, eventually exhausting file descriptors or connections
and causing
future exports to fail with networking errors.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=3e1b95000ef04070a9c08046aabb0300&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=3e1b95000ef04070a9c08046aabb0300&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/commands/report/execute.py
**Line:** 586:589
**Comment:**
*Resource Leak: The HTTP response object returned by
`urllib.request.build_opener().open(request)` is never closed. In a
long-running report worker this can leak sockets/file descriptors across many
exports and eventually cause request failures. Wrap the open call in a context
manager (`with ... as response`) so the connection is always released,
including on exceptions.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41103&comment_hash=8fc59d935c47623c4f7605471d325d026fd85713c95add0a8e70cd9acaa70af6&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41103&comment_hash=8fc59d935c47623c4f7605471d325d026fd85713c95add0a8e70cd9acaa70af6&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]