codeant-ai-for-open-source[bot] commented on code in PR #40331:
URL: https://github.com/apache/superset/pull/40331#discussion_r3281242806


##########
superset/commands/chart/data/streaming_export_command.py:
##########
@@ -66,7 +69,29 @@ def _get_sql_and_database(self) -> tuple[str, Any, str | 
None, str | None]:
         # Note: datasource should already be attached to a session from 
query_context
         datasource = self._query_context.datasource
         query_obj = self._query_context.queries[0]
-        sql_query = datasource.get_query_str(query_obj.to_dict())
+        query_dict = query_obj.to_dict()
+
+        # When ALLOW_FULL_CSV_EXPORT is enabled, raise the row limit so a
+        # "full" export is not silently capped at SQL_MAX_ROW. The ceiling is
+        # TABLE_VIZ_MAX_ROW_SERVER (a bounded, predictable maximum) rather than
+        # truly unlimited.
+        if is_feature_enabled("ALLOW_FULL_CSV_EXPORT"):
+            query_dict["row_limit"] = app.config["TABLE_VIZ_MAX_ROW_SERVER"]

Review Comment:
   **Suggestion:** The row-limit override is applied for every streaming CSV 
export whenever the feature flag is enabled, not just for explicit "full CSV" 
exports. This makes regular CSV exports unexpectedly bypass `SQL_MAX_ROW` and 
can return much larger datasets than requested. Gate this override on an 
explicit full-export signal from the request (or only honor the caller-provided 
`row_limit`) so normal CSV exports keep their intended cap. [incorrect 
condition logic]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Dashboard "Export to .CSV" ignores chart-configured row_limit.
   - ❌ Streaming CSV uses TABLE_VIZ_MAX_ROW_SERVER for all exports.
   - ⚠️ Large unintended exports increase database and network load.
   - ⚠️ Explore "Export to .CSV" streaming also over-returns rows.
   - ⚠️ Feature flag semantics leak into non-full export behavior.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Enable the `ALLOW_FULL_CSV_EXPORT` feature flag in Superset configuration 
so
   `superset.is_feature_enabled("ALLOW_FULL_CSV_EXPORT")` returns True, and 
ensure
   `TABLE_VIZ_MAX_ROW_SERVER` is significantly larger than the typical chart 
`row_limit`
   (e.g., 1_000_000 vs 10_000).
   
   2. On a dashboard with a large table chart, configure the chart's 
`row_limit` to a modest
   value (e.g., 10_000) and save. This `row_limit` is part of the chart form 
data used in
   `exportTable` in
   `superset-frontend/src/dashboard/components/gridComponents/Chart/Chart.tsx` 
lines 66–105,
   where the non-full path keeps `exportFormData = formData`.
   
   3. With the chart loaded and `queriesResponse` reflecting a large underlying 
dataset
   (rowcount ≥ `CSV_STREAMING_ROW_THRESHOLD`), click the regular "Export to 
.CSV" menu item
   (not "Export to full .CSV") in the dashboard header. This invokes 
`exportCSV` →
   `exportTable('csv', false)` in `Chart.tsx` lines 178–183, which:
   
      - Leaves `exportFormData.row_limit` at the chart's configured limit 
(e.g., 10_000),
   
      - Computes `actualRowCount` from `queriesResponse` and, when it exceeds
      `streamingThreshold`, sets `shouldUseStreaming = true`, triggering 
`exportChart` with
      `resultType: 'full', resultFormat: 'csv'` and an `onStartStreamingExport` 
callback.
   
   4. The client's streaming export handler posts the chart form data (including
   `row_limit=10_000`) to `/api/v1/chart/data` handled by 
`ChartDataRestApi.data` in
   `superset/charts/data/api.py` lines 115–163, which builds a `QueryContext` 
via
   `_create_query_context_from_form` and runs `ChartDataCommand`. When the 
result is CSV and
   large, `_send_chart_response` (lines 257–236 of the same file) calls
   `_should_use_streaming` and, since `actual_row_count >= threshold`, 
delegates to
   `_create_streaming_csv_response` (lines 230–260), which constructs
   `StreamingCSVExportCommand(query_context, chunk_size)`.
   
   5. Inside `StreamingCSVExportCommand._get_sql_and_database` in
   `superset/commands/chart/data/streaming_export_command.py` lines 61–99, the 
code converts
   the first query object to a dict (`query_dict = query_obj.to_dict()`), then
   unconditionally executes:
   
      - `if is_feature_enabled("ALLOW_FULL_CSV_EXPORT"):`
   
        ` query_dict["row_limit"] = app.config["TABLE_VIZ_MAX_ROW_SERVER"]`
   
      Since the flag is enabled, the chart's intended `row_limit` (10_000) is 
overwritten
      with `TABLE_VIZ_MAX_ROW_SERVER` (e.g., 1_000_000) for this streaming 
export, even
      though the user chose the regular "Export to .CSV" action.
   
   6. The modified `query_dict` is passed to 
`datasource.get_query_str_extended` /
   `get_query_str`, generating SQL with the larger limit; `_get_row_limit` in
   `StreamingCSVExportCommand` returns `None` (lines 101–108), so
   `BaseStreamingCSVExportCommand._execute_query_and_stream` in
   `superset/commands/streaming_export/base.py` lines 97–151 streams all rows 
up to the
   higher SQL limit. The resulting CSV contains far more rows than the chart's 
configured
   `row_limit`, demonstrating that the ALLOW_FULL_CSV_EXPORT override applies 
to all
   streaming CSV exports, not only explicit "full CSV" exports.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=4d5ebf995737494b974bbc79fe5ebd15&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=4d5ebf995737494b974bbc79fe5ebd15&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/commands/chart/data/streaming_export_command.py
   **Line:** 78:79
   **Comment:**
        *Incorrect Condition Logic: The row-limit override is applied for every 
streaming CSV export whenever the feature flag is enabled, not just for 
explicit "full CSV" exports. This makes regular CSV exports unexpectedly bypass 
`SQL_MAX_ROW` and can return much larger datasets than requested. Gate this 
override on an explicit full-export signal from the request (or only honor the 
caller-provided `row_limit`) so normal CSV exports keep their intended cap.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40331&comment_hash=803ae7e776f3e24424e914963dd53d43af0957599987421d8383bd99543251d7&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40331&comment_hash=803ae7e776f3e24424e914963dd53d43af0957599987421d8383bd99543251d7&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to