codeant-ai-for-open-source[bot] commented on code in PR #40499:
URL: https://github.com/apache/superset/pull/40499#discussion_r3319498553


##########
superset/sql/parse.py:
##########
@@ -1667,6 +1709,7 @@ def transpile_to_dialect(
     # Get source dialect (default to generic if not specified)
     source_dialect = SQLGLOT_DIALECTS.get(source_engine) if source_engine else 
Dialect
 
+    _check_script_length(sql, source_engine)

Review Comment:
   **Suggestion:** Calling the length gate before the `try` block changes this 
function's error contract: over-limit input now raises `SupersetParseError` 
instead of `QueryClauseValidationException`. Existing callers (for example in 
query sanitization and example import paths) only catch 
`QueryClauseValidationException`, so this new exception escapes and can turn 
expected validation handling into unhandled failures. Move the length check 
inside the same `try`/exception-wrapping flow (or explicitly wrap 
`SupersetParseError`) so the function keeps returning a consistent exception 
type. [api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Example dataset import crashes on over-long virtual dataset SQL.
   - ⚠️ Length-cap errors bypass QueryClauseValidationException-based fallback 
handling.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Note the transpile helper in `superset/sql/parse.py:1684-1750` where
   `transpile_to_dialect()` calls `_check_script_length(sql, source_engine)` on 
line 1712
   (per PR hunk) before entering the `try` block that wraps `sqlglot.parse_one` 
and raises
   `QueryClauseValidationException` on `ParseError` or generic `Exception`.
   
   2. Observe the example importer in 
`superset/commands/importers/v1/examples.py:55-101`:
   `transpile_virtual_dataset_sql()` calls `transpile_to_dialect(sql, 
target_engine,
   source_engine)` inside a `try` and only catches 
`QueryClauseValidationException`, logging
   a warning and falling back to the original SQL in that case.
   
   3. Construct a dataset config with an over-long `sql` field (e.g. `config = 
{"table_name":
   "big_ds", "sql": "a" * 1_000_001, "source_db_engine": "postgresql"}`) and 
ensure the
   target database engine differs (e.g. a `Database` with 
`db_engine_spec.engine = "mysql"`
   so `transpile_virtual_dataset_sql()` at `examples.py:71-80` actually calls
   `transpile_to_dialect`).
   
   4. Call `transpile_virtual_dataset_sql(config, database_id)` either directly 
or via
   `ImportExamplesCommand._import()` invoked from 
`load_examples_from_configs()` in
   `superset/examples/utils.py:93-119`; `_check_script_length` in
   `superset/sql/parse.py:1712` raises `SupersetParseError` before the `try`, 
this exception
   is not caught by the `except QueryClauseValidationException` block in
   `examples.py:92-100`, so it propagates up, causing 
`ImportExamplesCommand.run()` at
   `examples.py:120-131` to raise `CommandException` instead of logging a 
warning and keeping
   the original SQL. This demonstrates the changed error contract for over-cap 
inputs
   compared to other parse failures, which are consistently wrapped into
   `QueryClauseValidationException`.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=efd6216218b9412cadcfac2db4a8a19d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=efd6216218b9412cadcfac2db4a8a19d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/sql/parse.py
   **Line:** 1712:1712
   **Comment:**
        *Api Mismatch: Calling the length gate before the `try` block changes 
this function's error contract: over-limit input now raises 
`SupersetParseError` instead of `QueryClauseValidationException`. Existing 
callers (for example in query sanitization and example import paths) only catch 
`QueryClauseValidationException`, so this new exception escapes and can turn 
expected validation handling into unhandled failures. Move the length check 
inside the same `try`/exception-wrapping flow (or explicitly wrap 
`SupersetParseError`) so the function keeps returning a consistent exception 
type.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40499&comment_hash=263ad46f09592ae353909bcd1da72f72316354bd7cb839a7fbd7ad97a47b6bda&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40499&comment_hash=263ad46f09592ae353909bcd1da72f72316354bd7cb839a7fbd7ad97a47b6bda&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to