codeant-ai-for-open-source[bot] commented on code in PR #40935:
URL: https://github.com/apache/superset/pull/40935#discussion_r3398301994
##########
superset/mcp_service/dataset/tool/create_virtual_dataset.py:
##########
@@ -131,7 +163,6 @@ async def create_virtual_dataset(
)
Review Comment:
**🟠Architect Review — HIGH**
create_virtual_dataset now performs a CreateDatasetCommand followed by an
UpdateDatasetCommand; if the update step raises DatasetInvalidError or
DatasetUpdateFailedError, the function returns id=None and an error, but the
dataset created in the first step has already been committed. This leaves a
persisted "partial" dataset that the caller believes failed and makes
subsequent retries with the same database/schema/catalog and dataset_name fail
the uniqueness check ("dataset already exists").
**Suggestion:** Make the operation atomic from the tool caller's perspective
by either wrapping create and update in a single transaction or compensating
(e.g., deleting the newly created dataset) when the update fails, and adjust
error handling to distinguish create vs update failures or return the created
dataset id so callers can reconcile state.
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=b249d6af40df4b5eb2e8634ffe832c4e&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=b249d6af40df4b5eb2e8634ffe832c4e&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is an **Architect / Logical Review** comment left during a code review.
These reviews are first-class, important findings — not optional suggestions.
Do NOT dismiss this as a 'big architectural change' just because the title says
architect review; most of these can be resolved with a small, localized fix
once the intent is understood.
**Path:** superset/mcp_service/dataset/tool/create_virtual_dataset.py
**Line:** 87:163
**Comment:**
*HIGH: create_virtual_dataset now performs a CreateDatasetCommand
followed by an UpdateDatasetCommand; if the update step raises
DatasetInvalidError or DatasetUpdateFailedError, the function returns id=None
and an error, but the dataset created in the first step has already been
committed. This leaves a persisted "partial" dataset that the caller believes
failed and makes subsequent retries with the same database/schema/catalog and
dataset_name fail the uniqueness check ("dataset already exists").
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative
instruction. If no explicit code suggestion is given, you MUST still draft and
apply your own minimal, localized fix — do not punt back with 'no suggestion
provided, review manually'. Keep the change as small as possible: add a guard
clause, gate on a loading state, reorder an await, wrap in a conditional, etc.
Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]