codeant-ai-for-open-source[bot] commented on code in PR #37932:
URL: https://github.com/apache/superset/pull/37932#discussion_r2799467494
##########
superset/commands/dataset/duplicate.py:
##########
@@ -66,6 +66,7 @@ def run(self) -> Model:
table = SqlaTable(table_name=table_name, owners=owners)
table.database = database
table.schema = self._base_model.schema
+ table.catalog = self._base_model.catalog
Review Comment:
**Suggestion:** When duplicating a virtual dataset whose `catalog` is not
explicitly set, validation uses the database's default catalog to check
uniqueness, but the duplicated dataset itself is saved with a `None` catalog,
which is inconsistent with the uniqueness logic and the CreateDatasetCommand
behavior and can lead to later duplicate-name checks misbehaving for
catalog-aware databases. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Dataset duplication may overlook existing duplicates in catalogs.
- ⚠️ Catalog-aware DBs get inconsistent uniqueness vs stored metadata.
- ⚠️ Dataset management UI may allow logically duplicate datasets.
```
</details>
```suggestion
table.catalog = self._base_model.catalog or
database.get_default_catalog()
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a catalog-aware database (e.g. Trino) in Superset with a
non-null default
catalog (via Database model at `superset/models/core.py`, then referenced in
`Database.get_default_catalog()` used in
`superset/commands/dataset/duplicate.py:121`).
2. Create a virtual dataset on this database so that its `SqlaTable` record
has `catalog`
unset/NULL (this dataset later becomes `self._base_model` inside
`DuplicateDatasetCommand`
in `superset/commands/dataset/duplicate.py`).
3. From the Datasets list UI, click "Duplicate" for that virtual dataset and
submit a new
name; this triggers `DuplicateDatasetCommand.run()` at
`superset/commands/dataset/duplicate.py:51`, which calls `validate()` at
line 105.
4. In `validate()`, uniqueness is checked using `catalog =
self._base_model.catalog or
database.get_default_catalog()` at line 121 (i.e. the database default
catalog), but in
`run()` the new `SqlaTable` is persisted with `table.catalog =
self._base_model.catalog`
at line 69 (NULL), so the stored catalog for the duplicated dataset does not
match the
catalog used for uniqueness checks. This mismatch can cause later
duplicate-name
validation (using the resolved catalog) to overlook this dataset for
catalog-aware
databases, allowing logically duplicate datasets pointing at the same
catalog/schema/table
to be created.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/commands/dataset/duplicate.py
**Line:** 69:69
**Comment:**
*Logic Error: When duplicating a virtual dataset whose `catalog` is not
explicitly set, validation uses the database's default catalog to check
uniqueness, but the duplicated dataset itself is saved with a `None` catalog,
which is inconsistent with the uniqueness logic and the CreateDatasetCommand
behavior and can lead to later duplicate-name checks misbehaving for
catalog-aware databases.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37932&comment_hash=133252f12f563e5a75f2df5fb845e805439b8efa2b7123896b28cb861f58c64f&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37932&comment_hash=133252f12f563e5a75f2df5fb845e805439b8efa2b7123896b28cb861f58c64f&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]