codeant-ai-for-open-source[bot] commented on code in PR #38396:
URL: https://github.com/apache/superset/pull/38396#discussion_r2884457678
##########
superset/commands/semantic_layer/create.py:
##########
@@ -67,3 +69,28 @@ def validate(self) -> None:
# Validate configuration against the plugin
cls = registry[sl_type]
cls.from_configuration(self._properties["configuration"])
+
+
+class CreateSemanticViewCommand(BaseCommand):
+ def __init__(self, data: dict[str, Any]):
+ self._properties = data.copy()
+
+ @transaction(
+ on_error=partial(
+ on_error,
+ catches=(SQLAlchemyError, ValueError),
+ reraise=SemanticViewCreateFailedError,
+ )
+ )
+ def run(self) -> Model:
+ self.validate()
+ if isinstance(self._properties.get("configuration"), dict):
+ self._properties["configuration"] = json.dumps(
+ self._properties["configuration"]
+ )
+ return SemanticViewDAO.create(attributes=self._properties)
+
+ def validate(self) -> None:
+ layer_uuid = self._properties.get("semantic_layer_uuid")
+ if not SemanticLayerDAO.find_by_uuid(layer_uuid):
+ raise SemanticLayerNotFoundError()
Review Comment:
**Suggestion:** The semantic view creation command does not enforce
uniqueness of `(name, semantic_layer_uuid, configuration)` even though
`SemanticViewDAO.validate_uniqueness` and the `/semantic_layer/<uuid>/views`
endpoint treat that combination as unique, so callers can create duplicate
views for the same layer and configuration, leading to inconsistent behavior
and confusing "already added" flags; adding a DAO-backed uniqueness check in
`validate` and raising `ValueError` (which is converted into
`SemanticViewCreateFailedError` by the transaction wrapper) prevents this.
[possible bug]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Duplicate semantic views created via `/api/v1/semantic_view/` bulk
endpoint.
- ⚠️ AddSemanticViewModal UI misleads users about existing duplicate views.
- ⚠️ Semantic layer view listing silently contains duplicate database rows.
```
</details>
```suggestion
def validate(self) -> None:
layer_uuid = self._properties.get("semantic_layer_uuid")
if not SemanticLayerDAO.find_by_uuid(layer_uuid):
raise SemanticLayerNotFoundError()
name: str = self._properties.get("name", "")
configuration = self._properties.get("configuration")
if not SemanticViewDAO.validate_uniqueness(name, layer_uuid,
configuration):
# Trigger SemanticViewCreateFailedError via the transaction
error handler
raise ValueError(
f"Semantic view '{name}' already exists for this layer and
configuration"
)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Create a semantic layer via the REST API: send `POST
/api/v1/semantic_layer/`
(implemented in `superset/semantic_layers/api.py:561-605` as
`SemanticLayerRestApi.post`)
with a valid body so that a `SemanticLayer` row is created in
`superset/semantic_layers/models.py:110-152`. Note its returned `uuid`.
2. Call the bulk semantic view creation endpoint `POST
/api/v1/semantic_view/`
(implemented in `SemanticViewRestApi.post` at
`superset/semantic_layers/api.py:166-255`)
with a JSON body containing two identical entries in the `views` array, for
example:
```json
{
"views": [
{
"name": "orders_view",
"semantic_layer_uuid": "<layer_uuid_from_step_1>",
"configuration": {"foo": "bar"}
},
{
"name": "orders_view",
"semantic_layer_uuid": "<layer_uuid_from_step_1>",
"configuration": {"foo": "bar"}
}
]
}
```
The loop at `api.py:227-238` loads each entry with
`SemanticViewPostSchema` and calls
`CreateSemanticViewCommand(item).run()`.
3. Inside `CreateSemanticViewCommand.run`
(`superset/commands/semantic_layer/create.py:74-91`), `self.validate()` is
executed, and
`validate` at `create.py:93-96` only checks that
`SemanticLayerDAO.find_by_uuid(layer_uuid)` returns a layer. It never calls
`SemanticViewDAO.validate_uniqueness` defined at
`superset/daos/semantic_layer.py:126-158`, so both
`SemanticViewDAO.create(...)` calls
succeed, inserting two `SemanticView` rows with the same `(name,
semantic_layer_uuid,
configuration)` in `superset/semantic_layers/models.py:154-183`.
4. Later, fetch available views for that layer and runtime configuration via
`POST
/api/v1/semantic_layer/<uuid>/views` (`SemanticLayerRestApi.views` at
`superset/semantic_layers/api.py:500-559`) with body `{"runtime_data":
{"foo": "bar"}}`.
This endpoint loads existing views with
`SemanticViewDAO.find_by_semantic_layer`
(`semantic_layer.py:112-124`), builds `existing_keys` keyed by `(name,
json.dumps(configuration))` at `api.py:542-549`, and marks `already_added`
based on that
key at `api.py:552-556`. Because the database now contains duplicate rows
for the same
`(name, layer_uuid, configuration)`, the endpoint treats that combination as
a single
logical view (boolean `already_added`), while the underlying data has
multiple entries,
breaking the intended uniqueness contract and leading to confusing behavior
in the UI
(e.g. `AddSemanticViewModal` in
`superset-frontend/src/features/semanticViews/AddSemanticViewModal.tsx:304-319`
which
relies on this flag).
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/commands/semantic_layer/create.py
**Line:** 93:96
**Comment:**
*Possible Bug: The semantic view creation command does not enforce
uniqueness of `(name, semantic_layer_uuid, configuration)` even though
`SemanticViewDAO.validate_uniqueness` and the `/semantic_layer/<uuid>/views`
endpoint treat that combination as unique, so callers can create duplicate
views for the same layer and configuration, leading to inconsistent behavior
and confusing "already added" flags; adding a DAO-backed uniqueness check in
`validate` and raising `ValueError` (which is converted into
`SemanticViewCreateFailedError` by the transaction wrapper) prevents this.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38396&comment_hash=a7ba0991852ba24c74119663d42eb7bd7fa15bb582a073b6d0a6c30bc2d20f5d&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38396&comment_hash=a7ba0991852ba24c74119663d42eb7bd7fa15bb582a073b6d0a6c30bc2d20f5d&reaction=dislike'>👎</a>
##########
superset/semantic_layers/api.py:
##########
@@ -161,10 +170,89 @@ class SemanticViewRestApi(BaseSupersetModelRestApi):
allow_browser_login = True
class_permission_name = "SemanticView"
method_permission_name = MODEL_API_RW_METHOD_PERMISSION_MAP
- include_route_methods = {"put"}
+ include_route_methods = {"put", "post", "delete"}
edit_model_schema = SemanticViewPutSchema()
+ @expose("/", methods=("POST",))
+ @protect()
+ @statsd_metrics
+ @event_logger.log_this_with_context(
+ action=lambda self, *args, **kwargs: f"{self.__class__.__name__}.post",
+ log_to_statsd=False,
+ )
+ @requires_json
+ def post(self) -> Response:
+ """Bulk create semantic views.
+ ---
+ post:
+ summary: Create semantic views
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ type: object
+ properties:
+ views:
+ type: array
+ items:
+ type: object
+ properties:
+ name:
+ type: string
+ semantic_layer_uuid:
+ type: string
+ configuration:
+ type: object
+ description:
+ type: string
+ cache_timeout:
+ type: integer
+ responses:
+ 201:
+ description: Semantic views created
+ 400:
+ $ref: '#/components/responses/400'
+ 401:
+ $ref: '#/components/responses/401'
+ 422:
+ $ref: '#/components/responses/422'
+ """
+ body = request.json or {}
+ views_data = body.get("views", [])
+ if not views_data:
+ return self.response_400(message="No views provided")
+
+ schema = SemanticViewPostSchema()
+ created = []
+ errors = []
+ for view_data in views_data:
+ try:
+ item = schema.load(view_data)
+ except ValidationError as error:
+ errors.append({"name": view_data.get("name"), "error":
error.messages})
+ continue
+ try:
+ new_model = CreateSemanticViewCommand(item).run()
+ created.append({"uuid": str(new_model.uuid), "name":
new_model.name})
+ except SemanticLayerNotFoundError:
+ errors.append(
+ {"name": view_data.get("name"), "error": "Semantic layer
not found"}
+ )
+ except SemanticViewCreateFailedError as ex:
+ logger.error(
+ "Error creating semantic view: %s",
+ str(ex),
+ exc_info=True,
+ )
+ errors.append({"name": view_data.get("name"), "error":
str(ex)})
+
+ result: dict[str, Any] = {"created": created}
+ if errors:
+ result["errors"] = errors
+ return self.response(201, result=result)
Review Comment:
**Suggestion:** The bulk create endpoint for semantic views always returns
HTTP 201, even when every requested view fails validation or creation and none
are actually created, which misleads clients into treating a complete failure
as success and contradicts the documented 422 error response for invalid
payloads. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Bulk semantic view creation reports success on total failure.
- ⚠️ Clients cannot rely on status codes for validation errors.
```
</details>
```suggestion
result: dict[str, Any] = {"created": created}
if errors:
result["errors"] = errors
status = 201 if created else 422
return self.response(status, result=result)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Start Superset with the SEMANTIC_LAYERS feature enabled so that
`SemanticViewRestApi`
is registered (see `superset/semantic_layers/api.py:166-175` where
`SemanticViewRestApi`
is defined and `resource_name = "semantic_view"`).
2. Send an HTTP POST request to the semantic view bulk-create endpoint
handled by
`SemanticViewRestApi.post` (`superset/semantic_layers/api.py:177-255`),
typically exposed
as `/api/v1/semantic_view/`, with a JSON body containing only invalid views,
for example:
`{"views": [{"name": "invalid", "semantic_layer_uuid": "non-existent"}]}`.
3. Inside `post`, each `view_data` is validated with
`SemanticViewPostSchema().load(view_data)` (`api.py:227-233`) and/or
`CreateSemanticViewCommand(item).run()` (`api.py:237-238`), which in turn
validates the
semantic layer UUID via `SemanticLayerDAO.find_by_uuid` in
`CreateSemanticViewCommand.validate`
(`superset/commands/semantic_layer/create.py:74-96`);
for an invalid UUID this raises `SemanticLayerNotFoundError` and the view is
not created,
so `created` remains an empty list while an entry is appended to `errors`
(`api.py:239-249`).
4. After processing all requested views, the code at
`superset/semantic_layers/api.py:251-254` executes:
`result: dict[str, Any] = {"created": created}` (with `created == []`),
attaches
`errors` if any, and calls `return self.response(201, result=result)`,
causing the
endpoint to return HTTP 201 Created even though no semantic views were
actually created
and the result payload only contains errors, contradicting the documented
possibility
of a 422 response (`api.py:212-220`) for invalid input.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/semantic_layers/api.py
**Line:** 251:254
**Comment:**
*Logic Error: The bulk create endpoint for semantic views always
returns HTTP 201, even when every requested view fails validation or creation
and none are actually created, which misleads clients into treating a complete
failure as success and contradicts the documented 422 error response for
invalid payloads.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38396&comment_hash=39b6b944514b3a2afb2271aee737c90f17e16f2365cc13b48316c54cc11d0db8&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38396&comment_hash=39b6b944514b3a2afb2271aee737c90f17e16f2365cc13b48316c54cc11d0db8&reaction=dislike'>👎</a>
##########
superset/commands/semantic_layer/delete.py:
##########
@@ -54,3 +56,26 @@ def validate(self) -> None:
self._model = SemanticLayerDAO.find_by_uuid(self._uuid)
if not self._model:
raise SemanticLayerNotFoundError()
+
+
+class DeleteSemanticViewCommand(BaseCommand):
+ def __init__(self, pk: int):
+ self._pk = pk
+ self._model: SemanticView | None = None
+
+ @transaction(
+ on_error=partial(
+ on_error,
+ catches=(SQLAlchemyError,),
+ reraise=SemanticViewDeleteFailedError,
+ )
+ )
+ def run(self) -> None:
+ self.validate()
+ assert self._model
+ SemanticViewDAO.delete([self._model])
+
+ def validate(self) -> None:
+ self._model = SemanticViewDAO.find_by_id(self._pk)
+ if not self._model:
+ raise SemanticViewNotFoundError()
Review Comment:
**Suggestion:** The delete command for semantic views looks up the model
using `SemanticViewDAO.find_by_id(self._pk)` but `SemanticViewDAO` is
configured with `id_column_name = "uuid"`, while the API passes an integer `pk`
corresponding to the `id` column; this mismatch means the lookup will always
fail and deletes will always return "not found" even for existing views. [logic
error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ SemanticViewRestApi.delete always returns 404 for existing views.
- ❌ Semantic view records cannot be deleted via REST API.
- ⚠️ Semantic view CRUD UI cannot remove obsolete views.
```
</details>
```suggestion
def validate(self) -> None:
# `pk` is the integer surrogate ID, so search by the `id` column
explicitly
self._model = SemanticViewDAO.find_by_id(self._pk, id_column="id")
if not self._model:
raise SemanticViewNotFoundError()
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Create a semantic view via the bulk create endpoint handled by
`SemanticViewRestApi.post` at `superset/semantic_layers/api.py:177-254`,
which ultimately
calls `CreateSemanticViewCommand` and persists a `SemanticView` row in
`semantic_views`
(model defined at `superset/semantic_layers/models.py:154-175` with `uuid`
as primary key
and separate integer `id` at line 164).
2. Note that the created row has both a UUID primary key
(`SemanticView.uuid`) and an
auto-incremented integer `id` (`SemanticView.id`), per the model definition
at
`superset/semantic_layers/models.py:163-165`.
3. Call the DELETE endpoint exposed on `SemanticViewRestApi` at
`superset/semantic_layers/api.py:329-360` (route `@expose("/<pk>",
methods=("DELETE",))`)
using the integer `pk` equal to the `SemanticView.id` from step 2. This
triggers
`SemanticViewRestApi.delete`, which calls
`DeleteSemanticViewCommand(pk).run()` at line
357.
4. Inside `DeleteSemanticViewCommand.run` in
`superset/commands/semantic_layer/delete.py:73-76`, `self.validate()` invokes
`DeleteSemanticViewCommand.validate` at lines 78-81, which calls
`SemanticViewDAO.find_by_id(self._pk)`. Because
`SemanticViewDAO.id_column_name = "uuid"`
(`superset/daos/semantic_layer.py:106-110`) and `BaseDAO.find_by_id`
defaults to that
column (`superset/daos/base.py:287-308`), the lookup compares the UUID
column to the
integer `pk` and returns `None`. `validate` then raises
`SemanticViewNotFoundError`, which
`SemanticViewRestApi.delete` catches at
`superset/semantic_layers/api.py:359-360`,
returning a 404 "not found" even though the view exists and the caller
provided the
correct integer `id`.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/commands/semantic_layer/delete.py
**Line:** 78:81
**Comment:**
*Logic Error: The delete command for semantic views looks up the model
using `SemanticViewDAO.find_by_id(self._pk)` but `SemanticViewDAO` is
configured with `id_column_name = "uuid"`, while the API passes an integer `pk`
corresponding to the `id` column; this mismatch means the lookup will always
fail and deletes will always return "not found" even for existing views.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38396&comment_hash=fa0b64fd941bf955505bbcf6ad6c699a5556222d1eadf8f0639374a1bb6ac30f&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38396&comment_hash=fa0b64fd941bf955505bbcf6ad6c699a5556222d1eadf8f0639374a1bb6ac30f&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]