codeant-ai-for-open-source[bot] commented on code in PR #40746:
URL: https://github.com/apache/superset/pull/40746#discussion_r3407770291
##########
superset/mcp_service/user/schemas.py:
##########
@@ -278,7 +306,11 @@ def serialize_user_object(
user_roles = getattr(user, "roles", None)
if user_roles is not None:
try:
- roles = [r.name for r in user_roles if hasattr(r, "name")]
+ roles = [
+ escape_llm_context_delimiters(r.name)
+ for r in user_roles
+ if hasattr(r, "name") and isinstance(r.name, str)
+ ]
Review Comment:
**Suggestion:** The role extraction in `serialize_user_object` is
all-or-nothing: if any single role object raises `DetachedInstanceError` or
`AttributeError`, the list comprehension aborts and `roles` is set to `None`,
dropping all otherwise valid role names. Handle exceptions per role item (like
the new `UserInfo` validator does) so one bad role does not erase the full role
list. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ get_user_info may omit valid roles when one detaches.
- ⚠️ list_users may redact all roles due to one bad role.
- ⚠️ LLM clients get incomplete permission context for affected users.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open `superset/mcp_service/user/schemas.py` and locate
`serialize_user_object` at lines
291–293, and the roles comprehension and exception handler at lines 45–56
(diff lines
304–316).
2. In `tests/unit_tests/mcp_service/user/test_schemas.py`, note how
`DetachedRole` is
defined in `test_user_info_ignores_role_with_detached_instance` at lines
62–67 to simulate
an ORM role whose `.name` property raises `DetachedInstanceError`.
3. Create a FAB-like user object in a test (or REPL) with `user.roles =
[role_good,
role_detached]`, where `role_good.name == "Admin"` and `role_detached` is the
`DetachedRole` described above, then call `serialize_user_object(user,
include_sensitive=True, include_roles=True)` from
`superset/mcp_service/user/schemas.py:291–293`.
4. When the comprehension at diff lines 309–313 executes, accessing
`role_detached.name`
raises `DetachedInstanceError`, causing the `except (AttributeError,
DetachedInstanceError)` block at diff line 314–315 to run and set `roles =
None`, so the
returned `UserInfo.roles` (used by `get_user_info` at
`superset/mcp_service/user/tool/get_user_info.py:16–18` and `list_users` at
`superset/mcp_service/user/tool/list_users.py:12–19`) drops the valid
`"Admin"` role
instead of preserving it alongside skipping only the bad role.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=6e3d30f898954f619e2b9481be1e848d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=6e3d30f898954f619e2b9481be1e848d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/mcp_service/user/schemas.py
**Line:** 309:313
**Comment:**
*Logic Error: The role extraction in `serialize_user_object` is
all-or-nothing: if any single role object raises `DetachedInstanceError` or
`AttributeError`, the list comprehension aborts and `roles` is set to `None`,
dropping all otherwise valid role names. Handle exceptions per role item (like
the new `UserInfo` validator does) so one bad role does not erase the full role
list.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40746&comment_hash=bbfe0c6aca196f9522244acecb18730da4d64d42630335a9d3503d179f3a1415&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40746&comment_hash=bbfe0c6aca196f9522244acecb18730da4d64d42630335a9d3503d179f3a1415&reaction=dislike'>👎</a>
##########
tests/unit_tests/mcp_service/user/test_schemas.py:
##########
@@ -0,0 +1,179 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Unit tests for user-related MCP schemas."""
+
+from unittest.mock import MagicMock
+
+import pytest
+from pydantic import ValidationError
+from sqlalchemy.orm.exc import DetachedInstanceError
+
+from superset.mcp_service.user.schemas import (
+ sanitize_for_llm_context,
+ serialize_user_object,
+ UserInfo,
+)
+
+
+def test_user_info_rejects_bare_string_for_roles() -> None:
+ """A plain string must not be silently split into individual characters."""
+ with pytest.raises(ValidationError):
+ UserInfo(roles="Admin")
+
+
+def test_user_info_preserves_empty_roles_list() -> None:
+ """Empty roles should remain [] so callers can distinguish it from None."""
+ info = UserInfo(roles=[])
+ assert info.roles == []
+
+
+def test_user_info_coerces_role_objects_to_names() -> None:
+ """Role-like ORM objects must be converted to their .name strings."""
+ role_admin = MagicMock()
+ role_admin.name = "Admin"
+ role_alpha = MagicMock()
+ role_alpha.name = "Alpha"
+
+ info = UserInfo(roles=[role_admin, role_alpha])
+
+ assert info.roles == ["Admin", "Alpha"]
+
+
+def test_user_info_ignores_role_with_detached_instance() -> None:
+ """Detached ORM roles must not blow up serialization."""
+ role_good = MagicMock()
+ role_good.name = "Admin"
+
+ class DetachedRole:
+ @property
+ def name(self):
+ raise DetachedInstanceError()
+
+ role_detached = DetachedRole()
+
+ info = UserInfo(roles=[role_good, role_detached])
+
+ assert info.roles == ["Admin"]
+
+
+def test_serialize_user_object_round_trip_with_empty_roles() -> None:
+ """serialize_user_object must produce UserInfo.roles == [] for empty
roles."""
+ user = MagicMock()
+ user.id = 1
+ user.username = "admin"
+ user.first_name = "Admin"
+ user.last_name = "User"
+ user.active = True
+ user.email = "[email protected]"
+ user.changed_on = None
+ user.roles = []
+
+ info = serialize_user_object(user, include_sensitive=True,
include_roles=True)
+
+ assert info is not None
+ assert info.roles == []
+ assert info.username == "admin"
+ assert info.first_name == sanitize_for_llm_context(
+ "Admin", field_path=("first_name",)
+ )
+ assert info.last_name == sanitize_for_llm_context(
+ "User", field_path=("last_name",)
+ )
+ assert info.active is True
+ assert info.email == "[email protected]"
+
+
+def test_serialize_user_object_round_trip_with_role_objects() -> None:
+ """Full from_attributes path through serialize_user_object -> UserInfo."""
+ role_admin = MagicMock()
+ role_admin.name = "Admin"
+
+ user = MagicMock()
+ user.id = 1
+ user.username = "admin"
+ user.first_name = "Admin"
+ user.last_name = "User"
+ user.active = True
+ user.email = "[email protected]"
+ user.changed_on = None
+ user.roles = [role_admin]
+
+ info = serialize_user_object(user, include_sensitive=True,
include_roles=True)
+
+ assert info is not None
+ assert info.roles == ["Admin"]
+ assert info.username == "admin"
+ assert info.first_name == sanitize_for_llm_context(
+ "Admin", field_path=("first_name",)
+ )
+ assert info.last_name == sanitize_for_llm_context(
+ "User", field_path=("last_name",)
+ )
+ assert info.active is True
+ assert info.email == "[email protected]"
+
+
+def test_serialize_user_object_skips_roles_when_include_roles_false() -> None:
+ """serialize_user_object must return roles=None when
include_roles=False."""
+ role_admin = MagicMock()
+ role_admin.name = "Admin"
+
+ user = MagicMock()
+ user.id = 1
+ user.username = "admin"
+ user.first_name = "Admin"
+ user.last_name = "User"
+ user.active = True
+ user.email = "[email protected]"
+ user.changed_on = None
+ user.roles = [role_admin]
+
+ info = serialize_user_object(user, include_sensitive=True,
include_roles=False)
+
+ assert info is not None
+ assert info.roles is None
+ assert info.email == "[email protected]"
+
+
+def test_serialize_user_object_skips_email_when_include_sensitive_false() ->
None:
+ """serialize_user_object must return email=None when
include_sensitive=False."""
+ role_admin = MagicMock()
+ role_admin.name = "Admin"
+
+ user = MagicMock()
+ user.id = 1
+ user.username = "admin"
+ user.first_name = "Admin"
+ user.last_name = "User"
+ user.active = True
+ user.email = "[email protected]"
+ user.changed_on = None
+ user.roles = [role_admin]
+
+ info = serialize_user_object(user, include_sensitive=False,
include_roles=True)
+
+ assert info is not None
+ assert info.email is None
+ assert info.roles == ["Admin"]
Review Comment:
**Suggestion:** This assertion contradicts the serializer contract and
current implementation: when `include_sensitive=False`, both sensitive fields
(`email` and `roles`) are redacted. Expecting `roles == ["Admin"]` will cause
this test to fail and enforces incorrect behavior. Update the expectation to
`roles is None`. [logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Unit test fails enforcing behavior opposite documented contract.
- ⚠️ Confuses whether roles are treated as sensitive metadata.
- ⚠️ Can block CI for MCP user schema changes.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open `superset/mcp_service/user/schemas.py` and inspect
`serialize_user_object` at
lines 291–293 and 45–71 (diff lines 291–317): `roles` is only populated when
`include_sensitive and include_roles` is true; otherwise `roles` remains
`None` and is
passed as such to `UserInfo`.
2. Open `tests/unit_tests/mcp_service/user/tool/test_user_tools.py` and note
`test_get_user_info_redacts_sensitive_when_denied` at lines 87–105, which
asserts
`data["email"] is None` and `data["roles"] is None` when
`user_can_view_data_model_metadata` is patched to return `False`, confirming
the contract
that roles are sensitive and redacted with `include_sensitive=False`.
3. Open `tests/unit_tests/mcp_service/user/test_schemas.py` and locate
`test_serialize_user_object_skips_email_when_include_sensitive_false` around
diff lines
153–172, where a user with a single `"Admin"` role is passed to
`serialize_user_object(user, include_sensitive=False, include_roles=True)`.
4. Run `pytest
tests/unit_tests/mcp_service/user/test_schemas.py::test_serialize_user_object_skips_email_when_include_sensitive_false`;
the function returns a `UserInfo` with `info.email is None` and `info.roles
is None` per
the implementation in `superset/mcp_service/user/schemas.py`, causing the
assertion at
diff line 172 (`assert info.roles == ["Admin"]`) to fail and incorrectly
demand behavior
that contradicts both the serializer docstring and the tool-level privacy
tests.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=7e2b3279a5ab41cbb0804f45719e9f0f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=7e2b3279a5ab41cbb0804f45719e9f0f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** tests/unit_tests/mcp_service/user/test_schemas.py
**Line:** 172:172
**Comment:**
*Logic Error: This assertion contradicts the serializer contract and
current implementation: when `include_sensitive=False`, both sensitive fields
(`email` and `roles`) are redacted. Expecting `roles == ["Admin"]` will cause
this test to fail and enforces incorrect behavior. Update the expectation to
`roles is None`.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40746&comment_hash=0e8d3b7156bf67945e92e012e5835561839ed42d118a5dde28f1413e0e6eeec3&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40746&comment_hash=0e8d3b7156bf67945e92e012e5835561839ed42d118a5dde28f1413e0e6eeec3&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]