Re: [PR] fix(mcp): generic auth errors, required token expiry, and safer auth logging [superset]

via GitHub Mon, 01 Jun 2026 17:27:14 -0700


codeant-ai-for-open-source[bot] commented on code in PR #40646:
URL: https://github.com/apache/superset/pull/40646#discussion_r3337894066



##########
superset/mcp_service/middleware.py:
##########
@@ -511,7 +511,7 @@ async def on_list_tools(
                 try:
                     user = get_user_from_request()
                 except ValueError as exc:
-                    if "No authenticated user found" in str(exc):
+                    if "Authentication required" in str(exc):

Review Comment:
   **Suggestion:** The fail-open branch is keyed off a substring match on 
exception text, which is attacker-influenced in some `ValueError` paths (for 
example, JWT-derived usernames included in "user not found" messages). A 
crafted credential error containing this phrase will be misclassified as "no 
auth configured" and incorrectly return the full tools list. Replace this 
message-substring check with a stable signal (dedicated exception type, 
structured error code, or exact sentinel constant from the auth layer) so 
invalid credentials always stay fail-closed. [security]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Invalid JWT user can still see full tools/list output.
   - ⚠️ Unauthorized actors can enumerate MCP tools and metadata.
   - ⚠️ Violates intended fail-closed behavior for bad credentials.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Start the MCP server so that the middleware stack built in
   `superset/mcp_service/server.py:699-711` is active; this stack includes
   `RBACToolVisibilityMiddleware`, which filters every `tools/list` request.
   
   2. Enable JWT auth so `_resolve_user_from_jwt_context()` is used by
   `get_user_from_request()` (`superset/mcp_service/auth.py:259-55`), and issue 
an access
   token whose username claim (e.g., `preferred_username` or `sub`) is the 
string
   `"Authentication required. No valid credentials provided."` — this value is 
returned
   unchanged by `default_user_resolver()` in 
`superset/mcp_service/mcp_config.py:8-27`.
   
   3. Ensure there is no Superset user with that username so that
   `_resolve_user_from_jwt_context()` raises a `ValueError` at
   `superset/mcp_service/auth.py:307-311` with the message `JWT authenticated 
user
   'Authentication required. No valid credentials provided.' not found in 
Superset database.
   Ensure the user exists before granting MCP access.`; note that the 
attacker-controlled
   username is interpolated into the error string.
   
   4. From an MCP client, call `tools/list` using this JWT; the request reaches
   `RBACToolVisibilityMiddleware.on_list_tools()` at
   `superset/mcp_service/middleware.py:498-523`, where 
`get_user_from_request()` raises the
   `ValueError`, `except ValueError as exc:` is entered, and the substring 
check `if
   "Authentication required" in str(exc):` at line 514 evaluates True because 
the injected
   username contains that phrase. The middleware incorrectly treats this as the 
"no auth
   configured" fail-open case and returns the full `tools` list instead of 
failing closed
   (returning `[]`) for invalid credentials.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=e1dd35732eaf480a96103b5d4b6d8211&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=e1dd35732eaf480a96103b5d4b6d8211&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/middleware.py
   **Line:** 514:514
   **Comment:**
        *Security: The fail-open branch is keyed off a substring match on 
exception text, which is attacker-influenced in some `ValueError` paths (for 
example, JWT-derived usernames included in "user not found" messages). A 
crafted credential error containing this phrase will be misclassified as "no 
auth configured" and incorrectly return the full tools list. Replace this 
message-substring check with a stable signal (dedicated exception type, 
structured error code, or exact sentinel constant from the auth layer) so 
invalid credentials always stay fail-closed.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40646&comment_hash=65cdfd91479f2ff29ef81308aa971d3b3cb848029779dd07948110d2d1fe8187&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40646&comment_hash=65cdfd91479f2ff29ef81308aa971d3b3cb848029779dd07948110d2d1fe8187&reaction=dislike'>👎</a>



##########
superset/mcp_service/auth.py:
##########
@@ -421,29 +421,28 @@ def get_user_from_request() -> User:
     if hasattr(g, "user") and g.user:
         return g.user
 
-    # No auth source available — raise with diagnostic details
+    # No auth source available. Keep the client-facing message generic so it
+    # does not disclose server configuration; the detailed diagnostics are
+    # logged server-side only.
     auth_enabled = current_app.config.get("MCP_AUTH_ENABLED", False)
     jwt_configured = bool(
         current_app.config.get("MCP_JWKS_URI")
         or current_app.config.get("MCP_JWT_PUBLIC_KEY")
         or current_app.config.get("MCP_JWT_SECRET")
     )
-    details = [
-        f"No JWT access token in MCP request context "
-        f"(MCP_AUTH_ENABLED={auth_enabled}, "
-        f"JWT keys configured={jwt_configured})",
-        "No API key in Authorization header",
-        "MCP_DEV_USERNAME is not configured",
-        "g.user was not set by external middleware",
-    ]
-    configured_prefixes = current_app.config.get("FAB_API_KEY_PREFIXES", 
["sst_"])
-    prefix_example = configured_prefixes[0] if configured_prefixes else "sst_"
-    raise ValueError(
-        "No authenticated user found. Tried:\n"
-        + "\n".join(f"  - {d}" for d in details)
-        + f"\n\nEither pass a valid API key (Bearer {prefix_example}...), "
-        "JWT token, or configure MCP_DEV_USERNAME for development."
+    dev_username_configured = bool(current_app.config.get("MCP_DEV_USERNAME"))
+    logger.warning(
+        "MCP authentication failed: no valid credentials provided "
+        "(no JWT access token, no API key, no g.user from middleware)"
     )

Review Comment:
   **Suggestion:** This logs expected unauthenticated requests at WARNING, 
which conflicts with the existing fail-open/no-noise flow for `tools/list` and 
will produce noisy warning spam in normal dev/internal deployments. Lower this 
to DEBUG (or gate WARNING to true credential failures only) so only real auth 
failures trigger warning-level alerts. [incorrect log level]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ tools/list RBAC middleware emits spurious auth warning logs.
   - ⚠️ Dev/internal MCP deployments see noisy warning-level auth spam.
   - ⚠️ Alerting on WARNING may fire on normal unauthenticated traffic.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Start the MCP server with no auth sources configured: `MCP_AUTH_ENABLED` 
False or
   unset, no `MCP_JWKS_URI`/`MCP_JWT_PUBLIC_KEY`/`MCP_JWT_SECRET`, no 
`MCP_DEV_USERNAME`, and
   no external middleware setting `g.user` (affects `get_user_from_request()` at
   `superset/mcp_service/auth.py:379`–:96).
   
   2. From an MCP client (e.g. Claude, Cursor), trigger a `tools/list` call, 
which hits
   `RBACToolVisibilityMiddleware.on_list_tools()` in
   `superset/mcp_service/middleware.py:19`–:57`. This method calls 
`get_user_from_request()`
   at `middleware.py:33`.
   
   3. Inside `get_user_from_request()` (auth.py:53–96), JWT and API-key 
resolvers return
   `None`, `MCP_DEV_USERNAME` is unset, and `g.user` is absent, so execution 
reaches the
   fallback at auth.py:75–96. There it executes the added `logger.warning("MCP 
authentication
   failed: no valid credentials provided ...")` at approximately `auth.py:85` 
before raising
   `ValueError("Authentication required. No valid credentials provided.")`.
   
   4. Back in `RBACToolVisibilityMiddleware.on_list_tools()`, the `ValueError` 
is caught at
   `middleware.py:34`–:45`. Because the message contains `"Authentication 
required"`, the
   branch at `middleware.py:35`–:38` treats this as an expected 
unauthenticated/dev scenario
   and returns the original tool list **without logging anything**. However, 
the warning from
   step 3 has already been emitted, so every unauthenticated `tools/list` call 
in
   dev/internal deployments produces a WARNING-level log entry despite being an 
expected,
   fail-open path.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=abe2bc4c53644fa99cf747124acf7851&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=abe2bc4c53644fa99cf747124acf7851&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/auth.py
   **Line:** 434:437
   **Comment:**
        *Incorrect Log Level: This logs expected unauthenticated requests at 
WARNING, which conflicts with the existing fail-open/no-noise flow for 
`tools/list` and will produce noisy warning spam in normal dev/internal 
deployments. Lower this to DEBUG (or gate WARNING to true credential failures 
only) so only real auth failures trigger warning-level alerts.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40646&comment_hash=68b335be2d9a07c987bde283a4a33fc818102a153d14465b7d4acb8dd92b2cfe&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40646&comment_hash=68b335be2d9a07c987bde283a4a33fc818102a153d14465b7d4acb8dd92b2cfe&reaction=dislike'>👎</a>



##########
superset/mcp_service/utils/error_sanitization.py:
##########
@@ -63,8 +66,24 @@ def _get_generic_error_message(error_str: str) -> str | None:
     return None
 
 
-def _sanitize_validation_error(error: Exception) -> str:
-    """SECURITY FIX: Sanitize validation errors to prevent disclosure."""
+def _sanitize_validation_error(error: Exception, log_original: bool = True) -> 
str:
+    """SECURITY FIX: Sanitize validation errors to prevent disclosure.
+
+    Args:
+        error: The original exception to sanitize for client-facing output.
+        log_original: When True (default), log the original (unsanitized)
+            error server-side at INFO level before returning the sanitized
+            version. This preserves full diagnostics for operators while the
+            client only ever receives the sanitized message. Set to False to
+            suppress the server-side log (e.g. when the caller already logged).
+    """
+    if log_original:
+        logger.info(
+            "Sanitizing validation error (%s): %s",
+            type(error).__name__,
+            error,
+        )

Review Comment:
   **Suggestion:** The new server-side logging writes the raw exception text 
before any escaping, so attacker-controlled values containing newlines/tabs can 
inject forged log lines and break log integrity. Sanitize control characters 
(like the JWT log sanitizer does) before logging the original message, or log a 
structured/escaped representation. [security]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Validation pipeline logs attacker-influenced errors without escaping.
   - ⚠️ generate_chart tool errors can inject multi-line messages into logs.
   - ⚠️ Log-line structure for MCP validation errors can be corrupted.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Run the chart validation pipeline through the MCP generate-chart tool: an 
MCP client
   calls `generate_chart` in 
`superset/mcp_service/chart/tool/generate_chart.py:98`–:169`,
   which invokes 
`ValidationPipeline.validate_request_with_warnings(request.model_dump())` at
   `generate_chart.py:16`–:18`.
   
   2. Inside `ValidationPipeline.validate_request_with_warnings()`
   (`superset/mcp_service/chart/validation/pipeline.py:31`–:101`), if any 
unexpected
   exception is raised in schema/dataset/runtime validation or normalization, 
the `except
   Exception as e:` block at `pipeline.py:87`–:101` is entered. This block 
first logs
   `logger.exception("Validation pipeline error")` and then calls
   `_sanitize_validation_error(e)` at `pipeline.py:93`–:94`.
   
   3. `_sanitize_validation_error()` in
   `superset/mcp_service/utils/error_sanitization.py:69`–:128` is invoked with 
its default
   `log_original=True`. Before any sanitization, it executes the new logging 
block at lines
   80–85: `logger.info("Sanitizing validation error (%s): %s", 
type(error).__name__, error)`,
   which interpolates `str(error)` directly into the log message without 
escaping control
   characters.
   
   4. The unit test `test_sanitize_logs_original_by_default` in
   
`superset/tests/unit_tests/mcp_service/utils/test_error_sanitization.py:13`–:29`
 confirms
   this behavior: it calls `_sanitize_validation_error(ValueError("Invalid 
reference to table
   secret_revenue in query"))`, then inspects `caplog.records` and asserts that 
the INFO log
   message contains the raw substring `"secret_revenue"`. If the `error` 
message instead
   contained attacker-controlled newlines or tabs (e.g. from a Pydantic or 
database error
   that embeds user-supplied text), those control characters would also be 
logged unescaped
   at INFO level, allowing crafted exception text to inject extra lines or 
confuse
   line-oriented log parsers.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=0a7136850c6948f883031b0f92b2098e&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=0a7136850c6948f883031b0f92b2098e&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/mcp_service/utils/error_sanitization.py
   **Line:** 80:85
   **Comment:**
        *Security: The new server-side logging writes the raw exception text 
before any escaping, so attacker-controlled values containing newlines/tabs can 
inject forged log lines and break log integrity. Sanitize control characters 
(like the JWT log sanitizer does) before logging the original message, or log a 
structured/escaped representation.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40646&comment_hash=c8f3ec967e4d83f25e87218e93d2e6bd71af67ae10a152f4e04e0dd895fd0576&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40646&comment_hash=c8f3ec967e4d83f25e87218e93d2e6bd71af67ae10a152f4e04e0dd895fd0576&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix(mcp): generic auth errors, required token expiry, and safer auth logging [superset]

Reply via email to