codeant-ai-for-open-source[bot] commented on code in PR #37067:
URL: https://github.com/apache/superset/pull/37067#discussion_r2718586758
##########
superset/db_engine_specs/base.py:
##########
@@ -499,6 +510,8 @@ def start_oauth2_dance(cls, database: Database) -> None:
# UUID to the original tab, and the second tab will use it when
sending the
# message.
"tab_id": tab_id,
+ # PKCE code verifier stored in state to be retrieved during token
exchange
+ "code_verifier": code_verifier,
Review Comment:
**Suggestion:** The PKCE code verifier is being embedded into the OAuth2
`state` payload, which is a signed but unencrypted JWT sent as a query
parameter; this allows any interceptor who can see the authorization code to
also decode the state and obtain the `code_verifier`, defeating the security
guarantees of PKCE and re‑enabling authorization code interception attacks.
[security]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Database OAuth2 authentication compromise risk
- ❌ OAuth2 redirect flow (DatabaseRestApi.oauth2) security degraded
- ⚠️ Token exchange endpoint can be abused if intercepted
- ⚠️ Unit tests currently assume state contains code_verifier
```
</details>
```suggestion
# PKCE code verifier is managed server-side and not stored in
OAuth2 state
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Trigger a DB OAuth2 authorization flow by calling
BaseEngineSpec.start_oauth2_dance
(superset/db_engine_specs/base.py:472-516). This method generates a PKCE
code_verifier
(base.py:490) and inserts it into the OAuth2 state dict under the
"code_verifier" key
(base.py:513-514).
2. The state dict is encoded and sent to the provider via
BaseEngineSpec.get_oauth2_authorization_uri, which calls
encode_oauth2_state(state)
(base.py:572-576 -> encode_oauth2_state defined in
superset/utils/oauth2.py:152). The
code_verifier is therefore embedded into the encoded state that becomes a
query parameter
on the authorization request URL.
3. Inspect superset/utils/oauth2.py: around lines 152-165 —
encode_oauth2_state includes
state keys in the JWT payload and (per grep results) will add
"code_verifier" to the
encoded payload when present (utils/oauth2.py:163 shows
payload["code_verifier"] =
state["code_verifier"]). Because the state is a signed JWT but not
encrypted, any party
that can observe the authorization request/redirect (e.g., network traffic,
logs, browser
history, referrers) and can decode the JWT (signed but not encrypted) can
obtain the
code_verifier.
4. An attacker who intercepts the authorization code and the state (both are
returned to
the redirect URI) can extract the code_verifier from the state and use it to
redeem the
intercepted authorization code at the token endpoint. The token exchange
path that accepts
a code_verifier exists (BaseEngineSpec.get_oauth2_token, base.py:589-614)
and is called by
the server-side command that completes the flow
(superset/commands/database/oauth2.py:50-58 passes code_verifier from state
into token
request). This demonstrates the concrete end-to-end leak: start_oauth2_dance
stores
code_verifier in state -> encode_oauth2_state places it in JWT payload ->
token exchange
consumes it from the received state.
5. Conclusion / why this is a problem: Embedding the PKCE secret
(code_verifier) into the
client-visible state defeats PKCE's protection model; PKCE requires the
verifier to remain
confidential between the client and the authorization server exchange step.
The current
implementation places that secret into a signed-but-unencrypted state JWT
that accompanies
the authorization code, enabling an interceptor to both obtain the
authorization code and
the secret needed to redeem it.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/base.py
**Line:** 513:514
**Comment:**
*Security: The PKCE code verifier is being embedded into the OAuth2
`state` payload, which is a signed but unencrypted JWT sent as a query
parameter; this allows any interceptor who can see the authorization code to
also decode the state and obtain the `code_verifier`, defeating the security
guarantees of PKCE and re‑enabling authorization code interception attacks.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]