codeant-ai-for-open-source[bot] commented on code in PR #38454:
URL: https://github.com/apache/superset/pull/38454#discussion_r2892206607
##########
superset/utils/oauth2.py:
##########
@@ -176,6 +176,171 @@ def refresh_oauth2_token(
return token.access_token
+def save_user_provider_token(
+ user_id: int,
+ provider: str,
+ token_response: dict[str, Any],
+) -> None:
+ """
+ Upsert a UserAuthToken record for the given user and login provider.
+
+ Called after a successful Superset OAuth login when the provider has
+ ``save_token: True`` set in ``OAUTH_PROVIDERS``.
+ """
+ # pylint: disable=import-outside-toplevel
+ from superset.models.core import UserAuthToken
+
+ token = (
+ db.session.query(UserAuthToken)
+ .filter_by(user_id=user_id, provider=provider)
+ .one_or_none()
+ )
+ if token is None:
+ token = UserAuthToken(user_id=user_id, provider=provider)
+
+ token.access_token = token_response.get("access_token")
+ expires_in = token_response.get("expires_in")
+ if expires_in is not None:
+ token.access_token_expiration = datetime.now() + timedelta(
+ seconds=int(expires_in)
+ )
+ else:
+ token.access_token_expiration = None
+ token.refresh_token = token_response.get("refresh_token")
+
+ db.session.add(token)
+ db.session.commit()
+
+
+def get_upstream_provider_token(provider: str, user_id: int) -> str | None:
+ """
+ Return a valid access token stored for the given login provider and user.
+
+ If the stored token is expired and a refresh token is available, the token
+ is refreshed and the new access token is returned. Returns ``None`` when
no
+ token is found or when the token cannot be refreshed.
+ """
+ # pylint: disable=import-outside-toplevel
+ from superset.models.core import UserAuthToken
+
+ token: UserAuthToken | None = (
+ db.session.query(UserAuthToken)
+ .filter_by(user_id=user_id, provider=provider)
+ .one_or_none()
+ )
+ if token is None:
+ return None
+
+ if token.access_token and (
+ token.access_token_expiration is None
+ or datetime.now() < token.access_token_expiration
+ ):
+ return token.access_token
+
+ if token.refresh_token:
+ return _refresh_upstream_provider_token(token, provider)
+
+ # Expired with no refresh token โ discard stale record
+ db.session.delete(token)
+ db.session.commit()
+ return None
+
+
+def _refresh_upstream_provider_token(
+ token: "UserAuthToken",
+ provider: str,
+) -> str | None:
+ """
+ Refresh an upstream provider token using the stored refresh token.
+
+ Looks up the provider's token endpoint via Flask-AppBuilder's remote app
+ registry, issues a refresh-grant request, persists the new token, and
+ returns the new access token (or ``None`` if the refresh fails).
+ """
+ import requests # pylint: disable=import-outside-toplevel
+
+ provider_configs: list[dict[str, Any]] = app.config.get("OAUTH_PROVIDERS",
[])
+ provider_config = next(
+ (p for p in provider_configs if p.get("name") == provider), None
+ )
+ if not provider_config:
+ logger.warning(
+ "Cannot refresh upstream token: provider '%s' not found in
OAUTH_PROVIDERS",
+ provider,
+ )
+ return None
+
+ # Retrieve the token endpoint from the registered remote app's server
metadata
+ from flask_appbuilder import current_app as fab_app # pylint:
disable=import-outside-toplevel
+
+ sm = fab_app.appbuilder.sm
Review Comment:
**Suggestion:** The security manager attribute used to look up the OAuth
remote app is incorrect, so the code will always fall back to the empty
default, never find a remote app, and therefore never be able to refresh
upstream provider tokens, leaving users with permanently expired tokens. [logic
error]
<details>
<summary><b>Severity Level:</b> Critical ๐จ</summary>
```mdx
- โ Upstream OAuth tokens never refreshed; expire after first lifetime.
- โ ๏ธ Databases using upstream tokens fall back or fail auth.
```
</details>
```suggestion
remote_app = getattr(sm, "oauth_remotes", {}).get(provider)
```
<details>
<summary><b>Steps of Reproduction โ
</b></summary>
```mdx
1. Configure an OAuth login provider with token saving enabled in
`superset/config.py`,
e.g.:
`OAUTH_PROVIDERS = [{"name": "keycloak", "remote_app": {...},
"save_token": True}]`
and map it for a database via `DATABASE_OAUTH2_UPSTREAM_PROVIDERS =
{"trino":
"keycloak"}`.
2. Start Superset and log in via this provider; the overridden
`auth_user_oauth()` in
`superset/security/manager.py` (added in this PR, exact line not shown)
calls
`save_user_provider_token()` in `superset/utils/oauth2.py:179-213`,
persisting a
`UserAuthToken` row in `superset/models/core.py:1370-1390` with both
`access_token`
and `refresh_token`.
3. After the `user_auth_tokens.access_token_expiration` column has passed
(or after
manually expiring it in the database), issue a query against the mapped
Trino
database from SQL Lab or a dashboard, which triggers
`Database._get_sqla_engine()`
in `superset/models/core.py` (exact line not shown) and, because an
upstream
provider is configured, it calls `get_upstream_provider_token(provider,
user_id)`
in `superset/utils/oauth2.py:215-246`.
4. Inside `get_upstream_provider_token()`, the expired token with a nonโnull
`refresh_token` causes a call to `_refresh_upstream_provider_token(token,
provider)`
in `superset/utils/oauth2.py:249-341`, where
`remote_app = getattr(sm, "oauth_remoteapp", {}).get(provider)` executes.
`SupersetSecurityManager` actually exposes `sm.oauth_remotes` (not
`oauth_remoteapp`),
so `getattr(sm, "oauth_remoteapp", {})` returns `{}`, `remote_app` is
`None`,
the function logs "remote app not registered" and returns `None`. As a
result,
`get_upstream_provider_token()` returns `None`, the upstream token is
never
refreshed, and the database connection must fall back to a second OAuth
dance
(or fails if no DBโlevel OAuth is configured).
```
</details>
<details>
<summary><b>Prompt for AI Agent ๐ค </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/oauth2.py
**Line:** 276:276
**Comment:**
*Logic Error: The security manager attribute used to look up the OAuth
remote app is incorrect, so the code will always fall back to the empty
default, never find a remote app, and therefore never be able to refresh
upstream provider tokens, leaving users with permanently expired tokens.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=1105379d2cd291d05d68aa655a0c10877d2b6c900a41d7fa58fb06cab82009d4&reaction=like'>๐</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=1105379d2cd291d05d68aa655a0c10877d2b6c900a41d7fa58fb06cab82009d4&reaction=dislike'>๐</a>
##########
superset/models/core.py:
##########
@@ -509,17 +510,23 @@ def _get_sqla_engine( # pylint: disable=too-many-locals
# noqa: C901
if user and user.email:
effective_username = user.email.split("@")[0]
- oauth2_config = self.get_oauth2_config()
- access_token = (
- get_oauth2_access_token(
- oauth2_config,
- self.id,
- g.user.id,
- self.db_engine_spec,
+ upstream_provider = app.config.get(
+ "DATABASE_OAUTH2_UPSTREAM_PROVIDERS", {}
+ ).get(self.db_engine_spec.engine_name)
+ if upstream_provider and hasattr(g, "user") and hasattr(g.user, "id"):
+ access_token = get_upstream_provider_token(upstream_provider,
g.user.id)
+ else:
+ oauth2_config = self.get_oauth2_config()
+ access_token = (
+ get_oauth2_access_token(
+ oauth2_config,
+ self.id,
+ g.user.id,
+ self.db_engine_spec,
+ )
+ if oauth2_config and hasattr(g, "user") and hasattr(g.user,
"id")
+ else None
Review Comment:
**Suggestion:** The code uses `self.db_engine_spec.engine_name` to look up
the upstream OAuth provider, but engine specs expose the engine identifier via
the `engine` attribute, so this will either raise an AttributeError or fail to
match the configured key (e.g. "trino"), causing the upstream token forwarding
configuration to be ignored and potentially breaking database connections that
rely on it. [logic error]
<details>
<summary><b>Severity Level:</b> Critical ๐จ</summary>
```mdx
- โ ๏ธ Upstream login token never forwarded to configured databases.
- โ ๏ธ Trino OAuth2 reuse broken; extra OAuth dance required.
- โ ๏ธ New `DATABASE_OAUTH2_UPSTREAM_PROVIDERS` config effectively ignored.
- โ ๏ธ SQL Lab and charts lose intended single-sign-on behavior.
```
</details>
```suggestion
upstream_provider = app.config.get(
"DATABASE_OAUTH2_UPSTREAM_PROVIDERS", {}
).get(self.db_engine_spec.engine)
if upstream_provider and hasattr(g, "user") and hasattr(g.user,
"id"):
access_token = get_upstream_provider_token(upstream_provider,
g.user.id)
else:
oauth2_config = self.get_oauth2_config()
access_token = (
get_oauth2_access_token(
oauth2_config,
self.id,
g.user.id,
self.db_engine_spec,
)
if oauth2_config and hasattr(g, "user") and hasattr(g.user,
"id")
else None
)
```
<details>
<summary><b>Steps of Reproduction โ
</b></summary>
```mdx
1. Configure an upstream login provider mapping in `superset/config.py`,
using the engine
identifier as documented in the PR description, e.g.
`DATABASE_OAUTH2_UPSTREAM_PROVIDERS =
{"trino": "my_oidc_provider"}`.
2. Ensure there is a Trino database in Superset (row in `dbs` mapped to
`TrinoEngineSpec`), so that `Database.db_engine_spec` is that spec and its
`engine`
attribute is `"trino"` while its `engine_name` is a human-readable name like
`"Trino"`
(defined on the engine spec class in `superset/db_engine_specs/trino.py`).
3. Log in to Superset via the configured OAuth provider so that a login
access token is
stored (the rest of this PR wires that into the new `UserAuthToken` model in
`superset/models/core.py` and the OAuth utilities).
4. Run any query against the Trino database (for example via SQL Lab or a
chart), which
calls `Database.get_sqla_engine()` and then `Database._get_sqla_engine()` in
`superset/models/core.py`. At lines 513โ519, `upstream_provider` is computed
with
`.get(self.db_engine_spec.engine_name)`, so the code looks up `"Trino"` in
`DATABASE_OAUTH2_UPSTREAM_PROVIDERS` instead of `"trino"`, returns `None`,
and falls back
to the per-database OAuth2 flow (`get_oauth2_access_token`) instead of
forwarding the
upstream login token. The upstream-token-forwarding feature therefore never
activates,
even though the configuration key `"trino"` matches
`self.db_engine_spec.engine`.
```
</details>
<details>
<summary><b>Prompt for AI Agent ๐ค </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/models/core.py
**Line:** 513:528
**Comment:**
*Logic Error: The code uses `self.db_engine_spec.engine_name` to look
up the upstream OAuth provider, but engine specs expose the engine identifier
via the `engine` attribute, so this will either raise an AttributeError or fail
to match the configured key (e.g. "trino"), causing the upstream token
forwarding configuration to be ignored and potentially breaking database
connections that rely on it.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=140aba91fdc5065b9e5fd9d615a359c004807fe971ccce3e077eb006a9dffa84&reaction=like'>๐</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=140aba91fdc5065b9e5fd9d615a359c004807fe971ccce3e077eb006a9dffa84&reaction=dislike'>๐</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]