codeant-ai-for-open-source[bot] commented on code in PR #38454:
URL: https://github.com/apache/superset/pull/38454#discussion_r2892206607


##########
superset/utils/oauth2.py:
##########
@@ -176,6 +176,171 @@ def refresh_oauth2_token(
     return token.access_token
 
 
+def save_user_provider_token(
+    user_id: int,
+    provider: str,
+    token_response: dict[str, Any],
+) -> None:
+    """
+    Upsert a UserAuthToken record for the given user and login provider.
+
+    Called after a successful Superset OAuth login when the provider has
+    ``save_token: True`` set in ``OAUTH_PROVIDERS``.
+    """
+    # pylint: disable=import-outside-toplevel
+    from superset.models.core import UserAuthToken
+
+    token = (
+        db.session.query(UserAuthToken)
+        .filter_by(user_id=user_id, provider=provider)
+        .one_or_none()
+    )
+    if token is None:
+        token = UserAuthToken(user_id=user_id, provider=provider)
+
+    token.access_token = token_response.get("access_token")
+    expires_in = token_response.get("expires_in")
+    if expires_in is not None:
+        token.access_token_expiration = datetime.now() + timedelta(
+            seconds=int(expires_in)
+        )
+    else:
+        token.access_token_expiration = None
+    token.refresh_token = token_response.get("refresh_token")
+
+    db.session.add(token)
+    db.session.commit()
+
+
+def get_upstream_provider_token(provider: str, user_id: int) -> str | None:
+    """
+    Return a valid access token stored for the given login provider and user.
+
+    If the stored token is expired and a refresh token is available, the token
+    is refreshed and the new access token is returned.  Returns ``None`` when 
no
+    token is found or when the token cannot be refreshed.
+    """
+    # pylint: disable=import-outside-toplevel
+    from superset.models.core import UserAuthToken
+
+    token: UserAuthToken | None = (
+        db.session.query(UserAuthToken)
+        .filter_by(user_id=user_id, provider=provider)
+        .one_or_none()
+    )
+    if token is None:
+        return None
+
+    if token.access_token and (
+        token.access_token_expiration is None
+        or datetime.now() < token.access_token_expiration
+    ):
+        return token.access_token
+
+    if token.refresh_token:
+        return _refresh_upstream_provider_token(token, provider)
+
+    # Expired with no refresh token โ€” discard stale record
+    db.session.delete(token)
+    db.session.commit()
+    return None
+
+
+def _refresh_upstream_provider_token(
+    token: "UserAuthToken",
+    provider: str,
+) -> str | None:
+    """
+    Refresh an upstream provider token using the stored refresh token.
+
+    Looks up the provider's token endpoint via Flask-AppBuilder's remote app
+    registry, issues a refresh-grant request, persists the new token, and
+    returns the new access token (or ``None`` if the refresh fails).
+    """
+    import requests  # pylint: disable=import-outside-toplevel
+
+    provider_configs: list[dict[str, Any]] = app.config.get("OAUTH_PROVIDERS", 
[])
+    provider_config = next(
+        (p for p in provider_configs if p.get("name") == provider), None
+    )
+    if not provider_config:
+        logger.warning(
+            "Cannot refresh upstream token: provider '%s' not found in 
OAUTH_PROVIDERS",
+            provider,
+        )
+        return None
+
+    # Retrieve the token endpoint from the registered remote app's server 
metadata
+    from flask_appbuilder import current_app as fab_app  # pylint: 
disable=import-outside-toplevel
+
+    sm = fab_app.appbuilder.sm

Review Comment:
   **Suggestion:** The security manager attribute used to look up the OAuth 
remote app is incorrect, so the code will always fall back to the empty 
default, never find a remote app, and therefore never be able to refresh 
upstream provider tokens, leaving users with permanently expired tokens. [logic 
error]
   
   <details>
   <summary><b>Severity Level:</b> Critical ๐Ÿšจ</summary>
   
   ```mdx
   - โŒ Upstream OAuth tokens never refreshed; expire after first lifetime.
   - โš ๏ธ Databases using upstream tokens fall back or fail auth.
   ```
   </details>
   
   ```suggestion
   remote_app = getattr(sm, "oauth_remotes", {}).get(provider)
   ```
   <details>
   <summary><b>Steps of Reproduction โœ… </b></summary>
   
   ```mdx
   1. Configure an OAuth login provider with token saving enabled in 
`superset/config.py`,
   e.g.:
   
      `OAUTH_PROVIDERS = [{"name": "keycloak", "remote_app": {...}, 
"save_token": True}]`
   
      and map it for a database via `DATABASE_OAUTH2_UPSTREAM_PROVIDERS = 
{"trino":
      "keycloak"}`.
   
   2. Start Superset and log in via this provider; the overridden 
`auth_user_oauth()` in
   
      `superset/security/manager.py` (added in this PR, exact line not shown) 
calls
   
      `save_user_provider_token()` in `superset/utils/oauth2.py:179-213`, 
persisting a
   
      `UserAuthToken` row in `superset/models/core.py:1370-1390` with both 
`access_token`
   
      and `refresh_token`.
   
   3. After the `user_auth_tokens.access_token_expiration` column has passed 
(or after
   
      manually expiring it in the database), issue a query against the mapped 
Trino
   
      database from SQL Lab or a dashboard, which triggers 
`Database._get_sqla_engine()`
   
      in `superset/models/core.py` (exact line not shown) and, because an 
upstream
   
      provider is configured, it calls `get_upstream_provider_token(provider, 
user_id)`
   
      in `superset/utils/oauth2.py:215-246`.
   
   4. Inside `get_upstream_provider_token()`, the expired token with a nonโ€‘null
   
      `refresh_token` causes a call to `_refresh_upstream_provider_token(token, 
provider)`
   
      in `superset/utils/oauth2.py:249-341`, where
   
      `remote_app = getattr(sm, "oauth_remoteapp", {}).get(provider)` executes.
   
      `SupersetSecurityManager` actually exposes `sm.oauth_remotes` (not 
`oauth_remoteapp`),
   
      so `getattr(sm, "oauth_remoteapp", {})` returns `{}`, `remote_app` is 
`None`,
   
      the function logs "remote app not registered" and returns `None`. As a 
result,
   
      `get_upstream_provider_token()` returns `None`, the upstream token is 
never
   
      refreshed, and the database connection must fall back to a second OAuth 
dance
   
      (or fails if no DBโ€‘level OAuth is configured).
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent ๐Ÿค– </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/utils/oauth2.py
   **Line:** 276:276
   **Comment:**
        *Logic Error: The security manager attribute used to look up the OAuth 
remote app is incorrect, so the code will always fall back to the empty 
default, never find a remote app, and therefore never be able to refresh 
upstream provider tokens, leaving users with permanently expired tokens.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=1105379d2cd291d05d68aa655a0c10877d2b6c900a41d7fa58fb06cab82009d4&reaction=like'>๐Ÿ‘</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=1105379d2cd291d05d68aa655a0c10877d2b6c900a41d7fa58fb06cab82009d4&reaction=dislike'>๐Ÿ‘Ž</a>



##########
superset/models/core.py:
##########
@@ -509,17 +510,23 @@ def _get_sqla_engine(  # pylint: disable=too-many-locals  
# noqa: C901
             if user and user.email:
                 effective_username = user.email.split("@")[0]
 
-        oauth2_config = self.get_oauth2_config()
-        access_token = (
-            get_oauth2_access_token(
-                oauth2_config,
-                self.id,
-                g.user.id,
-                self.db_engine_spec,
+        upstream_provider = app.config.get(
+            "DATABASE_OAUTH2_UPSTREAM_PROVIDERS", {}
+        ).get(self.db_engine_spec.engine_name)
+        if upstream_provider and hasattr(g, "user") and hasattr(g.user, "id"):
+            access_token = get_upstream_provider_token(upstream_provider, 
g.user.id)
+        else:
+            oauth2_config = self.get_oauth2_config()
+            access_token = (
+                get_oauth2_access_token(
+                    oauth2_config,
+                    self.id,
+                    g.user.id,
+                    self.db_engine_spec,
+                )
+                if oauth2_config and hasattr(g, "user") and hasattr(g.user, 
"id")
+                else None

Review Comment:
   **Suggestion:** The code uses `self.db_engine_spec.engine_name` to look up 
the upstream OAuth provider, but engine specs expose the engine identifier via 
the `engine` attribute, so this will either raise an AttributeError or fail to 
match the configured key (e.g. "trino"), causing the upstream token forwarding 
configuration to be ignored and potentially breaking database connections that 
rely on it. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Critical ๐Ÿšจ</summary>
   
   ```mdx
   - โš ๏ธ Upstream login token never forwarded to configured databases.
   - โš ๏ธ Trino OAuth2 reuse broken; extra OAuth dance required.
   - โš ๏ธ New `DATABASE_OAUTH2_UPSTREAM_PROVIDERS` config effectively ignored.
   - โš ๏ธ SQL Lab and charts lose intended single-sign-on behavior.
   ```
   </details>
   
   ```suggestion
           upstream_provider = app.config.get(
               "DATABASE_OAUTH2_UPSTREAM_PROVIDERS", {}
           ).get(self.db_engine_spec.engine)
           if upstream_provider and hasattr(g, "user") and hasattr(g.user, 
"id"):
               access_token = get_upstream_provider_token(upstream_provider, 
g.user.id)
           else:
               oauth2_config = self.get_oauth2_config()
               access_token = (
                   get_oauth2_access_token(
                       oauth2_config,
                       self.id,
                       g.user.id,
                       self.db_engine_spec,
                   )
                   if oauth2_config and hasattr(g, "user") and hasattr(g.user, 
"id")
                   else None
               )
   ```
   <details>
   <summary><b>Steps of Reproduction โœ… </b></summary>
   
   ```mdx
   1. Configure an upstream login provider mapping in `superset/config.py`, 
using the engine
   identifier as documented in the PR description, e.g. 
`DATABASE_OAUTH2_UPSTREAM_PROVIDERS =
   {"trino": "my_oidc_provider"}`.
   
   2. Ensure there is a Trino database in Superset (row in `dbs` mapped to
   `TrinoEngineSpec`), so that `Database.db_engine_spec` is that spec and its 
`engine`
   attribute is `"trino"` while its `engine_name` is a human-readable name like 
`"Trino"`
   (defined on the engine spec class in `superset/db_engine_specs/trino.py`).
   
   3. Log in to Superset via the configured OAuth provider so that a login 
access token is
   stored (the rest of this PR wires that into the new `UserAuthToken` model in
   `superset/models/core.py` and the OAuth utilities).
   
   4. Run any query against the Trino database (for example via SQL Lab or a 
chart), which
   calls `Database.get_sqla_engine()` and then `Database._get_sqla_engine()` in
   `superset/models/core.py`. At lines 513โ€“519, `upstream_provider` is computed 
with
   `.get(self.db_engine_spec.engine_name)`, so the code looks up `"Trino"` in
   `DATABASE_OAUTH2_UPSTREAM_PROVIDERS` instead of `"trino"`, returns `None`, 
and falls back
   to the per-database OAuth2 flow (`get_oauth2_access_token`) instead of 
forwarding the
   upstream login token. The upstream-token-forwarding feature therefore never 
activates,
   even though the configuration key `"trino"` matches 
`self.db_engine_spec.engine`.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent ๐Ÿค– </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/models/core.py
   **Line:** 513:528
   **Comment:**
        *Logic Error: The code uses `self.db_engine_spec.engine_name` to look 
up the upstream OAuth provider, but engine specs expose the engine identifier 
via the `engine` attribute, so this will either raise an AttributeError or fail 
to match the configured key (e.g. "trino"), causing the upstream token 
forwarding configuration to be ignored and potentially breaking database 
connections that rely on it.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=140aba91fdc5065b9e5fd9d615a359c004807fe971ccce3e077eb006a9dffa84&reaction=like'>๐Ÿ‘</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38454&comment_hash=140aba91fdc5065b9e5fd9d615a359c004807fe971ccce3e077eb006a9dffa84&reaction=dislike'>๐Ÿ‘Ž</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to