codeant-ai-for-open-source[bot] commented on code in PR #38469:
URL: https://github.com/apache/superset/pull/38469#discussion_r3014804280
##########
superset/models/core.py:
##########
@@ -1360,6 +1355,29 @@ class DatabaseUserOAuth2Tokens(Model,
AuditMixinNullable):
refresh_token = Column(encrypted_field_factory.create(Text), nullable=True)
+class UpstreamOAuthToken(Model, AuditMixinNullable):
+ """
+ Store OAuth tokens from the Superset login provider, for forwarding to
databases.
+ """
+
+ __tablename__ = "upstream_oauth_tokens"
+ __table_args__ = (
+ sqla.Index("idx_upstream_oauth_tokens_user_provider", "user_id",
"provider"),
Review Comment:
**Suggestion:** The new token table lacks a uniqueness constraint on
`(user_id, provider)` even though reads use `.one_or_none()` in token lookup
paths. Concurrent login/token-save operations can insert duplicates, causing
`MultipleResultsFound` at query time and breaking token retrieval; enforce
uniqueness at the schema level. [race condition]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ SQL Lab queries fail for affected upstream-OAuth databases.
- ❌ Dashboards using those databases can error for impacted users.
- ⚠️ Intermittent, hard-to-reproduce failures under concurrent logins.
```
</details>
```suggestion
UniqueConstraint("user_id", "provider"),
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure upstream OAuth forwarding as described in the PR
(`DATABASE_OAUTH2_UPSTREAM_PROVIDERS` mapping) and log in via OAuth so that
`auth_user_oauth()` persists a row in `UpstreamOAuthToken` (implementation in
`superset/utils/oauth2.py`, where it queries `UpstreamOAuthToken` by
`user_id` and
`provider` using `.one_or_none()` before insert/update).
2. Trigger two concurrent OAuth login flows for the same Superset user and
provider (e.g.
two browsers or tabs hitting the login callback simultaneously), causing two
`auth_user_oauth()` executions to hit the `UpstreamOAuthToken` lookup code
path at roughly
the same time.
3. In each concurrent request, the `.one_or_none()` query on
`UpstreamOAuthToken`
(user_id=X, provider=Y) returns `None` because neither transaction has
committed yet; both
code paths then proceed to create and `db.session.add()` a new
`UpstreamOAuthToken` row
without any database-level uniqueness constraint on `(user_id, provider)`,
resulting in
duplicate rows after commit.
4. Later, when the same user issues a query against an upstream‑OAuth
database,
`_get_sqla_engine()` in `superset/models/core.py` (around lines 511–517)
calls
`get_access_token_for_database(self, g.user.id)` in
`superset/utils/oauth2.py`, which
again does a `.one_or_none()` on `UpstreamOAuthToken` filtered by `(user_id,
provider)`;
with duplicates present, SQLAlchemy raises `MultipleResultsFound`, causing
the engine
creation and the user's query (e.g. via SQL Lab or a dashboard) to fail with
a 500 error.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/models/core.py
**Line:** 1365:1365
**Comment:**
*Race Condition: The new token table lacks a uniqueness constraint on
`(user_id, provider)` even though reads use `.one_or_none()` in token lookup
paths. Concurrent login/token-save operations can insert duplicates, causing
`MultipleResultsFound` at query time and breaking token retrieval; enforce
uniqueness at the schema level.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38469&comment_hash=8021bff751097983619de52d8f3b3c003bc615267bda9e4365547f81bc837b49&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38469&comment_hash=8021bff751097983619de52d8f3b3c003bc615267bda9e4365547f81bc837b49&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]