codeant-ai-for-open-source[bot] commented on code in PR #41421:
URL: https://github.com/apache/superset/pull/41421#discussion_r3473313906
##########
superset/db_engine_specs/databricks.py:
##########
@@ -474,6 +545,76 @@ class
DatabricksNativeEngineSpec(DatabricksDynamicBaseEngineSpec):
supports_dynamic_catalog = True
supports_cross_catalog_queries = True
+ # OAuth 2.0 support
+ supports_oauth2 = True
+ oauth2_exception = OAuth2RedirectError
+ oauth2_scope = "sql"
+
+ # OAuth2 endpoints are determined dynamically based on cloud provider
+ oauth2_authorization_request_uri = "" # Set dynamically
+ oauth2_token_request_uri = "" # Set dynamically
+
+ @classmethod
+ def get_oauth2_authorization_uri(
+ cls,
+ config: "OAuth2ClientConfig",
+ state: "OAuth2State",
+ code_verifier: str | None = None,
+ ) -> str:
+ """
+ Return URI for initial OAuth2 request with dynamic endpoint detection.
+ """
+ from superset import db
+ from superset.models.core import Database
+
+ # Get the database to detect cloud provider
+ database_id = state["database_id"]
+ if database := db.session.get(Database, database_id):
+ provider = cls._detect_cloud_provider(database)
+ # Update config with the correct authorization URI for the cloud
provider
+ from typing import cast
+
+ config = cast(
+ "OAuth2ClientConfig",
+ dict(config)
+ | {
+ "authorization_request_uri":
cls._oauth2_endpoints[provider][
+ "authorization_request_uri"
+ ]
+ },
+ )
+
+ return super().get_oauth2_authorization_uri(config, state,
code_verifier)
+
+ @classmethod
+ def get_oauth2_token(
+ cls,
+ config: "OAuth2ClientConfig",
+ code: str,
+ code_verifier: str | None = None,
+ ) -> "OAuth2TokenResponse":
+ """
+ Exchange authorization code for refresh/access tokens with dynamic
endpoint.
+
+ The token request URI is resolved when the OAuth2 config is built (see
+ `get_oauth2_config`), so it already targets the correct cloud provider
and
+ account. Only fall back to the AWS endpoint when no URI is configured.
+ """
+ from typing import cast
+
+ if not config.get("token_request_uri"):
+ config = cast(
+ "OAuth2ClientConfig",
+ dict(config)
+ | {
+ "token_request_uri": cls._oauth2_endpoints["aws"][
+ "token_request_uri"
+ ]
+ },
+ )
Review Comment:
**Suggestion:** The fallback token URI uses the AWS template with an
unsubstituted `{}` placeholder when `token_request_uri` is missing, so token
exchange can POST to an invalid URL (`.../accounts/{}/v1/token`). The fallback
must be a fully resolved URL, or the code should fail fast when required
account/tenant identifiers are unavailable. [api mismatch]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Token exchange may POST to `.../accounts/{}/v1/token`.
- ⚠️ Misconfigured OAuth2 client silently fails instead of clear error.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a Databricks legacy database with OAuth2 so that
`Database.get_oauth2_config()` (superset/models/core.py:1333-1352) returns a
configuration
that either omits `"token_request_uri"` or sets it to a falsey value, while
still enabling
OAuth2 for the engine (`supports_oauth2 = True` at
`superset/db_engine_specs/databricks.py:548-552`).
2. Complete the OAuth authorization step in the browser so that the provider
redirects
back to Superset, which runs the `DatabaseOAuth2Command` and calls
`self._database.get_oauth2_config()` followed by
`self._database.db_engine_spec.get_oauth2_token(oauth2_config, code,
code_verifier)` at
`superset/commands/database/oauth2.py:13-39`.
3. In `DatabricksNativeEngineSpec.get_oauth2_token()`
(superset/db_engine_specs/databricks.py:589-616), because
`config.get("token_request_uri")` is missing/falsey, the branch at 605-614
executes and
rewrites `config` using `config = cast(...` so that
`config["token_request_uri"]` is set
to `cls._oauth2_endpoints["aws"]["token_request_uri"]` defined at 288-291 as
`"https://accounts.cloud.databricks.com/oidc/accounts/{}/v1/token"`, which
still contains
the `{}` placeholder.
4. The method then calls `super().get_oauth2_token(config, code,
code_verifier)` which
executes `BaseEngineSpec.get_oauth2_token()`
(superset/db_engine_specs/base.py:787-821),
reading `uri = config["token_request_uri"]` and posting the token request
with
`requests.post(uri, ...)` to
`https://accounts.cloud.databricks.com/oidc/accounts/{}/v1/token`, causing
the OAuth token
exchange to fail against an invalid URL even though the rest of the flow is
correct.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=2eb23d56248a425e89c0ba4b2beb16ff&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=2eb23d56248a425e89c0ba4b2beb16ff&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/databricks.py
**Line:** 605:614
**Comment:**
*Api Mismatch: The fallback token URI uses the AWS template with an
unsubstituted `{}` placeholder when `token_request_uri` is missing, so token
exchange can POST to an invalid URL (`.../accounts/{}/v1/token`). The fallback
must be a fully resolved URL, or the code should fail fast when required
account/tenant identifiers are unavailable.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=af6047575e3d1705bf1013130d0667fec87b3af3ae651cb8cd444db6fc973587&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=af6047575e3d1705bf1013130d0667fec87b3af3ae651cb8cd444db6fc973587&reaction=dislike'>👎</a>
##########
superset/db_engine_specs/databricks.py:
##########
@@ -474,6 +545,76 @@ class
DatabricksNativeEngineSpec(DatabricksDynamicBaseEngineSpec):
supports_dynamic_catalog = True
supports_cross_catalog_queries = True
+ # OAuth 2.0 support
+ supports_oauth2 = True
+ oauth2_exception = OAuth2RedirectError
+ oauth2_scope = "sql"
Review Comment:
**Suggestion:** OAuth2 is enabled but there is no enforcement that result
caching is per-user for these connections. Since OAuth2 tokens are
user-specific, shared cache entries can leak one user's query results to
another user when `per_user_caching` is not enabled. Force per-user caching for
OAuth2-enabled Databricks connections and prevent disabling it in the UI.
[cache]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Query cache may leak Databricks results between users.
- ⚠️ OAuth2 databases lack enforced per-user cache segregation.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Create a Databricks database using either `DatabricksNativeEngineSpec`
(superset/db_engine_specs/databricks.py:515-547) or
`DatabricksPythonConnectorEngineSpec`
(superset/db_engine_specs/databricks.py:729-827), both of which set
`supports_oauth2 =
True` and `oauth2_scope = "sql"` at lines 548-551 and 829-833, and configure
OAuth2 so
`Database.is_oauth2_enabled()` (superset/models/core.py:1316-1332) returns
True.
2. Leave per-user caching disabled by ensuring the database `extra` does not
contain
`"per_user_caching": true` (field defined at
`superset/databases/schemas.py:895`),
`database.impersonate_user` remains False (models/core.py:212), and feature
flags
`CACHE_IMPERSONATION` / `CACHE_QUERY_BY_USER` are False, which is a
plausible production
configuration.
3. Have User A and User B, with different permissions in Databricks, run the
same chart or
query backed by this database so that the query cache key is built in
`Viz._get_cache_key`
(superset/viz.py:495-514) or `QueryObject.cache_key`
(superset/common/query_object.py:480-490), both of which call
`add_impersonation_cache_key_if_needed(self.datasource.database,
cache_dict)`.
4. In `add_impersonation_cache_key_if_needed`
(superset/utils/cache_keys.py:42-50),
because `database.impersonate_user` is False,
`feature_flag_manager.is_feature_enabled("CACHE_QUERY_BY_USER")` is False,
and
`extra.get("per_user_caching", False)` is False, the condition at 42-49
fails and no
`impersonation_key` is added, so cache entries are keyed only on query shape
and
datasource metadata, not the user.
5. Meanwhile, the Databricks connection for each user is created with a
user-specific
OAuth2 access token via `Database._get_sqla_engine()`
(superset/models/core.py:567-577)
and `DatabricksDynamicBaseEngineSpec.impersonate_user()`
(superset/db_engine_specs/databricks.py:328-349), meaning the same SQL can
legitimately
return different results per user; however, due to the missing impersonation
key, User B
can receive User A's cached results for the same query, exposing data across
users.
6. There is no logic in `databricks.py` or in the database Marshmallow
schemas to
automatically enforce `"per_user_caching": true` or to disable the per-user
caching toggle
in the UI when `supports_oauth2` is True, so this unsafe configuration is
neither
prevented nor automatically corrected for OAuth2-enabled Databricks
connections.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=8cc0444d09c94988b1af51d7db527b03&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=8cc0444d09c94988b1af51d7db527b03&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/databricks.py
**Line:** 548:551
**Comment:**
*Cache: OAuth2 is enabled but there is no enforcement that result
caching is per-user for these connections. Since OAuth2 tokens are
user-specific, shared cache entries can leak one user's query results to
another user when `per_user_caching` is not enabled. Force per-user caching for
OAuth2-enabled Databricks connections and prevent disabling it in the UI.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=4f70a60eda1f7c6e31542d76daf0d3d91700794a2f281e3121308a9cbf5338b9&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=4f70a60eda1f7c6e31542d76daf0d3d91700794a2f281e3121308a9cbf5338b9&reaction=dislike'>👎</a>
##########
superset/db_engine_specs/databricks.py:
##########
@@ -685,6 +826,76 @@ class
DatabricksPythonConnectorEngineSpec(DatabricksDynamicBaseEngineSpec):
supports_dynamic_schema = supports_catalog = supports_dynamic_catalog =
True
+ # OAuth 2.0 support
+ supports_oauth2 = True
+ oauth2_exception = OAuth2RedirectError
+ oauth2_scope = "sql"
+
+ # OAuth2 endpoints are determined dynamically based on cloud provider
+ oauth2_authorization_request_uri = "" # Set dynamically
+ oauth2_token_request_uri = "" # Set dynamically
+
+ @classmethod
+ def get_oauth2_authorization_uri(
+ cls,
+ config: "OAuth2ClientConfig",
+ state: "OAuth2State",
+ code_verifier: str | None = None,
+ ) -> str:
+ """
+ Return URI for initial OAuth2 request with dynamic endpoint detection.
+ """
+ from superset import db
+ from superset.models.core import Database
+
+ # Get the database to detect cloud provider
+ database_id = state["database_id"]
+ if database := db.session.get(Database, database_id):
+ provider = cls._detect_cloud_provider(database)
+ # Update config with the correct authorization URI for the cloud
provider
+ from typing import cast
+
+ config = cast(
+ "OAuth2ClientConfig",
+ dict(config)
+ | {
+ "authorization_request_uri":
cls._oauth2_endpoints[provider][
+ "authorization_request_uri"
+ ]
+ },
+ )
Review Comment:
**Suggestion:** The Python connector implementation has the same endpoint
override bug: it replaces the configured authorization URI with a template
containing `{}` instead of a concrete account/tenant value. This breaks OAuth
login for providers that require a resolved path segment and ignores explicit
URIs configured by administrators. [api mismatch]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Databricks Python connector OAuth authorize URL keeps {} literal.
- ❌ OAuth login fails for Databricks SQL Python connector.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a Databricks database to use the Python connector engine
(`DatabricksPythonConnectorEngineSpec` at
`superset/db_engine_specs/databricks.py:729-827`) and provide OAuth2 client
settings in
`encrypted_extra["oauth2_client_info"]` so `Database.get_oauth2_config()`
(superset/models/core.py:1333-1352) returns a config with a fully-resolved
`authorization_request_uri`.
2. Execute a chart or SQL Lab query against this database, causing an
OAuth-related driver
error that leads `_handle_oauth2_error()`
(superset/models/core.py:1364-1372) to trigger
`Database.start_oauth2_dance()` (superset/models/core.py:1354-1362).
3. `Database.start_oauth2_dance()` calls
`BaseEngineSpec.start_oauth2_dance()`
(superset/db_engine_specs/base.py:651-722), which builds the OAuth2 `state`
and invokes
`DatabricksPythonConnectorEngineSpec.get_oauth2_authorization_uri(oauth2_config,
state,
code_verifier)` (superset/db_engine_specs/databricks.py:838-868).
4. In `DatabricksPythonConnectorEngineSpec.get_oauth2_authorization_uri`,
the code at
852-865 computes `provider = cls._detect_cloud_provider(database)`
(superset/db_engine_specs/databricks.py:302-326) and then replaces
`config["authorization_request_uri"]` via the `config = cast(...` block at
lines 858-865
with `cls._oauth2_endpoints[provider]["authorization_request_uri"]`, one of
the templates
defined at 286-300 such as
`https://login.microsoftonline.com/{}/oauth2/v2.0/authorize` or
`https://accounts.gcp.databricks.com/oidc/accounts/{}/v1/authorize`, leaving
the `{}`
placeholder unformatted.
5. `super().get_oauth2_authorization_uri(config, state, code_verifier)` then
delegates to
`BaseEngineSpec.get_oauth2_authorization_uri()`
(superset/db_engine_specs/base.py:757-785), which uses the unformatted
`authorization_request_uri` directly, causing the OAuth redirect to target
an invalid
endpoint like `https://login.microsoftonline.com/{}/oauth2/v2.0/authorize`
and breaking
OAuth login for the Python connector.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=7c4149f95a3c484da6bf59753c962cce&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=7c4149f95a3c484da6bf59753c962cce&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/databricks.py
**Line:** 858:866
**Comment:**
*Api Mismatch: The Python connector implementation has the same
endpoint override bug: it replaces the configured authorization URI with a
template containing `{}` instead of a concrete account/tenant value. This
breaks OAuth login for providers that require a resolved path segment and
ignores explicit URIs configured by administrators.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=24a653fbae8e6ce1dfce57b6bbf7207c893e9d734090070311ad48fb8d8a6604&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=24a653fbae8e6ce1dfce57b6bbf7207c893e9d734090070311ad48fb8d8a6604&reaction=dislike'>👎</a>
##########
superset/db_engine_specs/databricks.py:
##########
@@ -685,6 +826,76 @@ class
DatabricksPythonConnectorEngineSpec(DatabricksDynamicBaseEngineSpec):
supports_dynamic_schema = supports_catalog = supports_dynamic_catalog =
True
+ # OAuth 2.0 support
+ supports_oauth2 = True
+ oauth2_exception = OAuth2RedirectError
+ oauth2_scope = "sql"
+
+ # OAuth2 endpoints are determined dynamically based on cloud provider
+ oauth2_authorization_request_uri = "" # Set dynamically
+ oauth2_token_request_uri = "" # Set dynamically
+
+ @classmethod
+ def get_oauth2_authorization_uri(
+ cls,
+ config: "OAuth2ClientConfig",
+ state: "OAuth2State",
+ code_verifier: str | None = None,
+ ) -> str:
+ """
+ Return URI for initial OAuth2 request with dynamic endpoint detection.
+ """
+ from superset import db
+ from superset.models.core import Database
+
+ # Get the database to detect cloud provider
+ database_id = state["database_id"]
+ if database := db.session.get(Database, database_id):
+ provider = cls._detect_cloud_provider(database)
+ # Update config with the correct authorization URI for the cloud
provider
+ from typing import cast
+
+ config = cast(
+ "OAuth2ClientConfig",
+ dict(config)
+ | {
+ "authorization_request_uri":
cls._oauth2_endpoints[provider][
+ "authorization_request_uri"
+ ]
+ },
+ )
+
+ return super().get_oauth2_authorization_uri(config, state,
code_verifier)
+
+ @classmethod
+ def get_oauth2_token(
+ cls,
+ config: "OAuth2ClientConfig",
+ code: str,
+ code_verifier: str | None = None,
+ ) -> "OAuth2TokenResponse":
+ """
+ Exchange authorization code for refresh/access tokens with dynamic
endpoint.
+
+ The token request URI is resolved when the OAuth2 config is built (see
+ `get_oauth2_config`), so it already targets the correct cloud provider
and
+ account. Only fall back to the AWS endpoint when no URI is configured.
+ """
+ from typing import cast
+
+ if not config.get("token_request_uri"):
+ config = cast(
+ "OAuth2ClientConfig",
+ dict(config)
+ | {
+ "token_request_uri": cls._oauth2_endpoints["aws"][
+ "token_request_uri"
+ ]
+ },
+ )
Review Comment:
**Suggestion:** The Python connector token fallback has the same
unresolved-template issue and can send token requests to `.../{}/...` instead
of a real endpoint. Ensure fallback URIs are concrete and
provider/account-aware before calling `super().get_oauth2_token(...)`. [api
mismatch]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Python connector token fallback uses unresolved AWS template URL.
- ⚠️ Databricks OAuth2 setup without token URI breaks login.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a Databricks database to use
`DatabricksPythonConnectorEngineSpec`
(superset/db_engine_specs/databricks.py:729-827) with OAuth2 enabled but
provide an OAuth2
client configuration that lacks a valid `"token_request_uri"` so that
`Database.get_oauth2_config()` (superset/models/core.py:1333-1352) returns a
config where
`config.get("token_request_uri")` is falsey.
2. After the user authorizes in the browser and is redirected back, the
OAuth2 callback
handler `DatabaseOAuth2Command.run()` calls
`self._database.db_engine_spec.get_oauth2_token(oauth2_config, code,
code_verifier)` at
`superset/commands/database/oauth2.py:35-39`.
3. In `DatabricksPythonConnectorEngineSpec.get_oauth2_token()`
(superset/db_engine_specs/databricks.py:871-897), because
`config.get("token_request_uri")` is falsey, the code at 886-895 executes
and mutates
`config` using `config = cast(...` so that `"token_request_uri"` is set from
`cls._oauth2_endpoints["aws"]["token_request_uri"]`
(superset/db_engine_specs/databricks.py:288-291), which is the unresolved
template
`"https://accounts.cloud.databricks.com/oidc/accounts/{}/v1/token"`.
4. The method then calls `BaseEngineSpec.get_oauth2_token(config, code,
code_verifier)`
(superset/db_engine_specs/base.py:787-821), which POSTs to `uri =
config["token_request_uri"]` without any further formatting, so the token
exchange goes to
`.../accounts/{}/v1/token` and fails, breaking OAuth2 login or refresh for
the Python
connector when the token URI is not explicitly configured.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=c25aa502ab63406990f333f3d5df2313&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=c25aa502ab63406990f333f3d5df2313&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/databricks.py
**Line:** 886:895
**Comment:**
*Api Mismatch: The Python connector token fallback has the same
unresolved-template issue and can send token requests to `.../{}/...` instead
of a real endpoint. Ensure fallback URIs are concrete and
provider/account-aware before calling `super().get_oauth2_token(...)`.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=9ba879abbfa5d7d325cdbfe18d1584dbc16877e6e990c68df4522caf27d36fcb&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=9ba879abbfa5d7d325cdbfe18d1584dbc16877e6e990c68df4522caf27d36fcb&reaction=dislike'>👎</a>
##########
superset/db_engine_specs/databricks.py:
##########
@@ -474,6 +545,76 @@ class
DatabricksNativeEngineSpec(DatabricksDynamicBaseEngineSpec):
supports_dynamic_catalog = True
supports_cross_catalog_queries = True
+ # OAuth 2.0 support
+ supports_oauth2 = True
+ oauth2_exception = OAuth2RedirectError
+ oauth2_scope = "sql"
+
+ # OAuth2 endpoints are determined dynamically based on cloud provider
+ oauth2_authorization_request_uri = "" # Set dynamically
+ oauth2_token_request_uri = "" # Set dynamically
+
+ @classmethod
+ def get_oauth2_authorization_uri(
+ cls,
+ config: "OAuth2ClientConfig",
+ state: "OAuth2State",
+ code_verifier: str | None = None,
+ ) -> str:
+ """
+ Return URI for initial OAuth2 request with dynamic endpoint detection.
+ """
+ from superset import db
+ from superset.models.core import Database
+
+ # Get the database to detect cloud provider
+ database_id = state["database_id"]
+ if database := db.session.get(Database, database_id):
+ provider = cls._detect_cloud_provider(database)
+ # Update config with the correct authorization URI for the cloud
provider
+ from typing import cast
+
+ config = cast(
+ "OAuth2ClientConfig",
+ dict(config)
+ | {
+ "authorization_request_uri":
cls._oauth2_endpoints[provider][
+ "authorization_request_uri"
+ ]
+ },
+ )
Review Comment:
**Suggestion:** `get_oauth2_authorization_uri` overwrites the configured
`authorization_request_uri` with a provider template that still contains an
unsubstituted `{}` placeholder. This causes OAuth redirects to invalid
authorize endpoints (for example `.../accounts/{}/v1/authorize`) and also
discards any admin-provided fully resolved URI. Keep the already-resolved URI
from config, or format the template with the required account/tenant identifier
before calling `super()`. [api mismatch]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Databricks legacy OAuth redirect URL includes unresolved {} placeholder.
- ❌ Users cannot complete OAuth login for Databricks legacy connector.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a Databricks database using the legacy connector
(`DatabricksNativeEngineSpec` at
`superset/db_engine_specs/databricks.py:515-547`) and add
an OAuth2 client under `encrypted_extra["oauth2_client_info"]` so
`Database.get_oauth2_config()` (superset/models/core.py:1333-1352) returns a
config with a
fully-resolved `authorization_request_uri`.
2. Trigger an OAuth2 flow by running a query against this database so that an
OAuth-related error leads `Database._handle_oauth2_error()`
(superset/models/core.py:1364-1372) to call `Database.start_oauth2_dance()`
(superset/models/core.py:1354-1362).
3. `Database.start_oauth2_dance()` delegates to
`BaseEngineSpec.start_oauth2_dance()`
(superset/db_engine_specs/base.py:651-722), which builds a JWT `state`
including
`database_id` and calls
`DatabricksNativeEngineSpec.get_oauth2_authorization_uri(oauth2_config,
state,
code_verifier)` (superset/db_engine_specs/databricks.py:557-587).
4. Inside `DatabricksNativeEngineSpec.get_oauth2_authorization_uri`, the
code at lines
570-585 detects the cloud via `_detect_cloud_provider()`
(superset/db_engine_specs/databricks.py:302-326) and then overwrites
`config["authorization_request_uri"]` using `config = cast(...` at 577-585
with
`cls._oauth2_endpoints[provider]["authorization_request_uri"]` (templates
with `{}`
defined at 286-300), discarding any admin-provided fully-resolved URI.
5. `super().get_oauth2_authorization_uri(config, state, code_verifier)` then
runs
`BaseEngineSpec.get_oauth2_authorization_uri()`
(superset/db_engine_specs/base.py:757-785), which uses `uri =
config["authorization_request_uri"]` verbatim and returns `urljoin(uri, "?" +
urlencode(params))`, so the frontend receives an `OAuth2RedirectError`
(base.py:722)
pointing to an invalid URL like
`https://accounts.cloud.databricks.com/oidc/accounts/{}/v1/authorize`,
breaking the OAuth
redirect.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=2e950a17ea74414cbf7a41f6d9aafa1f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=2e950a17ea74414cbf7a41f6d9aafa1f&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/databricks.py
**Line:** 577:585
**Comment:**
*Api Mismatch: `get_oauth2_authorization_uri` overwrites the configured
`authorization_request_uri` with a provider template that still contains an
unsubstituted `{}` placeholder. This causes OAuth redirects to invalid
authorize endpoints (for example `.../accounts/{}/v1/authorize`) and also
discards any admin-provided fully resolved URI. Keep the already-resolved URI
from config, or format the template with the required account/tenant identifier
before calling `super()`.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=2336ac80e1345191d2120c7dd7e316e8c8a780a79c0fe9d7ba009fc8c4c39838&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=2336ac80e1345191d2120c7dd7e316e8c8a780a79c0fe9d7ba009fc8c4c39838&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]