codeant-ai-for-open-source[bot] commented on code in PR #41421:
URL: https://github.com/apache/superset/pull/41421#discussion_r3476352263


##########
superset/db_engine_specs/databricks.py:
##########
@@ -474,6 +576,74 @@ class 
DatabricksNativeEngineSpec(DatabricksDynamicBaseEngineSpec):
     supports_dynamic_catalog = True
     supports_cross_catalog_queries = True
 
+    # OAuth 2.0 support
+    supports_oauth2 = True
+    oauth2_exception = OAuth2RedirectError
+    oauth2_scope = "sql"
+
+    # OAuth2 endpoints are determined dynamically based on cloud provider
+    oauth2_authorization_request_uri = ""  # Set dynamically
+    oauth2_token_request_uri = ""  # Set dynamically
+
+    @classmethod
+    def get_oauth2_authorization_uri(
+        cls,
+        config: "OAuth2ClientConfig",
+        state: "OAuth2State",
+        code_verifier: str | None = None,
+    ) -> str:
+        """
+        Return URI for initial OAuth2 request with dynamic endpoint detection.
+
+        A fully-resolved `authorization_request_uri` from 
`DATABASE_OAUTH2_CLIENTS`
+        is preserved; only fall back to the auto-detected, account-resolved
+        endpoint when none is configured.
+        """
+        if not config.get("authorization_request_uri"):
+            from superset import db
+            from superset.models.core import Database
+
+            # Get the database to detect cloud provider
+            database_id = state["database_id"]
+            if database := db.session.get(Database, database_id):
+                provider = cls._detect_cloud_provider(database)
+                from typing import cast
+
+                config = cast(
+                    "OAuth2ClientConfig",
+                    dict(config)
+                    | {
+                        "authorization_request_uri": 
cls._resolve_oauth2_endpoint(
+                            database, provider, "authorization_request_uri"
+                        )
+                    },
+                )
+
+        return super().get_oauth2_authorization_uri(config, state, 
code_verifier)
+
+    @classmethod
+    def get_oauth2_token(
+        cls,
+        config: "OAuth2ClientConfig",
+        code: str,
+        code_verifier: str | None = None,
+    ) -> "OAuth2TokenResponse":
+        """
+        Exchange authorization code for refresh/access tokens.
+
+        The token request URI is resolved when the OAuth2 config is built (see
+        `get_oauth2_config`) and already targets the correct cloud provider and
+        account. There is no database context here to auto-detect it, so fail
+        fast rather than POST to an unresolved endpoint when it is missing.
+        """
+        if not config.get("token_request_uri"):
+            raise OAuth2Error(
+                "Databricks OAuth2 token endpoint is not configured: provide a 
"
+                "fully-resolved `token_request_uri` in 
DATABASE_OAUTH2_CLIENTS."
+            )
+
+        return super().get_oauth2_token(config, code, code_verifier)

Review Comment:
   **Suggestion:** This fails OAuth2 code exchange for the auto-detected flow: 
`DATABASE_OAUTH2_CLIENTS` can omit `token_request_uri`, but this method now 
hard-fails when it is missing. During callback, `OAuth2StoreTokenCommand` 
reloads config via `database.get_oauth2_config()`, which still has an empty 
token URI for Databricks, so token exchange always raises `OAuth2Error` instead 
of using the detected provider/account endpoint. Resolve `token_request_uri` 
dynamically (from database `cloud_provider` + `account_id`/`tenant_id`) before 
exchange, or carry the resolved token endpoint through the OAuth state. 
[incomplete implementation]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Databricks (legacy) OAuth2 logins fail with default config.
   - ❌ Tokens never stored by `OAuth2StoreTokenCommand.run`.
   - ❌ Queries against OAuth-enabled Databricks databases cannot run.
   - ⚠️ Behavior contradicts docs promising auto-detected token endpoint.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure global Databricks OAuth2 as documented at
   
`docs/user_docs_versioned_docs/version-6.0.0/configuration/databases.mdx:5-23`, 
defining
   `DATABASE_OAUTH2_CLIENTS` entries for `"Databricks (legacy)"` with `id`, 
`secret`, and
   `scope` but **no** `token_request_uri` (docs explicitly state endpoints are
   auto-detected).
   
   2. Create a Databricks database using the `"Databricks (legacy)"` engine so 
that its
   `db_engine_spec` is `DatabricksNativeEngineSpec`
   (`superset/db_engine_specs/databricks.py:546-553`), and set `ENGINE 
PARAMETERS` JSON to
   include `cloud_provider` and `account_id`/`tenant_id` as described in the 
docs
   (`databases.mdx:45-63`).
   
   3. Trigger the OAuth2 flow by running a query that requires OAuth for this 
database;
   `Database.start_oauth2_dance()` (`superset/models/core.py:27-35`) calls
   `BaseEngineSpec.start_oauth2_dance()` 
(`superset/db_engine_specs/base.py:11-83`), which
   obtains `oauth2_config = database.get_oauth2_config()` (`base.py:73-75). For 
global config
   without `token_request_uri`, `BaseEngineSpec.get_oauth2_config()` 
(`base.py:86-115`) sets
   `config["token_request_uri"]` to `cls.oauth2_token_request_uri`, which for
   `DatabricksNativeEngineSpec` is the empty string
   (`superset/db_engine_specs/databricks.py:25-27`).
   
   4. After the user authorizes with Databricks and the provider redirects back 
to Superset
   at `/api/v1/database/oauth2/` (`superset/databases/api.py:40-88`),
   `OAuth2StoreTokenCommand.run()` 
(`superset/commands/database/oauth2.py:7-58`) reloads
   `oauth2_config = self._database.get_oauth2_config()` (line 13-15, still with 
empty
   `token_request_uri`) and calls 
`self._database.db_engine_spec.get_oauth2_token(...)` (line
   35-39). For the legacy engine this resolves to
   `DatabricksNativeEngineSpec.get_oauth2_token()`
   (`superset/db_engine_specs/databricks.py:65-86`), which executes `if not
   config.get("token_request_uri"):` at line 639 and raises `OAuth2Error(...)` 
at lines
   640-643, aborting the token exchange instead of using any auto-detected 
provider/account
   endpoint. No tokens are stored and the OAuth2 login flow cannot complete for 
the
   documented default configuration.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=88f0674e117542b098554f99455905df&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=88f0674e117542b098554f99455905df&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/databricks.py
   **Line:** 639:645
   **Comment:**
        *Incomplete Implementation: This fails OAuth2 code exchange for the 
auto-detected flow: `DATABASE_OAUTH2_CLIENTS` can omit `token_request_uri`, but 
this method now hard-fails when it is missing. During callback, 
`OAuth2StoreTokenCommand` reloads config via `database.get_oauth2_config()`, 
which still has an empty token URI for Databricks, so token exchange always 
raises `OAuth2Error` instead of using the detected provider/account endpoint. 
Resolve `token_request_uri` dynamically (from database `cloud_provider` + 
`account_id`/`tenant_id`) before exchange, or carry the resolved token endpoint 
through the OAuth state.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=77822a7bbd119e9a581b5274ce031f4a20696399f177886590b529f48f524632&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=77822a7bbd119e9a581b5274ce031f4a20696399f177886590b529f48f524632&reaction=dislike'>👎</a>



##########
superset/db_engine_specs/databricks.py:
##########
@@ -685,6 +855,74 @@ class 
DatabricksPythonConnectorEngineSpec(DatabricksDynamicBaseEngineSpec):
 
     supports_dynamic_schema = supports_catalog = supports_dynamic_catalog = 
True
 
+    # OAuth 2.0 support
+    supports_oauth2 = True
+    oauth2_exception = OAuth2RedirectError
+    oauth2_scope = "sql"
+
+    # OAuth2 endpoints are determined dynamically based on cloud provider
+    oauth2_authorization_request_uri = ""  # Set dynamically
+    oauth2_token_request_uri = ""  # Set dynamically
+
+    @classmethod
+    def get_oauth2_authorization_uri(
+        cls,
+        config: "OAuth2ClientConfig",
+        state: "OAuth2State",
+        code_verifier: str | None = None,
+    ) -> str:
+        """
+        Return URI for initial OAuth2 request with dynamic endpoint detection.
+
+        A fully-resolved `authorization_request_uri` from 
`DATABASE_OAUTH2_CLIENTS`
+        is preserved; only fall back to the auto-detected, account-resolved
+        endpoint when none is configured.
+        """
+        if not config.get("authorization_request_uri"):
+            from superset import db
+            from superset.models.core import Database
+
+            # Get the database to detect cloud provider
+            database_id = state["database_id"]
+            if database := db.session.get(Database, database_id):
+                provider = cls._detect_cloud_provider(database)
+                from typing import cast
+
+                config = cast(
+                    "OAuth2ClientConfig",
+                    dict(config)
+                    | {
+                        "authorization_request_uri": 
cls._resolve_oauth2_endpoint(
+                            database, provider, "authorization_request_uri"
+                        )
+                    },
+                )
+
+        return super().get_oauth2_authorization_uri(config, state, 
code_verifier)
+
+    @classmethod
+    def get_oauth2_token(
+        cls,
+        config: "OAuth2ClientConfig",
+        code: str,
+        code_verifier: str | None = None,
+    ) -> "OAuth2TokenResponse":
+        """
+        Exchange authorization code for refresh/access tokens.
+
+        The token request URI is resolved when the OAuth2 config is built (see
+        `get_oauth2_config`) and already targets the correct cloud provider and
+        account. There is no database context here to auto-detect it, so fail
+        fast rather than POST to an unresolved endpoint when it is missing.
+        """
+        if not config.get("token_request_uri"):
+            raise OAuth2Error(
+                "Databricks OAuth2 token endpoint is not configured: provide a 
"
+                "fully-resolved `token_request_uri` in 
DATABASE_OAUTH2_CLIENTS."
+            )
+
+        return super().get_oauth2_token(config, code, code_verifier)

Review Comment:
   **Suggestion:** The same token-exchange break exists in the Python connector 
spec: when admins rely on auto-detection (no explicit `token_request_uri`), 
callback token exchange always errors out because this method requires a fully 
resolved URI but does not resolve one. Implement the same dynamic endpoint 
resolution used for authorization (or pass the resolved token URI from the 
initial step) so OAuth2 login can complete. [incomplete implementation]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Databricks Python connector OAuth2 flow fails by default.
   - ❌ Personal tokens for Databricks cannot be stored for users.
   - ❌ OAuth2-based per-user Databricks queries cannot complete.
   - ⚠️ Docs promise auto-detected token endpoint, code does not.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure global Databricks OAuth2 as per
   
`docs/user_docs_versioned_docs/version-6.0.0/configuration/databases.mdx:5-23`, 
defining
   `DATABASE_OAUTH2_CLIENTS["Databricks"]` with `id`, `secret`, and `scope` but 
**no**
   `token_request_uri`, relying on the documented auto-detection (docs lines 
39-69 state
   endpoints are auto-detected unless explicitly overridden).
   
   2. Create a Databricks database using the `"Databricks"` engine so that 
`db_engine_spec`
   is `DatabricksPythonConnectorEngineSpec`
   (`superset/db_engine_specs/databricks.py:758-763`), configure its hostname 
and `ENGINE
   PARAMETERS` (`cloud_provider` + `account_id`/`tenant_id`) as suggested in 
the docs
   (`databases.mdx:45-63`).
   
   3. Run a query that requires OAuth2; `Database.start_oauth2_dance()`
   (`superset/models/core.py:27-35`) calls `BaseEngineSpec.start_oauth2_dance()`
   (`superset/db_engine_specs/base.py:11-83`), which uses 
`database.get_oauth2_config()`
   (`models/core.py:15-25`). With only global config, 
`BaseEngineSpec.get_oauth2_config()`
   (`base.py:86-115`) sets `config["token_request_uri"]` from
   `db_engine_spec_config.get("token_request_uri", 
cls.oauth2_token_request_uri)`. For
   `DatabricksPythonConnectorEngineSpec`, `cls.oauth2_token_request_uri` is 
defined as an
   empty string (`superset/db_engine_specs/databricks.py:25-27` in its OAuth2 
section), so
   `config["token_request_uri"]` is `""`. The authorization URL succeeds because
   `DatabricksPythonConnectorEngineSpec.get_oauth2_authorization_uri()` 
(`databricks.py:24-23
   in the later hunk) dynamically resolves only the `authorization_request_uri`.
   
   4. After Databricks redirects back to Superset at `/api/v1/database/oauth2/`
   (`superset/databases/api.py:40-88`), `OAuth2StoreTokenCommand.run()`
   (`superset/commands/database/oauth2.py:7-58`) reloads `oauth2_config =
   self._database.get_oauth2_config()` (line 13-15, still with empty 
`token_request_uri`) and
   then calls `self._database.db_engine_spec.get_oauth2_token(...)` (line 
35-39). For the
   Python connector this resolves to 
`DatabricksPythonConnectorEngineSpec.get_oauth2_token()`
   (`superset/db_engine_specs/databricks.py:25-45 in the second hunk), which 
performs `if not
   config.get("token_request_uri"):` at line 918 and raises `OAuth2Error(...)` 
at lines
   919-921. Because `_resolve_oauth2_endpoint()` is never used for 
`token_request_uri`, the
   token endpoint is never auto-detected and the OAuth2 exchange for the Python 
connector
   always fails under the documented default configuration.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=c8551f966e7048b2a0c844b4a5ec00e8&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=c8551f966e7048b2a0c844b4a5ec00e8&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/databricks.py
   **Line:** 918:924
   **Comment:**
        *Incomplete Implementation: The same token-exchange break exists in the 
Python connector spec: when admins rely on auto-detection (no explicit 
`token_request_uri`), callback token exchange always errors out because this 
method requires a fully resolved URI but does not resolve one. Implement the 
same dynamic endpoint resolution used for authorization (or pass the resolved 
token URI from the initial step) so OAuth2 login can complete.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=214943266bd026f814ffe42f26db87a987d60b312cc0d88c7236e72ffb622e44&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41421&comment_hash=214943266bd026f814ffe42f26db87a987d60b312cc0d88c7236e72ffb622e44&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to