martimors opened a new issue, #33388: URL: https://github.com/apache/superset/issues/33388
### Bug description The `[databricks]` extra of the `apache-superset` package installs the deprecated and archived `sqlalchemy-databricks` package, which does not work with databricks anymore after the introduction of Unity catalog and deprecation (and soon removal) of hive metastore. The specific reason is not being able to ignore hidden partitioning columns from delta tables when running `DESCRIBE ...` queries. The new driver, aptly named `databricks-sqlalchemy` for maximum confusion, is a drop-in replacement for the old one, supported by microsoft and should work out of the box. However, I have found one issue that was not there with the old driver. The datasource is created successfully, but the webserver never returns and the request times out. Steps to reproduce: 1. From a stock apache/superset docker image, install the new databricks connector with sqlalchemy 1.x support (not the sqlalchemy 2.x one because it won't work with superset) ```sh pip install databricks-sqlalchemy~=1 ``` 2. In superset, create a datasource, using the sqlalchemy URI like this: ```sh databricks://token:xxxxxxx...@xxxxxxxxxxx.azuredatabricks.net:443 ``` Include the extra engine parameters to identify a compute cluster to use: ```sh {"connect_args":{"http_path":"/sql/1.0/warehouses/XXXXXXXXXXXX"}} ``` Lastly tick the box to allow selecting a catalog. 3. Create the datasource and observe the browser hanging for ever and a spinner in the UI that does not disappear. 4. Refresh the page and observe the datasource was created successfully after all. I have noticed that while this is going on, the databricks cluster is receiving `SHOW SCHEMAS` requests constantly in a loop:  The server logs are looping until it eventually returns a 201 if the client didn't time out already. [logs.txt](https://github.com/user-attachments/files/20099380/logs.txt) I suspect what is happening here is that Superset is pre-fetching all schemas in all catalogs, maybe just for testing od populating some cache? This is not ideal behavior in our environment because we have hundreds of catalogs and would only like to load what we need when we need it. ### Screenshots/recordings _No response_ ### Superset version 4.1.2 ### Python version 3.10 ### Node version Not applicable ### Browser Not applicable ### Additional context _No response_ ### Checklist - [x] I have searched Superset docs and Slack and didn't find a solution to my problem. - [x] I have searched the GitHub issue tracker and didn't find a similar bug report. - [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org