yadavay-amzn opened a new pull request, #56627:
URL: https://github.com/apache/spark/pull/56627

   ### What changes were proposed in this pull request?
   
   Route ThriftServer JDBC metadata operations (getCatalogs, getSchemas, 
getTables, getColumns) through CatalogManager so they honor DataSource V2 
catalogs and the default catalog. Populate TABLE_CAT with the real catalog 
name. Introduce a new conf `spark.sql.thriftServer.catalogMetadata.enabled` 
(default true) with legacy fallback when disabled.
   
   ### Why are the changes needed?
   
   With `spark.sql.catalog.*` and `spark.sql.defaultCatalog` set, JDBC/BI 
clients get inconsistent metadata because the metadata operations used the V1 
SessionCatalog directly, ignoring any configured DSv2 catalogs.
   
   ### Design notes
   
   (a) A null `catalogName` resolves to the CURRENT catalog (consistent with 
Spark's own unspecified-to-current resolution and with Trino/Snowflake 
behavior), not all catalogs.
   
   (b) getCatalogs returns CatalogManager.listCatalogs() (including 
spark_catalog, sorted alphabetically).
   
   (c) TABLE_CAT was previously empty or null -- now populated with the actual 
catalog name. The conf defaults to ON with an escape hatch for clients that 
relied on parsing empty TABLE_CAT.
   
   (d) KNOWN LIMITATION: listCatalogs() returns only ALREADY-LOADED catalogs, 
so catalogs that are configured but never accessed will not be listed. This is 
documented; we do not eagerly load catalogs.
   
   (e) V2-specific metadata authorization is deferred to a follow-up. Existing 
Hive auth hooks are unchanged and getCatalogs/getSchemas already pass null priv 
objects.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. TABLE_CAT now reflects the catalog name (gated by the new conf). New 
conf: `spark.sql.thriftServer.catalogMetadata.enabled`.
   
   ### How was this patch tested?
   
   SparkMetadataOperationSuite covers: default spark_catalog path, configured 
in-memory DSv2 catalog path, and conf-disabled legacy path (getCatalogs, 
getSchemas, getTables, getColumns).
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Yes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to