[
https://issues.apache.org/jira/browse/SPARK-50759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-50759.
---------------------------------
Fix Version/s: 4.1.0
Resolution: Fixed
Issue resolved by pull request 50085
[https://github.com/apache/spark/pull/50085]
> Spark catalog api bug when working with non-hms based catalog
> -------------------------------------------------------------
>
> Key: SPARK-50759
> URL: https://issues.apache.org/jira/browse/SPARK-50759
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.4.0, 4.0.0, 3.5.4
> Reporter: Sunny malik
> Assignee: Sunny malik
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.1.0
>
>
> Hi
> I am encountering issues while working with a REST-based catalog. My Spark
> session is configured with a default catalog that uses the REST-based
> implementation.
> The {{SparkSession.catalog}} API does not function correctly with the
> REST-based catalog. This issue has been tested and observed in Spark 3.4.
> ----------------------------------------------------------------------------------
> ${SPARK_HOME}/bin/spark-shell --master local[*]
> --driver-memory 2g
> --conf
> spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
> --conf
> spark.sql.catalog.iceberg.uri=[https://xx.xxx.xxxx.domain.com|https://xx.xxx.xxxx.domain.com/]
> --conf spark.sql.warehouse.dir=$SQL_WAREHOUSE_DIR
> --conf spark.sql.defaultCatalog=iceberg
> --conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
> --conf
> spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
> scala> spark.catalog.currentCatalog
> res1: String = iceberg
> scala> spark.sql("select * from restDb.restTable").show
> +---+----------+
> | id| data|
> +---+----------+
> | 1|some_value|
> +---+----------+
> scala> spark.catalog.tableExists("restDb.restTable")
> *res3: Boolean = true*
> scala> spark.catalog.tableExists("restDb", "restTable")
> *res4: Boolean = false*
> ----------------------------------------------------------------------------------
>
> API spark.catalog.tableExists(String databaseName, String tableName)
> is only meant to work with HMS based catalog
> ([https://github.com/apache/spark/blob/5a91172c019c119e686f8221bbdb31f59d3d7776/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala#L224])
>
> spark.catalog.tableExists(String databaseName, String tableName)
> is meant to work with hms and non-hms based catalogs
>
>
> Suggested resolutions
> 1. API spark.catalog.tableExists(String databaseName, String tableName) to
> throw runtime exception if session catalog is non-hms based catalog
> 2. Deprecrate HMS specific API in newer Spark release as Spark already have
> API that can work with hms and non-hms based catalogs.
>
> Thanks
> Sunny
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]