RussellSpitzer opened a new issue #2488:
URL: https://github.com/apache/iceberg/issues/2488


   Our current configuration of the Spark3 Spark session catalog allows you to 
set the value of the metastore either by inheriting it from the Hadoop 
Configuration
   
   ```
   spark.hadoop.hive.metastore.uris=thrift://localhost:9083
   ```
   
   or by specifying it for the catalog itself
   
   ```
   spark.sql.catalog.spark_catalog.uri=thrift://localhost:9083 
   ```
   
   Or a user can use a non Hive based catalog for the Session or Iceberg table. 
The key issue here is that if these catalogs differ we can end up with a lot of 
weird situations.
   
   For example:
   Say we configure only "spark_catalog.uri", This will set the Iceberg 
metastore to a value but leave the Spark Session catalog on it's default value 
(in my local case derby). This means that almost all calls to database will be 
done on derby and invisible to Iceberg. So I can end up with weird behavior like
   
   ```scala
   scala> spark.sql("CREATE DATABASE catset")
   21/04/16 10:18:56 WARN ObjectStore: Failed to get database catset, returning 
NoSuchObjectException
   res4: org.apache.spark.sql.DataFrame = []
   scala> spark.sql("CREATE TABLE catset.foo (x int) USING iceberg")
   java.lang.RuntimeException: Metastore operation failed for catset.foo
   ```
   
   I have no problem making the database, but my CREATE command uses the 
Iceberg catalog which doesn't have the the database. So I get the "catset" not 
exists error. I
   
   
   
   I think to address this we need to disallow configuring the sparksession 
catalog with a different catalog type than the Iceberg catalog. This means we 
only actually allow for the sparksession catalog to be coupled with a hive 
metastore which is also configured for the delegate session catalog.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to