[I] [Improvement] Spark connector should automatically set spark.sql.catalogImplementation=hive for hive provider catalogs [gravitino]

via GitHub Tue, 03 Feb 2026 13:13:18 -0800


andywalner opened a new issue, #9866:
URL: https://github.com/apache/gravitino/issues/9866


   ### What would you like to be improved?
   
   The Gravitino Spark connector requires users to manually set 
`spark.sql.catalogImplementation=hive` when using the `hive` catalog provider. 
This is a leaky abstraction - users shouldn't need to know about underlying 
Spark/Hive implementation details when Gravitino is meant to be the unified 
federation layer.
   
   **Current behavior:**
   
   The `hive` provider catalog appears in `SHOW CATALOGS`, but querying it 
fails with:
   
   ```
   Caused by: java.lang.AssertionError: assertion failed: Require setting 
spark.sql.catalogImplementation to `hive` to enable hive support.
       at 
org.apache.kyuubi.spark.connector.hive.HiveTableCatalog.initialize(HiveTableCatalog.scala:124)
       at 
org.apache.gravitino.spark.connector.hive.GravitinoHiveCatalog.createAndInitSparkCatalog(GravitinoHiveCatalog.java:42)
   ```
   
   **Expected behavior:**
   
   The `GravitinoSparkPlugin` should automatically set 
`spark.sql.catalogImplementation=hive` when it detects a catalog using the 
`hive` provider - just like `lakehouse-iceberg` works without requiring any 
additional Spark-level configuration.
   
   Users should only need:
   ```python
   spark = SparkSession.builder \
       .config("spark.plugins", 
"org.apache.gravitino.spark.connector.plugin.GravitinoSparkPlugin") \
       .config("spark.sql.gravitino.uri", "http://localhost:8090";) \
       .config("spark.sql.gravitino.metalake", "my_metalake") \
       .getOrCreate()
   ```
   
   ### Environment
   - Gravitino version: 1.1.0
   - Spark version: 3.5.3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Improvement] Spark connector should automatically set spark.sql.catalogImplementation=hive for hive provider catalogs [gravitino]

Reply via email to