andywalner opened a new issue, #9866:
URL: https://github.com/apache/gravitino/issues/9866
### What would you like to be improved?
The Gravitino Spark connector requires users to manually set
`spark.sql.catalogImplementation=hive` when using the `hive` catalog provider.
This is a leaky abstraction - users shouldn't need to know about underlying
Spark/Hive implementation details when Gravitino is meant to be the unified
federation layer.
**Current behavior:**
The `hive` provider catalog appears in `SHOW CATALOGS`, but querying it
fails with:
```
Caused by: java.lang.AssertionError: assertion failed: Require setting
spark.sql.catalogImplementation to `hive` to enable hive support.
at
org.apache.kyuubi.spark.connector.hive.HiveTableCatalog.initialize(HiveTableCatalog.scala:124)
at
org.apache.gravitino.spark.connector.hive.GravitinoHiveCatalog.createAndInitSparkCatalog(GravitinoHiveCatalog.java:42)
```
**Expected behavior:**
The `GravitinoSparkPlugin` should automatically set
`spark.sql.catalogImplementation=hive` when it detects a catalog using the
`hive` provider - just like `lakehouse-iceberg` works without requiring any
additional Spark-level configuration.
Users should only need:
```python
spark = SparkSession.builder \
.config("spark.plugins",
"org.apache.gravitino.spark.connector.plugin.GravitinoSparkPlugin") \
.config("spark.sql.gravitino.uri", "http://localhost:8090") \
.config("spark.sql.gravitino.metalake", "my_metalake") \
.getOrCreate()
```
### Environment
- Gravitino version: 1.1.0
- Spark version: 3.5.3
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]