Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21122#discussion_r186256056
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
---
@@ -1354,7 +1354,8 @@ class HiveDDLSuite
val indexName = tabName + "_index"
withTable(tabName) {
// Spark SQL does not support creating index. Thus, we have to use
Hive client.
- val client =
spark.sharedState.externalCatalog.asInstanceOf[HiveExternalCatalog].client
+ val client =
+
spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client
--- End diff --
Ideally Hive should not be a first-class citizen in Spark, it's just a data
source and a catalog, and nothing more. We want to narrow down the scope of
hive usage in Spark, and keeping it only in `HiveExternalCatalog` is a good
choice.
This is still an on-going effort, as @rdblue pointed out, there are still 2
places Spark uses Hive directly:
1. `HiveSessionStateBuilder`. It's mostly for the ADD JAR, once we move the
ADD JAR functionality to `ExternalCatalog`, we can fix it
2. `SaveAsHiveFile`. It's the data source part so it should be allowed to
use Hive there. One thing we can improve is the hive client reuse. We cast
`HiveExternalCatalog` to get the existing hive client, maybe there is a better
way to do it without the ugly casting.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]