yaooqinn commented on a change in pull request #31460:
URL: https://github.com/apache/spark/pull/31460#discussion_r569889222
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala
##########
@@ -220,31 +219,16 @@ object SharedState extends Logging {
}
/**
- * Load hive-site.xml into hadoopConf and determine the warehouse path we
want to use, based on
- * the config from both hive and Spark SQL. Finally set the warehouse config
value to sparkConf.
+ * Determine the warehouse path by spark conf, hadoop configuration and the
initial options from
+ * the very first created SparkSession instance.
*/
- def loadHiveConfFile(
+ def determineWarehouse(
sparkConf: SparkConf,
hadoopConf: Configuration,
initialConfigs: scala.collection.Map[String, String] = Map.empty)
: scala.collection.Map[String, String] = {
- def containsInSparkConf(key: String): Boolean = {
- sparkConf.contains(key) || sparkConf.contains("spark.hadoop." + key) ||
- (key.startsWith("hive") && sparkConf.contains("spark." + key))
- }
-
val hiveWarehouseKey = "hive.metastore.warehouse.dir"
- val configFile =
Utils.getContextOrSparkClassLoader.getResourceAsStream("hive-site.xml")
- if (configFile != null) {
- logInfo(s"loading hive config file: $configFile")
- val hadoopConfTemp = new Configuration()
- hadoopConfTemp.clear()
- hadoopConfTemp.addResource(configFile)
- for (entry <- hadoopConfTemp.asScala if
!containsInSparkConf(entry.getKey)) {
- hadoopConf.set(entry.getKey, entry.getValue)
Review comment:
According to the current usage restrictions of Hive in Spark, for
documented behaviors, there is no side-effect that makes practical sense. But
in some undocumented areas, there do have some kind of side effects, e.g.
dynamically load the `hive-site.xml` which is unreachable at the start of a
Spark app, but added later through some APIs, then those configurations will be
added anymore.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]