gatorsmile commented on a change in pull request #23709:
[SPARK-26794][SQL]SparkSession enableHiveSupport does not point to hive but
in-memory while the SparkContext exists
URL: https://github.com/apache/spark/pull/23709#discussion_r252961561
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala
##########
@@ -40,25 +40,39 @@ import org.apache.spark.util.{MutableURLClassLoader, Utils}
/**
* A class that holds all state shared across sessions in a given
[[SQLContext]].
*/
-private[sql] class SharedState(val sparkContext: SparkContext) extends Logging
{
+private[sql] class SharedState(
+ val sparkContext: SparkContext,
+ initialConfigs: scala.collection.Map[String, String])
+ extends Logging {
+ private val conf = sparkContext.conf.clone()
+ private val hadoopConf = new Configuration(sparkContext.hadoopConfiguration)
+
+ // If `SparkSession` is instantiated using an existing `SparkContext`
instance and no existing
+ // `SharedState`, all `SparkSession` level configurations have higher
priority to generate a
+ // `SharedState` instance. This will be done only once then shared across
`SparkSession`s
+ initialConfigs.foreach { case (k, v) =>
+ logDebug(s"Applying initial SparkSession options to SparkConf/HadoopConf:
$k -> $v")
+ conf.set(k, v)
+ hadoopConf.set(k, v)
+ }
// Load hive-site.xml into hadoopConf and determine the warehouse path we
want to use, based on
// the config from both hive and Spark SQL. Finally set the warehouse config
value to sparkConf.
val warehousePath: String = {
val configFile =
Utils.getContextOrSparkClassLoader.getResource("hive-site.xml")
if (configFile != null) {
logInfo(s"loading hive config file: $configFile")
- sparkContext.hadoopConfiguration.addResource(configFile)
+ hadoopConf.addResource(configFile)
}
// hive.metastore.warehouse.dir only stay in hadoopConf
- sparkContext.conf.remove("hive.metastore.warehouse.dir")
+ conf.remove("hive.metastore.warehouse.dir")
Review comment:
It sounds like this also change the original semantics.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]