[ https://issues.apache.org/jira/browse/HIVE-7436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069486#comment-14069486 ]
Xuefu Zhang commented on HIVE-7436: ----------------------------------- [~chengxiang li], thanks for sharing your thought. I gave it a little bit more thinking on this, and here are what I have in mind: 1. We don't want to read Spark config from a spark-site.xml file. We want expose a few basic Hive configuration for those, such as hive.server2.spark.masterurl for spark master. The reason for this is that we don't want to require the availability of such configuration file, and everything can be done in Hive itself. We don't necessarily want to get every configuration from spark-site.xml. This is also how tez is done, by the way. 2. I think the SparkContext should be per user session. A singleton available to everyone is not acceptable because we like to have a separation between users so that user doesn't accidentally share with other users such as custom UDFs. We don't like to do it per query due to the startup cost. I think per user session is a reasonable compromise. When user session expires, the resources will be released, so that they become available to other users. Let me know if this makes sense. Please note that currently SparkContext isn't thread safe, so we may need to ask spark to change that. (I think a old JIRA is already there.) > Load Spark configuration into Hive driver > ----------------------------------------- > > Key: HIVE-7436 > URL: https://issues.apache.org/jira/browse/HIVE-7436 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Chengxiang Li > Assignee: Chengxiang Li > Attachments: HIVE-7436-Spark.1.patch, HIVE-7436-Spark.2.patch > > > load Spark configuration into Hive driver, there are 3 ways to setup spark > configurations: > # Configure properties in spark configuration file(spark-defaults.conf). > # Java property. > # System environment. > Spark support configuration through system environment just for compatible > with previous scripts, we won't support in Hive on Spark. Hive on Spark load > defaults from java properties, then load properties from configuration file, > and override existed properties. > configuration steps: > # Create spark-defaults.conf, and place it in the /etc/spark/conf > configuration directory. > please refer to [http://spark.apache.org/docs/latest/configuration.html] > for configuration of spark-defaults.conf. > # Create the $SPARK_CONF_DIR environment variable and set it to the location > of spark-defaults.conf. > export SPARK_CONF_DIR=/etc/spark/conf > # Add $SAPRK_CONF_DIR to the $HADOOP_CLASSPATH environment variable. > export HADOOP_CLASSPATH=$SPARK_CONF_DIR:$HADOOP_CLASSPATH > NO PRECOMMIT TESTS. This is for spark-branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)