[ https://issues.apache.org/jira/browse/HIVE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147714#comment-14147714 ]
Rui Li commented on HIVE-7382: ------------------------------ Hi [~xuefuz], I hit the same problem as Szehon mentioned. After some digging, I think this is because in local-cluster mode spark will launch separate JVMs for executor backends. So it needs to run some scripts to determine proper class path (and probably something else), please refer to {{CommandUtils.buildCommandSeq}}, which is called when {{ExecutorRunner}} tries to launch the executor backend. Therefore local-cluster mode requires an installation of spark, and spark.home or spark.test.home to be properly set. I think this is all right if local-cluster is merely used for spark unit tests. But it shouldn't be used for user applications, because it's not that "local" in the sense it requires an installation of spark. To verify my guess, I run some hive query (not tests) on spark without setting spark.home. It runs well on standalone and local modes, but got the same error with local-cluster mode. To make it work, I have to export SPARK_HOME properly. (Please note setting spark.home or spark.testing + spark.test.home in SparkConf won't help) What's your opinion? > Create a MiniSparkCluster and set up a testing framework [Spark Branch] > ----------------------------------------------------------------------- > > Key: HIVE-7382 > URL: https://issues.apache.org/jira/browse/HIVE-7382 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Xuefu Zhang > Assignee: Rui Li > Labels: Spark-M1 > > To automatically test Hive functionality over Spark execution engine, we need > to create a test framework that can execute Hive queries with Spark as the > backend. For that, we should create a MiniSparkCluser for this, similar to > other execution engines. > Spark has a way to create a local cluster with a few processes in the local > machine, each process is a work node. It's fairly close to a real Spark > cluster. Our mini cluster can be based on that. > For more info, please refer to the design doc on wiki. -- This message was sent by Atlassian JIRA (v6.3.4#6332)