Re: Hive version with Spark

Sofia Panagiotidi Sat, 28 Nov 2015 01:15:07 -0800

Hi Mich,


I never managed to run Hive on Spark with a spark master other than local so I 
am afraid I don’t have a reply here.
But do try some things. Firstly, run hive as 

hive --hiveconf hive.root.logger=DEBUG,console

so that you are able to see what the exact error is.

I am afraid I cannot be much of a help as I think I reached the same point 
(where it would work only when setting spark.master=local) before abandoning.

Cheers



> On 27 Nov 2015, at 01:59, Mich Talebzadeh <m...@peridale.co.uk> wrote:
> 
> Hi Sophia,
>  
>  
> There is no Hadoop-2.6. I believe you should use Hadoop-2.4 as shown below
>  
>  
> mvn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests clean package
>  
> Also if you are building it for Hive on Spark engine, you should not include 
> Hadoop.jar files in your build.
>  
> For example I tried to build spark 1.3 from source code (I read that this 
> version works OK with Hive, having tried unsuccessfully spark 1.5.2). 
>  
> The following command created the tar file
>  
> ./make-distribution.sh --name "hadoop2-without-hive" --tgz 
> "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided"
>  
> spark-1.3.0-bin-hadoop2-without-hive.tar.gz
>  
>  
> Now I have other issues making Hive to use Spark execution engine (requires 
> Hive 1.1 or above )
>  
> In hive I do
>  
> set spark.home=/usr/lib/spark;
> set hive.execution.engine=spark;
> set spark.master=spark://127.0.0.1:7077 <spark://127.0.0.1:7077>;
> set spark.eventLog.enabled=true;
> set spark.eventLog.dir=/usr/lib/spark/logs;
> set spark.executor.memory=512m;
> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
> use asehadoop;
> select count(1) from t;
>  
> I get the following
>  
> OK
> Time taken: 0.753 seconds
> Query ID = hduser_20151127003523_e9863e84-9a81-4351-939c-36b3bef36478
> Total jobs = 1
> Launching Job 1 out of 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Failed to execute spark task, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark 
> client.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>  
> HTH,
>  
> Mich
>  
> NOTE: The information in this email is proprietary and confidential. This 
> message is for the designated recipient only, if you are not the intended 
> recipient, you should destroy it immediately. Any information in this message 
> shall not be understood as given or endorsed by Peridale Technology Ltd, its 
> subsidiaries or their employees, unless expressly so stated. It is the 
> responsibility of the recipient to ensure that this email is virus free, 
> therefore neither Peridale Ltd, its subsidiaries nor their employees accept 
> any responsibility.
>  
> From: Sofia [mailto:sofia.panagiot...@taiger.com] 
> Sent: 18 November 2015 16:50
> To: user@hive.apache.org
> Subject: Hive version with Spark
>  
> Hello
>  
> After various failed tries to use my Hive (1.2.1) with my Spark (Spark 1.4.1 
> built for Hadoop 2.2.0) I decided to try to build again Spark with Hive.
> I would like to know what is the latest Hive version that can be used to 
> build Spark at this point.
>  
> When downloading Spark 1.5 source and trying:
>  
> mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-1.2.1 
> -Phive-thriftserver  -DskipTests clean package
>  
> I get :
>  
> The requested profile "hive-1.2.1" could not be activated because it does not 
> exist.
>  
> Thank you
> Sofia

Re: Hive version with Spark

Reply via email to