Re: hive on spark - why is it so hard?

2017-10-02 Thread Jörn Franke
You should try with TEZ+LLAP. Additionally you will need to compare different configurations. Finally just any comparison is meaningless. You should use queries, data and file formats that your users are using later. > On 2. Oct 2017, at 03:06, Stephen Sprague wrote: > >

Re: hive on spark - why is it so hard?

2017-10-01 Thread Stephen Sprague
so... i made some progress after much copying of jar files around (as alluded to by Gopal previously on this thread). following the instructions here: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started and doing this as instructed will leave off about a dozen or

Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague
ok.. getting further. seems now i have to deploy hive to all nodes in the cluster - don't think i had to do that before but not a big deal to do it now. for me: HIVE_HOME=/usr/lib/apache-hive-2.3.0-bin/ SPARK_HOME=/usr/lib/spark-2.2.0-bin-hadoop2.6 on all three nodes now. i started

Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague
thanks. I haven't had a chance to dig into this again today but i do appreciate the pointer. I'll keep you posted. On Wed, Sep 27, 2017 at 10:14 AM, Sahil Takiar wrote: > You can try increasing the value of hive.spark.client.connect.timeout. > Would also suggest taking

Re: hive on spark - why is it so hard?

2017-09-27 Thread Sahil Takiar
You can try increasing the value of hive.spark.client.connect.timeout. Would also suggest taking a look at the HoS Remote Driver logs. The driver gets launched in a YARN container (assuming you are running Spark in yarn-client mode), so you just have to find the logs for that container. --Sahil

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
i _seem_ to be getting closer. Maybe its just wishful thinking. Here's where i'm at now. 2017-09-26T21:10:38,892 INFO [stderr-redir-1] client.SparkClientImpl: 17/09/26 21:10:38 INFO rest.RestSubmissionClient: Server responded with CreateSubmissionResponse: 2017-09-26T21:10:38,892 INFO

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
oh. i missed Gopal's reply. oy... that sounds foreboding. I'll keep you posted on my progress. On Tue, Sep 26, 2017 at 4:40 PM, Gopal Vijayaraghavan wrote: > Hi, > > > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a > spark session:

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
well this is the spark-submit line from above: 2017-09-26T14:04:45,678 INFO [4cb82b6d-9568-4518-8e00-f0cf7ac58cd3 main] client.SparkClientImpl: Running client driver with argv: */usr/li/spark-2.2.0-bin-**hadoop2.6/bin/spark-submit* and that's pretty clearly v2.2 I do have other versions of

Re: hive on spark - why is it so hard?

2017-09-26 Thread Gopal Vijayaraghavan
Hi, > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a spark > session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create > spark client. I get inexplicable errors with Hive-on-Spark unless I do a three step build. Build Hive first, use that version to

Re: hive on spark - why is it so hard?

2017-09-26 Thread Sahil Takiar
Are you sure you are using Spark 2.2.0? Based on the stack-trace it looks like your call to spark-submit it using an older version of Spark (looks like some early 1.x version). Do you have SPARK_HOME set locally? Do you have older versions of Spark installed locally? --Sahil On Tue, Sep 26, 2017

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague
thanks Sahil. here it is. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/scheduler/SparkListenerInterface at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:344) at

Re: hive on spark - why is it so hard?

2017-09-26 Thread Sahil Takiar
Hey Stephen, Can you send the full stack trace for the NoClassDefFoundError? For Hive 2.3.0, we only support Spark 2.0.0. Hive may work with more recent versions of Spark, but we only test with Spark 2.0.0. --Sahil On Tue, Sep 26, 2017 at 2:35 PM, Stephen Sprague wrote: >