Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague
ok.. getting further.  seems now i have to deploy hive to all nodes in the
cluster - don't think i had to do that before but not a big deal to do it
now.

for me:
HIVE_HOME=/usr/lib/apache-hive-2.3.0-bin/
SPARK_HOME=/usr/lib/spark-2.2.0-bin-hadoop2.6

on all three nodes now.

i started spark master on the namenode and i started spark slaves (2) on
two datanodes of the cluster.

so far so good.

now i run my usual test command.

$ hive --hiveconf hive.root.logger=DEBUG,console -e 'set
hive.execution.engine=spark; select date_key, count(*) from
fe_inventory.merged_properties_hist group by 1 order by 1;'

i get a little further now and find the stderr from the Spark Web UI
interface (nice) and it reports this:

17/09/27 20:47:35 INFO WorkerWatcher: Successfully connected to
spark://Worker@172.19.79.127:40145
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)*Caused
by: java.lang.NoSuchFieldError: SPARK_RPC_SERVER_ADDRESS*
at 
org.apache.hive.spark.client.rpc.RpcConfiguration.(RpcConfiguration.java:47)
at 
org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:134)
at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:516)
... 6 more



searching around the internet i find this is probably a compatibility issue.

i know. i know. no surprise here.

so i guess i just got to the point where everybody else is... build spark
w/o hive.

lemme see what happens next.





On Wed, Sep 27, 2017 at 7:41 PM, Stephen Sprague  wrote:

> thanks.  I haven't had a chance to dig into this again today but i do
> appreciate the pointer.  I'll keep you posted.
>
> On Wed, Sep 27, 2017 at 10:14 AM, Sahil Takiar 
> wrote:
>
>> You can try increasing the value of hive.spark.client.connect.timeout.
>> Would also suggest taking a look at the HoS Remote Driver logs. The driver
>> gets launched in a YARN container (assuming you are running Spark in
>> yarn-client mode), so you just have to find the logs for that container.
>>
>> --Sahil
>>
>> On Tue, Sep 26, 2017 at 9:17 PM, Stephen Sprague 
>> wrote:
>>
>>> i _seem_ to be getting closer.  Maybe its just wishful thinking.
>>> Here's where i'm at now.
>>>
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>>> 17/09/26 21:10:38 INFO rest.RestSubmissionClient: Server responded with
>>> CreateSubmissionResponse:
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl: {
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>>> "action" : "CreateSubmissionResponse",
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>>> "message" : "Driver successfully submitted as driver-20170926211038-0003",
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>>> "serverSparkVersion" : "2.2.0",
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>>> "submissionId" : "driver-20170926211038-0003",
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>>> "success" : true
>>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl: }
>>> 2017-09-26T21:10:45,701 DEBUG [IPC Client (425015667) connection to
>>> dwrdevnn1.sv2.trulia.com/172.19.73.136:8020 from dwr] ipc.Client: IPC
>>> Client (425015667) connection to dwrdevnn1.sv2.trulia.com/172.1
>>> 9.73.136:8020 from dwr: closed
>>> 2017-09-26T21:10:45,702 DEBUG [IPC Client (425015667) connection to
>>> dwrdevnn1.sv2.trulia.com/172.19.73.136:8020 from dwr] ipc.Client: IPC
>>> Clien
>>> t (425015667) connection to dwrdevnn1.sv2.trulia.com/172.19.73.136:8020
>>> from dwr: stopped, remaining connections 0
>>> 2017-09-26T21:12:06,719 ERROR [2337b36e-86ca-47cd-b1ae-f0b32571b97e
>>> main] client.SparkClientImpl: Timed out waiting for client to connect.
>>> *Possible reasons include network issues, errors in remote driver or the
>>> cluster has no available resources, etc.*
>>> *Please check YARN or Spark driver's logs for further information.*
>>> java.util.concurrent.ExecutionException: 
>>> java.util.concurrent.TimeoutException:
>>> Timed out waiting for client connection.
>>> at 
>>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>>> ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
>>> at 
>>> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:108)
>>> [hive-exec-2.3.0.jar:2.3.0]
>>> at 
>>> 

Re: Hive query starts own session for LLAP

2017-09-27 Thread Gopal Vijayaraghavan

> Now we need an explanation of "map" -- can you supply it?

The "map" mode runs all tasks with a TableScan operator inside LLAP instances 
and all other tasks in Tez YARN containers. This is the LLAP + Tez hybrid mode, 
which introduces some complexity in debugging a single query.

The "only" mode is so far the best option since, the LlapDecider runs very late 
in the optimizer order the earlier optimizers need to hedge their bets on 
whether LLAP will finally be used for a vertex or not. The "only" mode sort of 
short-cuts that by assuring all optimizers that it is "LLAP or Bust!".

Cheers,
Gopal





Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague
thanks.  I haven't had a chance to dig into this again today but i do
appreciate the pointer.  I'll keep you posted.

On Wed, Sep 27, 2017 at 10:14 AM, Sahil Takiar 
wrote:

> You can try increasing the value of hive.spark.client.connect.timeout.
> Would also suggest taking a look at the HoS Remote Driver logs. The driver
> gets launched in a YARN container (assuming you are running Spark in
> yarn-client mode), so you just have to find the logs for that container.
>
> --Sahil
>
> On Tue, Sep 26, 2017 at 9:17 PM, Stephen Sprague 
> wrote:
>
>> i _seem_ to be getting closer.  Maybe its just wishful thinking.   Here's
>> where i'm at now.
>>
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>> 17/09/26 21:10:38 INFO rest.RestSubmissionClient: Server responded with
>> CreateSubmissionResponse:
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl: {
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>> "action" : "CreateSubmissionResponse",
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>> "message" : "Driver successfully submitted as driver-20170926211038-0003",
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>> "serverSparkVersion" : "2.2.0",
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>> "submissionId" : "driver-20170926211038-0003",
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
>> "success" : true
>> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl: }
>> 2017-09-26T21:10:45,701 DEBUG [IPC Client (425015667) connection to
>> dwrdevnn1.sv2.trulia.com/172.19.73.136:8020 from dwr] ipc.Client: IPC
>> Client (425015667) connection to dwrdevnn1.sv2.trulia.com/172.1
>> 9.73.136:8020 from dwr: closed
>> 2017-09-26T21:10:45,702 DEBUG [IPC Client (425015667) connection to
>> dwrdevnn1.sv2.trulia.com/172.19.73.136:8020 from dwr] ipc.Client: IPC
>> Clien
>> t (425015667) connection to dwrdevnn1.sv2.trulia.com/172.19.73.136:8020
>> from dwr: stopped, remaining connections 0
>> 2017-09-26T21:12:06,719 ERROR [2337b36e-86ca-47cd-b1ae-f0b32571b97e
>> main] client.SparkClientImpl: Timed out waiting for client to connect.
>> *Possible reasons include network issues, errors in remote driver or the
>> cluster has no available resources, etc.*
>> *Please check YARN or Spark driver's logs for further information.*
>> java.util.concurrent.ExecutionException: 
>> java.util.concurrent.TimeoutException:
>> Timed out waiting for client connection.
>> at 
>> io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
>> ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
>> at 
>> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:108)
>> [hive-exec-2.3.0.jar:2.3.0]
>> at 
>> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
>> [hive-exec-2.3.0.jar:2.3.0]
>> at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.c
>> reateRemoteClient(RemoteHiveSparkClient.java:101)
>> [hive-exec-2.3.0.jar:2.3.0]
>> at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<
>> init>(RemoteHiveSparkClient.java:97) [hive-exec-2.3.0.jar:2.3.0]
>> at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.
>> createHiveSparkClient(HiveSparkClientFactory.java:73)
>> [hive-exec-2.3.0.jar:2.3.0]
>> at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImp
>> l.open(SparkSessionImpl.java:62) [hive-exec-2.3.0.jar:2.3.0]
>> at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionMan
>> agerImpl.getSession(SparkSessionManagerImpl.java:115)
>> [hive-exec-2.3.0.jar:2.3.0]
>> at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSpark
>> Session(SparkUtilities.java:126) [hive-exec-2.3.0.jar:2.3.0]
>> at org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerPar
>> allelism.getSparkMemoryAndCores(SetSparkReducerParallelism.java:236)
>> [hive-exec-2.3.0.jar:2.3.0]
>>
>>
>> i'll dig some more tomorrow.
>>
>> On Tue, Sep 26, 2017 at 8:23 PM, Stephen Sprague 
>> wrote:
>>
>>> oh. i missed Gopal's reply.  oy... that sounds foreboding.  I'll keep
>>> you posted on my progress.
>>>
>>> On Tue, Sep 26, 2017 at 4:40 PM, Gopal Vijayaraghavan >> > wrote:
>>>
 Hi,

 > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a
 spark session: org.apache.hadoop.hive.ql.metadata.HiveException:
 Failed to create spark client.

 I get inexplicable errors with Hive-on-Spark unless I do a three step
 build.

 Build Hive first, use that version to build Spark, use that Spark
 version to rebuild Hive.

 I have to do this to make it work because Spark contains Hive jars and
 Hive contains Spark jars in the class-path.

 And specifically I have to edit the pom.xml files, 

Re: hive on spark - why is it so hard?

2017-09-27 Thread Sahil Takiar
You can try increasing the value of hive.spark.client.connect.timeout.
Would also suggest taking a look at the HoS Remote Driver logs. The driver
gets launched in a YARN container (assuming you are running Spark in
yarn-client mode), so you just have to find the logs for that container.

--Sahil

On Tue, Sep 26, 2017 at 9:17 PM, Stephen Sprague  wrote:

> i _seem_ to be getting closer.  Maybe its just wishful thinking.   Here's
> where i'm at now.
>
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
> 17/09/26 21:10:38 INFO rest.RestSubmissionClient: Server responded with
> CreateSubmissionResponse:
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl: {
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
> "action" : "CreateSubmissionResponse",
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
> "message" : "Driver successfully submitted as driver-20170926211038-0003",
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
> "serverSparkVersion" : "2.2.0",
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
> "submissionId" : "driver-20170926211038-0003",
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl:
> "success" : true
> 2017-09-26T21:10:38,892  INFO [stderr-redir-1] client.SparkClientImpl: }
> 2017-09-26T21:10:45,701 DEBUG [IPC Client (425015667) connection to
> dwrdevnn1.sv2.trulia.com/172.19.73.136:8020 from dwr] ipc.Client: IPC
> Client (425015667) connection to dwrdevnn1.sv2.trulia.com/172.
> 19.73.136:8020 from dwr: closed
> 2017-09-26T21:10:45,702 DEBUG [IPC Client (425015667) connection to
> dwrdevnn1.sv2.trulia.com/172.19.73.136:8020 from dwr] ipc.Client: IPC
> Clien
> t (425015667) connection to dwrdevnn1.sv2.trulia.com/172.19.73.136:8020
> from dwr: stopped, remaining connections 0
> 2017-09-26T21:12:06,719 ERROR [2337b36e-86ca-47cd-b1ae-f0b32571b97e main]
> client.SparkClientImpl: Timed out waiting for client to connect.
> *Possible reasons include network issues, errors in remote driver or the
> cluster has no available resources, etc.*
> *Please check YARN or Spark driver's logs for further information.*
> java.util.concurrent.ExecutionException: 
> java.util.concurrent.TimeoutException:
> Timed out waiting for client connection.
> at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
> ~[netty-all-4.0.29.Final.jar:4.0.29.Final]
> at 
> org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:108)
> [hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
> [hive-exec-2.3.0.jar:2.3.0]
> at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.
> createRemoteClient(RemoteHiveSparkClient.java:101)
> [hive-exec-2.3.0.jar:2.3.0]
> at org.apache.hadoop.hive.ql.exec.spark.
> RemoteHiveSparkClient.(RemoteHiveSparkClient.java:97)
> [hive-exec-2.3.0.jar:2.3.0]
> at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.
> createHiveSparkClient(HiveSparkClientFactory.java:73)
> [hive-exec-2.3.0.jar:2.3.0]
> at org.apache.hadoop.hive.ql.exec.spark.session.
> SparkSessionImpl.open(SparkSessionImpl.java:62)
> [hive-exec-2.3.0.jar:2.3.0]
> at org.apache.hadoop.hive.ql.exec.spark.session.
> SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:115)
> [hive-exec-2.3.0.jar:2.3.0]
> at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.
> getSparkSession(SparkUtilities.java:126) [hive-exec-2.3.0.jar:2.3.0]
> at org.apache.hadoop.hive.ql.optimizer.spark.
> SetSparkReducerParallelism.getSparkMemoryAndCores(
> SetSparkReducerParallelism.java:236) [hive-exec-2.3.0.jar:2.3.0]
>
>
> i'll dig some more tomorrow.
>
> On Tue, Sep 26, 2017 at 8:23 PM, Stephen Sprague 
> wrote:
>
>> oh. i missed Gopal's reply.  oy... that sounds foreboding.  I'll keep you
>> posted on my progress.
>>
>> On Tue, Sep 26, 2017 at 4:40 PM, Gopal Vijayaraghavan 
>> wrote:
>>
>>> Hi,
>>>
>>> > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a
>>> spark session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed
>>> to create spark client.
>>>
>>> I get inexplicable errors with Hive-on-Spark unless I do a three step
>>> build.
>>>
>>> Build Hive first, use that version to build Spark, use that Spark
>>> version to rebuild Hive.
>>>
>>> I have to do this to make it work because Spark contains Hive jars and
>>> Hive contains Spark jars in the class-path.
>>>
>>> And specifically I have to edit the pom.xml files, instead of passing in
>>> params with -Dspark.version, because the installed pom files don't get
>>> replacements from the build args.
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>
>


-- 
Sahil Takiar
Software Engineer at Cloudera
takiar.sa...@gmail.com | (510) 673-0309