Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-04-08 Thread Jeff Zhang
It is supposed to be fixed in 0.9.0-SNAPSHOT as well, if you hit this issue
in master, then it should be a bug, please file a ticket and describe the
details. Thanks



Y. Ethan Guo  于2019年4月8日周一 下午4:42写道:

> I'm partially hitting this issue in 0.9.0-SNAPSHOT for Spark interpreter
> with other names.  Not sure if ZEPPELIN-3986 issue is completely resolved.
> I'm using multiple spark interpreters with different spark confs which
> share the same SPARK_SUBMIT_OPTIONS including a `--jars` option.  It seems
> that only one of them is working.  Anyway, shall we follow up on the ticket
> and see how to fix it?
>
> Thanks,
> - Ethan
>
> On Mon, Apr 8, 2019 at 1:34 AM Jeff Zhang  wrote:
>
>> Hi Ethan,
>>
>> These behavior are not expected. Maybe you are hitting this issue which
>> is fixed in 0.8.2
>> https://jira.apache.org/jira/browse/ZEPPELIN-3986
>>
>>
>> Y. Ethan Guo  于2019年4月8日周一 下午4:26写道:
>>
>>> Hi Jeff, Dave,
>>>
>>> Thanks for the suggestion.  I was able to successfully run the Spark
>>> interpreter in yarn cluster mode on anther machine running Zeppelin.  The
>>> previous problem could probably be due to network issues.
>>>
>>> I have two observations:
>>> (1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the
>>> "spark" interpreter with yarn cluster mode configured.  I verify that the
>>> jars are pushed to the driver and executors by successfully running a job
>>> using some classes in the jars.  However, if I create a new "spark_abc"
>>> interpreter under the spark interpreter group, this new interpreter doesn't
>>> seem to pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors
>>> of not being able to access packages/classes in the jars.
>>>
>>> (2) Once I restart the spark interpreters in the interpreter settings,
>>> the corresponding Spark jobs in yarn cluster first transition from
>>> "RUNNING" state to "ACCEPTED" state, and then end up in "FAILED" state.
>>>
>>> I'm wondering if the above behavior are expected and they are known to
>>> be the limitations of the current 0.9.0-SNAPSHOT version.
>>>
>>> Thanks,
>>> - Ethan
>>>
>>> On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd 
>>> wrote:
>>>
 From the connection refused message I wonder if it is an SSL error.  I
 note none of the information for SSL (truststore, keystore, etc.)
 I would think the YARN cluster requires some form of authentication.
 On 4/7/19 9:27 AM, Jeff Zhang wrote:

 It looks like the interpreter process can not connect to zeppelin
 server process. I guess it is due to some network issue, can you check
 whether the node in yarn cluster can connect to the zeppelin server host ?

 Y. Ethan Guo  于2019年4月7日周日 下午3:31写道:

> Hi Jeff,
>
> Given this PR is merged, I'm trying to see if I can run yarn cluster
> mode from master build.  I built Zeppelin master from this commit:
>
> commit 3655c12b875884410224eca5d6155287d51916ac
> Author: Jongyoul Lee 
> Date:   Mon Apr 1 15:37:57 2019 +0900
> [MINOR] Refactor CronJob class (#3335)
>
> While I can successfully run Spark interpreter yarn client mode, I'm
> having trouble making the yarn cluster mode working.  Specifically, while
> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
> because of this exception (see below).  Do you have any idea why this
> is happening?
>
> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
> keyStorePassword=None, trustStore=None, trustStorePassword=None,
> protocol=None, enabledAlgorithms=Set()}
>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
> Starting the user application in a separate Thread
>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
> Waiting for spark context initialization...
>  INFO [2019-04-07 06:57:00,403] ({Driver}
> RemoteInterpreterServer.java[]:148) - Starting remote interpreter
> server on port 0, intpEventServerAddress: 172.17.0.1:45128
> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91)
> - User class threw exception:
> org.apache.thrift.transport.TTransportException: 
> java.net.ConnectException:
> Connection refused (Connection refused)
> org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused (Connection refused)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:154)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:139)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMe

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-04-08 Thread Y. Ethan Guo
I'm partially hitting this issue in 0.9.0-SNAPSHOT for Spark interpreter
with other names.  Not sure if ZEPPELIN-3986 issue is completely resolved.
I'm using multiple spark interpreters with different spark confs which
share the same SPARK_SUBMIT_OPTIONS including a `--jars` option.  It seems
that only one of them is working.  Anyway, shall we follow up on the ticket
and see how to fix it?

Thanks,
- Ethan

On Mon, Apr 8, 2019 at 1:34 AM Jeff Zhang  wrote:

> Hi Ethan,
>
> These behavior are not expected. Maybe you are hitting this issue which is
> fixed in 0.8.2
> https://jira.apache.org/jira/browse/ZEPPELIN-3986
>
>
> Y. Ethan Guo  于2019年4月8日周一 下午4:26写道:
>
>> Hi Jeff, Dave,
>>
>> Thanks for the suggestion.  I was able to successfully run the Spark
>> interpreter in yarn cluster mode on anther machine running Zeppelin.  The
>> previous problem could probably be due to network issues.
>>
>> I have two observations:
>> (1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the
>> "spark" interpreter with yarn cluster mode configured.  I verify that the
>> jars are pushed to the driver and executors by successfully running a job
>> using some classes in the jars.  However, if I create a new "spark_abc"
>> interpreter under the spark interpreter group, this new interpreter doesn't
>> seem to pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors
>> of not being able to access packages/classes in the jars.
>>
>> (2) Once I restart the spark interpreters in the interpreter settings,
>> the corresponding Spark jobs in yarn cluster first transition from
>> "RUNNING" state to "ACCEPTED" state, and then end up in "FAILED" state.
>>
>> I'm wondering if the above behavior are expected and they are known to be
>> the limitations of the current 0.9.0-SNAPSHOT version.
>>
>> Thanks,
>> - Ethan
>>
>> On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd  wrote:
>>
>>> From the connection refused message I wonder if it is an SSL error.  I
>>> note none of the information for SSL (truststore, keystore, etc.)
>>> I would think the YARN cluster requires some form of authentication.
>>> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>>>
>>> It looks like the interpreter process can not connect to zeppelin server
>>> process. I guess it is due to some network issue, can you check whether the
>>> node in yarn cluster can connect to the zeppelin server host ?
>>>
>>> Y. Ethan Guo  于2019年4月7日周日 下午3:31写道:
>>>
 Hi Jeff,

 Given this PR is merged, I'm trying to see if I can run yarn cluster
 mode from master build.  I built Zeppelin master from this commit:

 commit 3655c12b875884410224eca5d6155287d51916ac
 Author: Jongyoul Lee 
 Date:   Mon Apr 1 15:37:57 2019 +0900
 [MINOR] Refactor CronJob class (#3335)

 While I can successfully run Spark interpreter yarn client mode, I'm
 having trouble making the yarn cluster mode working.  Specifically, while
 the interpreter job was accepted in yarn, the job failed after 1-2 minutes
 because of this exception (see below).  Do you have any idea why this
 is happening?

 DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
 Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
 keyStorePassword=None, trustStore=None, trustStorePassword=None,
 protocol=None, enabledAlgorithms=Set()}
  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
 Starting the user application in a separate Thread
  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
 Waiting for spark context initialization...
  INFO [2019-04-07 06:57:00,403] ({Driver}
 RemoteInterpreterServer.java[]:148) - Starting remote interpreter
 server on port 0, intpEventServerAddress: 172.17.0.1:45128
 ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
 User class threw exception:
 org.apache.thrift.transport.TTransportException: java.net.ConnectException:
 Connection refused (Connection refused)
 org.apache.thrift.transport.TTransportException:
 java.net.ConnectException: Connection refused (Connection refused)
 at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
 at
 org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:154)
 at
 org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:139)
 at
 org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at
 org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
 Caused by

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-04-08 Thread Jeff Zhang
Hi Ethan,

These behavior are not expected. Maybe you are hitting this issue which is
fixed in 0.8.2
https://jira.apache.org/jira/browse/ZEPPELIN-3986


Y. Ethan Guo  于2019年4月8日周一 下午4:26写道:

> Hi Jeff, Dave,
>
> Thanks for the suggestion.  I was able to successfully run the Spark
> interpreter in yarn cluster mode on anther machine running Zeppelin.  The
> previous problem could probably be due to network issues.
>
> I have two observations:
> (1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the "spark"
> interpreter with yarn cluster mode configured.  I verify that the jars are
> pushed to the driver and executors by successfully running a job using some
> classes in the jars.  However, if I create a new "spark_abc" interpreter
> under the spark interpreter group, this new interpreter doesn't seem to
> pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors of not
> being able to access packages/classes in the jars.
>
> (2) Once I restart the spark interpreters in the interpreter settings, the
> corresponding Spark jobs in yarn cluster first transition from "RUNNING"
> state to "ACCEPTED" state, and then end up in "FAILED" state.
>
> I'm wondering if the above behavior are expected and they are known to be
> the limitations of the current 0.9.0-SNAPSHOT version.
>
> Thanks,
> - Ethan
>
> On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd  wrote:
>
>> From the connection refused message I wonder if it is an SSL error.  I
>> note none of the information for SSL (truststore, keystore, etc.)
>> I would think the YARN cluster requires some form of authentication.
>> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>>
>> It looks like the interpreter process can not connect to zeppelin server
>> process. I guess it is due to some network issue, can you check whether the
>> node in yarn cluster can connect to the zeppelin server host ?
>>
>> Y. Ethan Guo  于2019年4月7日周日 下午3:31写道:
>>
>>> Hi Jeff,
>>>
>>> Given this PR is merged, I'm trying to see if I can run yarn cluster
>>> mode from master build.  I built Zeppelin master from this commit:
>>>
>>> commit 3655c12b875884410224eca5d6155287d51916ac
>>> Author: Jongyoul Lee 
>>> Date:   Mon Apr 1 15:37:57 2019 +0900
>>> [MINOR] Refactor CronJob class (#3335)
>>>
>>> While I can successfully run Spark interpreter yarn client mode, I'm
>>> having trouble making the yarn cluster mode working.  Specifically, while
>>> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
>>> because of this exception (see below).  Do you have any idea why this
>>> is happening?
>>>
>>> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
>>> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
>>> keyStorePassword=None, trustStore=None, trustStorePassword=None,
>>> protocol=None, enabledAlgorithms=Set()}
>>>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
>>> Starting the user application in a separate Thread
>>>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
>>> Waiting for spark context initialization...
>>>  INFO [2019-04-07 06:57:00,403] ({Driver}
>>> RemoteInterpreterServer.java[]:148) - Starting remote interpreter
>>> server on port 0, intpEventServerAddress: 172.17.0.1:45128
>>> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
>>> User class threw exception:
>>> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
>>> Connection refused (Connection refused)
>>> org.apache.thrift.transport.TTransportException:
>>> java.net.ConnectException: Connection refused (Connection refused)
>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:154)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:139)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> at
>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
>>> Caused by: java.net.ConnectException: Connection refused (Connection
>>> refused)
>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> at
>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>> at
>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>> at
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>> at java.net.Socket.connect(Socket.java:589)
>>> at org.apache.thrift.transport.TSocket.ope

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-04-08 Thread Y. Ethan Guo
Hi Jeff, Dave,

Thanks for the suggestion.  I was able to successfully run the Spark
interpreter in yarn cluster mode on anther machine running Zeppelin.  The
previous problem could probably be due to network issues.

I have two observations:
(1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the "spark"
interpreter with yarn cluster mode configured.  I verify that the jars are
pushed to the driver and executors by successfully running a job using some
classes in the jars.  However, if I create a new "spark_abc" interpreter
under the spark interpreter group, this new interpreter doesn't seem to
pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors of not
being able to access packages/classes in the jars.

(2) Once I restart the spark interpreters in the interpreter settings, the
corresponding Spark jobs in yarn cluster first transition from "RUNNING"
state to "ACCEPTED" state, and then end up in "FAILED" state.

I'm wondering if the above behavior are expected and they are known to be
the limitations of the current 0.9.0-SNAPSHOT version.

Thanks,
- Ethan

On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd  wrote:

> From the connection refused message I wonder if it is an SSL error.  I
> note none of the information for SSL (truststore, keystore, etc.)
> I would think the YARN cluster requires some form of authentication.
> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>
> It looks like the interpreter process can not connect to zeppelin server
> process. I guess it is due to some network issue, can you check whether the
> node in yarn cluster can connect to the zeppelin server host ?
>
> Y. Ethan Guo  于2019年4月7日周日 下午3:31写道:
>
>> Hi Jeff,
>>
>> Given this PR is merged, I'm trying to see if I can run yarn cluster mode
>> from master build.  I built Zeppelin master from this commit:
>>
>> commit 3655c12b875884410224eca5d6155287d51916ac
>> Author: Jongyoul Lee 
>> Date:   Mon Apr 1 15:37:57 2019 +0900
>> [MINOR] Refactor CronJob class (#3335)
>>
>> While I can successfully run Spark interpreter yarn client mode, I'm
>> having trouble making the yarn cluster mode working.  Specifically, while
>> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
>> because of this exception (see below).  Do you have any idea why this
>> is happening?
>>
>> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
>> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
>> keyStorePassword=None, trustStore=None, trustStorePassword=None,
>> protocol=None, enabledAlgorithms=Set()}
>>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
>> Starting the user application in a separate Thread
>>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
>> Waiting for spark context initialization...
>>  INFO [2019-04-07 06:57:00,403] ({Driver}
>> RemoteInterpreterServer.java[]:148) - Starting remote interpreter
>> server on port 0, intpEventServerAddress: 172.17.0.1:45128
>> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
>> User class threw exception:
>> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
>> Connection refused (Connection refused)
>> org.apache.thrift.transport.TTransportException:
>> java.net.ConnectException: Connection refused (Connection refused)
>> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:154)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:139)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
>> Caused by: java.net.ConnectException: Connection refused (Connection
>> refused)
>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>> at
>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>> at
>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>> at
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>> at java.net.Socket.connect(Socket.java:589)
>> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
>> ... 8 more
>>
>> Thanks,
>> - Ethan
>>
>> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang  wrote:
>>
>>> Here's the PR
>>> https://github.com/apache/zeppelin/pull/3308
>>>
>>> Y. Ethan Guo  于2019年2月28日周四 上午2:50写道:
>>>
 Hi All,

 I'm trying to use the new feature of yarn cluster 

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-04-07 Thread Dave Boyd
From the connection refused message I wonder if it is an SSL error.  I note 
none of the information for SSL (truststore, keystore, etc.)
I would think the YARN cluster requires some form of authentication.

On 4/7/19 9:27 AM, Jeff Zhang wrote:
It looks like the interpreter process can not connect to zeppelin server 
process. I guess it is due to some network issue, can you check whether the 
node in yarn cluster can connect to the zeppelin server host ?

Y. Ethan Guo mailto:guoyi...@uber.com>> 于2019年4月7日周日 
下午3:31写道:
Hi Jeff,

Given this PR is merged, I'm trying to see if I can run yarn cluster mode from 
master build.  I built Zeppelin master from this commit:

commit 3655c12b875884410224eca5d6155287d51916ac
Author: Jongyoul Lee mailto:jongy...@gmail.com>>
Date:   Mon Apr 1 15:37:57 2019 +0900
[MINOR] Refactor CronJob class (#3335)

While I can successfully run Spark interpreter yarn client mode, I'm having 
trouble making the yarn cluster mode working.  Specifically, while the 
interpreter job was accepted in yarn, the job failed after 1-2 minutes because 
of this exception (see below).  Do you have any idea why this is happening?

DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) - Created 
SSL options for fs: SSLOptions{enabled=false, keyStore=None, 
keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, 
enabledAlgorithms=Set()}
 INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) - Starting 
the user application in a separate Thread
 INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) - Waiting 
for spark context initialization...
 INFO [2019-04-07 06:57:00,403] ({Driver} 
RemoteInterpreterServer.java[]:148) - Starting remote interpreter server 
on port 0, intpEventServerAddress: 172.17.0.1:45128
ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) - User 
class threw exception: org.apache.thrift.transport.TTransportException: 
java.net.ConnectException: Connection refused (Connection refused)
org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:154)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:139)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 8 more

Thanks,
- Ethan

On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang 
mailto:zjf...@gmail.com>> wrote:
Here's the PR
https://github.com/apache/zeppelin/pull/3308

Y. Ethan Guo mailto:guoyi...@uber.com>> 于2019年2月28日周四 
上午2:50写道:
Hi All,

I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0 jobs 
on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and 
HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark interpreter 
can be started in the cluster. I used `--jars` in SPARK_SUBMIT_OPTIONS to add 
local jars. However, when I tried to import a class from the jars in a Spark 
paragraph, the interpreter complained that it cannot find the package and class 
(":23: error: object ... is not a member of package ..."). Looks like 
the jars are not properly imported.

I followed the instruction 
here
 to add the jars, but it seems that it's not working in the cluster mode.  And 
this issue seems to be related to this bug: 
https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update on 
fixing it? What is the right way to add local jars in yarn cluster mode? Any 
help and update are much appreciated.


Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):

export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars ... 
--repositories 
https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/r

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-04-07 Thread Jeff Zhang
It looks like the interpreter process can not connect to zeppelin server
process. I guess it is due to some network issue, can you check whether the
node in yarn cluster can connect to the zeppelin server host ?

Y. Ethan Guo  于2019年4月7日周日 下午3:31写道:

> Hi Jeff,
>
> Given this PR is merged, I'm trying to see if I can run yarn cluster mode
> from master build.  I built Zeppelin master from this commit:
>
> commit 3655c12b875884410224eca5d6155287d51916ac
> Author: Jongyoul Lee 
> Date:   Mon Apr 1 15:37:57 2019 +0900
> [MINOR] Refactor CronJob class (#3335)
>
> While I can successfully run Spark interpreter yarn client mode, I'm
> having trouble making the yarn cluster mode working.  Specifically, while
> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
> because of this exception (see below).  Do you have any idea why this
> is happening?
>
> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
> keyStorePassword=None, trustStore=None, trustStorePassword=None,
> protocol=None, enabledAlgorithms=Set()}
>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
> Starting the user application in a separate Thread
>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
> Waiting for spark context initialization...
>  INFO [2019-04-07 06:57:00,403] ({Driver}
> RemoteInterpreterServer.java[]:148) - Starting remote interpreter
> server on port 0, intpEventServerAddress: 172.17.0.1:45128
> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
> User class threw exception:
> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
> Connection refused (Connection refused)
> org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused (Connection refused)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:154)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:139)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
> Caused by: java.net.ConnectException: Connection refused (Connection
> refused)
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
> ... 8 more
>
> Thanks,
> - Ethan
>
> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang  wrote:
>
>> Here's the PR
>> https://github.com/apache/zeppelin/pull/3308
>>
>> Y. Ethan Guo  于2019年2月28日周四 上午2:50写道:
>>
>>> Hi All,
>>>
>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh
>>> so that the Spark interpreter can be started in the cluster. I used
>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried
>>> to import a class from the jars in a Spark paragraph, the interpreter
>>> complained that it cannot find the package and class (":23: error:
>>> object ... is not a member of package ..."). Looks like the jars are not
>>> properly imported.
>>>
>>> I followed the instruction here
>>> 
>>> to add the jars, but it seems that it's not working in the cluster mode.
>>> And this issue seems to be related to this bug:
>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update
>>> on fixing it? What is the right way to add local jars in yarn cluster mode?
>>> Any help and update are much appreciated.
>>>
>>>
>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):
>>>
>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars
>>> ... --repositories
>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>> "
>>>
>>> Thanks,
>>> - Ethan
>>> --
>>> Best,
>>> - Ethan
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>

-- 
Be

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-04-07 Thread Y. Ethan Guo
Hi Jeff,

Given this PR is merged, I'm trying to see if I can run yarn cluster mode
from master build.  I built Zeppelin master from this commit:

commit 3655c12b875884410224eca5d6155287d51916ac
Author: Jongyoul Lee 
Date:   Mon Apr 1 15:37:57 2019 +0900
[MINOR] Refactor CronJob class (#3335)

While I can successfully run Spark interpreter yarn client mode, I'm having
trouble making the yarn cluster mode working.  Specifically, while the
interpreter job was accepted in yarn, the job failed after 1-2 minutes
because of this exception (see below).  Do you have any idea why this
is happening?

DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
keyStorePassword=None, trustStore=None, trustStorePassword=None,
protocol=None, enabledAlgorithms=Set()}
 INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
Starting the user application in a separate Thread
 INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
Waiting for spark context initialization...
 INFO [2019-04-07 06:57:00,403] ({Driver}
RemoteInterpreterServer.java[]:148) - Starting remote interpreter
server on port 0, intpEventServerAddress: 172.17.0.1:45128
ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
User class threw exception:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused (Connection refused)
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:154)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.(RemoteInterpreterServer.java:139)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
Caused by: java.net.ConnectException: Connection refused (Connection
refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 8 more

Thanks,
- Ethan

On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang  wrote:

> Here's the PR
> https://github.com/apache/zeppelin/pull/3308
>
> Y. Ethan Guo  于2019年2月28日周四 上午2:50写道:
>
>> Hi All,
>>
>> I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0
>> jobs on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and
>> HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark
>> interpreter can be started in the cluster. I used `--jars` in
>> SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried to import a
>> class from the jars in a Spark paragraph, the interpreter complained that
>> it cannot find the package and class (":23: error: object ... is
>> not a member of package ..."). Looks like the jars are not properly
>> imported.
>>
>> I followed the instruction here
>> 
>> to add the jars, but it seems that it's not working in the cluster mode.
>> And this issue seems to be related to this bug:
>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update
>> on fixing it? What is the right way to add local jars in yarn cluster mode?
>> Any help and update are much appreciated.
>>
>>
>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):
>>
>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars
>> ... --repositories
>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>> "
>>
>> Thanks,
>> - Ethan
>> --
>> Best,
>> - Ethan
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-02-27 Thread Jeff Zhang
Here's the PR
https://github.com/apache/zeppelin/pull/3308

Y. Ethan Guo  于2019年2月28日周四 上午2:50写道:

> Hi All,
>
> I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0
> jobs on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and
> HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark
> interpreter can be started in the cluster. I used `--jars` in
> SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried to import a
> class from the jars in a Spark paragraph, the interpreter complained that
> it cannot find the package and class (":23: error: object ... is
> not a member of package ..."). Looks like the jars are not properly
> imported.
>
> I followed the instruction here
> 
> to add the jars, but it seems that it's not working in the cluster mode.
> And this issue seems to be related to this bug:
> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update
> on fixing it? What is the right way to add local jars in yarn cluster mode?
> Any help and update are much appreciated.
>
>
> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):
>
> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars ...
> --repositories
> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
> "
>
> Thanks,
> - Ethan
> --
> Best,
> - Ethan
>


-- 
Best Regards

Jeff Zhang


Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

2019-02-27 Thread Y. Ethan Guo
Hi All,

I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0
jobs on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and
HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark
interpreter can be started in the cluster. I used `--jars` in
SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried to import a
class from the jars in a Spark paragraph, the interpreter complained that
it cannot find the package and class (":23: error: object ... is
not a member of package ..."). Looks like the jars are not properly
imported.

I followed the instruction here

to add the jars, but it seems that it's not working in the cluster mode.
And this issue seems to be related to this bug:
https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update on
fixing it? What is the right way to add local jars in yarn cluster mode?
Any help and update are much appreciated.


Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):

export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars ...
--repositories
https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
"

Thanks,
- Ethan
-- 
Best,
- Ethan