Re: Zeppelin - Spark Driver location

2018-03-14 Thread Jeff Zhang
spark-submit would only run when you run the first paragraph using spark
interpreter. After that, paragraph would send code to the spark app to
execute.

>>> Also spark standalone cluster moder should work even before this new
release, right?
I didn't verify that, not sure whether other people veryfit.


ankit jain 于2018年3月15日周四 上午4:32写道:

> Also spark standalone cluster moder should work even before this new
> release, right?
>
> On Wed, Mar 14, 2018 at 8:43 AM, ankit jain 
> wrote:
>
>> Hi Jhang,
>> Not clear on that - I thought spark-submit was done when we run a
>> paragraph, how does the .sh file come into play?
>>
>> Thanks
>> Ankit
>>
>> On Tue, Mar 13, 2018 at 5:43 PM, Jeff Zhang  wrote:
>>
>>>
>>> spark-submit is called in bin/interpreter.sh,  I didn't try standalone
>>> cluster mode. It is expected to run driver in separate host, but didn't
>>> guaranteed zeppelin support this.
>>>
>>> Ankit Jain 于2018年3月14日周三 上午8:34写道:
>>>
 Hi Jhang,
 What is the expected behavior with standalone cluster mode? Should we
 see separate driver processes in the cluster(one per user) or multiple
 SparkSubmit processes?

 I was trying to dig in Zeppelin code & didn’t see where Zeppelin does
 the Spark-submit to the cluster? Can you please point to it?

 Thanks
 Ankit

 On Mar 13, 2018, at 5:25 PM, Jeff Zhang  wrote:


 ZEPPELIN-2898  is
 for yarn cluster model.  And Zeppelin have integration test for yarn mode,
 so guaranteed it would work. But don't' have test for standalone, so not
 sure the behavior of standalone mode.


 Ruslan Dautkhanov 于2018年3月14日周三 上午8:06写道:

> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster
> in it's title so I assume it's only yarn-cluster.
> Never used standalone-cluster myself.
>
> Which distro of Hadoop do you use?
> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
>
> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html
>
>
>
> --
> Ruslan Dautkhanov
>
> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz <
> jhonderson2...@gmail.com> wrote:
>
>> Does this new feature work only for yarn-cluster ?. Or for spark
>> standalone too ?
>>
> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov <
>> dautkha...@gmail.com> escribió:
>>
> > Zeppelin version: 0.8.0 (merged at September 2017 version)
>>>
>>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end
>>> of September so not sure if you have that.
>>>
>>> Check out
>>> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235
>>> how to set this up.
>>>
>>>
>>> --
>>> Ruslan Dautkhanov
>>>
>>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
>>> jhonderson2...@gmail.com> wrote:
>>>
>> Hi zeppelin users !

 I am working with zeppelin pointing to a spark in standalone. I am
 trying to figure out a way to make zeppelin runs the spark driver 
 outside
 of client process that submits the application.

 According with the documentation (
 http://spark.apache.org/docs/2.1.1/spark-standalone.html):

 *For standalone clusters, Spark currently supports two deploy
 modes. In client mode, the driver is launched in the same process as 
 the
 client that submits the application. In cluster mode, however, the 
 driver
 is launched from one of the Worker processes inside the cluster, and 
 the
 client process exits as soon as it fulfills its responsibility of
 submitting the application without waiting for the application to 
 finish.*

 The problem is that, even when I set the properties for
 spark-standalone cluster and deploy mode in cluster, the driver still 
 run
 inside zeppelin machine (according with spark UI/executors page). 
 These are
 properties that I am setting for the spark interpreter:

 master: spark://:7077
 spark.submit.deployMode: cluster
 spark.executor.memory: 16g

 Any ideas would be appreciated.

 Thank you

 Details:
 Spark version: 2.1.1
 Zeppelin version: 0.8.0 (merged at September 2017 version)

>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ankit.
>>
>
>
>
> --
> Thanks & Regards,
> Ankit.
>


Re: Zeppelin - Spark Driver location

2018-03-14 Thread ankit jain
Also spark standalone cluster moder should work even before this new
release, right?

On Wed, Mar 14, 2018 at 8:43 AM, ankit jain  wrote:

> Hi Jhang,
> Not clear on that - I thought spark-submit was done when we run a
> paragraph, how does the .sh file come into play?
>
> Thanks
> Ankit
>
> On Tue, Mar 13, 2018 at 5:43 PM, Jeff Zhang  wrote:
>
>>
>> spark-submit is called in bin/interpreter.sh,  I didn't try standalone
>> cluster mode. It is expected to run driver in separate host, but didn't
>> guaranteed zeppelin support this.
>>
>> Ankit Jain 于2018年3月14日周三 上午8:34写道:
>>
>>> Hi Jhang,
>>> What is the expected behavior with standalone cluster mode? Should we
>>> see separate driver processes in the cluster(one per user) or multiple
>>> SparkSubmit processes?
>>>
>>> I was trying to dig in Zeppelin code & didn’t see where Zeppelin does
>>> the Spark-submit to the cluster? Can you please point to it?
>>>
>>> Thanks
>>> Ankit
>>>
>>> On Mar 13, 2018, at 5:25 PM, Jeff Zhang  wrote:
>>>
>>>
>>> ZEPPELIN-2898  is
>>> for yarn cluster model.  And Zeppelin have integration test for yarn mode,
>>> so guaranteed it would work. But don't' have test for standalone, so not
>>> sure the behavior of standalone mode.
>>>
>>>
>>> Ruslan Dautkhanov 于2018年3月14日周三 上午8:06写道:
>>>
 https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster
 in it's title so I assume it's only yarn-cluster.
 Never used standalone-cluster myself.

 Which distro of Hadoop do you use?
 Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
 https://www.cloudera.com/documentation/enterprise/release-
 notes/topics/rg_deprecated.html



 --
 Ruslan Dautkhanov

 On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz <
 jhonderson2...@gmail.com> wrote:

> Does this new feature work only for yarn-cluster ?. Or for spark
> standalone too ?
>
 El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov <
> dautkha...@gmail.com> escribió:
>
 > Zeppelin version: 0.8.0 (merged at September 2017 version)
>>
>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end
>> of September so not sure if you have that.
>>
>> Check out https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-
>> ea53e8810235 how to set this up.
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
>> jhonderson2...@gmail.com> wrote:
>>
> Hi zeppelin users !
>>>
>>> I am working with zeppelin pointing to a spark in standalone. I am
>>> trying to figure out a way to make zeppelin runs the spark driver 
>>> outside
>>> of client process that submits the application.
>>>
>>> According with the documentation (http://spark.apache.org/docs/
>>> 2.1.1/spark-standalone.html):
>>>
>>> *For standalone clusters, Spark currently supports two deploy modes.
>>> In client mode, the driver is launched in the same process as the client
>>> that submits the application. In cluster mode, however, the driver is
>>> launched from one of the Worker processes inside the cluster, and the
>>> client process exits as soon as it fulfills its responsibility of
>>> submitting the application without waiting for the application to 
>>> finish.*
>>>
>>> The problem is that, even when I set the properties for
>>> spark-standalone cluster and deploy mode in cluster, the driver still 
>>> run
>>> inside zeppelin machine (according with spark UI/executors page). These 
>>> are
>>> properties that I am setting for the spark interpreter:
>>>
>>> master: spark://:7077
>>> spark.submit.deployMode: cluster
>>> spark.executor.memory: 16g
>>>
>>> Any ideas would be appreciated.
>>>
>>> Thank you
>>>
>>> Details:
>>> Spark version: 2.1.1
>>> Zeppelin version: 0.8.0 (merged at September 2017 version)
>>>
>>
>
>
> --
> Thanks & Regards,
> Ankit.
>



-- 
Thanks & Regards,
Ankit.


Re: Zeppelin - Spark Driver location

2018-03-14 Thread ankit jain
Hi Jhang,
Not clear on that - I thought spark-submit was done when we run a
paragraph, how does the .sh file come into play?

Thanks
Ankit

On Tue, Mar 13, 2018 at 5:43 PM, Jeff Zhang  wrote:

>
> spark-submit is called in bin/interpreter.sh,  I didn't try standalone
> cluster mode. It is expected to run driver in separate host, but didn't
> guaranteed zeppelin support this.
>
> Ankit Jain 于2018年3月14日周三 上午8:34写道:
>
>> Hi Jhang,
>> What is the expected behavior with standalone cluster mode? Should we see
>> separate driver processes in the cluster(one per user) or multiple
>> SparkSubmit processes?
>>
>> I was trying to dig in Zeppelin code & didn’t see where Zeppelin does the
>> Spark-submit to the cluster? Can you please point to it?
>>
>> Thanks
>> Ankit
>>
>> On Mar 13, 2018, at 5:25 PM, Jeff Zhang  wrote:
>>
>>
>> ZEPPELIN-2898  is
>> for yarn cluster model.  And Zeppelin have integration test for yarn mode,
>> so guaranteed it would work. But don't' have test for standalone, so not
>> sure the behavior of standalone mode.
>>
>>
>> Ruslan Dautkhanov 于2018年3月14日周三 上午8:06写道:
>>
>>> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in
>>> it's title so I assume it's only yarn-cluster.
>>> Never used standalone-cluster myself.
>>>
>>> Which distro of Hadoop do you use?
>>> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
>>> https://www.cloudera.com/documentation/enterprise/
>>> release-notes/topics/rg_deprecated.html
>>>
>>>
>>>
>>> --
>>> Ruslan Dautkhanov
>>>
>>> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz <
>>> jhonderson2...@gmail.com> wrote:
>>>
 Does this new feature work only for yarn-cluster ?. Or for spark
 standalone too ?

>>> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov <
 dautkha...@gmail.com> escribió:

>>> > Zeppelin version: 0.8.0 (merged at September 2017 version)
>
> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
> September so not sure if you have that.
>
> Check out https://medium.com/@zjffdu/zeppelin-0-8-0-new-
> features-ea53e8810235 how to set this up.
>
>
> --
> Ruslan Dautkhanov
>
> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
> jhonderson2...@gmail.com> wrote:
>
 Hi zeppelin users !
>>
>> I am working with zeppelin pointing to a spark in standalone. I am
>> trying to figure out a way to make zeppelin runs the spark driver outside
>> of client process that submits the application.
>>
>> According with the documentation (http://spark.apache.org/docs/
>> 2.1.1/spark-standalone.html):
>>
>> *For standalone clusters, Spark currently supports two deploy modes.
>> In client mode, the driver is launched in the same process as the client
>> that submits the application. In cluster mode, however, the driver is
>> launched from one of the Worker processes inside the cluster, and the
>> client process exits as soon as it fulfills its responsibility of
>> submitting the application without waiting for the application to 
>> finish.*
>>
>> The problem is that, even when I set the properties for
>> spark-standalone cluster and deploy mode in cluster, the driver still run
>> inside zeppelin machine (according with spark UI/executors page). These 
>> are
>> properties that I am setting for the spark interpreter:
>>
>> master: spark://:7077
>> spark.submit.deployMode: cluster
>> spark.executor.memory: 16g
>>
>> Any ideas would be appreciated.
>>
>> Thank you
>>
>> Details:
>> Spark version: 2.1.1
>> Zeppelin version: 0.8.0 (merged at September 2017 version)
>>
>


-- 
Thanks & Regards,
Ankit.


Re: Zeppelin - Spark Driver location

2018-03-13 Thread Vannson, Raphael
Hello Jhon,

Conceptually this makes sense, since Zeppelin creates a spark application for 
the execution runtime underneath its frontend process.

Having said this, depending on how Zeppelin is implemented, it might be 
required for the driver to be collocated with the zeppelin process on the same 
host (remember the Zeppelin notebook process needs to “talk” to the spark 
driver process, this might be done via a child process).
I certainly can see how a collocated design would be simpler to implement for 
the Zeppelin contributors which may have considered the functionality you have 
described but for a later release date.

So this is not a definitive answer (I don’t know the actual answer) but I would 
not expect this kind of setup to be supported yet.
(I tried to make it work and could not get the spark kernel to start so I just 
reverted to a client deploy mode instead of cluster – since this option was 
acceptable to me).
I would be curious to see if that is possible tough and how that would be 
configured.

I hope this helps (a bit).

Best,
Raphael

From: Jhon Anderson Cardenas Diaz 
Reply-To: "us...@zeppelin.apache.org" 
Date: Tuesday, March 13, 2018 at 4:24 PM
To: "dev@zeppelin.apache.org" , 
"us...@zeppelin.apache.org" 
Subject: Zeppelin - Spark Driver location

Hi zeppelin users !

I am working with zeppelin pointing to a spark in standalone. I am trying to 
figure out a way to make zeppelin runs the spark driver outside of client 
process that submits the application.

According with the documentation 
(http://spark.apache.org/docs/2.1.1/spark-standalone.html):

For standalone clusters, Spark currently supports two deploy modes. In client 
mode, the driver is launched in the same process as the client that submits the 
application. In cluster mode, however, the driver is launched from one of the 
Worker processes inside the cluster, and the client process exits as soon as it 
fulfills its responsibility of submitting the application without waiting for 
the application to finish.

The problem is that, even when I set the properties for spark-standalone 
cluster and deploy mode in cluster, the driver still run inside zeppelin 
machine (according with spark UI/executors page). These are properties that I 
am setting for the spark interpreter:

master: spark://:7077
spark.submit.deployMode: cluster
spark.executor.memory: 16g

Any ideas would be appreciated.

Thank you

Details:
Spark version: 2.1.1
Zeppelin version: 0.8.0 (merged at September 2017 version)


Re: Zeppelin - Spark Driver location

2018-03-13 Thread Ruslan Dautkhanov
 > Zeppelin version: 0.8.0 (merged at September 2017 version)

https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
September so not sure if you have that.

Check out
https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 how to
set this up.



-- 
Ruslan Dautkhanov

On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
jhonderson2...@gmail.com> wrote:

> Hi zeppelin users !
>
> I am working with zeppelin pointing to a spark in standalone. I am trying
> to figure out a way to make zeppelin runs the spark driver outside of
> client process that submits the application.
>
> According with the documentation (http://spark.apache.org/docs/
> 2.1.1/spark-standalone.html):
>
> *For standalone clusters, Spark currently supports two deploy modes.
> In client mode, the driver is launched in the same process as the client
> that submits the application. In cluster mode, however, the driver is
> launched from one of the Worker processes inside the cluster, and the
> client process exits as soon as it fulfills its responsibility of
> submitting the application without waiting for the application to finish.*
>
> The problem is that, even when I set the properties for spark-standalone
> cluster and deploy mode in cluster, the driver still run inside zeppelin
> machine (according with spark UI/executors page). These are properties that
> I am setting for the spark interpreter:
>
> master: spark://:7077
> spark.submit.deployMode: cluster
> spark.executor.memory: 16g
>
> Any ideas would be appreciated.
>
> Thank you
>
> Details:
> Spark version: 2.1.1
> Zeppelin version: 0.8.0 (merged at September 2017 version)
>


Re: Zeppelin - Spark Driver location

2018-03-13 Thread Ankit Jain
Hi Jhang,
What is the expected behavior with standalone cluster mode? Should we see 
separate driver processes in the cluster(one per user) or multiple SparkSubmit 
processes?

I was trying to dig in Zeppelin code & didn’t see where Zeppelin does the 
Spark-submit to the cluster? Can you please point to it?

Thanks
Ankit

> On Mar 13, 2018, at 5:25 PM, Jeff Zhang  wrote:
> 
> 
> ZEPPELIN-2898 is for yarn cluster model.  And Zeppelin have integration test 
> for yarn mode, so guaranteed it would work. But don't' have test for 
> standalone, so not sure the behavior of standalone mode. 
> 
> 
> Ruslan Dautkhanov 于2018年3月14日周三 上午8:06写道:
>> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in it's 
>> title so I assume it's only yarn-cluster.
>> Never used standalone-cluster myself. 
>> 
>> Which distro of Hadoop do you use?
>> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
>> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html
>> 
>> 
>> 
>> -- 
>> Ruslan Dautkhanov
>> 
>>> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz 
>>>  wrote:
>> 
>>> Does this new feature work only for yarn-cluster ?. Or for spark standalone 
>>> too ?
>> 
>>> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov  
>>> escribió:
>> 
 > Zeppelin version: 0.8.0 (merged at September 2017 version)
 
 https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of 
 September so not sure if you have that.
 
 Check out 
 https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 how to 
 set this up.
 
>> 
 
 -- 
 Ruslan Dautkhanov
 
>> 
> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz 
>  wrote:
>> 
> Hi zeppelin users !
> 
> I am working with zeppelin pointing to a spark in standalone. I am trying 
> to figure out a way to make zeppelin runs the spark driver outside of 
> client process that submits the application.
> 
> According with the documentation 
> (http://spark.apache.org/docs/2.1.1/spark-standalone.html):
> 
> For standalone clusters, Spark currently supports two deploy modes. In 
> client mode, the driver is launched in the same process as the client 
> that submits the application. In cluster mode, however, the driver is 
> launched from one of the Worker processes inside the cluster, and the 
> client process exits as soon as it fulfills its responsibility of 
> submitting the application without waiting for the application to finish.
> 
> The problem is that, even when I set the properties for spark-standalone 
> cluster and deploy mode in cluster, the driver still run inside zeppelin 
> machine (according with spark UI/executors page). These are properties 
> that I am setting for the spark interpreter:
> 
> master: spark://:7077
> spark.submit.deployMode: cluster
> spark.executor.memory: 16g
> 
> Any ideas would be appreciated.
> 
> Thank you
> 
> Details:
> Spark version: 2.1.1
> Zeppelin version: 0.8.0 (merged at September 2017 version)


Re: Zeppelin - Spark Driver location

2018-03-13 Thread Jeff Zhang
spark-submit is called in bin/interpreter.sh,  I didn't try standalone
cluster mode. It is expected to run driver in separate host, but didn't
guaranteed zeppelin support this.

Ankit Jain 于2018年3月14日周三 上午8:34写道:

> Hi Jhang,
> What is the expected behavior with standalone cluster mode? Should we see
> separate driver processes in the cluster(one per user) or multiple
> SparkSubmit processes?
>
> I was trying to dig in Zeppelin code & didn’t see where Zeppelin does the
> Spark-submit to the cluster? Can you please point to it?
>
> Thanks
> Ankit
>
> On Mar 13, 2018, at 5:25 PM, Jeff Zhang  wrote:
>
>
> ZEPPELIN-2898  is
> for yarn cluster model.  And Zeppelin have integration test for yarn mode,
> so guaranteed it would work. But don't' have test for standalone, so not
> sure the behavior of standalone mode.
>
>
> Ruslan Dautkhanov 于2018年3月14日周三 上午8:06写道:
>
>> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in
>> it's title so I assume it's only yarn-cluster.
>> Never used standalone-cluster myself.
>>
>> Which distro of Hadoop do you use?
>> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
>>
>> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz <
>> jhonderson2...@gmail.com> wrote:
>>
>>> Does this new feature work only for yarn-cluster ?. Or for spark
>>> standalone too ?
>>>
>> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov <
>>> dautkha...@gmail.com> escribió:
>>>
>> > Zeppelin version: 0.8.0 (merged at September 2017 version)

 https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
 September so not sure if you have that.

 Check out
 https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235
 how to set this up.


 --
 Ruslan Dautkhanov

 On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
 jhonderson2...@gmail.com> wrote:

>>> Hi zeppelin users !
>
> I am working with zeppelin pointing to a spark in standalone. I am
> trying to figure out a way to make zeppelin runs the spark driver outside
> of client process that submits the application.
>
> According with the documentation (
> http://spark.apache.org/docs/2.1.1/spark-standalone.html):
>
> *For standalone clusters, Spark currently supports two deploy modes.
> In client mode, the driver is launched in the same process as the client
> that submits the application. In cluster mode, however, the driver is
> launched from one of the Worker processes inside the cluster, and the
> client process exits as soon as it fulfills its responsibility of
> submitting the application without waiting for the application to finish.*
>
> The problem is that, even when I set the properties for
> spark-standalone cluster and deploy mode in cluster, the driver still run
> inside zeppelin machine (according with spark UI/executors page). These 
> are
> properties that I am setting for the spark interpreter:
>
> master: spark://:7077
> spark.submit.deployMode: cluster
> spark.executor.memory: 16g
>
> Any ideas would be appreciated.
>
> Thank you
>
> Details:
> Spark version: 2.1.1
> Zeppelin version: 0.8.0 (merged at September 2017 version)
>



Re: Zeppelin - Spark Driver location

2018-03-13 Thread Jeff Zhang
ZEPPELIN-2898  is for
yarn cluster model.  And Zeppelin have integration test for yarn mode, so
guaranteed it would work. But don't' have test for standalone, so not sure
the behavior of standalone mode.


Ruslan Dautkhanov 于2018年3月14日周三 上午8:06写道:

> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in
> it's title so I assume it's only yarn-cluster.
> Never used standalone-cluster myself.
>
> Which distro of Hadoop do you use?
> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
>
> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html
>
>
>
> --
> Ruslan Dautkhanov
>
> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz <
> jhonderson2...@gmail.com> wrote:
>
>> Does this new feature work only for yarn-cluster ?. Or for spark
>> standalone too ?
>>
> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov 
>> escribió:
>>
> > Zeppelin version: 0.8.0 (merged at September 2017 version)
>>>
>>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
>>> September so not sure if you have that.
>>>
>>> Check out
>>> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 how
>>> to set this up.
>>>
>>>
>>> --
>>> Ruslan Dautkhanov
>>>
>>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
>>> jhonderson2...@gmail.com> wrote:
>>>
>> Hi zeppelin users !

 I am working with zeppelin pointing to a spark in standalone. I am
 trying to figure out a way to make zeppelin runs the spark driver outside
 of client process that submits the application.

 According with the documentation (
 http://spark.apache.org/docs/2.1.1/spark-standalone.html):

 *For standalone clusters, Spark currently supports two deploy modes.
 In client mode, the driver is launched in the same process as the client
 that submits the application. In cluster mode, however, the driver is
 launched from one of the Worker processes inside the cluster, and the
 client process exits as soon as it fulfills its responsibility of
 submitting the application without waiting for the application to finish.*

 The problem is that, even when I set the properties for
 spark-standalone cluster and deploy mode in cluster, the driver still run
 inside zeppelin machine (according with spark UI/executors page). These are
 properties that I am setting for the spark interpreter:

 master: spark://:7077
 spark.submit.deployMode: cluster
 spark.executor.memory: 16g

 Any ideas would be appreciated.

 Thank you

 Details:
 Spark version: 2.1.1
 Zeppelin version: 0.8.0 (merged at September 2017 version)

>>>


Re: Zeppelin - Spark Driver location

2018-03-13 Thread Jhon Anderson Cardenas Diaz
Does this new feature work only for yarn-cluster ?. Or for spark standalone
too ?

El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov 
escribió:

> > Zeppelin version: 0.8.0 (merged at September 2017 version)
>
> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
> September so not sure if you have that.
>
> Check out
> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 how
> to set this up.
>
>
>
> --
> Ruslan Dautkhanov
>
> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
> jhonderson2...@gmail.com> wrote:
>
>> Hi zeppelin users !
>>
>> I am working with zeppelin pointing to a spark in standalone. I am trying
>> to figure out a way to make zeppelin runs the spark driver outside of
>> client process that submits the application.
>>
>> According with the documentation (
>> http://spark.apache.org/docs/2.1.1/spark-standalone.html):
>>
>> *For standalone clusters, Spark currently supports two deploy modes.
>> In client mode, the driver is launched in the same process as the client
>> that submits the application. In cluster mode, however, the driver is
>> launched from one of the Worker processes inside the cluster, and the
>> client process exits as soon as it fulfills its responsibility of
>> submitting the application without waiting for the application to finish.*
>>
>> The problem is that, even when I set the properties for spark-standalone
>> cluster and deploy mode in cluster, the driver still run inside zeppelin
>> machine (according with spark UI/executors page). These are properties that
>> I am setting for the spark interpreter:
>>
>> master: spark://:7077
>> spark.submit.deployMode: cluster
>> spark.executor.memory: 16g
>>
>> Any ideas would be appreciated.
>>
>> Thank you
>>
>> Details:
>> Spark version: 2.1.1
>> Zeppelin version: 0.8.0 (merged at September 2017 version)
>>
>
>


Zeppelin - Spark Driver location

2018-03-13 Thread Jhon Anderson Cardenas Diaz
Hi zeppelin users !

I am working with zeppelin pointing to a spark in standalone. I am trying
to figure out a way to make zeppelin runs the spark driver outside of
client process that submits the application.

According with the documentation (
http://spark.apache.org/docs/2.1.1/spark-standalone.html):

*For standalone clusters, Spark currently supports two deploy modes.
In client mode, the driver is launched in the same process as the client
that submits the application. In cluster mode, however, the driver is
launched from one of the Worker processes inside the cluster, and the
client process exits as soon as it fulfills its responsibility of
submitting the application without waiting for the application to finish.*

The problem is that, even when I set the properties for spark-standalone
cluster and deploy mode in cluster, the driver still run inside zeppelin
machine (according with spark UI/executors page). These are properties that
I am setting for the spark interpreter:

master: spark://:7077
spark.submit.deployMode: cluster
spark.executor.memory: 16g

Any ideas would be appreciated.

Thank you

Details:
Spark version: 2.1.1
Zeppelin version: 0.8.0 (merged at September 2017 version)