Re: Zeppelin - Spark Driver location

Ankit Jain Tue, 13 Mar 2018 18:03:20 -0700

Hi Jhang,
What is the expected behavior with standalone cluster mode? Should we see 
separate driver processes in the cluster(one per user) or multiple SparkSubmit 
processes?


I was trying to dig in Zeppelin code & didn’t see where Zeppelin does the 
Spark-submit to the cluster? Can you please point to it?

Thanks
Ankit

> On Mar 13, 2018, at 5:25 PM, Jeff Zhang <[email protected]> wrote:
> 
> 
> ZEPPELIN-2898 is for yarn cluster model.  And Zeppelin have integration test 
> for yarn mode, so guaranteed it would work. But don't' have test for 
> standalone, so not sure the behavior of standalone mode. 
> 
> 
> Ruslan Dautkhanov <[email protected]>于2018年3月14日周三 上午8:06写道：
>> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in it's 
>> title so I assume it's only yarn-cluster.
>> Never used standalone-cluster myself. 
>> 
>> Which distro of Hadoop do you use?
>> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
>> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html
>> 
>> 
>> 
>> -- 
>> Ruslan Dautkhanov
>> 
>>> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz 
>>> <[email protected]> wrote:
>> 
>>> Does this new feature work only for yarn-cluster ?. Or for spark standalone 
>>> too ?
>> 
>>> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov <[email protected]> 
>>> escribió:
>> 
>>>> > Zeppelin version: 0.8.0 (merged at September 2017 version)
>>>> 
>>>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of 
>>>> September so not sure if you have that.
>>>> 
>>>> Check out 
>>>> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 how to 
>>>> set this up.
>>>> 
>> 
>>>> 
>>>> -- 
>>>> Ruslan Dautkhanov
>>>> 
>> 
>>>>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz 
>>>>> <[email protected]> wrote:
>> 
>>>>> Hi zeppelin users !
>>>>> 
>>>>> I am working with zeppelin pointing to a spark in standalone. I am trying 
>>>>> to figure out a way to make zeppelin runs the spark driver outside of 
>>>>> client process that submits the application.
>>>>> 
>>>>> According with the documentation 
>>>>> (http://spark.apache.org/docs/2.1.1/spark-standalone.html):
>>>>> 
>>>>> For standalone clusters, Spark currently supports two deploy modes. In 
>>>>> client mode, the driver is launched in the same process as the client 
>>>>> that submits the application. In cluster mode, however, the driver is 
>>>>> launched from one of the Worker processes inside the cluster, and the 
>>>>> client process exits as soon as it fulfills its responsibility of 
>>>>> submitting the application without waiting for the application to finish.
>>>>> 
>>>>> The problem is that, even when I set the properties for spark-standalone 
>>>>> cluster and deploy mode in cluster, the driver still run inside zeppelin 
>>>>> machine (according with spark UI/executors page). These are properties 
>>>>> that I am setting for the spark interpreter:
>>>>> 
>>>>> master: spark://<master-name>:7077
>>>>> spark.submit.deployMode: cluster
>>>>> spark.executor.memory: 16g
>>>>> 
>>>>> Any ideas would be appreciated.
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Details:
>>>>> Spark version: 2.1.1
>>>>> Zeppelin version: 0.8.0 (merged at September 2017 version)

Re: Zeppelin - Spark Driver location

Reply via email to