Re: Zeppelin - Spark Driver location

Jeff Zhang Tue, 13 Mar 2018 17:44:12 -0700

spark-submit is called in bin/interpreter.sh,  I didn't try standalone
cluster mode. It is expected to run driver in separate host, but didn't
guaranteed zeppelin support this.


Ankit Jain <[email protected]>于2018年3月14日周三 上午8:34写道：

> Hi Jhang,
> What is the expected behavior with standalone cluster mode? Should we see
> separate driver processes in the cluster(one per user) or multiple
> SparkSubmit processes?
>
> I was trying to dig in Zeppelin code & didn’t see where Zeppelin does the
> Spark-submit to the cluster? Can you please point to it?
>
> Thanks
> Ankit
>
> On Mar 13, 2018, at 5:25 PM, Jeff Zhang <[email protected]> wrote:
>
>
> ZEPPELIN-2898 <https://issues.apache.org/jira/browse/ZEPPELIN-2898> is
> for yarn cluster model.  And Zeppelin have integration test for yarn mode,
> so guaranteed it would work. But don't' have test for standalone, so not
> sure the behavior of standalone mode.
>
>
> Ruslan Dautkhanov <[email protected]>于2018年3月14日周三 上午8:06写道：
>
>> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in
>> it's title so I assume it's only yarn-cluster.
>> Never used standalone-cluster myself.
>>
>> Which distro of Hadoop do you use?
>> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
>>
>> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz <
>> [email protected]> wrote:
>>
>>> Does this new feature work only for yarn-cluster ?. Or for spark
>>> standalone too ?
>>>
>> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov <
>>> [email protected]> escribió:
>>>
>> > Zeppelin version: 0.8.0 (merged at September 2017 version)
>>>>
>>>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
>>>> September so not sure if you have that.
>>>>
>>>> Check out
>>>> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235
>>>> how to set this up.
>>>>
>>>>
>>>> --
>>>> Ruslan Dautkhanov
>>>>
>>>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
>>>> [email protected]> wrote:
>>>>
>>> Hi zeppelin users !
>>>>>
>>>>> I am working with zeppelin pointing to a spark in standalone. I am
>>>>> trying to figure out a way to make zeppelin runs the spark driver outside
>>>>> of client process that submits the application.
>>>>>
>>>>> According with the documentation (
>>>>> http://spark.apache.org/docs/2.1.1/spark-standalone.html):
>>>>>
>>>>> *For standalone clusters, Spark currently supports two deploy modes.
>>>>> In client mode, the driver is launched in the same process as the client
>>>>> that submits the application. In cluster mode, however, the driver is
>>>>> launched from one of the Worker processes inside the cluster, and the
>>>>> client process exits as soon as it fulfills its responsibility of
>>>>> submitting the application without waiting for the application to finish.*
>>>>>
>>>>> The problem is that, even when I set the properties for
>>>>> spark-standalone cluster and deploy mode in cluster, the driver still run
>>>>> inside zeppelin machine (according with spark UI/executors page). These 
>>>>> are
>>>>> properties that I am setting for the spark interpreter:
>>>>>
>>>>> master: spark://<master-name>:7077
>>>>> spark.submit.deployMode: cluster
>>>>> spark.executor.memory: 16g
>>>>>
>>>>> Any ideas would be appreciated.
>>>>>
>>>>> Thank you
>>>>>
>>>>> Details:
>>>>> Spark version: 2.1.1
>>>>> Zeppelin version: 0.8.0 (merged at September 2017 version)
>>>>>
>>>>

Re: Zeppelin - Spark Driver location

Reply via email to