Re: Does Spark uses data locality information from HDFS when running in standalone mode?

Mich Talebzadeh Sun, 05 Jun 2016 06:11:03 -0700

I use YARN as I run Hive on Spark engine in yarn-cluster mode plus other
stuff. if I turn off YARN half of my applications won't work.  I don't see
great concern for supporting YARN. However you may have other reasons




Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 5 June 2016 at 13:40, Marco Capuccini <marco.capucc...@farmbio.uu.se>
wrote:

> I meant when running in standalone cluster mode, where Hadoop data nodes
> run on the same nodes where the Spark workers run. I don’t want to support
> YARN as well in my infrastructure, and since I already set up a standalone
> Spark cluster, I was wondering if running only HDFS in the same cluster
> would be enough.
>
> Regards
> Marco
>
> On 05 Jun 2016, at 12:17, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
> Well in standalone mode you are running your spark code on one physical
> node so the assumption would be that there is HDFS node running on the same
> host.
>
> When you are running Spark in yarn-client mode, then Yarn is part of
> Hadoop core and Yarn will know about the datanodes from
> %HADOOP_HOME/etc/Hadoop/slaves
>
> HTH
>
> Dr Mich Talebzadeh
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 5 June 2016 at 10:50, Marco Capuccini <marco.capucc...@farmbio.uu.se>
> wrote:
>
>> Dear all,
>>
>> Does Spark uses data locality information from HDFS, when running in
>> standalone mode? Or is it running on YARN mandatory for such purpose? I
>> can't find this information in the docs, and on Google I am only finding
>> contrasting opinion on that.
>>
>> Regards
>> Marco Capuccini
>>
>
>
>

Re: Does Spark uses data locality information from HDFS when running in standalone mode?

Reply via email to