I use YARN as I run Hive on Spark engine in yarn-cluster mode plus other stuff. if I turn off YARN half of my applications won't work. I don't see great concern for supporting YARN. However you may have other reasons
Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 5 June 2016 at 13:40, Marco Capuccini <marco.capucc...@farmbio.uu.se> wrote: > I meant when running in standalone cluster mode, where Hadoop data nodes > run on the same nodes where the Spark workers run. I don’t want to support > YARN as well in my infrastructure, and since I already set up a standalone > Spark cluster, I was wondering if running only HDFS in the same cluster > would be enough. > > Regards > Marco > > On 05 Jun 2016, at 12:17, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > Well in standalone mode you are running your spark code on one physical > node so the assumption would be that there is HDFS node running on the same > host. > > When you are running Spark in yarn-client mode, then Yarn is part of > Hadoop core and Yarn will know about the datanodes from > %HADOOP_HOME/etc/Hadoop/slaves > > HTH > > Dr Mich Talebzadeh > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > http://talebzadehmich.wordpress.com > > > > On 5 June 2016 at 10:50, Marco Capuccini <marco.capucc...@farmbio.uu.se> > wrote: > >> Dear all, >> >> Does Spark uses data locality information from HDFS, when running in >> standalone mode? Or is it running on YARN mandatory for such purpose? I >> can't find this information in the docs, and on Google I am only finding >> contrasting opinion on that. >> >> Regards >> Marco Capuccini >> > > >