YARN supports such scenarios via NodeLabels and CapacityScheduler. The
following link should help you further with your requirements.

https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html


Regards,
Ajay


On Tue, Feb 7, 2017 at 3:45 PM, Alvaro Brandon <[email protected]>
wrote:

> Hello all:
>
> I have the following scenario.
> - I have a cluster of 50 machines with Hadoop and Spark installed on them.
> - I want to launch one Spark application through spark submit. However I
> want this application to run on only a subset of these machines,
> disregarding data locality. (e.g. 10 machines)
>
> Is this possible?. Is there any option in YARN that allows such thing?.
>
>

Reply via email to