YARN supports such scenarios via NodeLabels and CapacityScheduler. The following link should help you further with your requirements.
https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/NodeLabel.html https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html Regards, Ajay On Tue, Feb 7, 2017 at 3:45 PM, Alvaro Brandon <[email protected]> wrote: > Hello all: > > I have the following scenario. > - I have a cluster of 50 machines with Hadoop and Spark installed on them. > - I want to launch one Spark application through spark submit. However I > want this application to run on only a subset of these machines, > disregarding data locality. (e.g. 10 machines) > > Is this possible?. Is there any option in YARN that allows such thing?. > >
