How about hacking your way around it. Start with max workers & keep killing them off after each run.
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Thu, Mar 13, 2014 at 2:00 AM, Pierre Borckmans < pierre.borckm...@realimpactanalytics.com> wrote: > Thanks Patrick. > > I could try that. > > But the idea was to be able to write a fully automated benchmark, varying > the dataset size, the number of workers, the memory, … without having to > stop/start the cluster each time. > > I was thinking something like SparkConf.set(“spark.max_number_workers”, n) > would be useful in this context but maybe too specific to be implemented. > > Thanks anyway, > > Cheers > > Pierre > > > > On 12 Mar 2014, at 22:50, Patrick Wendell <pwend...@gmail.com> wrote: > > > Hey Pierre, > > > > Currently modifying the "slaves" file is the best way to do this > > because in general we expect that users will want to launch workers on > > any slave. > > > > I think you could hack something together pretty easily to allow this. > > For instance if you modify the line in slaves.sh from this: > > > > for slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do > > > > to this > > > > for slave in `cat "$HOSTLIST"| head -n $NUM_SLAVES | sed > > "s/#.*$//;/^$/d"`; do > > > > Then you could just set NUM_SLAVES before you stop/start. Not sure if > > this helps much but maybe it's a bit faster. > > > > - Patrick > > > > On Wed, Mar 12, 2014 at 10:18 AM, Pierre Borckmans > > <pierre.borckm...@realimpactanalytics.com> wrote: > >> Hi there! > >> > >> I was performing some tests for benchmarking purposes, among other > things to observe the evolution of the performances versus the number of > workers. > >> > >> In that context, I was wondering if there is any easy way to choose the > number of workers to be used in standalone mode, without having to change > the "slaves" file, dispatch it, and restart the cluster ? > >> > >> > >> Cheers, > >> > >> Pierre > >