Re: Changing number of workers for benchmarking purposes

Pierre Borckmans Thu, 13 Mar 2014 02:01:26 -0700

Thanks Patrick.

I could try that.


But the idea was to be able to write a fully automated benchmark, varying the 
dataset size, the number of workers, the memory, … without having to stop/start 
the cluster each time.

I was thinking something like SparkConf.set(“spark.max_number_workers”, n) 
would be useful in this context but maybe too specific to be implemented.

Thanks anyway,

Cheers

Pierre



On 12 Mar 2014, at 22:50, Patrick Wendell <pwend...@gmail.com> wrote:

> Hey Pierre,
> 
> Currently modifying the "slaves" file is the best way to do this
> because in general we expect that users will want to launch workers on
> any slave.
> 
> I think you could hack something together pretty easily to allow this.
> For instance if you modify the line in slaves.sh from this:
> 
>  for slave in `cat "$HOSTLIST"|sed  "s/#.*$//;/^$/d"`; do
> 
> to this
> 
>  for slave in `cat "$HOSTLIST"| head -n $NUM_SLAVES | sed
> "s/#.*$//;/^$/d"`; do
> 
> Then you could just set NUM_SLAVES before you stop/start. Not sure if
> this helps much but maybe it's a bit faster.
> 
> - Patrick
> 
> On Wed, Mar 12, 2014 at 10:18 AM, Pierre Borckmans
> <pierre.borckm...@realimpactanalytics.com> wrote:
>> Hi there!
>> 
>> I was performing some tests for benchmarking purposes, among other things to 
>> observe the evolution of the performances versus the number of workers.
>> 
>> In that context, I was wondering if there is any easy way to choose the 
>> number of workers to be used in standalone mode, without having to change 
>> the "slaves" file, dispatch it, and restart the cluster ?
>> 
>> 
>> Cheers,
>> 
>> Pierre

Re: Changing number of workers for benchmarking purposes

Reply via email to