Yes, I am talking about standalone single node cluster. No, I am not increasing parallelism. I just wanted to know if it is natural. Does message passing across the workers account for the happenning?
I am running SparkKMeans, just to validate one prediction model. I am using several data sets. I have a standalone mode. I am varying the workers from 1 to 16 On Sat, Feb 21, 2015 at 8:14 PM, Sean Owen <so...@cloudera.com> wrote: > I can imagine a few reasons. Adding workers might cause fewer tasks to > execute locally (?) So you may be execute more remotely. > > Are you increasing parallelism? for trivial jobs, chopping them up > further may cause you to pay more overhead of managing so many small > tasks, for no speed up in execution time. > > Can you provide any more specifics though? you haven't said what > you're running, what mode, how many workers, how long it takes, etc. > > On Sat, Feb 21, 2015 at 2:37 PM, Deep Pradhan <pradhandeep1...@gmail.com> > wrote: > > Hi, > > I have been running some jobs in my local single node stand alone > cluster. I > > am varying the worker instances for the same job, and the time taken for > the > > job to complete increases with increase in the number of workers. I > repeated > > some experiments varying the number of nodes in a cluster too and the > same > > behavior is seen. > > Can the idea of worker instances be extrapolated to the nodes in a > cluster? > > > > Thank You >