When the computation is simple and the input file small, then it is time efficient to use fewer workers (even one) to avoid communication overhead between the workers.
When we are dealing with millions of users and edges and more complex algorithms, running the code in just a worker is impossible. Computation time is much higher (than the communication time between workers), thus we can use more workers to split the computation and get it done faster. Of course for different algorithms and input datasets, different number of workers may be applied. Hope this helps, On Tue, Jul 23, 2013 at 8:34 AM, Wonbae Kim <[email protected]> wrote: > Hi I'm working on hadoop system with 9 slave nodes. > I ran the PageRank example with 1 worker and 8 workers. > However, I couldn't figure out the advantage of 8 workers case. > Even the giraph timer says it's faster when using 1 worker. > Please let me know any problem with my configuration of giraph or hadoop > if you know the cause. > > Thank you. > -- Maria Stylianou marsty5.wordpress.com
