I perfectly agree. And I just fixed the graph to plot the full range of values. Thanks for the confirmation about experiment setup. If anyone has any tips I would love to hear them further.
- Danny On Fri, Mar 4, 2011 at 8:02 PM, Ted Dunning <[email protected]> wrote: > I generically dislike graphs with offset references. > > Your run-time graph has a baseline of 1500 seconds which makes it tricky > for the reader to understand your (entirely correct) statement that having > more than 8 machines isn't helpful. The way you plotted the graph > coincidentally looks like nearly perfect speedup across the entire range. > > No comments about your setup. My guess is that you could tune hadoop to > get a better result due to lower overheads but the results won't be > categorically different. Iterative algorithms on stock hadoop are just > plain problematic. > > > On Fri, Mar 4, 2011 at 4:33 PM, Danny Bickson <[email protected]>wrote: > >> I would love to get any feedback you or others may have about the setup of >> this experiment. >> > >
