On Wed, Nov 2, 2011 at 9:37 AM, Jake Mannix <[email protected]> wrote: > I think if you focus on the "lossless" graph algorithms, most of > them will indeed scale pretty poorly on MapReduce. If you're willing > to do lossy stuff, like doing straightforward dimensional reduction on > the adjacency matrix and analyzing the geometry of the projected graph, > there is still lots of stuff you can do via MR, but for traditional things > like > "shortest path", finding min-cuts, hubs and cycles, etc, MR can > incur a lot of overhead, esp. as the graph size grows to the point at > which doing it in memory on a single machine is impossible (ie the > point at which MR is even supposed to be used).
FWIW on modern multicore processors, we (quite unexpectedly) found that MR's more efficient use of cache outperformed state of the art concurrent minimal spanning tree algorithms. Quite large intermediate volumes of data are generated, though. Too much to stream successfully across a network. So machine-local storage would be needed... Robert
