I think if you focus on the "lossless" graph algorithms, most of them will indeed scale pretty poorly on MapReduce. If you're willing to do lossy stuff, like doing straightforward dimensional reduction on the adjacency matrix and analyzing the geometry of the projected graph, there is still lots of stuff you can do via MR, but for traditional things like "shortest path", finding min-cuts, hubs and cycles, etc, MR can incur a lot of overhead, esp. as the graph size grows to the point at which doing it in memory on a single machine is impossible (ie the point at which MR is even supposed to be used).
+1 -jake On Wed, Nov 2, 2011 at 2:24 AM, Sebastian Schelter <[email protected]> wrote: > As you might know I recently started an experimental graph mining > module. I was already concerned at the beginning of this whether > MapReduce is really a suitable platform for (most) graph algorithms. > > I'm not content with the performance of the algorithms after some > testing and I'm pretty sure the future of large scale graph processing > is not on MapReduce (but hopefully on a Pregel like platform such as > Giraph). > > As we're currently removing clutter and trying to concentrate on the > core algorithms, I suggest to remove all graph algorithms with the > exception of PageRank. > > If no one objects with this, I'll start the cleanup in a few days. > > --sebastian >
