Seems like the bigger thing I see us discussing/needing is a distributed memory layer. Do each of these tools invent their own or is there a good, open (ASL compatible) implementation out there somewhere that we could use? Given such a layer, wouldn't it be fairly straightforward to implement both graph based and matrix based approaches? Thinking aloud (and perhaps a bit crazy), I wonder if one could simply implement a Hadoop filesystem that was based on distributed memory (and persistable to disk, perhaps) thereby allowing existing code to simply work.
--Grant On Sep 9, 2011, at 10:36 AM, Jake Mannix wrote: > On Fri, Sep 9, 2011 at 7:01 AM, Benson Margulies <[email protected]>wrote: > >> I've since reached the conclusion that the thing I'm trying to compare >> it to is a 'data grid', e.g. gigaspaces. >> >> We want a large, evolving, data structure, which is essentially cached >> in memory split over nodes. >> > > I should mention that Giraph certainly allows for the graph to change (both > in > edge values, and in actual graph structure). But it's currently a very > BSP-specific > paradigm: run _this_ algorithm, via BSP, over _this_ initial data set, until > _this_ many iterations have run, then exit. You could hack it to do other > things, > but it wasn't the original intent, from what I can tell. > > -jake
