On Sep 16, 2011, at 12:27 AM, Ted Dunning wrote: > Actually, I don't think that these really provide a distributed memory > layer. > > What they is multiple iterations without having to renegotiate JVM launches, > local memory that persists across iterations and decent message passing. > (and of course some level of synchronization). > > And that is plenty for us. >
That sounds a lot like a distributed memory layer (i.e. the JVM stays up w/ it's memory) and then a msg passing layer on top of it. It smells like to me that it does for memory what the map-reduce + DFS abstraction did for that space, i.e. it gave a base platform + API that made it easy for people to build large scale distributed, disk-based, batch oriented systems. We need a base platform for large-scale, distributed memory-based systems so that it is easy to write implementations on top of it. > On Fri, Sep 16, 2011 at 12:14 AM, Jake Mannix <[email protected]> wrote: > >> A big "distributed memory layer" does indeed sound great, however. Spark >> and Giraph both provide their own, although the former seems to lean more >> toward "read-only, with allowed side-effects", and very general purpose, >> while the latter is couched in the language of graphs, and computation is >> specifically BSP (currently), but allows for fairly arbitrary mutation (and >> persisting final results back to HDFS). >>
