Re: Apache Giraph?

Jake Mannix Fri, 16 Sep 2011 15:44:56 -0700

On Fri, Sep 16, 2011 at 3:36 PM, Ted Dunning <[email protected]> wrote:


> Returning something halves performance or worse since you can't fire and
> forget.  IN Pregel style, you should expect the message to be processed in
> the next super step and a value returned in the super step after that.
>

I guess it depends on what you're doing.  Sometimes you may want something
returned which doesn't depend on what you're sending over (ie. it was
computed
in the _previous_ superstep), cutting 3 supersteps at least down to 2.

But of course, you're right, the right way to do this is to just have the
response
from the previous step "sent back" at the same time as you're sending out
our current message.  Then it's never 2 steps - you're sending out your
message
now, the other side processes it during the next superstep, and it often can
send the response as soon as it has done so.  Async is definitely right
here.


>
> On Fri, Sep 16, 2011 at 2:31 PM, Jake Mannix <[email protected]>
> wrote:
>
> > On Fri, Sep 16, 2011 at 1:24 PM, Ted Dunning <[email protected]>
> > wrote:
> >
> > > Well, distributed memory to me would have fetch and store operations.
> >  Here
> > > we can send a message, but we can't actually fetch or store data
> without
> > > cooperation.
> > >
> >
> > Funny you mention that - I've been considering suggesting that Giraph
> > modify
> > the "sendMsg()" method contract to not be void, but return something
> too...
> >
> >  -jake
> >
> >
> > > On Fri, Sep 16, 2011 at 4:45 AM, Grant Ingersoll <[email protected]
> > > >wrote:
> > >
> > > >
> > > > On Sep 16, 2011, at 12:27 AM, Ted Dunning wrote:
> > > >
> > > > > Actually, I don't think that these really provide a distributed
> > memory
> > > > > layer.
> > > > >
> > > > > What they is multiple iterations without having to renegotiate JVM
> > > > launches,
> > > > > local memory that persists across iterations and decent message
> > > passing.
> > > > > (and of course some level of synchronization).
> > > > >
> > > > > And that is plenty for us.
> > > > >
> > > >
> > > > That sounds a lot like a distributed memory layer (i.e. the JVM stays
> > up
> > > w/
> > > > it's memory) and then a msg passing layer on top of it.  It smells
> like
> > > to
> > > > me that it does for memory what the map-reduce + DFS abstraction did
> > for
> > > > that space, i.e. it gave a base platform + API that made it easy for
> > > people
> > > > to build large scale distributed, disk-based, batch oriented systems.
> >  We
> > > > need a base platform for large-scale, distributed memory-based
> systems
> > so
> > > > that it is easy to write implementations on top of it.
> > > >
> > > >
> > > > > On Fri, Sep 16, 2011 at 12:14 AM, Jake Mannix <
> [email protected]
> > >
> > > > wrote:
> > > > >
> > > > >> A big "distributed memory layer" does indeed sound great, however.
> > > >  Spark
> > > > >> and Giraph both provide their own, although the former seems to
> lean
> > > > more
> > > > >> toward "read-only, with allowed side-effects", and very general
> > > purpose,
> > > > >> while the latter is couched in the language of graphs, and
> > computation
> > > > is
> > > > >> specifically BSP (currently), but allows for fairly arbitrary
> > mutation
> > > > (and
> > > > >> persisting final results back to HDFS).
> > > > >>
> > > >
> > > >
> > > >
> > >
> >
>

Re: Apache Giraph?

Reply via email to