On Fri, Sep 9, 2011 at 8:03 AM, Avery Ching <ach...@apache.org> wrote:
> The GraphLab model is more asynchronous than BSP They allow you to update
> your neighbors rather than the BSP model of messaging per superstep. Rather
> than one massive barrier in BSP, they implement this with vertex locking.
> They also all a vertex to modify the state of its neighbors. We could
> certainly add something similar as an alternative computing model, perhaps
> without locking. Here's one idea:
> 1) No explicit supersteps (asynchronous)
This sounds interesting, esp. for streaming algorithms, although I was
thinking something slightly less ambitions to start out: still have
supersteps (effectively) which describe when each vertex has had a chance to
send all messages it wants for this iteration, and has processed all inbound
> 2) All vertices execute compute() (and may or may not send messages)
> 3) Vertices can examine their neighbors or any vertex in the graph (issue
> RPCs to get their state)
"or any vertex in the graph" sounds pretty scary, but yes, powerful. I like
it when my seemingly radical ideas get made look not so scary by comparison!
> 4) When messages are received by a vertex, compute() is executed on it (and
> state is locally locked to compute only)
> 5) Vertices stlll vote to halt when done, indicating the end of the
> 6) Combiners can still be used to reduce the number of messages sent (and
> the number of times compute is executed).
> This could be fun. And provide an interesting comparison platform barrier
> based vs vertex based synchronization.
Yeah, I think locking is an implementation detail which might be even
avoidable: if Vertices are effectively given a messageQueue which they can
process from, we could interpolate between buffering and processing messages
synchonously. The per-mapper threading model could get... interesting!