There was another question related to this in the recent past as well.
On 1/23/12 3:10 PM, Gavan Hood wrote:
Yes thanks Claudio, that is what my impression is as well. Thanks for the
From: Claudio Martella [mailto:claudio.marte...@gmail.com]
Sent: Tuesday, 24 January 2012 8:38 AM
Subject: Re: How is this use case supported
Giraph is a batch processing engine, no DB. What you would do is the same
you would do with Mapreduce. As you said, you input a snapshot of your
constantly changing graph to Giraph and work later with what's coming out in
your pipeline. I personally I don't see space for transactions inside of
Giraph, you'd have to manage it yourself from its output to update your DB.
Does it help?
On Mon, Jan 23, 2012 at 11:29 PM, Gavan Hood<gwh...@simul-tech.com> wrote:
I have been wondering how Giraph can support a large graph that is
constantly being updated by multiple jobs running simultaneously.
Output of jobs are continually adding extra and modifying edges /
vertices in the graph. Some notion of transactional concurrency would be
needed as well in this environment.
From what I can see it appears that Giraph may be well suited to working
with snapshots of such as system rather than the root implementation, but I
feel that I might be missing a core design pattern.