Giraph is a batch processing engine, no DB. What you would do is the
same you would do with Mapreduce. As you said, you input a snapshot of
your constantly changing graph to Giraph and work later with what's
coming out in your pipeline. I personally I don't see space for
transactions inside of Giraph, you'd have to manage it yourself from
its output to update your DB.
Does it help?
On Mon, Jan 23, 2012 at 11:29 PM, Gavan Hood <gwh...@simul-tech.com> wrote:
> Hi all,
> I have been wondering how Giraph can support a large graph that is constantly
> being updated by multiple jobs running simultaneously.
> Output of jobs are continually adding extra and modifying edges / vertices
> in the graph. Some notion of transactional concurrency would be needed as
> well in this environment.
> From what I can see it appears that Giraph may be well suited to working with
> snapshots of such as system rather than the root implementation, but I feel
> that I might be missing a core design pattern.