One thing I would update below Claudio is some sort of statement
that Giraph is currently Incubating, with appropriate links back to the
Incubator website and policies.
On Nov 21, 2011, at 12:43 PM, Claudio Martella wrote:
> Hi devs,
> FOSDEM has announced a devroom completely dedicated to Graph Processing:
> I'm going to submit for a talk there. Here's the draft, feedback is welcome :)
> Title: "Apache Giraph: distributed graph processing in the cloud."
> Abstract: Web and online social graphs have been rapidly growing in
> size and scale during the past decade. In 2008, Google estimated that
> the number of web pages reached over a trillion. Online social
> networking and email sites, including Yahoo!, Google, Microsoft,
> Facebook, LinkedIn, and Twitter, have hundreds of millions of users
> and are expected to grow much more in the future. Processing these
> graphs plays a big role in relevant and personalized information for
> users, such as results from a search engine or news in an online
> social networking site.
> The Apache Giraph (http://incubator.apache.org/giraph) project is a
> faul-tolerant in-memory distributed graph processing system which runs
> on top of a standard Hadoop cluster and is capable of running any
> standard Bulk Synchronous Parallel (BSP) operation over any large
> generic data set which can be represented as a graph. Apache Giraph is
> a loose implementation of Google Pregel.
> Giraph entered the ASF Incubator in July 2011, where it has enlisted
> the aid of committers from Yahoo!, Facebook, LinkedIn, and Twitter.
> The talk will present why running MapReduce jobs for graph processing
> can be a problem, introducing the reason why Google designed Pregel
> at first place. Later, the BSP model will be presented focusing on how
> it can be used to implement a distributed graph processing engine.
> The last part of the talk will be dedicated to Apache Giraph, with a
> description of the programming model (i.e. the API, some typical
> examples such as PageRank and Single Source Shortest Path) along with
> a technical overview of how the architecture of Giraph works and how
> it leverages the Hadoop infrastructure.
> Claudio Martella
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA