Interesintg. In our community, someone's thinking about asynchronous
message processing for more efficient iteration, too.
As I mentioned before to you, differ in slogan but not in kind. The
technical issues are nothing, Avery.
It would be nice if we can talk together continuously, for
collaborative competition. http://s.apache.org/HamaVsGiraph
On Wed, Sep 14, 2011 at 2:47 AM, Avery Ching <ach...@apache.org> wrote:
> Hi Vinod,
> Edward and I have chatted about this at times. It sounds better in theory
> (both BSP based and adding support for MRv2) than in practice I think
> (underlying implementations are quite different). Actually, I also believe
> that in the future, Giraph is not going to solely be BSP-based graph
> computing. We are also thinking about other underlying computing models
> (i.e. streaming (asynchronous) graph processing - see
> But I think today, the issues are the following:
> 1) Giraph runs completely as a MapReduce job on Hadoop today. This needs
> to be maintained to support our current users, who will not likely move to
> MRv2 for at least a year.
> 2) The internals of Giraph are implemented differently than Hama and would
> take some time to port to.
> 3) If we have various graph processing computing models (BSP based, streams
> or asynchronous, or a combination), then being on Hama brings little value
> for Giraph.
> Perhaps more practically, I wonder if it would be possible for someone from
> the Hama team to refactor our code a bit to support Hama-style BSP in
> Giraph? Certainly would be a pretty cool project...
> On 9/13/11 4:49 AM, Edward J. Yoon wrote:
>> Quite a while ago, I implemented a clone of Google Pregel simply using
>> BSPLib and decided to focus on BSP computing engine.
>> Hama and Giraph projects are differ in slogan but not in kind.
>> If we made some collaboration, Giraph should be implemented on top of
>> Hama BSP computing engine.
>> Otherwise, we will back to square one again.
>> 1. http://markmail.org/thread/4czcgtjupjvpqcqi
>> On Sun, Sep 11, 2011 at 11:22 PM, Vinod Kumar Vavilapalli
>> <vino...@hortonworks.com> wrote:
>>> Crosspost to hama-dev and giraph-dev.
>>> It was only in my morning time that I was looking at HAMA-431, the port
>>> Hama to YARN. And one of the tweets reminded me of JIRA issue GIRAPH-13
>>> which is about porting Giraph to YARN.
>>> I was also looking at the Girpah proposal for entry into Apache
>>> There is an interesting section there:
>>> Relationships with Other Apache Products
>>> Giraph has some overlapping functionality with Apache Hama. However,
>>> are some significant differences. Giraph focuses on graph-based bulk
>>> synchronous parallel (BSP) computing, while Apache Hama is more for
>>> purposed BSP computing. Giraph runs on the Hadoop infrastructure, while
>>> Apache Hama uses its own computing framework.
>>> I agree with the point about Hama being a general purposed BSP and Giraph
>>> being completely graph oriented. But the later one about the
>>> is going to be moot with both Giraph and Hama trying to be ported over to
>>> So here's my billion dollar question: Is it possible to implement
>>> graph based APIs over the Hama's bsp APIs which both run over a single
>>> Apache BSP implementation over YARN?
>>> I also do see the email thread regarding Hama and Giraph's future
>>> collaboration when Hadoop NextGen aka YARN comes in:
>>> http://s.apache.org/HamaVsGiraph. So are we ready for this yet?
>>> Disclaimer: I come from the Hadoop world, have no idea of Giraph's APIs
>>> internals except that I see a bsp package in Giraph's source tree. I do
>>> a tiny bit about Hama's APIs and internal but my expertise is only two
>>> (An elephant maintainer trying to see if a Giraffe can be made to ride
>>> a hippopotamus riding over an elephant)
Best Regards, Edward J. Yoon