Re: compared with MapReduce ,what is the advantage of HAMA?

changguanghui Fri, 23 Sep 2011 23:11:09 -0700

I think，maybe, It is important to find some algorithm or some problem which is 
more suitable for using HAMA. Then, people can observe the contrast to the 
results between HAMA and MapReduce. Because more people want to know why they 
should choose HAMA, when they should choose HAMA.....


 -----邮件原件-----
发件人: Thomas Jungblut [mailto:[email protected]] 
发送时间: 2011年9月23日 19:39
收件人: [email protected]
主题: Re: compared with MapReduce ,what is the advantage of HAMA?

Hi,
to clearly state the advantage: you have less overhead.
Let me illustrate an algorithm for mindist search, I renamed it to graph
exploration. This will apply on Shortest Paths, too.
I wrote about it here:
http://codingwiththomas.blogspot.com/2011/04/graph-exploration-with-hadoop-mapreduce.html

Basically the algorithm groups the components of the graph and assigns the
lowest key of the group as an identifier for the component.
Usually you are solving graph problems with MapReduce with a technique
called "Message Passing".
So you are going to send messages to other vertices in every map step. Then
you have to shuffle, sort and reduce the vertices to compute the result.
This isn't done with a single iteration, so you have to chain several
map/reduce jobs.

For each iteration you inherit the overhead of sorting and shuffeling.
Additional you have to do this on the disk.

Hama provides a message passing interface, so you don't have to take care of
writing each message to HDFS.
Each iteration, which is in MapReduce a full job execution, is called a
superstep in BSP.
Each superstep is faster than a full job execution in Hadoop, because you
don't have the overhead with spilling to disk, job setup, sorting and
shuffeling.
In addition you can put your whole graph into RAM, this will speed up the
computation anyways. Hadoop does not offer this capability yet.

But I want to point out some facts that are not positive though:
Currently no benchmarks against Hadoop or other Frameworks like Giraph or
GoldenORB exist, so we can't say: we are the best/fastest/coolest.
And graph algorithms are a hard way to code. As you can see, I have written
lots of code to get this running. That is because I have to take care of the
partitioning, vertex messaging and IO stuff by myself.
For that purpose we are going to release a Pregel API which makes the
development of graph algorithms a lot more easier.
You can get a sneak peek here:
https://issues.apache.org/jira/browse/HAMA-409

That was a lot of text, but I hope to clarify a lot.

Best Regards,
Thomas

2011/9/23 changguanghui <[email protected]>

> Hi Thomas,
>
> Could you provide a concrete instance to illustrate the advantage of HAMA，
> when HAMA vs. MapReduce?
>
> For example，SSSP on HAMA vs. SSSP on MapReduce. So ,I can catch the idea of
> HAMA quickly.
>
> Thank you very much!
>
> Changguanghui
>

Re: compared with MapReduce ,what is the advantage of HAMA?

Reply via email to