Re: communication protocols in hadoop mapreduce

Rekha Joshi Wed, 21 Apr 2010 04:53:07 -0700

A quick answer would be - it is heartbeat communication mechanism, poll-like 
flow between JT/TT's.
Also for communication underneath its RPC, and not the default Java 
serialization but a hadoop specific serialization implementation to have some 
performance gains.


AVRO is in strong contention to be used in hadoop for serialization.You might 
like to also look up into Thrift, Google Protocol Buffers.

Cheers,
/

On 4/21/10 4:37 PM, "Ahmad Shahzad" <ashahz...@gmail.com> wrote:

Hey everyone,
                     I wanted to know that which communication protocols
hadoop mapreduce uses under the hood to provide communication if any. For
example for the shuffle process it uses http to shuffle the values to the
reducers.
So, job tracker has to talk to task trackers, and task trackers have to
report back to job trackers, and what about if the data  is not available on
the same node and the slave node has to fetch the data from other node. In
all of the cases which communication mechanisms are used to achieve the
communication, is it http only??

I would really appreciate if someone can tell me regarding this thing or if
someone has some link that can help me regarding this issue.

Regards,
Ahmad Shahzad

Re: communication protocols in hadoop mapreduce

Reply via email to