Hello,

I am trying to understand how data locality works in hadoop.

If you run a map reduce job do the mappers only read data from the host on 
which they are running?

Is there a communication protocol between the map reduce layer and HDFS layer 
so that the mapper gets optimized to read data locally?

Any pointers on which layer of the stack handles this?

Cheers,
Ivan

Reply via email to