Re: How MapReduce selects data blocks for processing user request

Rishi Yadav Fri, 08 Feb 2013 21:00:46 -0800

Hi Mehal,

When Client makes a read request for a certain file say foo.txt, namenode
sends information of first block(BlockID) and the datanodes it resides on.


It's client which decides which datanode to pull information from. If first
request fails, it can make a retry to get another replica of block from
another datanode. This process repeats until all data is read.

Thanks and Regards,

Rishi Yadav

(o) 408.988.2000x113 ||  (f) 408.716.2726

InfoObjects Inc || http://www.infoobjects.com *(Big Data Solutions)*

*INC 500 Fastest growing company in 2012 || 2011*

*Best Place to work in Bay Area 2012 - *SF Business Times and the Silicon
Valley / San Jose Business Journal

2041 Mission College Boulevard, #280 || Santa Clara, CA 95054




On Fri, Feb 8, 2013 at 4:40 PM, Mehal Patel <[email protected]> wrote:

> Hello All,
>
> I am confused over how MapReduce tasks select data blocks for processing
> user requests ?
>
> As data block replication replicates single data block over multiple
> datanodes, during job processing how uniquely
> data blocks are selected for processing user requests ? How does it
> guarantees that no same block gets chosen twice or thrice
> for different mapper task.
>
>
> Thank you
>
> -Mehal
>

Re: How MapReduce selects data blocks for processing user request

Reply via email to