Re: Which replica?

Doug Cutting Mon, 01 Dec 2008 16:12:56 -0800

A task may read from more than one block. For example, in line-orientedinput, lines frequently cross block boundaries. And a block may be readfrom more than one host. For example, if a datanode dies midway throughproviding a block, the client will switch to using a different datanode.So the mapping is not simple. This information is also not, as youinferred, available to applications. Why do you need this? Do you havea compelling reason?


Doug


James Cipar wrote:

Is there any way to determine which replica of each chunk is read by amap-reduce program? I've been looking through the hadoop code, and itseems like it tries to hide those kinds of details from the higher levelAPI. Ideally, I'd like the host the task was running on, the file nameand chunk number, and the host the chunk was read from.

Re: Which replica?

Reply via email to