Re: HDFS internal mechanism questions

Todd Lipcon Thu, 07 Oct 2010 22:32:46 -0700

On Thu, Oct 7, 2010 at 10:08 PM, Sean Bigdatafun
<sean.bigdata...@gmail.com>wrote:


> Is there a pointer where I can find details of the write path in HDFS? In
> particular, I'd like to get some technical figures describing the following
> puzzle in my mind:
>
>    * Is there a 64KB block-wise checksum within the 64MB blocks (as
> described in Section 5.2 in the
>
GFS paper)? or HDFS keeps a whole-block (64 MB) wise single checksum?
>

Checksums are on 512 byte chunks. This is theoretically configurable, but I
have a lot of doubts that it would actually work with a different value,
since I've never heard of anyone changing it :)


>
>    *  It seems that HDFS' staging strategy," In fact, initially the HDFS
> client caches the file data into a temporary local file. Application writes
> are transparently redirected to this temporary local file" , is quite
> different from the original GFS paper (see Section 2.3 of GFS paper "neither
> client nor the chunkserver caches file data"). Can someone help me
> understanding it ?
>
>

People keep referencing this on the list, but it hasn't been that way in
about 3 years :) Where do you see this, so we can update the docs?



>    *  Both HDFS document and GFS paper mentioned that Namenode poll
> Datanodes periodically (BlockReport) to get their most up-to-date
> information. Can someone tell me what exact info "BlockReport" contain or
> tell me the class name that I can look up in the Javadoc?
>

Look at the NameNode.java class - the block reports come in via RPC to
there.


>  Plus, is the block-id treated as file name in the datanode's local
> filesystem?
>

In the DN, each block is two files: blk_NNNNN and blk_NNNNN_GS.meta, where
GS is a generation stamp. The meta file contains checksums.


> Here is my guess-standing:
>    --- 1)  I think the reason why losing Namenode metadata can cause HDFS
> cluster data total loss is because "BlockReport" does not contain the
> mapping between a HDFS filename and the block-ids (otherwise, the polled
> data may be sufficient to reconstruct the overall HDFS metadata view), so
> I'd like to understand more details.
>

Correct, the DNs have no concept of filename.


>    --- 2)  Namenode's metadata contains "{filename, n-th block} -->
> block-id", and serve as the final authority (from checkpoint and edit log).
> But the metadata does not contain "block-id --> {machineA, machineB,
> machineC}" -- instead, it waits for the BlockReport info from Datanodes.
>

Correct. You can move blocks between DNs while the NN is down and no one
will be the wiser.

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS internal mechanism questions

Reply via email to