about how the hdfs choose datanodes to store the files

cheng xu Wed, 04 May 2011 23:59:51 -0700

Hi:
 all! we know that the hdfs divide a large file into several blocks(with
each 64mb, 3 replications default). and  once  the metadata in the namenode
are modified, there goes a thread dataStreamer to transport the blocks to
the datanode. for each block, the client send the block to the 3 datanodes
with a pipeline.



    dfsClient.namenode.create(src, masked, dfsClient.clientName, new
EnumSetWritable<CreateFlag>(flag), createParent, replication, blockSize);
    streamer = new DataStreamer();
    streamer.start();

I just wondering how the cluster choose which datanodes to store the blocks.
what policy?
and as we know there may be plenty of blocks for a file.  and what's the
sequences is for  these blocks to be transported, cos from what I read from
the code, there is only one thread to do this from the client to the
datanodes.


any answer or url are appreciated.thanks
best regards!
xu

about how the hdfs choose datanodes to store the files

Reply via email to