RE: DFS Block Allocation

Jeff Eastman Thu, 20 Dec 2007 10:17:21 -0800

Thanks Dhruba, 

That makes sense. The data was already on the master node and I did not
consider that I could upload from other nodes too. The distribution on
the slave nodes is uniform and your response explains why the one other
bigger box did not get a larger number of blocks. Noting your use of the
word "attempts", can I conclude that at some point it might be
impossible to upload blocks from a local file to the DFS on the same
node and at that point the blocks would all be loaded elsewhere?

Jeff

-----Original Message-----
From: dhruba Borthakur [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 20, 2007 9:38 AM
To: hadoop-user@lucene.apache.org
Subject: RE: DFS Block Allocation

Hi Jeff,

Did you run the file-upload command on the master node itself? The DFS
client attempts to store one replica of the data on the node on which
the DFSClient is running.

To get a uniform distribution, it would be good if you upload your data
from multiple nodes in your cluster. 

Thanks,
dhruba

-----Original Message-----
From: Jeff Eastman [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 20, 2007 7:15 AM
To: hadoop-user@lucene.apache.org
Subject: DFS Block Allocation

I've brought up a small cluster and uploaded some large files. The
master node is cu027 and it seems to be getting an unfair percentage of
the blocks allocated to it, especially compared to cu171 which has the
same size disk. Can somebody shed some light on the reasons for this?

Jeff

RE: DFS Block Allocation

Reply via email to