Todd, Much better. I also had to adjust the number of map tasks for the gen step. Storage was spread across 3 nodes instead of 4, but a bit more playing with these parameters should do the trick.
Thanks for your help. Jeff From: Todd Lipcon [mailto:t...@cloudera.com] Sent: Friday, February 25, 2011 8:09 PM To: hdfs-user@hadoop.apache.org Cc: Jeffrey Buell Subject: Re: balancing and replication in HDFS When you run terasort, pass -Dmapred.reduce.tasks=4 and see how that goes for you. See this old thread for info: http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3ccbbf4b570906300617ma4505f5o2aa1b9fb87b31...@mail.gmail.com%3E -Todd On Fri, Feb 25, 2011 at 4:45 PM, Jeffrey Buell <jbu...@vmware.com<mailto:jbu...@vmware.com>> wrote: Hi Todd, Thanks for the quick response. I added <final>true</final> to dfs.replication, but I still get just one output copy. Can hadoop apps overwrite the replication level even with this parameter? I tried increasing mapred.tasktracker.reduce.tasks.maximum from 1 to 4, but that didn't make any difference: output is still all on one node. I thought that parameter controls the number of reduce tasks per node so 1 should be sufficient, is that correct logic? The other reduce parameters are at their defaults. Which parameters should I be trying? There's no way the sort can run efficiently if the gen is unbalanced. At 5 GB, there're plenty of blocks to make the distribution even. How does HDFS decide to spread the storage across the nodes? Note the sizes of the 4 chunks (1.7,...,3.2 GB) seem to be repeatable, but they land on different nodes each time teragen is run. Jeff > -----Original Message----- > From: Todd Lipcon [mailto:t...@cloudera.com<mailto:t...@cloudera.com>] > Sent: Friday, February 25, 2011 3:13 PM > To: hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org> > Subject: Re: balancing and replication in HDFS > > Hi Jeff, > The output of terasort has replication level 1 by default. This is so > it goes faster with the default settings and makes for more impressive > benchmark results :) > The reason you see it all on one machine is probably that you're > running with one reducer. Try configuring your terasort to use more > reduce tasks and you should see the load (and space usage) even out. > -Todd > > On Fri, Feb 25, 2011 at 2:52 PM, Jeffrey Buell > <jbu...@vmware.com<mailto:jbu...@vmware.com>> > wrote: > > > > I'm a newbie to hadoop and HDFS. I'm seeing odd behavior in HDFS > that I hope somebody can clear up for me. I'm running hadoop version > 0.20.1+169.127 from the cloudera distro on 4 identical nodes, each with > 4 cpus and 100GB disk space. Replication is set to 2. > > > > I run: > > > > hadoop jar /usr/lib/hadoop/hadoop-*-examples.jar teragen 50000000 > tera_in5 > > > > This produces the expected 10GB of data on disk (5GB * 2 copies). > But the data is spread very unevenly across the nodes, ranging from > 1.7 to 3.2 GB on each node. Then I sort the data: > > > > hadoop jar /usr/lib/hadoop/hadoop-*-examples.jar terasort tera_in5 > tera_out5 > > > > It finishes successfully, and HDFS recognizes the right amount of > data: > > > > $ hadoop fs -du /user/hadoop/ > > Found 2 items > > 5000023410 hdfs://namd-1/user/hadoop/tera_in5 > > 5000170993 hdfs://namd-1/user/hadoop/tera_out5 > > > > However all the new data is on one node (apparently randomly chosen), > and the total disk usage is only 15GB, which means that the output data > is not replicated. For nearly all the elapsed time of the sort, the > other 3 nodes are idle. Some of the output data is in > dfs/data/current, but a lot is in one of 64 new subdirs > (dfs/data/current/subdir0 through subdir63). > > > > Why is all this happening? Am I missing some tunables that make HDFS > do the right balance and replication? > > > > Thanks, > > > > Jeff > > > > -- > Todd Lipcon > Software Engineer, Cloudera -- Todd Lipcon Software Engineer, Cloudera