I get this as well... It seems to be due to delayed block deletion in HDFS. Once the cluster settles down for a while, the blocks get deleted and we're back in balance.
On Wed, May 13, 2009 at 10:56 AM, Alexandra Alecu <[email protected] > wrote: > > I am a Hbase/hadoop beginner. > As an initial test, I am trying to import about 120GB of records into one > big table in HBase (replication level 2). I have a HBase master and a > Hadoop namenode running on two separate machines and 4 other nodes running > the datanodes and regionservers. Each datanode has approx 400 GB local > storage. > > I have done a few tests previously with Hbase 0.19.1 and I kept on running > into problems related to the slow compactions (HBASE-1058). I have now > installed HBase 0.19.2 and one thing I noticed is that the disk usage > during > import is much higher and the datanodes come out very unbalanced. > > Whereas using HBase 0.19.1, I used to fill about 300 GB nicely balanced, > now > I have filled about 700GB, 100GB on each of the 3 datanodes and one of the > nodes gets completely full (400GB) causing the import to slowdown and > eventually fail not being able to contact one of the .META. regions. > > I stopped HBase and tried to balance the hdfs which informed me : > > 09/05/13 17:34:38 INFO balancer.Balancer: Need to move 177.92 GB bytes to > make the cluster balanced. > > After this, with hadoop hard at work balancing, it seems to fail to move > blocks 50% of the time, should I worry about these errors/warnings: > 09/05/13 17:34:38 WARN balancer.Balancer: Error moving block > -6198018159178133648 from 131.111.70.215:50010 to 131.111.70.214:50010 > through 131.111.70.216:50010: block move is failed > > Checking the balancing process, it looks like the hdfs usage constantly > decreases, having at the end a value closer to what i expected. > Essentially, > it looks like the balancing has wiped the data which was causing this one > datanode to fill up to almost 100%. Maybe this data was caused by the > delayed compaction or some logs which need to be played on the cluster. > > This is the situation towards the end of the balancing : > > Datanodes available: 4 (4 total, 0 dead) > > Name: 1 > Configured Capacity: 433309891584 (403.55 GB) > DFS Used: 88593623040 (82.51 GB) > DFS Used%: 20.45% > DFS Remaining%: 78.51% > Last contact: Wed May 13 18:48:09 BST 2009 > > Name: 2 > Configured Capacity: 433309891584 (403.55 GB) > DFS Used: 89317653511 (83.18 GB) > DFS Used%: 20.61% > DFS Remaining%: 78.34% > Last contact: Wed May 13 18:48:10 BST 2009 > > Name: 3 > Configured Capacity: 433309891584 (403.55 GB) > DFS Used: 89644974080 (83.49 GB) > DFS Used%: 20.69% > DFS Remaining%: 78.27% > Last contact: Wed May 13 18:48:10 BST 2009 > > Name: 4 > Configured Capacity: 433309891584 (403.55 GB) > DFS Used: 138044233537 (128.56 GB) > DFS Used%: 31.86% > DFS Remaining%: 67.07% > Last contact: Wed May 13 18:48:10 BST 2009 > > Before the balancing, the datanode no 4 was using approx 400 GB. > > What are your comments on this behaviour? Is this something that you > expected? > > Let me know if you need me to provide more information. > > Many thanks, > Alexandra Alecu. > > > > -- > View this message in context: > http://www.nabble.com/Hbase-0.19.2---Large-import-results-in-heavily-unbalanced-hadoop-DFS-tp23526652p23526652.html > Sent from the HBase User mailing list archive at Nabble.com. > >
