Alexandra,

I notice the same behavior, both with uneven block placement, and errors 
running the balancer. DFS will wait to do the deletions until some time when 
the cluster is lightly loaded. I believe there is a config setting which can 
alter that behavior, but I do not recall it offhand. Regarding balancer errors, 
when I see this I restart and that clears the problem. I wonder if it is some 
bug, but have not had time to dig in.

Also, please be advised you should not run the HDFS balancer while HBase is 
running. Due to some issues with how the DFS client library caches block 
locations, HBase can run into trouble if blocks are deleted from locations 
where they were previously known to exist. If you feel your cluster requires 
balancing, shut down HBase first, do the balancing, then restart. 

    - Andy




________________________________
From: Alexandra Alecu <[email protected]>
To: [email protected]
Sent: Wednesday, May 13, 2009 10:56:04 AM
Subject: Hbase 0.19.2 - Large import results in heavily unbalanced hadoop DFS


I am a Hbase/hadoop beginner. 
As an initial test, I am trying to import about 120GB of records into one
big table in HBase (replication level 2).  I have a HBase master and a
Hadoop namenode running on two separate machines and 4 other nodes running
the datanodes and regionservers. Each datanode has approx 400 GB local
storage.

I have done a few tests previously with Hbase 0.19.1 and I kept on running
into problems related to the slow compactions (HBASE-1058).  I have now
installed HBase 0.19.2 and one thing I noticed is that the disk usage during
import is much higher and the datanodes come out very unbalanced. 

Whereas using HBase 0.19.1, I used to fill about 300 GB nicely balanced, now
I have filled about 700GB, 100GB on each of the 3 datanodes and one of the
nodes gets completely full (400GB) causing the import to slowdown and
eventually fail not being able to contact one of the .META. regions.

I stopped HBase and tried to balance the hdfs which informed me :

09/05/13 17:34:38 INFO balancer.Balancer: Need to move 177.92 GB bytes to
make the cluster balanced.

After this, with hadoop hard at work balancing, it seems to fail to move
blocks 50% of the time, should I worry about these errors/warnings: 
09/05/13 17:34:38 WARN balancer.Balancer: Error moving block
-6198018159178133648 from 131.111.70.215:50010 to 131.111.70.214:50010
through 131.111.70.216:50010: block move is failed

Checking the balancing process, it looks like the hdfs usage constantly
decreases, having at the end a value closer to what i expected. Essentially,
it looks like the balancing has wiped the data which was causing this one
datanode to fill up to almost 100%. Maybe this data was caused by the
delayed compaction or some logs which need to be played on the cluster.

This is the situation towards the end of the balancing : 

Datanodes available: 4 (4 total, 0 dead)

Name: 1
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 88593623040 (82.51 GB)
DFS Used%: 20.45%
DFS Remaining%: 78.51%
Last contact: Wed May 13 18:48:09 BST 2009

Name: 2
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 89317653511 (83.18 GB)
DFS Used%: 20.61%
DFS Remaining%: 78.34%
Last contact: Wed May 13 18:48:10 BST 2009

Name: 3
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 89644974080 (83.49 GB)
DFS Used%: 20.69%
DFS Remaining%: 78.27%
Last contact: Wed May 13 18:48:10 BST 2009

Name: 4
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 138044233537 (128.56 GB)
DFS Used%: 31.86%
DFS Remaining%: 67.07%
Last contact: Wed May 13 18:48:10 BST 2009

Before the balancing, the datanode no 4 was using approx 400 GB.

What are your comments on this behaviour?  Is this something that you
expected?

Let me know if you need me to provide more information. 

Many thanks,
Alexandra Alecu.



-- 
View this message in context: 
http://www.nabble.com/Hbase-0.19.2---Large-import-results-in-heavily-unbalanced-hadoop-DFS-tp23526652p23526652.html
Sent from the HBase User mailing list archive at Nabble.com.


      

Reply via email to