Alexandra,
I notice the same behavior, both with uneven block placement, and errors
running the balancer. DFS will wait to do the deletions until some time when
the cluster is lightly loaded. I believe there is a config setting which can
alter that behavior, but I do not recall it offhand. Regarding balancer errors,
when I see this I restart and that clears the problem. I wonder if it is some
bug, but have not had time to dig in.
Also, please be advised you should not run the HDFS balancer while HBase is
running. Due to some issues with how the DFS client library caches block
locations, HBase can run into trouble if blocks are deleted from locations
where they were previously known to exist. If you feel your cluster requires
balancing, shut down HBase first, do the balancing, then restart.
- Andy
________________________________
From: Alexandra Alecu <[email protected]>
To: [email protected]
Sent: Wednesday, May 13, 2009 10:56:04 AM
Subject: Hbase 0.19.2 - Large import results in heavily unbalanced hadoop DFS
I am a Hbase/hadoop beginner.
As an initial test, I am trying to import about 120GB of records into one
big table in HBase (replication level 2). I have a HBase master and a
Hadoop namenode running on two separate machines and 4 other nodes running
the datanodes and regionservers. Each datanode has approx 400 GB local
storage.
I have done a few tests previously with Hbase 0.19.1 and I kept on running
into problems related to the slow compactions (HBASE-1058). I have now
installed HBase 0.19.2 and one thing I noticed is that the disk usage during
import is much higher and the datanodes come out very unbalanced.
Whereas using HBase 0.19.1, I used to fill about 300 GB nicely balanced, now
I have filled about 700GB, 100GB on each of the 3 datanodes and one of the
nodes gets completely full (400GB) causing the import to slowdown and
eventually fail not being able to contact one of the .META. regions.
I stopped HBase and tried to balance the hdfs which informed me :
09/05/13 17:34:38 INFO balancer.Balancer: Need to move 177.92 GB bytes to
make the cluster balanced.
After this, with hadoop hard at work balancing, it seems to fail to move
blocks 50% of the time, should I worry about these errors/warnings:
09/05/13 17:34:38 WARN balancer.Balancer: Error moving block
-6198018159178133648 from 131.111.70.215:50010 to 131.111.70.214:50010
through 131.111.70.216:50010: block move is failed
Checking the balancing process, it looks like the hdfs usage constantly
decreases, having at the end a value closer to what i expected. Essentially,
it looks like the balancing has wiped the data which was causing this one
datanode to fill up to almost 100%. Maybe this data was caused by the
delayed compaction or some logs which need to be played on the cluster.
This is the situation towards the end of the balancing :
Datanodes available: 4 (4 total, 0 dead)
Name: 1
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 88593623040 (82.51 GB)
DFS Used%: 20.45%
DFS Remaining%: 78.51%
Last contact: Wed May 13 18:48:09 BST 2009
Name: 2
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 89317653511 (83.18 GB)
DFS Used%: 20.61%
DFS Remaining%: 78.34%
Last contact: Wed May 13 18:48:10 BST 2009
Name: 3
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 89644974080 (83.49 GB)
DFS Used%: 20.69%
DFS Remaining%: 78.27%
Last contact: Wed May 13 18:48:10 BST 2009
Name: 4
Configured Capacity: 433309891584 (403.55 GB)
DFS Used: 138044233537 (128.56 GB)
DFS Used%: 31.86%
DFS Remaining%: 67.07%
Last contact: Wed May 13 18:48:10 BST 2009
Before the balancing, the datanode no 4 was using approx 400 GB.
What are your comments on this behaviour? Is this something that you
expected?
Let me know if you need me to provide more information.
Many thanks,
Alexandra Alecu.
--
View this message in context:
http://www.nabble.com/Hbase-0.19.2---Large-import-results-in-heavily-unbalanced-hadoop-DFS-tp23526652p23526652.html
Sent from the HBase User mailing list archive at Nabble.com.