You're looking at sizes on disk? Then this has nothing to do with HBase load balancing.
HBase does not move blocks around on the HDFS layer or deal with which physical disks are used, that is completely the responsibility of HDFS. Periodically HBase will perform major compactions on regions which causes data to be rewritten. This creates new files so could change what is in HDFS. JG > -----Original Message----- > From: Tao Xie [mailto:[email protected]] > Sent: Monday, September 06, 2010 8:38 PM > To: [email protected] > Subject: Re: question about RegionManager > > Actually, I'm a newbie of HBase. I went to read the code of assigning > region > because I met a load imbalance problem in my hbase cluster. I run 1+6 > nodes > hbase cluster, 1 node as master & client, the other nodes as region > server > and data nodes. I run YCSB to insert records. In the inserting time, I > find > the data written to data nodes have different data size on disks. I > think > HDFS is doing well in balancing write. So is this problem due to HBase? > > Btw, after finished writing for minutes, the disks get balanced > finally. I > think maybe there is a LoadBalance like deamon thread working on this. > Can > anyone explain this? Many thanks. > > After inserting 160M 1k records, my six datanodes are greatly > imbalanced. > > 10.1.0.125: /dev/sdb1 280G 89G 178G 34% /mnt/DP_disk1 > > 10.1.0.125: /dev/sdc1 280G 91G 176G 35% /mnt/DP_disk2 > > 10.1.0.125: /dev/sdd1 280G 91G 176G 34% /mnt/DP_disk3 > > 10.1.0.121: /dev/sdb1 280G 15G 251G 6% /mnt/DP_disk1 > > 10.1.0.121: /dev/sdc1 280G 16G 250G 6% /mnt/DP_disk2 > > 10.1.0.121: /dev/sdd1 280G 15G 251G 6% /mnt/DP_disk3 > > 10.1.0.122: /dev/sdb1 280G 15G 251G 6% /mnt/DP_disk1 > > 10.1.0.122: /dev/sdc1 280G 15G 252G 6% /mnt/DP_disk2 > > 10.1.0.122: /dev/sdd1 280G 13G 253G 5% /mnt/DP_disk3 > > 10.1.0.124: /dev/sdb1 280G 14G 253G 5% /mnt/DP_disk1 > > 10.1.0.124: /dev/sdc1 280G 15G 252G 6% /mnt/DP_disk2 > > 10.1.0.124: /dev/sdd1 280G 14G 253G 6% /mnt/DP_disk3 > > 10.1.0.123: /dev/sdb1 280G 66G 200G 25% /mnt/DP_disk1 > > 10.1.0.123: /dev/sdc1 280G 65G 201G 25% /mnt/DP_disk2 > > 10.1.0.123: /dev/sdd1 280G 65G 202G 25% /mnt/DP_disk3 > > 10.1.0.126: /dev/sdb1 280G 14G 252G 6% /mnt/DP_disk1 > > 10.1.0.126: /dev/sdc1 280G 14G 252G 6% /mnt/DP_disk2 > > 10.1.0.126: /dev/sdd1 280G 13G 253G 5% /mnt/DP_disk3 > > 2010/9/7 Tao Xie <[email protected]> > > > I have a look at the following method in 0.89. Is the the following > line > > correct ? > > > > nRegions *= e.getValue().size(); > > > > > > private int regionsToGiveOtherServers(final int numUnassignedRegions, > > final HServerLoad thisServersLoad) { > > SortedMap<HServerLoad, Set<String>> lightServers = > > new TreeMap<HServerLoad, Set<String>>(); > > this.master.getLightServers(thisServersLoad, lightServers); > > // Examine the list of servers that are more lightly loaded than > this > > one. > > // Pretend that we will assign regions to these more lightly > loaded > > servers > > // until they reach load equal with ours. Then, see how many > regions > > are left > > // unassigned. That is how many regions we should assign to this > > server. > > int nRegions = 0; > > for (Map.Entry<HServerLoad, Set<String>> e: > lightServers.entrySet()) { > > HServerLoad lightLoad = new HServerLoad(e.getKey()); > > do { > > lightLoad.setNumberOfRegions(lightLoad.getNumberOfRegions() + > 1); > > nRegions += 1; > > } while (lightLoad.compareTo(thisServersLoad) <= 0 > > && nRegions < numUnassignedRegions); > > nRegions *= e.getValue().size(); > > if (nRegions >= numUnassignedRegions) { > > break; > > } > > } > > return nRegions; > > } > > > > > > > > 2010/9/7 Jonathan Gray <[email protected]> > > > > That code does actually exist in the latest 0.89 release. > >> > >> It was a protection put in place to guard against a weird behavior > that we > >> had seen during load balancing. > >> > >> As Ryan suggests, this code was in need of a rewrite and was just > >> committed last week to trunk/0.90. If you're interested in the new > load > >> balancing code, it's in o.a.h.h.regionserver.LoadBalancer > >> > >> At the least, you should upgrade to 0.20.6 as there are some > important > >> fixes from 0.20.4 (until 0.90 is released, at which point everyone > should > >> move to it). > >> > >> JG > >> > >> > -----Original Message----- > >> > From: Ryan Rawson [mailto:[email protected]] > >> > Sent: Monday, September 06, 2010 7:10 PM > >> > To: [email protected] > >> > Subject: Re: question about RegionManager > >> > > >> > That code was completely rewritten in 0.89/0.90... its pretty > dodgy so > >> > I'd > >> > highly consider upgrading to 0.89 asap. > >> > > hi, all > >> > > > >> > > I'm reading the code of RegionManager, I find in the following > method > >> > there > >> > > is an situation when nRegionsToAssign <= nregions, the code only > >> > assigns 1 > >> > > region. > >> > > Is this correct? Hbase version 0.20.4. > >> > > > >> > > private void assignRegionsToMultipleServers(final HServerLoad > >> > > thisServersLoad, > >> > > final Set<RegionState> regionsToAssign, final HServerInfo info, > >> > > final ArrayList<HMsg> returnMsgs) { > >> > > boolean isMetaAssign = false; > >> > > for (RegionState s : regionsToAssign) { > >> > > if (s.getRegionInfo().isMetaRegion()) > >> > > isMetaAssign = true; > >> > > } > >> > > int nRegionsToAssign = regionsToAssign.size(); > >> > > // Now many regions to assign this server. > >> > > int nregions = regionsPerServer(nRegionsToAssign, > thisServersLoad); > >> > > LOG.debug("Assigning for " + info + ": total nregions to > assign=" + > >> > > nRegionsToAssign + ", nregions to reach balance=" + nregions + > >> > > ", isMetaAssign=" + isMetaAssign); > >> > > if (nRegionsToAssign <= nregions) { > >> > > // I do not know whats supposed to happen in this case. Assign > one. > >> > > LOG.debug("Assigning one region only (playing it safe..)"); > >> > > assignRegions(regionsToAssign, 1, info, returnMsgs); > >> > > } else { > >> > > > >
