Actually, I'm a newbie of HBase. I went to read the code of assigning region because I met a load imbalance problem in my hbase cluster. I run 1+6 nodes hbase cluster, 1 node as master & client, the other nodes as region server and data nodes. I run YCSB to insert records. In the inserting time, I find the data written to data nodes have different data size on disks. I think HDFS is doing well in balancing write. So is this problem due to HBase?
Btw, after finished writing for minutes, the disks get balanced finally. I think maybe there is a LoadBalance like deamon thread working on this. Can anyone explain this? Many thanks. After inserting 160M 1k records, my six datanodes are greatly imbalanced. 10.1.0.125: /dev/sdb1 280G 89G 178G 34% /mnt/DP_disk1 10.1.0.125: /dev/sdc1 280G 91G 176G 35% /mnt/DP_disk2 10.1.0.125: /dev/sdd1 280G 91G 176G 34% /mnt/DP_disk3 10.1.0.121: /dev/sdb1 280G 15G 251G 6% /mnt/DP_disk1 10.1.0.121: /dev/sdc1 280G 16G 250G 6% /mnt/DP_disk2 10.1.0.121: /dev/sdd1 280G 15G 251G 6% /mnt/DP_disk3 10.1.0.122: /dev/sdb1 280G 15G 251G 6% /mnt/DP_disk1 10.1.0.122: /dev/sdc1 280G 15G 252G 6% /mnt/DP_disk2 10.1.0.122: /dev/sdd1 280G 13G 253G 5% /mnt/DP_disk3 10.1.0.124: /dev/sdb1 280G 14G 253G 5% /mnt/DP_disk1 10.1.0.124: /dev/sdc1 280G 15G 252G 6% /mnt/DP_disk2 10.1.0.124: /dev/sdd1 280G 14G 253G 6% /mnt/DP_disk3 10.1.0.123: /dev/sdb1 280G 66G 200G 25% /mnt/DP_disk1 10.1.0.123: /dev/sdc1 280G 65G 201G 25% /mnt/DP_disk2 10.1.0.123: /dev/sdd1 280G 65G 202G 25% /mnt/DP_disk3 10.1.0.126: /dev/sdb1 280G 14G 252G 6% /mnt/DP_disk1 10.1.0.126: /dev/sdc1 280G 14G 252G 6% /mnt/DP_disk2 10.1.0.126: /dev/sdd1 280G 13G 253G 5% /mnt/DP_disk3 2010/9/7 Tao Xie <[email protected]> > I have a look at the following method in 0.89. Is the the following line > correct ? > > nRegions *= e.getValue().size(); > > > private int regionsToGiveOtherServers(final int numUnassignedRegions, > final HServerLoad thisServersLoad) { > SortedMap<HServerLoad, Set<String>> lightServers = > new TreeMap<HServerLoad, Set<String>>(); > this.master.getLightServers(thisServersLoad, lightServers); > // Examine the list of servers that are more lightly loaded than this > one. > // Pretend that we will assign regions to these more lightly loaded > servers > // until they reach load equal with ours. Then, see how many regions > are left > // unassigned. That is how many regions we should assign to this > server. > int nRegions = 0; > for (Map.Entry<HServerLoad, Set<String>> e: lightServers.entrySet()) { > HServerLoad lightLoad = new HServerLoad(e.getKey()); > do { > lightLoad.setNumberOfRegions(lightLoad.getNumberOfRegions() + 1); > nRegions += 1; > } while (lightLoad.compareTo(thisServersLoad) <= 0 > && nRegions < numUnassignedRegions); > nRegions *= e.getValue().size(); > if (nRegions >= numUnassignedRegions) { > break; > } > } > return nRegions; > } > > > > 2010/9/7 Jonathan Gray <[email protected]> > > That code does actually exist in the latest 0.89 release. >> >> It was a protection put in place to guard against a weird behavior that we >> had seen during load balancing. >> >> As Ryan suggests, this code was in need of a rewrite and was just >> committed last week to trunk/0.90. If you're interested in the new load >> balancing code, it's in o.a.h.h.regionserver.LoadBalancer >> >> At the least, you should upgrade to 0.20.6 as there are some important >> fixes from 0.20.4 (until 0.90 is released, at which point everyone should >> move to it). >> >> JG >> >> > -----Original Message----- >> > From: Ryan Rawson [mailto:[email protected]] >> > Sent: Monday, September 06, 2010 7:10 PM >> > To: [email protected] >> > Subject: Re: question about RegionManager >> > >> > That code was completely rewritten in 0.89/0.90... its pretty dodgy so >> > I'd >> > highly consider upgrading to 0.89 asap. >> > > hi, all >> > > >> > > I'm reading the code of RegionManager, I find in the following method >> > there >> > > is an situation when nRegionsToAssign <= nregions, the code only >> > assigns 1 >> > > region. >> > > Is this correct? Hbase version 0.20.4. >> > > >> > > private void assignRegionsToMultipleServers(final HServerLoad >> > > thisServersLoad, >> > > final Set<RegionState> regionsToAssign, final HServerInfo info, >> > > final ArrayList<HMsg> returnMsgs) { >> > > boolean isMetaAssign = false; >> > > for (RegionState s : regionsToAssign) { >> > > if (s.getRegionInfo().isMetaRegion()) >> > > isMetaAssign = true; >> > > } >> > > int nRegionsToAssign = regionsToAssign.size(); >> > > // Now many regions to assign this server. >> > > int nregions = regionsPerServer(nRegionsToAssign, thisServersLoad); >> > > LOG.debug("Assigning for " + info + ": total nregions to assign=" + >> > > nRegionsToAssign + ", nregions to reach balance=" + nregions + >> > > ", isMetaAssign=" + isMetaAssign); >> > > if (nRegionsToAssign <= nregions) { >> > > // I do not know whats supposed to happen in this case. Assign one. >> > > LOG.debug("Assigning one region only (playing it safe..)"); >> > > assignRegions(regionsToAssign, 1, info, returnMsgs); >> > > } else { >> > >
