Re: Need help trying to balance HBase RegionServer load

Daniel Einspanjer Thu, 17 Jun 2010 08:59:10 -0700

Here is an example of a region split with both daughters beingassigned to the same region. Is this expected?

2010-06-17 08:34:53,060 INFOorg.apache.hadoop.hbase.master.ServerManager: ProcessingMSG_REPORT_SPLIT_INCLUDES_DAUGHTERS:crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276776160508:Daughters;crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647,crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647from cm-hadoop14.mozilla.org,60020,1276560962019; 1 of 12010-06-17 08:34:54,316 INFOorg.apache.hadoop.hbase.master.RegionManager: Assigning regioncrash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647to cm-hadoop15.mozilla.org,60020,12767788688412010-06-17 08:34:54,316 INFOorg.apache.hadoop.hbase.master.RegionManager: Assigning regioncrash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647to cm-hadoop15.mozilla.org,60020,12767788688412010-06-17 08:34:55,432INFO org.apache.hadoop.hbase.master.ServerManager: ProcessingMSG_REPORT_OPEN:crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647from cm-hadoop15.mozilla.org,60020,1276778868841;

1 of 1

2010-06-17 08:34:55,432 INFOorg.apache.hadoop.hbase.master.RegionServerOperation:crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647open on 10.2.72.74:600202010-06-17 08:34:55,436 INFOorg.apache.hadoop.hbase.master.RegionServerOperation: Updated rowcrash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647in region .META.,,1 with startcode=1276778868841, server=1

0.2.72.74:60020

2010-06-17 08:34:56,044 INFOorg.apache.hadoop.hbase.master.ServerManager: ProcessingMSG_REPORT_OPEN:crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647from cm-hadoop15.mozilla.org,60020,1276778868841;

1 of 1

2010-06-17 08:34:56,044 INFOorg.apache.hadoop.hbase.master.RegionServerOperation:crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647open on 10.2.72.74:600202010-06-17 08:34:56,048 INFOorg.apache.hadoop.hbase.master.RegionServerOperation: Updated rowcrash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647in region .META.,,1 with startcode=1276778868841, server=1

0.2.72.74:60020



On 6/17/10 11:42 AM, Daniel Einspanjer wrote:

Currently, in our production cluster, almost all of the traffic for aday ends up assigned to a single RS and that causes the load on thatmachine to be too high.  
With our last release, we salted our rowkeys so that rather thanstarting with the date:  
100617<guid>
 they now start with the first letter of the guid followed by the date:
 e100617<guid_that_starts_with_e>
When I look at the region assignments though, I see a single serverassigned the following regions:
 0100617...
 1100617...
 2100617...
 3100617...
 4100617...
 ...
 d100617...
 e100617...
 f100617...
Is there anything we can do to try to get the cluster to shuffle thisup some more?We are getting compaction times in the minutes (one I saw was over 12minutes) and this causes our clients to time out and shut down whichcauses production outages.
-Daniel

Re: Need help trying to balance HBase RegionServer load

Reply via email to