Here is an example of a region split with both daughters being assigned to the same region. Is this expected?

2010-06-17 08:34:53,060 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_SPLIT_INCLUDES_DAUGHTERS: crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276776160508: Daughters; crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647, crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 from cm-hadoop14.mozilla.org,60020,1276560962019; 1 of 1 2010-06-17 08:34:54,316 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 to cm-hadoop15.mozilla.org,60020,1276778868841 2010-06-17 08:34:54,316 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 to cm-hadoop15.mozilla.org,60020,12767788688412010-06-17 08:34:55,432 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_OPEN: crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 from cm-hadoop15.mozilla.org,60020,1276778868841;
1 of 1
2010-06-17 08:34:55,432 INFO org.apache.hadoop.hbase.master.RegionServerOperation: crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 open on 10.2.72.74:60020 2010-06-17 08:34:55,436 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Updated row crash_reports,21006172700f355-1d02-485a-90d9-0e8182100617,1276788891647 in region .META.,,1 with startcode=1276778868841, server=1
0.2.72.74:60020
2010-06-17 08:34:56,044 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_OPEN: crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 from cm-hadoop15.mozilla.org,60020,1276778868841;
1 of 1
2010-06-17 08:34:56,044 INFO org.apache.hadoop.hbase.master.RegionServerOperation: crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 open on 10.2.72.74:60020 2010-06-17 08:34:56,048 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Updated row crash_reports,21006172b7ec9f5-dcad-4c98-9dc5-969532100617,1276788891647 in region .META.,,1 with startcode=1276778868841, server=1
0.2.72.74:60020


On 6/17/10 11:42 AM, Daniel Einspanjer wrote:
Currently, in our production cluster, almost all of the traffic for a day ends up assigned to a single RS and that causes the load on that machine to be too high. 


With our last release, we salted our rowkeys so that rather than starting with the date: 

100617<guid>

they now start with the first letter of the guid followed by the date:

e100617<guid_that_starts_with_e>

When I look at the region assignments though, I see a single server assigned the following regions:

0100617...

1100617...

2100617...

3100617...

4100617...

...

d100617...

e100617...

f100617...

Is there anything we can do to try to get the cluster to shuffle this up some more? We are getting compaction times in the minutes (one I saw was over 12 minutes) and this causes our clients to time out and shut down which causes production outages.

-Daniel

Reply via email to