[ 
https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584274#action_12584274
 ] 

stack commented on HBASE-555:
-----------------------------

When .19 server shows up, gets 100+ regions.  This patch should include upper 
bound on how many regions we assign at a time.

Looking at the Lars cluster, this patch seems to be doing what its supposed 
to.... Its ugly in that currently there is a log for every 
MSG_REPORT_PROCESS_OPEN message -- one per region every time it reports in -- 
but its just during startup (Previous our startup logs were clogged with 
reassigning regions already assigned).

Here is illustration that patch is basically working...  After assignment, 7 
seconds after last assigned region message is logged we see this:

{code}
2008-04-01 00:07:05,460 DEBUG org.apache.hadoop.hbase.HMaster: Received 
MSG_REPORT_PROCESS_OPEN : pdc-docs,US20070223009_20070927,1205860531876 from 
192.168.105.19:60020
{code}

A few regions open, then we get this:

{code}
2008-04-01 00:07:11,528 DEBUG org.apache.hadoop.hbase.HMaster: Received 
MSG_REPORT_PROCESS_OPEN : pdc-docs,US4881767_19891121,1205704528908 from 
192.168.105.19:60020
{code}

... about 6 seconds after one from previous batch... then...

{code}
2008-04-01 00:07:14,534 DEBUG org.apache.hadoop.hbase.HMaster: Received 
MSG_REPORT_PROCESS_OPEN : pdc-docs,US5399923_19950321,1205793104251 from 
192.168.105.19:60020
{code}

later....

{code}
2008-04-01 00:07:23,614 DEBUG org.apache.hadoop.hbase.HMaster: Received 
MSG_REPORT_PROCESS_OPEN : pdc-docs,EP04011653NWA1,1205771873299 from 
192.168.105.19:60020
{code}

etc.


> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of 
> reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this 
> cluster, there is a load of churn in the master log as we assign regions, 
> they report their opening and then after the hbase.hbasemaster.maxregionopen 
> of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our 
> regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the 
> regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning 
> region pdc-docs,US20060158177_20060720,1205765009844 to server 
> 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: 
> MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this 
> cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening 
> of regions.  The region opens are running pretty fast at about a second each 
> but there are hundreds of regions to open in this case so its easy to go over 
> our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to