[
https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584274#action_12584274
]
stack commented on HBASE-555:
-----------------------------
When .19 server shows up, gets 100+ regions. This patch should include upper
bound on how many regions we assign at a time.
Looking at the Lars cluster, this patch seems to be doing what its supposed
to.... Its ugly in that currently there is a log for every
MSG_REPORT_PROCESS_OPEN message -- one per region every time it reports in --
but its just during startup (Previous our startup logs were clogged with
reassigning regions already assigned).
Here is illustration that patch is basically working... After assignment, 7
seconds after last assigned region message is logged we see this:
{code}
2008-04-01 00:07:05,460 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : pdc-docs,US20070223009_20070927,1205860531876 from
192.168.105.19:60020
{code}
A few regions open, then we get this:
{code}
2008-04-01 00:07:11,528 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : pdc-docs,US4881767_19891121,1205704528908 from
192.168.105.19:60020
{code}
... about 6 seconds after one from previous batch... then...
{code}
2008-04-01 00:07:14,534 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : pdc-docs,US5399923_19950321,1205793104251 from
192.168.105.19:60020
{code}
later....
{code}
2008-04-01 00:07:23,614 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : pdc-docs,EP04011653NWA1,1205771873299 from
192.168.105.19:60020
{code}
etc.
> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of
> reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-555
> URL: https://issues.apache.org/jira/browse/HBASE-555
> Project: Hadoop HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.16.0, 0.2.0, 0.1.0
> Reporter: stack
> Assignee: stack
> Priority: Blocker
> Attachments: 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions. Starting this
> cluster, there is a load of churn in the master log as we assign regions,
> they report their opening and then after the hbase.hbasemaster.maxregionopen
> of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our
> regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the
> regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning
> region pdc-docs,US20060158177_20060720,1205765009844 to server
> 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer:
> MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this
> cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening
> of regions. The region opens are running pretty fast at about a second each
> but there are hundreds of regions to open in this case so its easy to go over
> our default of 60 seconds.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.