[
https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16614383#comment-16614383
]
Mingliang Liu commented on HBASE-21164:
---------------------------------------
V7 patch to address Allan's concern. Refactoring Sleeper seems not necessary.
We can remove the unit test if it's not necessary either, as the change is not
as major as previous version.
> reportForDuty should do (expotential) backoff rather than retry every 3
> seconds (default).
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-21164
> URL: https://issues.apache.org/jira/browse/HBASE-21164
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: stack
> Assignee: Mingliang Liu
> Priority: Minor
> Attachments: HBASE-21164.005.patch, HBASE-21164.006.patch,
> HBASE-21164.007.patch, HBASE-21164.branch-2.1.001.patch,
> HBASE-21164.branch-2.1.002.patch, HBASE-21164.branch-2.1.003.patch,
> HBASE-21164.branch-2.1.004.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available.
> If Master is initializing, and especially on a big cluster when it can take a
> while particularly if something is amiss, the log every three seconds is
> annoying and doesn't do anything of use. Do backoff if fails up to a
> reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to
> master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001,
> startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed;
> sleeping and then retrying.
> ....
> {code}
> For example, I am looking at a large cluster now that had a backlog of
> procedure WALs. It is taking a couple of hours recreating the procedure-state
> because there are millions of procedures outstanding. Meantime, the Master
> log is just full of the above message -- every three seconds...
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)