[
https://issues.apache.org/jira/browse/HBASE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410777#comment-15410777
]
Ted Yu commented on HBASE-16367:
--------------------------------
I haven't found other race condition.
The master started normally with patch v2.
bq. ensure daemon thread in the constructor finished before we go ton
HRegionServer.run
Doesn't seem to be necessary - there can be some parallelism between the two
threads after the cluster Id registration.
Thanks
> Race between master and region server initialization may lead to premature
> server abort
> ---------------------------------------------------------------------------------------
>
> Key: HBASE-16367
> URL: https://issues.apache.org/jira/browse/HBASE-16367
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.1.2
> Reporter: Ted Yu
> Assignee: Ted Yu
> Attachments: 16367.v1.txt, 16367.v2.txt, 16367.v2.txt,
> 63908-master.log
>
>
> I was troubleshooting a case where hbase (1.1.2) master always dies shortly
> after start - see attached master log snippet.
> It turned out that master initialization thread was racing with
> HRegionServer#preRegistrationInitialization() (initializeZooKeeper, actually)
> since HMaster extends HRegionServer.
> Through additional logging in master:
> {code}
> this.oldLogDir = createInitialFileSystemLayout();
> HFileSystem.addLocationsOrderInterceptor(conf);
> LOG.info("creating splitLogManager");
> {code}
> I found that execution didn't reach the last log line before region server
> declared cluster Id being null.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)