[
https://issues.apache.org/jira/browse/HBASE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410580#comment-15410580
]
Ted Yu commented on HBASE-16367:
--------------------------------
Another approach is for master to pass an instance of CountDownLatch to region
server.
After master sets cluster Id, it counts down the latch to let region server
continue with initialization.
> Race between master and region server initialization may lead to premature
> server abort
> ---------------------------------------------------------------------------------------
>
> Key: HBASE-16367
> URL: https://issues.apache.org/jira/browse/HBASE-16367
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.1.2
> Reporter: Ted Yu
> Assignee: Ted Yu
> Attachments: 16367.v1.txt, 63908-master.log
>
>
> I was troubleshooting a case where hbase (1.1.2) master always dies shortly
> after start - see attached master log snippet.
> It turned out that master initialization thread was racing with
> HRegionServer#preRegistrationInitialization() (initializeZooKeeper, actually)
> since HMaster extends HRegionServer.
> Through additional logging in master:
> {code}
> this.oldLogDir = createInitialFileSystemLayout();
> HFileSystem.addLocationsOrderInterceptor(conf);
> LOG.info("creating splitLogManager");
> {code}
> I found that execution didn't reach the last log line before region server
> declared cluster Id being null.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)