[jira] [Commented] (HBASE-16367) Race between master and region server initialization may lead to premature server abort

stack (JIRA) Tue, 09 Jan 2018 11:25:27 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16318998#comment-16318998
 ]


stack commented on HBASE-16367:
-------------------------------

This patch doesn't work. See HBASE-19694. No doc on what the latch is about, 
what it is supposed to be holding up. Digging, the order of events seems same 
but this latch seems super fragile, susceptible to break if any reordering 
done. No test to guard against change. Let me try and revert this thing over in 
HBASE-19694.

> Race between master and region server initialization may lead to premature 
> server abort
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-16367
>                 URL: https://issues.apache.org/jira/browse/HBASE-16367
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.2
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: 16367.addendum, 16367.v1.txt, 16367.v2.txt, 
> 16367.v3.txt, 63908-master.log
>
>
> I was troubleshooting a case where hbase (1.1.2) master always dies shortly 
> after start - see attached master log snippet.
> It turned out that master initialization thread was racing with 
> HRegionServer#preRegistrationInitialization() (initializeZooKeeper, actually) 
> since HMaster extends HRegionServer.
> Through additional logging in master:
> {code}
>     this.oldLogDir = createInitialFileSystemLayout();
>     HFileSystem.addLocationsOrderInterceptor(conf);
>     LOG.info("creating splitLogManager");
> {code}
> I found that execution didn't reach the last log line before region server 
> declared cluster Id being null.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-16367) Race between master and region server initialization may lead to premature server abort

Reply via email to