[ 
https://issues.apache.org/jira/browse/HBASE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412933#comment-15412933
 ] 

stack commented on HBASE-16367:
-------------------------------

What is this? It adds:

821         if (this.initLatch != null) {
822           this.initLatch.await(50, TimeUnit.SECONDS);
823         }

...which causes a new findbugs.... reported above but ignored:

Return value of java.util.concurrent.CountDownLatch.await(long, TimeUnit) 
ignored in 
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper() At 
HRegionServer.java:ignored in 
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper() At 
HRegionServer.java:[line 822]

We then wait on the latch 50 seconds and then just proceed? What is supposed to 
be the startup scenario here? How does the latch ensure a particular path?

> Race between master and region server initialization may lead to premature 
> server abort
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-16367
>                 URL: https://issues.apache.org/jira/browse/HBASE-16367
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.2
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: 16367.addendum, 16367.v1.txt, 16367.v2.txt, 
> 16367.v3.txt, 63908-master.log
>
>
> I was troubleshooting a case where hbase (1.1.2) master always dies shortly 
> after start - see attached master log snippet.
> It turned out that master initialization thread was racing with 
> HRegionServer#preRegistrationInitialization() (initializeZooKeeper, actually) 
> since HMaster extends HRegionServer.
> Through additional logging in master:
> {code}
>     this.oldLogDir = createInitialFileSystemLayout();
>     HFileSystem.addLocationsOrderInterceptor(conf);
>     LOG.info("creating splitLogManager");
> {code}
> I found that execution didn't reach the last log line before region server 
> declared cluster Id being null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to