[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592721#comment-13592721
 ] 

Jeffrey Zhong commented on HBASE-7824:
--------------------------------------

I think I found the root cause and I addressed in the trunk 
patch(https://reviews.apache.org/r/9419/diff/#index_header) where I have the 
following line:
{code}
 // wait till all dead server are processed     
    ServerManager serverManager = master.getServerManager();
    while (serverManager.areDeadServersInProgress()) {
      Thread.sleep(100);
    }
{code}

Because my change will make master start up quickly with some SSH handling left 
which changes existing test case assumption a little bit. So I added the above 
lines to match the exiting test case expectation which that all log splitting 
work is done & previous dead servers are handled.   

I've run the test case 20 times in a loop without any failure. 

The reason that the test case passed with removing 
"this.deadservers.add(serverName);". Because it basically assigns regions 
before master initialization due to waitForActiveAndReadyMaster in the test 
code. Since it matches old behavior so that test case passed while the log 
splitting work might not have been done before those regions are assigned.


 


                
> Improve master start up time when there is log splitting work
> -------------------------------------------------------------
>
>                 Key: HBASE-7824
>                 URL: https://issues.apache.org/jira/browse/HBASE-7824
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Jeffrey Zhong
>            Assignee: Jeffrey Zhong
>             Fix For: 0.94.7
>
>         Attachments: hbase-7824.patch, hbase-7824_v2.patch
>
>
> When there is log split work going on, master start up waits till all log 
> split work completes even though the log split has nothing to do with meta 
> region servers.
> It's a bad behavior considering a master node can run when log split is 
> happening while its start up is blocking by log split work. 
> Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to