[
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592721#comment-13592721
]
Jeffrey Zhong commented on HBASE-7824:
--------------------------------------
I think I found the root cause and I addressed in the trunk
patch(https://reviews.apache.org/r/9419/diff/#index_header) where I have the
following line:
{code}
// wait till all dead server are processed
ServerManager serverManager = master.getServerManager();
while (serverManager.areDeadServersInProgress()) {
Thread.sleep(100);
}
{code}
Because my change will make master start up quickly with some SSH handling left
which changes existing test case assumption a little bit. So I added the above
lines to match the exiting test case expectation which that all log splitting
work is done & previous dead servers are handled.
I've run the test case 20 times in a loop without any failure.
The reason that the test case passed with removing
"this.deadservers.add(serverName);". Because it basically assigns regions
before master initialization due to waitForActiveAndReadyMaster in the test
code. Since it matches old behavior so that test case passed while the log
splitting work might not have been done before those regions are assigned.
> Improve master start up time when there is log splitting work
> -------------------------------------------------------------
>
> Key: HBASE-7824
> URL: https://issues.apache.org/jira/browse/HBASE-7824
> Project: HBase
> Issue Type: Bug
> Components: master
> Reporter: Jeffrey Zhong
> Assignee: Jeffrey Zhong
> Fix For: 0.94.7
>
> Attachments: hbase-7824.patch, hbase-7824_v2.patch
>
>
> When there is log split work going on, master start up waits till all log
> split work completes even though the log split has nothing to do with meta
> region servers.
> It's a bad behavior considering a master node can run when log split is
> happening while its start up is blocking by log split work.
> Since master is kind of single point of failure, we should start it ASAP.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira