[
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626131#comment-13626131
]
chunhui shen commented on HBASE-7824:
-------------------------------------
Maybe I have realized one bug case.
Suppose Master,RS1,RS2
1.kill master and RS1
2.start master and RS1
3.master start SSH to process dead server RS1 when initialization
4.RS1 is not in dead server since a new RS1 is online
5.AssignmentManager#joinCluster rebuild user regions, return the dead server
RS1 and its regions
6.AssignmentManager#processDeadServersAndRecoverLostRegions will assign the
regions carried by RS1
7.However hlogs of RS1 is still being split by SSH, it means data loss since we
assign region in step6 before completing log-split
[~jeffreyz]
Please take a check, correct me if wrong
> Improve master start up time when there is log splitting work
> -------------------------------------------------------------
>
> Key: HBASE-7824
> URL: https://issues.apache.org/jira/browse/HBASE-7824
> Project: HBase
> Issue Type: Bug
> Components: master
> Reporter: Jeffrey Zhong
> Assignee: Jeffrey Zhong
> Fix For: 0.94.8
>
> Attachments: hbase-7824.patch, hbase-7824-v10.patch,
> hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch,
> hbase-7824-v8.patch, hbase-7824-v9.patch
>
>
> When there is log split work going on, master start up waits till all log
> split work completes even though the log split has nothing to do with meta
> region servers.
> It's a bad behavior considering a master node can run when log split is
> happening while its start up is blocking by log split work.
> Since master is kind of single point of failure, we should start it ASAP.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira