[
https://issues.apache.org/jira/browse/HBASE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146908#comment-13146908
]
ramkrishna.s.vasudevan commented on HBASE-4748:
-----------------------------------------------
Two things i observed after trying out the suggested approach
2 RS are there but the RS carrying META doesnot come up when the master is
restarting
-> Here as part of HMaster starting operation it will try to split the log
{code} this.fileSystemManager.
splitLogAfterStartup(this.serverManager.getOnlineServers().keySet());
{code}
So we can ensure that the znode splitlog node is created and it has some
children. We can wait till there are no more children to proceed with META
assignment.
2 RS are there but the RS carrying META comes up immediately when the master is
restarting and also gets registered with the master.
In this case
{code} this.fileSystemManager.
splitLogAfterStartup(this.serverManager.getOnlineServers().keySet()); {code}
Will not start the splitLog process as it sees that the logs of the server is
online. So it will proceed with assign root and meta region. But by the time if
the ServerShutdownhandler comes into picture and splits the logs just before
master is assigning root and meta then it merges with case #1. So in case#2 if
the ServerShutDownHandler does not start its action then we are again in for
the trouble as the recovered edits may still may not be created.
I think overall when the master restarts and before master tries to split log
if the expireServer does not remove from onlineServers list(the RS that got
killed when master was coming up) then master will fail splitting the logs and
will carry on with root and meta assignment.
> Race between creating recovered edits for META and master assigning ROOT and
> META.
> ----------------------------------------------------------------------------------
>
> Key: HBASE-4748
> URL: https://issues.apache.org/jira/browse/HBASE-4748
> Project: HBase
> Issue Type: Bug
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
>
> 1. Start a cluster.
> 2. Alter a table
> 3. Restart the master using ./hbase-daemon.sh restart master
> 4. Kill the RS after master restarts.
> 5. Start RS again.
> 6. No table operations can be performed on the table that was altered but
> admin.listTables() is able to list the altered table.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira