[ 
https://issues.apache.org/jira/browse/HBASE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146908#comment-13146908
 ] 

ramkrishna.s.vasudevan commented on HBASE-4748:
-----------------------------------------------

Two things i observed after trying out the suggested approach 

2 RS are there but the RS carrying META doesnot come up when the master is 
restarting
 -> Here as part of HMaster starting operation it will try to split the log 
{code} this.fileSystemManager. 
splitLogAfterStartup(this.serverManager.getOnlineServers().keySet()); 
{code} 
So we can ensure that the znode splitlog node is created and it has some 
children. We can wait till there are no more children to proceed with META 
assignment. 

2 RS are there but the RS carrying META comes up immediately when the master is 
restarting and also gets registered with the master.
In this case 
{code} this.fileSystemManager. 
splitLogAfterStartup(this.serverManager.getOnlineServers().keySet()); {code} 
Will not start the splitLog process as it sees that the logs of the server is 
online. So it will proceed with assign root and meta region. But by the time if 
the ServerShutdownhandler comes into picture and splits the logs just before 
master is assigning root and meta then it merges with case #1. So in case#2 if 
the ServerShutDownHandler does not start its action then we are again in for 
the trouble as the recovered edits may still may not be created. 

I think overall when the master restarts and before master tries to split log 
if the expireServer does not remove from onlineServers list(the RS that got 
killed when master was coming up) then master will fail splitting the logs and 
will carry on with root and meta assignment.
                
> Race between creating recovered edits for META and master assigning ROOT and 
> META.
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-4748
>                 URL: https://issues.apache.org/jira/browse/HBASE-4748
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>
> 1. Start a cluster.
> 2. Alter a table
> 3. Restart the master using ./hbase-daemon.sh restart master
> 4. Kill the RS after master restarts.
> 5. Start RS again.
> 6. No table operations can be performed on the table that was altered but 
> admin.listTables() is able to list the altered table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to