[ 
https://issues.apache.org/jira/browse/HBASE-21154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691300#comment-16691300
 ] 

Duo Zhang commented on HBASE-21154:
-----------------------------------

OK when testing sync replication, I found that this problem lead me into a dead 
lock.

For sync replication, if the cluster is in ACTIVE state, when splitting wal, we 
will copy the wal file to the remote cluster for keeping consistency. If we 
fail to do so, for example, the remote cluster has already been transited to 
DOWNGRADE_ACTIVE, the split will fail, and we will keep retrying, until we 
transit the cluster from ACTIVE to other states, usually STANDBY.

And the problem here is that, in SCP, only meta wal has been treated specially, 
and for namespace region, the recovery process is the same with other user 
regions. So in the above scenario, if the crashed region server carries 
namespace region, then the namespace region can not be online, as the log 
splitting can not finish, until we transit the cluster from ACTIVE to other 
states. But consider we are restarting the HMaster, the HMaster needs namespace 
region to be online to finish initialization, but we need the HMaster to finish 
initialization and then we can transit the sync replication state from ACTIVE 
to other states so we can make namespace region online. OK, dead lock...

So at least this issue should be done for 3.0. Will prepare a patch soon.

Thanks.

> Remove hbase:namespace table; fold it into hbase:meta
> -----------------------------------------------------
>
>                 Key: HBASE-21154
>                 URL: https://issues.apache.org/jira/browse/HBASE-21154
>             Project: HBase
>          Issue Type: Improvement
>          Components: meta
>            Reporter: stack
>            Priority: Major
>
> Namespace table is a small system table. Usually it has two rows. It must be 
> assigned before user tables but after hbase:meta goes out. Its presence 
> complicates our startup and is a constant source of grief when for whatever 
> reason, it is not up and available. In fact, master startup is predicated on 
> hbase:namespace being assigned and will not make progress unless it is up.
> Lets just add a new 'ns' column family to hbase:meta for namespace.
> Here is a default ns table content:
> {code}
> hbase(main):023:0* scan 'hbase:namespace'
> ROW                                                                           
>                            COLUMN+CELL
>  default                                                                      
>                            column=info:d, timestamp=1526694059106, 
> value=\x0A\x07default
>  hbase                                                                        
>                            column=info:d, timestamp=1526694059461, 
> value=\x0A\x05hbase
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to