[
https://issues.apache.org/jira/browse/HBASE-21154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691300#comment-16691300
]
Duo Zhang commented on HBASE-21154:
-----------------------------------
OK when testing sync replication, I found that this problem lead me into a dead
lock.
For sync replication, if the cluster is in ACTIVE state, when splitting wal, we
will copy the wal file to the remote cluster for keeping consistency. If we
fail to do so, for example, the remote cluster has already been transited to
DOWNGRADE_ACTIVE, the split will fail, and we will keep retrying, until we
transit the cluster from ACTIVE to other states, usually STANDBY.
And the problem here is that, in SCP, only meta wal has been treated specially,
and for namespace region, the recovery process is the same with other user
regions. So in the above scenario, if the crashed region server carries
namespace region, then the namespace region can not be online, as the log
splitting can not finish, until we transit the cluster from ACTIVE to other
states. But consider we are restarting the HMaster, the HMaster needs namespace
region to be online to finish initialization, but we need the HMaster to finish
initialization and then we can transit the sync replication state from ACTIVE
to other states so we can make namespace region online. OK, dead lock...
So at least this issue should be done for 3.0. Will prepare a patch soon.
Thanks.
> Remove hbase:namespace table; fold it into hbase:meta
> -----------------------------------------------------
>
> Key: HBASE-21154
> URL: https://issues.apache.org/jira/browse/HBASE-21154
> Project: HBase
> Issue Type: Improvement
> Components: meta
> Reporter: stack
> Priority: Major
>
> Namespace table is a small system table. Usually it has two rows. It must be
> assigned before user tables but after hbase:meta goes out. Its presence
> complicates our startup and is a constant source of grief when for whatever
> reason, it is not up and available. In fact, master startup is predicated on
> hbase:namespace being assigned and will not make progress unless it is up.
> Lets just add a new 'ns' column family to hbase:meta for namespace.
> Here is a default ns table content:
> {code}
> hbase(main):023:0* scan 'hbase:namespace'
> ROW
> COLUMN+CELL
> default
> column=info:d, timestamp=1526694059106,
> value=\x0A\x07default
> hbase
> column=info:d, timestamp=1526694059461,
> value=\x0A\x05hbase
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)