[ 
https://issues.apache.org/jira/browse/HBASE-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17546010#comment-17546010
 ] 

Duo Zhang commented on HBASE-15867:
-----------------------------------

There are two other things which are handled by replication queue storage.

First is the lastSequenceIds, which is used by serial replication. It needs to 
be updated together with replication offset, atomically, so we need to store it 
with replication offset in the same table. The key is basically a (peerId, 
encodedRegionName) pair, and the value is a sequence id.

The second is hfile refs, for replicating bulk load hfiles. In fact, it is only 
used to prevent the hfiles being deleted by the HFileCleaner before being 
replicated, and the update to hfile refs does not need to be atomic with 
replication offset. Buy anyway, store it in the same table but a separated 
family seems no harm.

Reviewing the code, for both lastSequenceIds and hfile refs, one of the 
requirements is to delete them as all when deleting the peer, and for hfile 
refs, we also need to list all the refs atomically. So the idea is to just 
store them in one row, with different qualifiers. To be more specific, 
introduce two new families, may be called replicated_seq_id and hfile_ref, and 
for a peer, there is only one row, where the row key is the peer id, and in 
replicated_seq_id, the qualifier is the encodedRegionName, and value is the 
sequence id, and for hfile_ref, the qualifier is the hfile name and value is 
just empty. In this way, a single delete families call can remove them all at 
once, and also, a simple get all the hfile refs for a replication peer at once.

> Move HBase replication tracking from ZooKeeper to HBase
> -------------------------------------------------------
>
>                 Key: HBASE-15867
>                 URL: https://issues.apache.org/jira/browse/HBASE-15867
>             Project: HBase
>          Issue Type: New Feature
>          Components: Replication
>    Affects Versions: 2.1.0
>            Reporter: Joseph
>            Assignee: Zheng Hu
>            Priority: Major
>
> Move the WAL file and offset tracking out of ZooKeeper and into an HBase 
> table called hbase:replication. 
> The largest three new changes will be two classes ReplicationTableBase, 
> TableBasedReplicationQueues, and TableBasedReplicationQueuesClient. As of now 
> ReplicationPeers and HFileRef's tracking will not be implemented. Subtasks 
> have been filed for these two jobs.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to