Sujit P created HBASE-21201:
-------------------------------

             Summary: Support to run VerifyReplication MR tool without peerid
                 Key: HBASE-21201
                 URL: https://issues.apache.org/jira/browse/HBASE-21201
             Project: HBase
          Issue Type: Brainstorming
          Components: hbase-operator-tools
    Affects Versions: 3.0.0, 2.2.0
            Reporter: Sujit P


In some use cases, hbase clients writes to separate clusters(probably different 
datacenters) tables for redundancy. As an administrator/application architect, 
I would like to find out if both cluster tables are in the same state (cell by 
cell). One of the tools that is readily available to use is VerifyRep which is 
part of replication.

However, it requires peerId to be setup on atleast of the involved cluster. 
PeerId is unnecessary in this use-case scenario and possibly cause unintended 
consequences as the clusters aren't really replication peers neither do We 
prefer them to be.

Looking at the code:

Tool attempts to get only the clusterKey which is essentially ZooKeeper quorum 
url

 
{code:java}
//VerifyReplication.java

private static Pair<ReplicationPeerConfig, Configuration> 
getPeerQuorumConfig(final Configuration conf, String peerId)
.
.
return Pair.newPair(peerConfig,
        ReplicationUtils.getPeerClusterConfiguration(peerConfig, conf));


//ReplicationUtils.java
public static Configuration getPeerClusterConfiguration(ReplicationPeerConfig 
peerConfig, Configuration baseConf) throws ReplicationException {
Configuration otherConf;
try {
otherConf = HBaseConfiguration.createClusterConf(baseConf, 
peerConfig.getClusterKey());{code}
 

 

So I would like to propose to update the tool to pass the remote cluster 
ZkQuorum as an argument (ex. --peerQuorumAddress 
clusterBzk1,clusterBzk2,clusterBzk3:2181/hbase-secure ) and use it effectively 
without dependence on replication peerId, similar to peerFSAddress. The are 
certain advantages in doing so as follows:
 * Reduce the development/maintenance of separate tool for above scenario
 * Allow the tool to be more useful for other scenarios as well such as 
 ** validating backups in remote cluster HBASE-19106
 ** compare cloned tableA and original tableA in same/remote cluster incase of 
user error before restoring snapshot to original table to find the records that 
need to be added/invalid/missing etc
 ** Allow backup operators who are non-Hbase admins(who shouldn't be adding the 
peerId) to run the tool, since currently only Hbase superuser can add a peerId 
for reasons discussed in HBASE-21163.

Please post your comments

Thanks

cc: [~clayb], [~brfrn169] , [~vrodionov] , [~rashidaligee]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to