Sujit P created HBASE-21201:
-------------------------------
Summary: Support to run VerifyReplication MR tool without peerid
Key: HBASE-21201
URL: https://issues.apache.org/jira/browse/HBASE-21201
Project: HBase
Issue Type: Brainstorming
Components: hbase-operator-tools
Affects Versions: 3.0.0, 2.2.0
Reporter: Sujit P
In some use cases, hbase clients writes to separate clusters(probably different
datacenters) tables for redundancy. As an administrator/application architect,
I would like to find out if both cluster tables are in the same state (cell by
cell). One of the tools that is readily available to use is VerifyRep which is
part of replication.
However, it requires peerId to be setup on atleast of the involved cluster.
PeerId is unnecessary in this use-case scenario and possibly cause unintended
consequences as the clusters aren't really replication peers neither do We
prefer them to be.
Looking at the code:
Tool attempts to get only the clusterKey which is essentially ZooKeeper quorum
url
{code:java}
//VerifyReplication.java
private static Pair<ReplicationPeerConfig, Configuration>
getPeerQuorumConfig(final Configuration conf, String peerId)
.
.
return Pair.newPair(peerConfig,
ReplicationUtils.getPeerClusterConfiguration(peerConfig, conf));
//ReplicationUtils.java
public static Configuration getPeerClusterConfiguration(ReplicationPeerConfig
peerConfig, Configuration baseConf) throws ReplicationException {
Configuration otherConf;
try {
otherConf = HBaseConfiguration.createClusterConf(baseConf,
peerConfig.getClusterKey());{code}
So I would like to propose to update the tool to pass the remote cluster
ZkQuorum as an argument (ex. --peerQuorumAddress
clusterBzk1,clusterBzk2,clusterBzk3:2181/hbase-secure ) and use it effectively
without dependence on replication peerId, similar to peerFSAddress. The are
certain advantages in doing so as follows:
* Reduce the development/maintenance of separate tool for above scenario
* Allow the tool to be more useful for other scenarios as well such as
** validating backups in remote cluster HBASE-19106
** compare cloned tableA and original tableA in same/remote cluster incase of
user error before restoring snapshot to original table to find the records that
need to be added/invalid/missing etc
** Allow backup operators who are non-Hbase admins(who shouldn't be adding the
peerId) to run the tool, since currently only Hbase superuser can add a peerId
for reasons discussed in HBASE-21163.
Please post your comments
Thanks
cc: [~clayb], [~brfrn169] , [~vrodionov] , [~rashidaligee]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)