[
https://issues.apache.org/jira/browse/HBASE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell resolved HBASE-9220.
----------------------------------------
Resolution: Incomplete
> An API(and shell command) to list tables replicated TO the current cluster
> ---------------------------------------------------------------------------
>
> Key: HBASE-9220
> URL: https://issues.apache.org/jira/browse/HBASE-9220
> Project: HBase
> Issue Type: New Feature
> Components: Replication, shell
> Environment: clusters setup as Master and Slave for replication of
> tables
> Reporter: Demai Ni
> Priority: Major
>
> This JIRA to track the continuous discussion following HBASE-8663, and
> hopefully surface a better way to handle the use case:
> an administrator or developer, who has 'list table' access to a cluster,
> would like to know which tables/families are replicated to the cluster(i.e
> slave). so that he/she won't mess things up.
> While HBASE-8663 covered the API to get the list of tables and families from
> current cluster(i.e Master). There is no conclusion on how to do the same for
> replicated tables TO the current cluster(i.e slave). Several ideas have been
> entertained during HBASE-8663's discussion, and summarized here:
> * *Idea 1*: on Slave cluster, use a new String attribute REPLICATION_MASTER
> to HColumnDescriptor to indicate this column is replicated from it. A check
> can be added to ensure the value of REPLICATION_MASTER is valid at the same
> of set.
> ** problem 1) a slave can have more than one master(a minor one);
> ** problem 2) the consistency is broken if the Master cluster 'remove_peer'(a
> major problem which request a synchronous call to the remote master/peer
> cluster)
> * *Idea 2*: reuse REPLICATION_SCOPE, and give a new meaning for value '-1'.
> If a table is replicated to this cluster, its REPLICATION_SCOPE must be set
> to -1 before a replication can occur
> ** problem 1) incompatible change. Currently the slave side table will look
> just like normal tables, the new change will request use to explicitly flag
> REPLICATION_SCOPE = -1
> ** problem 2) incompatible change. Currently any none-zero value of
> REPLICATION_SCOPE will be treated as if its value of 1(global replication).
> the change will impact the existing tables
> ** problem 3) value '-1' only tell user that the table is replicated to
> current cluster, won't be able to indicate the source/Master cluster
> * *Idea 3*: invent a new HColumnDescriptor attribute 'replication_peers', an
> array of ID. We can use positive ID for target-cluster, and negative ID for
> source-cluster, for example
> {code}
> hbase(main):004:0> list_peers
> PEER_ID CLUSTER_KEY STATE
> 1 Slave_A.hbase.com:2181:/hbase ENABLED
> 2 Slave_B.hbase.com:2181:/hbase ENABLED
> 3 Slave_Master_C.hbase.com:2181:/hbase ENABLED
> -1 Master_A.hbase.com:2181:/hbase ENABLED
> -2 Master_B.hbase.com:2181:/hbase ENABLED
> -3 Slave_Master_C.hbase.com:2181:/hbase ENABLED
> >describe table
> 't1_dn', {NAME => 'cf1', REPLICATION_PEERS => '1,2,3', ..}
> 't2_dn', {NAME => 'cf1', REPLICATION_PEERS => '-1,-2',..}
> 't3_dn', {NAME => 'cf1', REPLICATION_PEERS => '3,-3',..}
> t1_dn#cf1 is replicated from this cluster, and its slave clusters are
> Slave_A,Slave_B and Slave_Master_C
> t2_dn#cf1 is replicated to this cluster, and its master clusters are Master_A
> and Master_B
> t3_dn#cf1 is setup as Master_Slave replication, with
> Slave_Master_C.hbase.com(while don't have to be the same cluster)
> {code}
> ** problem: similar as idea 1, and an improved version. A synchronous call
> can be implemented through the peer ID
> * *Idea 4*: Replication central controller that resides outside of all the
> clusters. The controller will communicate with all clusters and keep info
> consistent, which can be a very good operational manager for users who have
> 10+ clusters to oversee, and other features(such as backup/restore) can
> leverage the framework
> ** problem: well, not really a problem per se, except the effort for the
> whole solution is pretty large and need some clean up work. For example,
> currently 'add_peer' doesn't check the value, and we need to fix that first;
> and replication setup rely on manually create table on peer slave, we may
> like to ensure the same schema and do it automatically from Master cluster.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)