[jira] [Resolved] (HBASE-9220) An API(and shell command) to list tables replicated TO the current cluster

Andrew Kyle Purtell (Jira) Thu, 16 Jun 2022 10:56:04 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andrew Kyle Purtell resolved HBASE-9220.
----------------------------------------
    Resolution: Incomplete

> An API(and shell command) to list tables replicated TO the current cluster 
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-9220
>                 URL: https://issues.apache.org/jira/browse/HBASE-9220
>             Project: HBase
>          Issue Type: New Feature
>          Components: Replication, shell
>         Environment: clusters setup as Master and Slave for replication of 
> tables
>            Reporter: Demai Ni
>            Priority: Major
>
> This JIRA to track the continuous discussion following HBASE-8663, and 
> hopefully surface a better way to handle the use case: 
> an administrator or developer,  who has 'list table' access to a cluster, 
> would like to know which tables/families are replicated to the cluster(i.e 
> slave). so that he/she won't mess things up.
> While HBASE-8663 covered the API to get the list of tables and families from 
> current cluster(i.e Master). There is no conclusion on how to do the same for 
> replicated tables TO the current cluster(i.e slave). Several ideas have been 
> entertained during HBASE-8663's discussion, and summarized here: 
> * *Idea 1*: on Slave cluster, use a new String attribute REPLICATION_MASTER 
> to HColumnDescriptor to indicate this column is replicated from it. A check 
> can be added to ensure the value of REPLICATION_MASTER is valid at the same 
> of set. 
> ** problem 1) a slave can have more than one master(a minor one); 
> ** problem 2) the consistency is broken if the Master cluster 'remove_peer'(a 
> major problem which request a synchronous call to the remote master/peer 
> cluster)
> * *Idea 2*: reuse REPLICATION_SCOPE, and give a new meaning for value '-1'. 
> If a table is replicated to this cluster, its REPLICATION_SCOPE must be set 
> to -1 before a replication can occur
> ** problem 1) incompatible change. Currently the slave side table will look 
> just like normal tables, the new change will request use to explicitly flag 
> REPLICATION_SCOPE = -1
> ** problem 2) incompatible change. Currently any none-zero value of 
> REPLICATION_SCOPE will be treated as if its value of 1(global replication). 
> the change will impact the existing tables
> ** problem 3) value '-1' only tell user that the table is replicated to 
> current cluster, won't be able to indicate the source/Master cluster
> * *Idea 3*:  invent a new HColumnDescriptor attribute 'replication_peers', an 
> array of ID. We can use positive ID for target-cluster, and negative ID for 
> source-cluster, for example 
> {code}
> hbase(main):004:0> list_peers
>  PEER_ID CLUSTER_KEY STATE
>  1 Slave_A.hbase.com:2181:/hbase ENABLED
>  2 Slave_B.hbase.com:2181:/hbase ENABLED
>  3 Slave_Master_C.hbase.com:2181:/hbase ENABLED
> -1 Master_A.hbase.com:2181:/hbase ENABLED
> -2 Master_B.hbase.com:2181:/hbase ENABLED
> -3 Slave_Master_C.hbase.com:2181:/hbase ENABLED
> >describe table
> 't1_dn', {NAME => 'cf1', REPLICATION_PEERS => '1,2,3', ..}
> 't2_dn', {NAME => 'cf1', REPLICATION_PEERS => '-1,-2',..}
> 't3_dn', {NAME => 'cf1', REPLICATION_PEERS => '3,-3',..}
> t1_dn#cf1 is replicated from this cluster, and its slave clusters are 
> Slave_A,Slave_B and Slave_Master_C
> t2_dn#cf1 is replicated to this cluster, and its master clusters are Master_A 
> and Master_B
> t3_dn#cf1 is setup as Master_Slave replication, with 
> Slave_Master_C.hbase.com(while don't have to be the same cluster) 
> {code}
> ** problem: similar as idea 1, and an improved version. A synchronous call 
> can be implemented through the peer ID
> * *Idea 4*: Replication central controller that resides outside of all the 
> clusters. The controller will communicate with all clusters and keep info 
> consistent, which can be a very good operational manager for users who have 
> 10+ clusters to oversee, and other features(such as backup/restore) can 
> leverage the framework
> ** problem: well, not really a problem per se, except the effort for the 
> whole solution is pretty large and need some clean up work. For example, 
> currently 'add_peer' doesn't check the value, and we need to fix that first; 
> and replication setup rely on manually create table on peer slave, we may 
> like to ensure the same schema and do it automatically from Master cluster. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Resolved] (HBASE-9220) An API(and shell command) to list tables replicated TO the current cluster

Reply via email to