Demai Ni created HBASE-9220:
-------------------------------
Summary: An API(and shell command) to list tables replicated TO
the current cluster
Key: HBASE-9220
URL: https://issues.apache.org/jira/browse/HBASE-9220
Project: HBase
Issue Type: New Feature
Components: Replication, shell
Environment: clusters setup as Master and Slave for replication of
tables
Reporter: Demai Ni
This JIRA to track the continuous discussion following HBASE-8663, and
hopefully surface a better way to handle the use case:
an administrator or developer, who has 'list table' access to a cluster, would
like to know which tables/families are replicated to the cluster(i.e slave). so
that he/she won't mess things up.
While HBASE-8663 covered the API to get the list of tables and families from
current cluster(i.e Master). There is no conclusion on how to do the same for
replicated tables TO the current cluster(i.e slave). Several ideas have been
entertained during HBASE-8663's discussion, and summarized here:
* *Idea 1*: on Slave cluster, use a new String attribute REPLICATION_MASTER to
HColumnDescriptor to indicate this column is replicated from it. A check can be
added to ensure the value of REPLICATION_MASTER is valid at the same of set.
** problem 1) a slave can have more than one master(a minor one);
** problem 2) the consistency is broken if the Master cluster 'remove_peer'(a
major problem which request a synchronous call to the remote master/peer
cluster)
* *Idea 2*: reuse REPLICATION_SCOPE, and give a new meaning for value '-1'. If
a table is replicated to this cluster, its REPLICATION_SCOPE must be set to -1
before a replication can occur
** problem 1) incompatible change. Currently the slave side table will look
just like normal tables, the new change will request use to explicitly flag
REPLICATION_SCOPE = -1
** problem 2) incompatible change. Currently any none-zero value of
REPLICATION_SCOPE will be treated as if its value of 1(global replication). the
change will impact the existing tables
** problem 3) value '-1' only tell user that the table is replicated to current
cluster, won't be able to indicate the source/Master cluster
* *Idea 3*: invent a new HColumnDescriptor attribute 'replication_peers', an
array of ID. We can use positive ID for target-cluster, and negative ID for
source-cluster, for example
{code}
hbase(main):004:0> list_peers
PEER_ID CLUSTER_KEY STATE
1 Slave_A.hbase.com:2181:/hbase ENABLED
2 Slave_B.hbase.com:2181:/hbase ENABLED
3 Slave_Master_C.hbase.com:2181:/hbase ENABLED
-1 Master_A.hbase.com:2181:/hbase ENABLED
-2 Master_B.hbase.com:2181:/hbase ENABLED
-3 Slave_Master_C.hbase.com:2181:/hbase ENABLED
>describe table
't1_dn', {NAME => 'cf1', REPLICATION_PEERS => '1,2,3', ..}
't2_dn', {NAME => 'cf1', REPLICATION_PEERS => '-1,-2',..}
't3_dn', {NAME => 'cf1', REPLICATION_PEERS => '3,-3',..}
t1_dn#cf1 is replicated from this cluster, and its slave clusters are
Slave_A,Slave_B and Slave_Master_C
t2_dn#cf1 is replicated to this cluster, and its master clusters are Master_A
and Master_B
t3_dn#cf1 is setup as Master_Slave replication, with
Slave_Master_C.hbase.com(while don't have to be the same cluster)
{code}
** problem: similar as idea 1, and an improved version. A synchronous call can
be implemented through the peer ID
* *Idea 4*: Replication central controller that resides outside of all the
clusters. The controller will communicate with all clusters and keep info
consistent, which can be a very good operational manager for users who have 10+
clusters to oversee, and other features(such as backup/restore) can leverage
the framework
** problem: well, not really a problem per se, except the effort for the whole
solution is pretty large and need some clean up work. For example, currently
'add_peer' doesn't check the value, and we need to fix that first; and
replication setup rely on manually create table on peer slave, we may like to
ensure the same schema and do it automatically from Master cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira