[
https://issues.apache.org/jira/browse/HBASE-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860416#comment-13860416
]
Andrew Purtell commented on HBASE-8751:
---------------------------------------
We can't introduce new functionality on 0.94 branch without introducing it to
trunk and later releases first. Any chance of a patch for trunk?
While it's less likely to be encountered in newer releases of HBase, from my
experience operating HBase clusters (and ending up with bad state on many a
testing cluster), it can be useful to clear all HBase state from ZooKeeper
before a cold restart to clear up issues. It is easier and safer if an operator
can clear out all HBase state in ZooKeeper as opposed to specific znodes,
because some cannot be lost. I believe we still do not keep the primary/only
copy of any HBase state in ZooKeeper. This patch would change that, so it
deserves discussion. We should avoid that if possible in my opinion.
> Enable peer cluster to choose/change the ColumnFamilies/Tables it really want
> to replicate from a source cluster
> ----------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-8751
> URL: https://issues.apache.org/jira/browse/HBASE-8751
> Project: HBase
> Issue Type: New Feature
> Components: Replication
> Reporter: Feng Honghua
> Assignee: Feng Honghua
> Attachments: HBASE-8751-0.94-V0.patch, HBASE-8751-0.94-v1.patch
>
>
> Consider scenarios (all cf are with replication-scope=1):
> 1) cluster S has 3 tables, table A has cfA,cfB, table B has cfX,cfY, table C
> has cf1,cf2.
> 2) cluster X wants to replicate table A : cfA, table B : cfX and table C from
> cluster S.
> 3) cluster Y wants to replicate table B : cfY, table C : cf2 from cluster S.
> Current replication implementation can't achieve this since it'll push the
> data of all the replicatable column-families from cluster S to all its peers,
> X/Y in this scenario.
> This improvement provides a fine-grained replication theme which enable peer
> cluster to choose the column-families/tables they really want from the source
> cluster:
> A). Set the table:cf-list for a peer when addPeer:
> hbase-shell> add_peer '3', "zk:1100:/hbase", "table1; table2:cf1,cf2;
> table3:cf2"
> B). View the table:cf-list config for a peer using show_peer_tableCFs:
> hbase-shell> show_peer_tableCFs "1"
> C). Change/set the table:cf-list for a peer using set_peer_tableCFs:
> hbase-shell> set_peer_tableCFs '2', "table1:cfX; table2:cf1; table3:cf1,cf2"
> In this theme, replication-scope=1 only means a column-family CAN be
> replicated to other clusters, but only the 'table:cf-list list' determines
> WHICH cf/table will actually be replicated to a specific peer.
> To provide back-compatibility, empty 'table:cf-list list' will replicate all
> replicatable cf/table. (this means we don't allow a peer which replicates
> nothing from a source cluster, we think it's reasonable: if replicating
> nothing why bother adding a peer?)
> This improvement addresses the exact problem raised by the first FAQ in
> "http://hbase.apache.org/replication.html":
> "GLOBAL means replicate? Any provision to replicate only to cluster X and
> not to cluster Y? or is that for later?
> Yes, this is for much later."
> I also noticed somebody mentioned "replication-scope" as integer rather than
> a boolean is for such fine-grained replication purpose, but I think extending
> "replication-scope" can't achieve the same replication granularity
> flexibility as providing above per-peer replication configurations.
> This improvement has been running smoothly in our production clusters
> (Xiaomi) for several months.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)