[ https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618791#comment-16618791 ]
Nihal Jain commented on HBASE-18822: ------------------------------------ I tried this patch on my cluster and had the following observations/comments on proposed patch: * Table creation/alter ops are replicated to all its peer irrespective of the replication state of the peers. It is good behavior wise as it will ensure things are in sync completely. But, what if the admin disables a peer, do we still want to sync create table to the peer? Or should we sync to a peer only if the peer is enabled? * Similarly table creation/alter ops are replicated irrespective of the replication scope. It is good behavior wise as it will ensure things are in sync completely. But, still we discuss about should we sync to a peer only if the replication scope is 1? * If table already exists in peer cluster but with a different descriptor, it successfully creates table in active cluster. Should we log some warning if table descriptor is different on the peer? * If we perform the following ops in order: ** (a) Create table in active cluster ** (b) Add peer cluster ** (c) Modify the table created in step a ** *RESULT*: It will get stuck in procedure retry loop (in RUNNABLE STATE) and will get a TableNotFoundException in each retry. The procedure will never complete as it think it is a retriable error. ** *StackTrace:* {noformat} 2018-09-17 22:37:46,840 WARN [PEWorker-6] procedure.ModifyTableProcedure: Retriable error trying to modify table=t4 (in state=MODIFY_TABLE_SYNC_SCHEMA_TO_PEER) org.apache.hadoop.hbase.TableNotFoundException: t4 at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:548) at org.apache.hadoop.hbase.client.HBaseAdmin.getDescriptor(HBaseAdmin.java:338) at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.syncSchemaModificationToPeer(ModifyTableProcedure.java:432) at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:132) at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:58) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:189) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:873) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1510) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1298) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1797) {noformat} * Table with the same name exists in active and peer cluster; both having familiy c1. If we alter the table t1 and set version = 5 for family c1 in active cluster, it will skip modifying table in peer. Because the patch only handles supporting add column / delete column. Not to alter existing column families. But by sync don't we mean we want to sync everything? What if the active goes down with t1:c1:version=5, while the peer which is used for disaster recovery solution just supports t1:c1:version=1 (i.e. whatever was descriptor property of the column family while creation) * Table with the same name exists in active and peer cluster but with families f1, f2 in active and families c1, c2 in peer cluster. If we alter the table t1 to drop family f1 in active cluster, since the count logic is used (not descriptor comparison. WHY?), it will replace the existing descriptor of peer and drop families c1 and c2 (along with data, if any) from peer and add family f1 to it (as this is the new descriptor in active). * Table with the same name exists in active and peer cluster but with families f1, f2 in active and family c1 in peer cluster. If we alter the table t1 to drop family f1 in active cluster, it will skip modifying table in peer as count is same. Again, should we log some warning if table descriptor is different on the peer? > Create table for peer cluster automatically when creating table in source > cluster of using namespace replication. > ----------------------------------------------------------------------------------------------------------------- > > Key: HBASE-18822 > URL: https://issues.apache.org/jira/browse/HBASE-18822 > Project: HBase > Issue Type: Improvement > Components: Replication > Affects Versions: 2.0.0-alpha-2 > Reporter: Zheng Hu > Assignee: Zheng Hu > Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-18822.v1.patch, HBASE-18822.v1.patch > > > In our cluster of using namespace replication, we always forget to create > table in peer cluster, which lead to replication get stuck. > We have implemented the feature in our cluster: create table for peer > cluster automatically when creating table in source cluster of using > namespace replication. > > I'm not sure if someone else needs this feature, so create an issue here for > discussing -- This message was sent by Atlassian JIRA (v7.6.3#76005)