[ 
https://issues.apache.org/jira/browse/HBASE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618791#comment-16618791
 ] 

Nihal Jain commented on HBASE-18822:
------------------------------------

I tried this patch on my cluster and had the following observations/comments on 
proposed patch:
 * Table creation/alter ops are replicated to all its peer irrespective of the 
replication state of the peers. It is good behavior wise as it will ensure 
things are in sync completely. But, what if the admin disables a peer, do we 
still want to sync create table to the peer? Or should we sync to a peer only 
if the peer is enabled?

 * Similarly table creation/alter ops are replicated irrespective of the 
replication scope. It is good behavior wise as it will ensure things are in 
sync completely. But, still  we discuss about should we sync to a peer only if 
the replication scope is 1?

 * If table already exists in peer cluster but with a different descriptor, it 
successfully creates table in active cluster. Should we log some warning if 
table descriptor is different on the peer?
 * If we perform the following ops in order:
 ** (a) Create table in active cluster
 ** (b) Add peer cluster
 ** (c) Modify the table created in step a
 ** *RESULT*: It will get stuck in procedure retry loop (in RUNNABLE STATE) and 
will get a TableNotFoundException in each retry. The procedure will never 
complete as it think it is a retriable error.
 ** *StackTrace:*
{noformat}
2018-09-17 22:37:46,840 WARN  [PEWorker-6] procedure.ModifyTableProcedure: 
Retriable error trying to modify table=t4 (in 
state=MODIFY_TABLE_SYNC_SCHEMA_TO_PEER)

org.apache.hadoop.hbase.TableNotFoundException: t4

                at 
org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:548)

                at 
org.apache.hadoop.hbase.client.HBaseAdmin.getDescriptor(HBaseAdmin.java:338)

                at 
org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.syncSchemaModificationToPeer(ModifyTableProcedure.java:432)

                at 
org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:132)

                at 
org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:58)

                at 
org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:189)

                at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:873)

                at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1510)

                at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1298)

                at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76)

                at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1797)
{noformat}

  
 * Table with the same name exists in active and peer cluster; both having 
familiy c1. If we alter the table t1 and set version = 5 for family c1 in 
active cluster, it will skip modifying table in peer. Because the patch only 
handles supporting add column / delete column. Not to alter existing column 
families. But by sync don't we mean we want to sync everything? What if the 
active goes down with t1:c1:version=5, while the peer which is used for 
disaster recovery solution just supports t1:c1:version=1 (i.e. whatever was 
descriptor property of the column family while creation)
 * Table with the same name exists in active and peer cluster but with families 
f1, f2 in active and families c1, c2 in peer cluster. If we alter the table t1 
to drop family f1 in active cluster, since the count logic is used (not 
descriptor comparison. WHY?), it will replace the existing descriptor of peer 
and drop families c1 and c2 (along with data, if any) from peer and add family 
f1 to it (as this is the new descriptor in active).
 * Table with the same name exists in active and peer cluster but with families 
f1, f2 in active and family c1 in peer cluster. If we alter the table t1 to 
drop family f1 in active cluster, it will skip modifying table in peer as count 
is same. Again, should we log some warning if table descriptor is different on 
the peer?

> Create table for peer cluster automatically when creating table in source 
> cluster of using namespace replication.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-18822
>                 URL: https://issues.apache.org/jira/browse/HBASE-18822
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>    Affects Versions: 2.0.0-alpha-2
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0
>
>         Attachments: HBASE-18822.v1.patch, HBASE-18822.v1.patch
>
>
> In our cluster of using namespace replication,   we always forget to create 
> table in peer cluster, which lead to replication get stuck. 
> We have implemented the feature in our cluster:  create table for peer 
> cluster automatically when creating table in source cluster of using 
> namespace replication.
>  
> I'm not sure if someone else needs this feature, so create an issue here for 
> discussing   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to