[jira] [Commented] (HBASE-20456) Support removing a ReplicationSourceShipper for a special wal group

Wellington Chevreuil (JIRA) Fri, 16 Nov 2018 09:34:38 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-20456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689696#comment-16689696
 ]


Wellington Chevreuil commented on HBASE-20456:
----------------------------------------------

Hi [~Apache9], 

Was doing some tests with replication on master branch, and I believe there's 
an issue caused by the following change introduced by this patch:

{noformat}
-  private void handleEmptyWALEntryBatch(Path currentPath) throws 
InterruptedException {
+  private void handleEmptyWALEntryBatch() throws InterruptedException {
     LOG.trace("Didn't read any new entries from WAL");
-    if (source.isRecovered()) {
-      // we're done with queue recovery, shut ourself down
+    if (logQueue.isEmpty()) {
+      // we're done with current queue, either this is a recovered queue, or 
it is the special group
+      // for a sync replication peer and the peer has been transited to DA or 
S state.
{noformat}

The problem is that for a normal source (not replicated), if it starts while 
the current WAL is still empty (for instance, just after an RS restart), the 
reader will reach WAL's end, then shutdown the source. If then new edits are 
added to this same WAL, it won't get replicated until this wal file gets into a 
recovery queue.

This is easy to reproduce, just setup replication, restart source RS, do a put 
on source, related edit does not get into target until the wal goes to a 
recovery queue (restarting RS, for instance, causes the wal containing that put 
to be recovered).

> Support removing a ReplicationSourceShipper for a special wal group
> -------------------------------------------------------------------
>
>                 Key: HBASE-20456
>                 URL: https://issues.apache.org/jira/browse/HBASE-20456
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Replication, wal
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HBASE-20456-HBASE-19064-v1.patch, 
> HBASE-20456-HBASE-19064-v2.patch, HBASE-20456-HBASE-19064-v3.patch, 
> HBASE-20456-HBASE-19064-v3.patch, HBASE-20456-HBASE-19064.patch
>
>
> For the multi wal implementation, if a new group is created, then we will 
> always open a wal writer for it, even if no one uses it later. But for sync 
> replication, if the peer is transited to DA or S from A, the special wal will 
> be closed and there will be no more wal files for it so we need to close the 
> shipper



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20456) Support removing a ReplicationSourceShipper for a special wal group

Reply via email to