[jira] [Comment Edited] (HBASE-9465) Push entries to peer clusters serially

2016-12-28 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530835#comment-15530835
 ] 

Vincent Poon edited comment on HBASE-9465 at 12/28/16 7:12 PM:
---

I don't think you need the code addition above 'continue" here - if 
readAllEntries() returns true, then lastPositionsForSerialScope should always 
be empty?

{code}
if (readAllEntriesToReplicateOrNextFile(currentWALisBeingWrittenTo, entries,
  lastPositionsForSerialScope)) {
for (Map.Entry entry : 
lastPositionsForSerialScope.entrySet()) {
  waitingUntilCanPush(entry);
}
try {
  MetaTableAccessor
  .updateReplicationPositions(manager.getConnection(), 
actualPeerId,
  lastPositionsForSerialScope);
} catch (IOException e) {
  LOG.error("updateReplicationPositions fail", e);
  stopper.stop("updateReplicationPositions fail");
}

continue;
  }
{code}


was (Author: vincentpoon):
I think you need the code addition above 'continue" here - if readAllEntries... 
returns true, then lastPositionsForSerialScope should always be empty?

{code}
if (readAllEntriesToReplicateOrNextFile(currentWALisBeingWrittenTo, entries,
  lastPositionsForSerialScope)) {
for (Map.Entry entry : 
lastPositionsForSerialScope.entrySet()) {
  waitingUntilCanPush(entry);
}
try {
  MetaTableAccessor
  .updateReplicationPositions(manager.getConnection(), 
actualPeerId,
  lastPositionsForSerialScope);
} catch (IOException e) {
  LOG.error("updateReplicationPositions fail", e);
  stopper.stop("updateReplicationPositions fail");
}

continue;
  }
{code}

> Push entries to peer clusters serially
> --
>
> Key: HBASE-9465
> URL: https://issues.apache.org/jira/browse/HBASE-9465
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver, Replication
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Honghua Feng
>Assignee: Phil Yang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-9465-branch-1-v1.patch, 
> HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v2.patch, 
> HBASE-9465-branch-1-v3.patch, HBASE-9465-branch-1-v4.patch, 
> HBASE-9465-branch-1-v4.patch, HBASE-9465-v1.patch, HBASE-9465-v2.patch, 
> HBASE-9465-v2.patch, HBASE-9465-v3.patch, HBASE-9465-v4.patch, 
> HBASE-9465-v5.patch, HBASE-9465-v6.patch, HBASE-9465-v6.patch, 
> HBASE-9465-v7.patch, HBASE-9465-v7.patch, HBASE-9465.pdf
>
>
> When region-move or RS failure occurs in master cluster, the hlog entries 
> that are not pushed before region-move or RS-failure will be pushed by 
> original RS(for region move) or another RS which takes over the remained hlog 
> of dead RS(for RS failure), and the new entries for the same region(s) will 
> be pushed by the RS which now serves the region(s), but they push the hlog 
> entries of a same region concurrently without coordination.
> This treatment can possibly lead to data inconsistency between master and 
> peer clusters:
> 1. there are put and then delete written to master cluster
> 2. due to region-move / RS-failure, they are pushed by different 
> replication-source threads to peer cluster
> 3. if delete is pushed to peer cluster before put, and flush and 
> major-compact occurs in peer cluster before put is pushed to peer cluster, 
> the delete is collected and the put remains in peer cluster
> In this scenario, the put remains in peer cluster, but in master cluster the 
> put is masked by the delete, hence data inconsistency between master and peer 
> clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-9465) Push entries to peer clusters serially

2016-08-08 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412858#comment-15412858
 ] 

Phil Yang edited comment on HBASE-9465 at 8/9/16 3:20 AM:
--

Can not see the failure message in Test Results, and it can pass locally, let's 
resubmit the patch to retry


was (Author: yangzhe1991):
Can not seen the failure message in Test Results, and it can pass locally, 
let's resubmit the patch to retry

> Push entries to peer clusters serially
> --
>
> Key: HBASE-9465
> URL: https://issues.apache.org/jira/browse/HBASE-9465
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver, Replication
>Reporter: Honghua Feng
>Assignee: Phil Yang
> Attachments: HBASE-9465-branch-1-v1.patch, 
> HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v2.patch, 
> HBASE-9465-branch-1-v3.patch, HBASE-9465-branch-1-v4.patch, 
> HBASE-9465-v1.patch, HBASE-9465-v2.patch, HBASE-9465-v2.patch, 
> HBASE-9465-v3.patch, HBASE-9465-v4.patch, HBASE-9465-v5.patch, 
> HBASE-9465-v6.patch, HBASE-9465-v6.patch, HBASE-9465-v7.patch, 
> HBASE-9465-v7.patch, HBASE-9465.pdf
>
>
> When region-move or RS failure occurs in master cluster, the hlog entries 
> that are not pushed before region-move or RS-failure will be pushed by 
> original RS(for region move) or another RS which takes over the remained hlog 
> of dead RS(for RS failure), and the new entries for the same region(s) will 
> be pushed by the RS which now serves the region(s), but they push the hlog 
> entries of a same region concurrently without coordination.
> This treatment can possibly lead to data inconsistency between master and 
> peer clusters:
> 1. there are put and then delete written to master cluster
> 2. due to region-move / RS-failure, they are pushed by different 
> replication-source threads to peer cluster
> 3. if delete is pushed to peer cluster before put, and flush and 
> major-compact occurs in peer cluster before put is pushed to peer cluster, 
> the delete is collected and the put remains in peer cluster
> In this scenario, the put remains in peer cluster, but in master cluster the 
> put is masked by the delete, hence data inconsistency between master and peer 
> clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-9465) Push entries to peer clusters serially

2016-07-29 Thread Phil Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398898#comment-15398898
 ] 

Phil Yang edited comment on HBASE-9465 at 7/29/16 7:41 AM:
---

Read HBASE-15156 , I have no idea when it will be committed to master and it is 
configurable and default is disable. Maybe we can keep the logic simpler now 
and after HBASE-15156 pushed to master, we can open a follow-up issue to handle 
this case if needed?


was (Author: yangzhe1991):
Read HBASE-15156 , I have no idea when it will be committed to master and it is 
configurable and default is disable. Maybe we can keep the logic simpler now 
and after HBASE-15156 pushed to master, we can open a follow-up issue to handle 
this case?

> Push entries to peer clusters serially
> --
>
> Key: HBASE-9465
> URL: https://issues.apache.org/jira/browse/HBASE-9465
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver, Replication
>Reporter: Honghua Feng
>Assignee: Phil Yang
> Attachments: HBASE-9465-v1.patch, HBASE-9465-v2.patch, 
> HBASE-9465-v2.patch, HBASE-9465-v3.patch, HBASE-9465.pdf
>
>
> When region-move or RS failure occurs in master cluster, the hlog entries 
> that are not pushed before region-move or RS-failure will be pushed by 
> original RS(for region move) or another RS which takes over the remained hlog 
> of dead RS(for RS failure), and the new entries for the same region(s) will 
> be pushed by the RS which now serves the region(s), but they push the hlog 
> entries of a same region concurrently without coordination.
> This treatment can possibly lead to data inconsistency between master and 
> peer clusters:
> 1. there are put and then delete written to master cluster
> 2. due to region-move / RS-failure, they are pushed by different 
> replication-source threads to peer cluster
> 3. if delete is pushed to peer cluster before put, and flush and 
> major-compact occurs in peer cluster before put is pushed to peer cluster, 
> the delete is collected and the put remains in peer cluster
> In this scenario, the put remains in peer cluster, but in master cluster the 
> put is masked by the delete, hence data inconsistency between master and peer 
> clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)