[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-12-30 Thread Bharath Vissapragada (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17005913#comment-17005913
 ] 

Bharath Vissapragada commented on HBASE-21856:
--

Yikes, don't know how I missed that. Thanks for correcting me [~larsh].

I've spent sometime on the current replication internals and design and what 
[~apurtell] suggested (in his first comment) here totally makes sense to me and 
I think thats doable without invasive code changes. This suggested ordering (as 
seen by a sink side for all mutations to a given row must be be causal) is 
critical to a project that I'm working on. So I'd like to assign this to myself 
and give it a shot. Hope thats ok. 

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-10-10 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948976#comment-16948976
 ] 

Lars Hofhansl commented on HBASE-21856:
---

[~bharathv] See description (I mention "Serial Replication" there) :) ... The 
thought is that serial replication (i.e. a global ordering) is very expensive 
and not always needed.
I propose that for most use cases the ordering I propose here is sufficient. In 
the end we should have a discussion about what the specific problem is that we 
want to solve.

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-10-07 Thread Bharath Vissapragada (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946309#comment-16946309
 ] 

Bharath Vissapragada commented on HBASE-21856:
--

I'm kinda curious how this relates to HBASE-20046. It looks to me  like the 
intent here is same. Thoughts? (or am I missing something?)

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-09-19 Thread Andrew Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933196#comment-16933196
 ] 

Andrew Purtell commented on HBASE-21856:


That's correct. But if we are having issues as described if we introduce a back 
pressure signal that stops the sender, or causes it to (exponentially) back 
off, this will be fine. 

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-09-16 Thread Bin Shi (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930701#comment-16930701
 ] 

Bin Shi commented on HBASE-21856:
-

[~apurtell], I want to make sure we're on the same page – if we don't let the 
source to send batches sequentially, they could arrive at the the same 
destination in the sink out of order, then the sink side needs to buffer the 
arrived batches within a time window and sort the batches by the 
non-consecutive sequence IDs. Because the sequence IDs aren't consecutive, we 
might have some edits with smaller sequence IDs arriving behind the time window 
if the time window isn't big enough, but bigger time window results in longer 
latency and extra complexity. Is my understanding correct?

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-09-10 Thread Andrew Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926815#comment-16926815
 ] 

Andrew Purtell commented on HBASE-21856:


The *_source_* must ship all regions to the same sink regionserver. That is the 
narrowest requirement on the source side. All edits for a region must flow to 
the same sink, presumably selected with a consistent hash. The *_sink_* 
regionserver must then apply the batches in order. It is not strictly necessary 
for the source to send batches sequentially or in a blocking manner. We might 
opt to implement it that way, but it is not necessary. IMHO, better to do the 
mastering and ordering on the sink side, but agree it may make implementation 
more challenging. 

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-09-06 Thread Bin Shi (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924560#comment-16924560
 ] 

Bin Shi commented on HBASE-21856:
-

[~apurtell], what you said about "source side ordering" makes sense to me. I 
want to know more about why the source needs to ship edits of a region to the 
same destination in the sink, as described by "Sink side ordering". 

Currently, the source ships the batches (a batch contains edits of multiple 
regions) with multiple Replication RPC in parallel, and for each RPC, it is a 
synchronous call (waiting for replication response before sending another 
batch). On the sink side, when a region server receives a Replication RPC, it 
splits the batch into smalls batches per region, calls table.batch() 
synchronously to apply the batch to the region, then sends replication response 
to the source. If my understanding is correct, what we really miss here is that 
the source should ship batches from the same region sequentially, instead of 
making sure these batches landing on the same destination. With the former, 
since we only have single source now, we can preserve the order for a region in 
the presence of cross cluster replication, no matter the batches from the same 
region lands on the same destination or on the different destinations. Without 
the former, even the batches from the same region landing on the same 
destination, we still can't preserve the order.

Please let me know if I miss something.

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-03-08 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788236#comment-16788236
 ] 

Andrew Purtell commented on HBASE-21856:


There are two aspects of this:

1: Source side ordering. Sources need to ensure that all/any logs which contain 
the edits for a given region and row are processed in order. Today we can have 
the region open on one server generating edits and a recovery queue from a 
crashed regionserver somewhere else, leading to interleaving. [~lhofhansl]'s 
suggestions in the issue description are a solution for this. A strawman ideal 
solution: Replication queues are split into per region queues, just like WALs 
are split into per region recovered edits files. The new owner of the primary 
region takes ownership of the split-replication-queue and drains the queue 
before shipping newer edits. 

2. Sink side ordering. All of the work above is useless if sources then send 
batches of work containing the edits for a given region and row to more than 
one destination over at the sink. Who can say in what order the sinks will 
process the replication RPC? They could be processed out or order or in 
parallel. The sources should build replication batch RPCs by region (the code 
today does this in part) and should chose a constant destination for a region's 
batched edits with a consistent hash. (Multiple pending batches for the same 
remote regionserver can be merged as an optimization.) Given a set of sink 
endpoints and an encoded region name, always choose the same sink. This 
guarantees at most one live endpoint in the sink will be processing a region's 
edits, and so can apply the edits in a total order. 

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)