HBase replication: "in order semantics"

Jan Van Besien Fri, 09 Nov 2012 05:25:50 -0800

Hi,

I am trying to understand in detail how HBase replication works.

First of all, I assume that it is required for replication to workcorrect that all edits are replayed on the replica HBase cluster in thesame order as they were executed on the source HBase cluster. Correct?


If so, I am trying to understand how that is guaranteed.

I can see that this is trivially true by reading the edits in the HLog,and using that as a source for replication.

However, what if a region is moved to another region server. Can we notend up in the following sitation?


1) region A is originally hosted by region server X.

2) replication in region server X is replicating edits of region A. Saythat it is lagging behind a bit, so it has a number of edits still to do.

3) region A is moved to region server Y.

4) edits for region A arrive on region server Y, and replication onregion server Y starts replicating them5) replication in region server X is still busy with some left overedits from region A, so these are replicated out of order

So the question really is whether there is a mechanism to prevent thereplication source from reading edits in a HLog for a region that wasmeanwhile already moved to another region server.

It could be that it has something to do with log splitting and recovery,but I was under the assumption that HBase only splits logs in case ofrecovery and/or master restart, and not in case of region moves.


I hope somebody can shed some light on this topic.

Thanks,
Jan

HBase replication: "in order semantics"

Reply via email to