Re: Considering a completely different administrative approach for Replication v2

Vladimir Rodionov Tue, 08 Sep 2015 09:57:49 -0700

I have always wanted to ask that ... How does timeline consistency differ
from casual?
There are some approaches for asynchronous replication which preserves
causality of events in a system,
this is for sure better than eventual consistency.


-Vlad

On Tue, Sep 8, 2015 at 8:59 AM, Andrew Purtell <[email protected]> wrote:

> That's not my motivation for bringing this up. I suppose if you favor a
> similarity then it's a happy accident.
>
>
> On Mon, Sep 7, 2015 at 11:48 PM, Jerry He <[email protected]> wrote:
>
> > Interesting idea.
> > Is this meant to provide an alternative 'native' cross DC replication
> > support that is close to Cassandra ?
> >
> > Jerry
> > On Sep 7, 2015 10:44 PM, "Andrew Purtell" <[email protected]> wrote:
> >
> > > I opened an umbrella for Replication v2 as HBASE-14379. At the moment
> it
> > > envisions the administration of cross-DC replication relationships and
> > data
> > > access as the same as today. However, we do have an opportunity to
> reboot
> > > with a completely different approach. I thought it worth bringing up
> for
> > > discussion.
> > >
> > > We could in theory reboot around timeline consistent region replicas.
> If
> > > you squint, region replicas have a similar theory of operation as
> > cross-DC
> > > replication. What if we redefine administration and data access for
> > > Replication v2 as sets of region replica placements that can cross data
> > > center boundaries, with the client able to distinguish local locations
> > from
> > > remote locations, and then choose based on policy? So if, for example,
> > you
> > > may have three data centers, then instead of setting up three
> > > point-to-point replication peering relationships like today, you'd
> simply
> > > create a table that has a region replica placement policy in its schema
> > > with (logical) locations spanning all three data centers. Behind the
> > > scenes, each data center would have HBASE-10070 style primary-secondary
> > > relationships, and additionally:
> > > 1. the primary region will run something like today's replication
> source
> > > for each secondary location that is in a remote DC;
> > > 2. a primary region anywhere may receive change streams from remote DCs
> > > like today's replication sinks.
> > >
> > > On the client side we have some prior work in this regard: CSBT, and
> Ted
> > > Malaska's HBase.MCC. I mention CSBT but I don't think we want its
> > > partitioning or reliance on Zookeeper. HBase.MCC is more of a starting
> > > point.
> > >
> > > I'm not saying we should do this, only that we could do this. There are
> > > pros and cons. In some ways defining point-to-point replication
> > > relationships is easier for admins and users, e.g. the topology is
> built
> > > and managed explicitly. In some ways merging replicas and cross-DC
> > > replication is easier, e.g. it removes APIs, necessary tooling,
> cognitive
> > > load (cross-DC replication is no longer 'special').
> >
>

Re: Considering a completely different administrative approach for Replication v2

Reply via email to