That's not my motivation for bringing this up. I suppose if you favor a similarity then it's a happy accident.
On Mon, Sep 7, 2015 at 11:48 PM, Jerry He <[email protected]> wrote: > Interesting idea. > Is this meant to provide an alternative 'native' cross DC replication > support that is close to Cassandra ? > > Jerry > On Sep 7, 2015 10:44 PM, "Andrew Purtell" <[email protected]> wrote: > > > I opened an umbrella for Replication v2 as HBASE-14379. At the moment it > > envisions the administration of cross-DC replication relationships and > data > > access as the same as today. However, we do have an opportunity to reboot > > with a completely different approach. I thought it worth bringing up for > > discussion. > > > > We could in theory reboot around timeline consistent region replicas. If > > you squint, region replicas have a similar theory of operation as > cross-DC > > replication. What if we redefine administration and data access for > > Replication v2 as sets of region replica placements that can cross data > > center boundaries, with the client able to distinguish local locations > from > > remote locations, and then choose based on policy? So if, for example, > you > > may have three data centers, then instead of setting up three > > point-to-point replication peering relationships like today, you'd simply > > create a table that has a region replica placement policy in its schema > > with (logical) locations spanning all three data centers. Behind the > > scenes, each data center would have HBASE-10070 style primary-secondary > > relationships, and additionally: > > 1. the primary region will run something like today's replication source > > for each secondary location that is in a remote DC; > > 2. a primary region anywhere may receive change streams from remote DCs > > like today's replication sinks. > > > > On the client side we have some prior work in this regard: CSBT, and Ted > > Malaska's HBase.MCC. I mention CSBT but I don't think we want its > > partitioning or reliance on Zookeeper. HBase.MCC is more of a starting > > point. > > > > I'm not saying we should do this, only that we could do this. There are > > pros and cons. In some ways defining point-to-point replication > > relationships is easier for admins and users, e.g. the topology is built > > and managed explicitly. In some ways merging replicas and cross-DC > > replication is easier, e.g. it removes APIs, necessary tooling, cognitive > > load (cross-DC replication is no longer 'special'). >
