Re: [GENERAL] BDR Selective Replication

Jim Nasby Tue, 28 Apr 2015 18:16:15 -0700

On 4/27/15 7:54 PM, Craig Ringer wrote:

    If 'default replication set' is the idea of "here's what tables
    *should* be getting replicated regardless of whether that's
    happening or not", it'd be great if that was done so it could be
    split out on it's own at some point. It's a problem that affects all
    replication systems.



It wasn't, but that's an interesting idea.

You need  away to identify peer nodes in an abstract way before you can
really define sets of which nodes should get which tables. So I think
replication identifiers ( https://commitfest.postgresql.org/4/161/ ) are
a pre-requisite for that though, and one that's proving difficult to get
in.

Perhaps... different replication systems probably use different methodsto identify, so presumably there'd need to be some way to map a genericidentifier into an appropriate identifier for whatever replicationsystem you're using.

I think any sort of replication sets is likely to have similar problems,
especially the "no in-core user" problem. There's nothing fundamentally
impossible about filtering WAL sent to physical downstreams over
streaming replication to include only replicated tables and the
catalogs, though, so perhaps there could be an in-core user for it.

Oh, I wasn't thinking this needed to be in-core. I think it'd be a loteasier to develop it as an extension to start with... certainly a lotless headache ;) If it becomes popular then it'll be a lot easier to getit added.

In BDR we're currently (ab)using security labels to tag tables with
their replication sets, but I'd love to have a proper way to do that. As
I recall the prior approach, of allowing custom relation options, was
rejected on -hackers.

How would you want to go about storing and tracking the information? A
new catalog? The other issue for in-core replication sets would probably
be making it foreign-key aware, so replication of a table transitively
requires replication of its references.

As you said, we'd need a way to identify replication nodes. We mightalso need/want a way to specify topology. I don't think topology wouldbe too hard (presumably it's either a single 'parent' node, or a list ofpeers). What might be more interesting is dealing with different systemsmethods of identifying nodes.

You'd want a way to define different sets and associate them with nodes.A node could be a provider, subscriber, or both. I think somereplication systems support 'pass through' as well, where the nodepasses data downstream but doesn't apply it itself. Or it could bemulti-master and possibly a provider to read-only subscribers.

Finally you'd need to associate tables and sequences with a set. I agreeyou'd want to look at FKs. I'd also like to be able to define rules fora set, like "include everything in this schema, unless the firstcharacter is _".

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] BDR Selective Replication

Reply via email to