On 4/27/15 7:54 PM, Craig Ringer wrote:
If 'default replication set' is the idea of "here's what tables
*should* be getting replicated regardless of whether that's
happening or not", it'd be great if that was done so it could be
split out on it's own at some point. It's a problem that affects all
replication systems.
It wasn't, but that's an interesting idea.
You need away to identify peer nodes in an abstract way before you can
really define sets of which nodes should get which tables. So I think
replication identifiers ( https://commitfest.postgresql.org/4/161/ ) are
a pre-requisite for that though, and one that's proving difficult to get
in.
Perhaps... different replication systems probably use different methods
to identify, so presumably there'd need to be some way to map a generic
identifier into an appropriate identifier for whatever replication
system you're using.
I think any sort of replication sets is likely to have similar problems,
especially the "no in-core user" problem. There's nothing fundamentally
impossible about filtering WAL sent to physical downstreams over
streaming replication to include only replicated tables and the
catalogs, though, so perhaps there could be an in-core user for it.
Oh, I wasn't thinking this needed to be in-core. I think it'd be a lot
easier to develop it as an extension to start with... certainly a lot
less headache ;) If it becomes popular then it'll be a lot easier to get
it added.
In BDR we're currently (ab)using security labels to tag tables with
their replication sets, but I'd love to have a proper way to do that. As
I recall the prior approach, of allowing custom relation options, was
rejected on -hackers.
How would you want to go about storing and tracking the information? A
new catalog? The other issue for in-core replication sets would probably
be making it foreign-key aware, so replication of a table transitively
requires replication of its references.
As you said, we'd need a way to identify replication nodes. We might
also need/want a way to specify topology. I don't think topology would
be too hard (presumably it's either a single 'parent' node, or a list of
peers). What might be more interesting is dealing with different systems
methods of identifying nodes.
You'd want a way to define different sets and associate them with nodes.
A node could be a provider, subscriber, or both. I think some
replication systems support 'pass through' as well, where the node
passes data downstream but doesn't apply it itself. Or it could be
multi-master and possibly a provider to read-only subscribers.
Finally you'd need to associate tables and sequences with a set. I agree
you'd want to look at FKs. I'd also like to be able to define rules for
a set, like "include everything in this schema, unless the first
character is _".
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general