[Firebird-devel] Replication: declarative control

Dmitry Yemanov Thu, 21 Feb 2019 00:15:36 -0800

All,

In v4 Beta, replication is fully driven by the configuration file. Inparticular, tables to be replicated are (optionally) defined using tworegexp-based options: include_filter and exclude_filter. This was theeasiest solution that doesn't require ODS changes and this matches thetrace/audit configuration thus being famous to Firebird DBAs.

However, I don't think this is flexible enough. IMO, there's much sensein separating "what is replicated" from "how it's replicated". Theformer is a part of the CDC (custom data capture) interface and defineswhat changes we need to collect. The latter is a set of implementationdetails belonging to either the built-in replicator engine or 3rd partyCDC plugin - caching rules, transport options, etc. Such a separationwould allow to build a really flexible architecture.

Moving this idea further, it's worth making the "what is replicated"part controlled declaratively, using SQL. I.e. DBA defines some"replication set" by including or excluding tables, this set is storedinside the database and used by the CDC publisher to filter outunnecessary changes before passing them to the CDC handler. Theseinclude/exclude rules are also replicated.


I see the following (at least) benefits in this approach:

1) Allow custom CDC solutions without interacting with the built-inreplication configuration

2) Automatic setup for cascaded replication - every replica knows whattables are allowed for replication and reuse these rules without anyexplicit include/exclude settings

3) We may allow modification of non-replicated tables in read-onlyreplicas - this will not cause replication conflicts

4) With some additional efforts, we could allow the replication set tobe changed at runtime, without restarting the server (it could be badpractice in general, but perhaps useful to quickly fix someconfiguration mistake)

Dimitry Sibiryakov has kindly provided a pull request implementing thisfeature using the CREATE|ALTER TABLE extensions. It uses RDB$FLAGS forstorage and thus doesn't require ODS changes. I suppose this can beaccepted as a straightforward solution for v4. However, it may besomewhat limited in the long term. So I'd like to have it discussedbefore accepting the PR.

One thing I'm worried about is whether it's enough to have a singleglobal replication set or maybe it's useful to have many independentreplication sets. How they can be used, for example:

1) Two slightly different global replications sets are defined, only oneof them is active at a time, but we can switch between them (e.g. viaenable/disable commands)

2) Different tables (separated by some rule) are included into differentreplication sets which are all active together, their intersection isused by the CDC publisher. This may be useful if these replication setshas some declarative customizations (see below).

3) Different replication sets are declared as intended for different CDCplugins. This implies that multiple CDC plugins may be configuredindependently. In this cases the CDC publisher checks the source (table)against the target (plugin) before sending the changes.

These cases are purely theoretical, but I believe we should considerthem and decide whether it's worth to be prepared for them or not.

Second, IMHO declaring tables as "publishable" via CREATE|ALTER TABLE istoo restrictive. I'd rather manage the replication set using some globalcommands, be it ALTER DATABASE or something different, allowing toinclude/exclude all tables at once, or comma-separated list of tables,or maybe tables by mask (regexp?). Of course, both SQL solutions(database level and table level) may co-exist.

Finally, if we consider the replication set being a filter, it may bealso useful to limit the published change set to some particularoperations (INSERT|UPDATE|DELETE) or even some particular rows (WHEREfilter). I doubt this is useful for replication per se, but this mayallow something similar to "change views" in InterBase, currently with aCDC plugin acting as a client, but perhaps it could be extended later tointeract with the real client application.

And one partially related question from another angle: does it makesense to implement also replica-side declarative filtering? I mean thecase where changes for all tables are journaled but for some reason onlysome tables should be applied to replica - e.g. two independent replicaswith different filters but replicated from the same master journal (toavoid double journaling). If this feature is desirable, then how shouldthe master-side filter (replication set) co-exist with the replica-sidefilter?

Please provide your feedback on these questions. I'm not talking aboutimplementing everything in FB4, I just need to understand how to buildthe foundation that could be extended later with minimal efforts.



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

[Firebird-devel] Replication: declarative control

Reply via email to