Hi,
Generally, I like the DDL approach more than using the configuration
file. Additional benefit to ones listed is possibility to create various
front-ends for replication configuration / integration of such
functionality into existing Firebird management tools.
Comments to specific question inline...
Dne 21. 02. 19 v 9:14 Dmitry Yemanov napsal(a):
One thing I'm worried about is whether it's enough to have a single
global replication set or maybe it's useful to have many independent
replication sets. How they can be used, for example:
It would be nice to has such feature in future.
1) Two slightly different global replications sets are defined, only one
of them is active at a time, but we can switch between them (e.g. via
enable/disable commands)
Hmm, can see an use case for it that would solve some problem and not
cause trouble at the same time. I sense a great potential for users to
shoot themselves to foot.
2) Different tables (separated by some rule) are included into different
replication sets which are all active together, their intersection is
used by the CDC publisher. This may be useful if these replication sets
has some declarative customizations (see below).
This would be certainly valuable.
3) Different replication sets are declared as intended for different CDC
plugins. This implies that multiple CDC plugins may be configured
independently. In this cases the CDC publisher checks the source (table)
against the target (plugin) before sending the changes.
This is certainly interesting. I can imagine a business case for a CDC
plugin that does data stream processing instead persistence to some kind
of replica.
Second, IMHO declaring tables as "publishable" via CREATE|ALTER TABLE is
too restrictive. I'd rather manage the replication set using some global
commands, be it ALTER DATABASE or something different, allowing to
include/exclude all tables at once, or comma-separated list of tables,
or maybe tables by mask (regexp?). Of course, both SQL solutions
(database level and table level) may co-exist.
Agreed. However, would be database and table level definitions
independent, or would we translate database level to batch of table
ones? First would allow independent revocation but would complicate
processing.
Finally, if we consider the replication set being a filter, it may be
also useful to limit the published change set to some particular
operations (INSERT|UPDATE|DELETE) or even some particular rows (WHERE
filter). I doubt this is useful for replication per se, but this may
allow something similar to "change views" in InterBase, currently with a
CDC plugin acting as a client, but perhaps it could be extended later to
interact with the real client application.
This would be certainly interesting. It would allow data sharding and
all sorts of various interesting "utilizations" of replication plumbing
for other things than just replication itself.
And one partially related question from another angle: does it make
sense to implement also replica-side declarative filtering? I mean the
case where changes for all tables are journaled but for some reason only
some tables should be applied to replica - e.g. two independent replicas
with different filters but replicated from the same master journal (to
avoid double journaling). If this feature is desirable, then how should
the master-side filter (replication set) co-exist with the replica-side
filter?
It's tempting, but I see potential for problems. If we would allow
multiple sets & filters at master node, there is no need to have them at
replica. And if replica would have different definitions than master,
then it's not possible to replace master with replica in recovery
scenario. It could be an interim solution for absence of multiple sets &
filters on master, but it would be hard to deprecate them once we
implement such master solution.
regards
Pavel
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel