On 2019-12-12 11:26, David Christensen wrote:
On Dec 12, 2019, at 11:17 AM, Jeff Ross <[email protected]> wrote:
Hi all,
I've had bucardo up and running as a master-master now for a couple of months.
Recently I noticed that one of the tables on the initial target database has
only about 110,000 rows compared to the initial source database where that
table has almost 2 million. In looking over my setup script I discovered that
in the the initial pg_dump of data only from initial source to target that
table was accidentally not included.
I currently have 2 syncs running and bucardo status shows them both to be Good.
Additionally, bucardo validate sync shows both syncs to be valid.
What would be the best way to get the rows from the initial source table to the
target table? I don't see a way to force the re-sync using bucardo. I can
disable the triggers on the target table, dump the data and insert and then
re-enable triggers, as the initial data load did using the session replication
role. Or maybe the thing to do it to split that table out into a new sync and
then do the initial seeding?
Well, there are a couple of options to refresh a table in Bucardo; as with many
things, there is an engineering tradeoff to be considered:
1) straightforward; something like this will repopulate the table “foo”:
$ pg_dump -h master --data-only -t foo | psql -1 -c 'set
session_replication_role = replica' -c 'truncate table foo' -f -
In particular, this sets session_replication_role to replica to prevent trigger
firing, truncates said table, then loads everything from the current state of
the table w/pg_dump, all within a single transaction, so it’s safe to run
without ending up in an indeterminate state.
Upsides: easy, don’t need to modify the triggers on any table config, etc. Any
changes to the table made while this process is ongoing will later be
replicated once this starts up again.
Downsides, will likely block replication activity for the duration of the COPY,
as once the table takes a lock from truncating, no additional writes will be
allowed, so any additional changes made to the table will be blocked until this
operation completes. This will likely also block the replication of any other
objects in this sync.
2) (As you suggested) drop the table from the sync, then re-add in a new sync.
(It’s probably possible to add it to an existing sync and just onetimecopy=2,
so maybe look into this too.)
Upside: prevents blocking of the rest of the cluster while this single table is
created
Downsides: if the table has FKs also in bucardo, etc, you’re not guaranteed to
be in a state where everything is valid; i.e., if one of the syncs stops, but
the one that has the referenced data keeps going then you have orphaned rows.
3) Update the table with an effective noop; something like: UPDATE foo SET col
= col. This will add the delta rows to the table, then bucardo will sync
itself as if a bunch of changes have been made.
Upsides: all contained in SQL, will have the desired effect eventually.
Downsides: generates a bunch of db writes, network traffic, etc, as 2 million
rows are updated, delta rows are written, and bucardo copies all of them.
HTH,
David
--
David Christensen
Senior Software and Database Engineer
End Point Corporation
[email protected]
785-727-1171
Thank you David!
I ended up doing the third option and did the noop update 100,000 rows
at a time. It was also a good demonstration of how fast updates get
applied across the network. In our case, doing 100,000 rows finished
replication to the remote server on EC2 in less than 10 seconds each round.
Jeff
--
The contents of this e-mail and any attachments are intended solely for the
use of the named addressee(s) and may contain confidential and/or
privileged information. Any unauthorized use, copying, disclosure, or
distribution of the contents of this e-mail is strictly prohibited by the
sender and may be unlawful. If you are not the intended recipient, please
notify the sender immediately and delete this e-mail.
_______________________________________________
Bucardo-general mailing list
[email protected]
https://bucardo.org/mailman/listinfo/bucardo-general