On 2019-12-12 11:26, David Christensen wrote:
On Dec 12, 2019, at 11:17 AM, Jeff Ross <[email protected]> wrote:

Hi all,

I've had bucardo up and running as a master-master now for a couple of months.  
Recently I noticed that one of the tables on the initial target database has 
only about 110,000 rows compared to the initial source database where that 
table has almost 2 million.  In looking over my setup script I discovered that 
in the the initial pg_dump of data only from initial source to target that 
table was accidentally not included.

I currently have 2 syncs running and bucardo status shows them both to be Good. 
 Additionally, bucardo validate sync shows both syncs to be valid.

What would be the best way to get the rows from the initial source table to the 
target table?  I don't see a way to force the re-sync using bucardo.  I can 
disable the triggers on the target table, dump the data and insert and then 
re-enable triggers, as the initial data load did using the session replication 
role.  Or maybe the thing to do it to split that table out into a new sync and 
then do the initial seeding?
Well, there are a couple of options to refresh a table in Bucardo; as with many 
things, there is an engineering tradeoff to be considered:


1) straightforward; something like this will repopulate the table “foo”:

$ pg_dump -h master --data-only -t foo | psql -1 -c 'set 
session_replication_role = replica' -c 'truncate table foo' -f -

In particular, this sets session_replication_role to replica to prevent trigger 
firing, truncates said table, then loads everything from the current state of 
the table w/pg_dump, all within a single transaction, so it’s safe to run 
without ending up in an indeterminate state.

Upsides: easy, don’t need to modify the triggers on any table config, etc.  Any 
changes to the table made while this process is ongoing will later be 
replicated once this starts up again.

Downsides, will likely block replication activity for the duration of the COPY, 
as once the table takes a lock from truncating, no additional writes will be 
allowed, so any additional changes made to the table will be blocked until this 
operation completes.  This will likely also block the replication of any other 
objects in this sync.


2) (As you suggested) drop the table from the sync, then re-add in a new sync.  
(It’s probably possible to add it to an existing sync and just onetimecopy=2, 
so maybe look into this too.)

Upside: prevents blocking of the rest of the cluster while this single table is 
created

Downsides: if the table has FKs also in bucardo, etc, you’re not guaranteed to 
be in a state where everything is valid; i.e., if one of the syncs stops, but 
the one that has the referenced data keeps going then you have orphaned rows.


3) Update the table with an effective noop; something like: UPDATE foo SET col 
= col.  This will add the delta rows to the table, then bucardo will sync 
itself as if a bunch of changes have been made.

Upsides: all contained in SQL, will have the desired effect eventually.

Downsides: generates a bunch of db writes, network traffic, etc, as 2 million 
rows are updated, delta rows are written, and bucardo copies all of them.


HTH,

David
--
David Christensen
Senior Software and Database Engineer
End Point Corporation
[email protected]
785-727-1171



Thank you David!

I ended up doing the third option and did the noop update 100,000 rows at a time.  It was also a good demonstration of how fast updates get applied across the network.  In our case, doing 100,000 rows finished replication to the remote server on EC2  in less than 10 seconds each round.

Jeff

--
The contents of this e-mail and any attachments are intended solely for the use of the named addressee(s) and may contain confidential and/or privileged information. Any unauthorized use, copying, disclosure, or distribution of the contents of this e-mail is strictly prohibited by the sender and may be unlawful. If you are not the intended recipient, please notify the sender immediately and delete this e-mail.
_______________________________________________
Bucardo-general mailing list
[email protected]
https://bucardo.org/mailman/listinfo/bucardo-general

Reply via email to