Re: [Bucardo-general] Re-syncing a single table in sync group

David Christensen Thu, 12 Dec 2019 10:28:37 -0800

> On Dec 12, 2019, at 11:17 AM, Jeff Ross <[email protected]> wrote:
> 
> Hi all,
> 
> I've had bucardo up and running as a master-master now for a couple of 
> months.  Recently I noticed that one of the tables on the initial target 
> database has only about 110,000 rows compared to the initial source database 
> where that table has almost 2 million.  In looking over my setup script I 
> discovered that in the the initial pg_dump of data only from initial source 
> to target that table was accidentally not included.
> 
> I currently have 2 syncs running and bucardo status shows them both to be 
> Good.  Additionally, bucardo validate sync shows both syncs to be valid.
> 
> What would be the best way to get the rows from the initial source table to 
> the target table?  I don't see a way to force the re-sync using bucardo.  I 
> can disable the triggers on the target table, dump the data and insert and 
> then re-enable triggers, as the initial data load did using the session 
> replication role.  Or maybe the thing to do it to split that table out into a 
> new sync and then do the initial seeding?


Well, there are a couple of options to refresh a table in Bucardo; as with many 
things, there is an engineering tradeoff to be considered:


1) straightforward; something like this will repopulate the table “foo”:

$ pg_dump -h master --data-only -t foo | psql -1 -c 'set 
session_replication_role = replica' -c 'truncate table foo' -f -

In particular, this sets session_replication_role to replica to prevent trigger 
firing, truncates said table, then loads everything from the current state of 
the table w/pg_dump, all within a single transaction, so it’s safe to run 
without ending up in an indeterminate state.

Upsides: easy, don’t need to modify the triggers on any table config, etc.  Any 
changes to the table made while this process is ongoing will later be 
replicated once this starts up again.

Downsides, will likely block replication activity for the duration of the COPY, 
as once the table takes a lock from truncating, no additional writes will be 
allowed, so any additional changes made to the table will be blocked until this 
operation completes.  This will likely also block the replication of any other 
objects in this sync.


2) (As you suggested) drop the table from the sync, then re-add in a new sync.  
(It’s probably possible to add it to an existing sync and just onetimecopy=2, 
so maybe look into this too.)

Upside: prevents blocking of the rest of the cluster while this single table is 
created

Downsides: if the table has FKs also in bucardo, etc, you’re not guaranteed to 
be in a state where everything is valid; i.e., if one of the syncs stops, but 
the one that has the referenced data keeps going then you have orphaned rows.


3) Update the table with an effective noop; something like: UPDATE foo SET col 
= col.  This will add the delta rows to the table, then bucardo will sync 
itself as if a bunch of changes have been made.

Upsides: all contained in SQL, will have the desired effect eventually.

Downsides: generates a bunch of db writes, network traffic, etc, as 2 million 
rows are updated, delta rows are written, and bucardo copies all of them.


HTH,

David
--
David Christensen
Senior Software and Database Engineer
End Point Corporation
[email protected]
785-727-1171


--
David Christensen
Senior Software and Database Engineer
End Point Corporation
[email protected]
785-727-1171

signature.asc
Description: Message signed with OpenPGP

_______________________________________________
Bucardo-general mailing list
[email protected]
https://bucardo.org/mailman/listinfo/bucardo-general

Re: [Bucardo-general] Re-syncing a single table in sync group

Reply via email to