On Nov 15, 2012, at 5:34 PM, Alexey Klyukin <[email protected]> wrote:

> 
> 
> Hi,
> 
> I've been testing swap sync and conflict resolution for bucardo 4.8

Sorry, got confused with version numbers, 4.5 it is.

> and found that kids die with the following error message quite often during 
> almost concurrent updates (i.e. by manually updating the same role on both 
> source and target in order to simulate a conflict) 
> 
> [Thu Nov 15 15:26:05 2012]  KID No conflict, target only for 
> public.products.prod_id: 10006
> [Thu Nov 15 15:26:05 2012]  KID Action summary: 2:1
> [Thu Nov 15 15:26:05 2012]  KID [1/1] public.products UPDATE target to source 
> pk 10006
> 'Warning! Aborting due to exception for public.products.prod_id: 10006 Error 
> was DBD::Pg::st execute failed: ERROR:  could not serialize access due to 
> concurrent update at /usr/local/share/perl/5.10.1/Bucardo.pm line 5776.'
> [Thu Nov 15 15:26:05 2012]  KID Final database backend PID is 27203
> [Thu Nov 15 15:26:05 2012]  KID Kid exiting at cleanup_kid. Reason: Died at 
> /usr/local/share/perl/5.10.1/Bucardo.pm line 5835.
> [Thu Nov 15 15:26:05 2012]  KID Removed pid file 
> "/var/run/bucardo/bucardo.kid.sync.dellstore2_swap.zen_dellstore2.pid"
> [Thu Nov 15 15:26:14 2012]  CTL Rows updated child 27199 to aborted in q: 1
> [Thu Nov 15 15:26:14 2012]  CTL Warning! Kid 27199 seems to have died. Sync 
> "dellstore2_swap"
> [Thu Nov 15 15:26:24 2012]  CTL Cleaning up aborted sync from q table for 
> "zen_dellstore2". PID was 27199
> [Thu Nov 15 15:26:24 2012]  CTL Already an empty slot, so not re-adding
> 
> 
> After the sync is kicked, bucardo finds delta rows, detects a conflict due to 
> updates for the same rows and successfully resolves it:
> 
> Thu Nov 15 15:31:21 2012]  KID Total delta count: 2
> [Thu Nov 15 15:31:21 2012]  KID Logged details of conflict to 
> bucardo_conflict.log
> [Thu Nov 15 15:31:21 2012]  KID Conflict detected for public.products:10006. 
> Using standard conflict "target"
> [Thu Nov 15 15:31:21 2012]  KID Action summary: 2:1
> [Thu Nov 15 15:31:21 2012]  KID [1/1] public.products UPDATE target to source 
> pk 10006
> [Thu Nov 15 15:31:21 2012]  KID Updating bucardo_track for public.products on 
> blade_dellstore2
> [Thu Nov 15 15:31:21 2012]  KID Updating bucardo_track for public.products on 
> zen_dellstore2
> [Thu Nov 15 15:31:21 2012]  KID Issuing final commit for source and target
> 
> The problem is that the kid is not restarted automatically. I'm not sure if 
> it has something to do with the 'already an empty slot...' error message 
> above. One workaround I found is to set sync's checktime to a non-zero value, 
> so that pending delta rows are detected and replicated, but I wonder if it 
> should restart the kid automatically after such failure, given keepalive flag 
> is set for the sync?
> 
> Thank you,
> --
> Alexey Klyukin        http://www.commandprompt.com
> The PostgreSQL Company – Command Prompt, Inc.
> 
> 
> 
> 
> _______________________________________________
> Bucardo-general mailing list
> [email protected]
> https://mail.endcrypt.com/mailman/listinfo/bucardo-general
> 

--
Alexey Klyukin        http://www.commandprompt.com
The PostgreSQL Company – Command Prompt, Inc.




_______________________________________________
Bucardo-general mailing list
[email protected]
https://mail.endcrypt.com/mailman/listinfo/bucardo-general

Reply via email to