Hi David,Thanks for your response.After all it turns out that the Bucardo 
server did not had enough memory. So when the memory was full the kid died.I 
have upgraded that server from 2 to 12 Gb of ram and it seems that bucardo 
keeps busy almost 5G.
 Regards, Adrian Videanu

      From: David Christensen <[email protected]>
 To: Videanu Adrian <[email protected]> 
Cc: "[email protected]" <[email protected]>
 Sent: Wednesday, October 5, 2016 6:13 PM
 Subject: Re: [Bucardo-general] Kid is not responding,
   
Hi Videanu,

> Hi all, 
> I have a 4 Master bucardo 5.4.1 setup.
> The replication was down for a few days and now I have almost 8 millions rows 
> to be moved between servers.
> Due to that the operation takes more than 1 hour. Until  now I had a firewall 
> problem and at almost 1, 1.5 hours the connections was cut and the 
> transaction was restarted.

So did you fix the timeout issue via adjusting the tcp_keep_alives in your 
postgresql.conf file?  I’ve had to do that before with some long-running slony 
operations where there were long periods of time where no data was being 
transferred over the connections.  That should keep the connection going even 
if there were high waits in the transfer.  (Though I’d be a little surprised if 
there were pauses of that length without *any* data transfer.)

> Now I have fised that but I got this error:
> (2498) [Wed Oct  5 12:35:20 2016] CTL Warning: Kid 2525 is not responding, 
> will respawn
> (2498) [Wed Oct  5 12:35:20 2016] CTL Old syncrun entry removed during 
> resurrection, start time was 2016-10-05 11:12:45.165723+03
> (6411) [Wed Oct  5 12:35:20 2016] KID (ccAclSync) New kid, sync "ccAclSync" 
> alive=1 Parent=2498 PID=6411 kicked=1
> (6411) [Wed Oct  5 12:35:20 2016] KID (ccAclSync) Overwriting 
> /var/run/bucardo/bucardo.kid.sync.ccAclSync.pid: old process was ?

The messages you point out appear to be more informational than indicative of 
ongoing error issues; this is the message you get if the Kid process no longer 
exists.  Now, if you are getting this message repeatedly and it’s never able to 
have the Kid process run that’s a different story.  That would indicate that 
the Kid process is dying while trying to do the actual replication.  My guess 
right now is that it is a residue of the earlier issue you had.

> Is there any way that I could increase kid/sync timeout ? Maybe kick the sync 
> manually with the timeout parameter ? 

BTW, there is no timeout setting in Bucardo for the Kid sync.  The answer here 
is to figure out why the Kid is dying if it’s other than the timeout issue, and 
fix that.

HTH,

David
--
David Christensen
End Point Corporation
[email protected]
785-727-1171




   
_______________________________________________
Bucardo-general mailing list
[email protected]
https://mail.endcrypt.com/mailman/listinfo/bucardo-general

Reply via email to