Please let me restate that this problem is due to a remote client having very bad connectivity. Nothing I can do to change that. Internet sucks where they are, and it doesn't seem to be getting fixed. Sometimes it's offline for several hours, or even a full day.
But Bucardo doesn't like that. It eventually times out / breaks and does not restart itself when connectivity is finally restored. I had to build a script to check for remote connectivity (using Cron), which restarts bucardo if log.bucardo shows no recent activity. I tweaked the script to manually kill all instances of "Bucardo Master Control", which fixed the problem mentioned previously, but it's not a happy solution. I'm concerned we're creating memory leaks or other nasties by killing these processes in this blunt manner. I'll attach the script here lest it be helpful to someone. Will Bucardo 5 be more tolerant of spotty connectivity? Happy New Year everyone! Jonathan -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of [email protected] Sent: Thursday, December 27, 2012 12:07 PM To: [email protected] Subject: Bucardo-general Digest, Vol 63, Issue 9 Send Bucardo-general mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit https://mail.endcrypt.com/mailman/listinfo/bucardo-general or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of Bucardo-general digest..." Today's Topics: 1. Postgres error: too many clients (Jonathan Brinkman) ---------------------------------------------------------------------- Message: 1 Date: Fri, 30 Nov 2012 20:22:38 -0500 From: "Jonathan Brinkman" <[email protected]> To: <[email protected]> Subject: [Bucardo-general] Postgres error: too many clients Message-ID: <007301cdcf62$59f86c90$0de945b0$@com> Content-Type: text/plain; charset="us-ascii" Greetings One of my production servers' postgres service is breaking, sending errors "psql: FATAL: sorry, too many clients already". When I use select * from pg_stat_activity; on postgres db I see that bucardo has over 90 connections to the database. Here is the result. Why so many connections?? Note that some of the xact_start dates are from a week ago, while others are from right now and are getting refreshed constantly. Using bucardo_ctl restart does not fix the problem. When I use ps -Afww | grep -i bucardo I see a LOT of lines of bucardo-related items like these: postgres 30048 2380 0 18:47 ? 00:00:33 postgres: bucardo bucardo [local] idle postgres 30049 2380 0 18:47 ? 00:00:46 postgres: bucardo vog_cms_main 127.0.0.1(44237) idle root 30126 1 0 18:49 ? 00:00:50 Bucardo Kid. Sync "cmsvog_pushdelta_main_to_gate": (pushdelta) "cmsvog_main" -> "cmsvog_gate" Notice this: cloud-db:~$ ps -Afww | grep -i Master root 1464 1 0 20:15 ? 00:00:00 Bucardo Master Control Program v4.4.8. postgres 1522 888 0 20:17 pts/1 00:00:00 grep -i Master root 2827 1 0 Nov24 ? 00:00:35 Bucardo Master Control Program v4.4.8. Active syncs: cmsvog_pushdelta_main_to_gate,cmsvog_swap_main_and_gate,vog_pushdelta_cms_re plication root 6567 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 6630 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 7266 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 8111 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 8779 1 0 09:47 ? 00:01:17 Bucardo Master Control Program v4.4.8. Active syncs: cmsvog_pushdelta_main_to_gate,cmsvog_swap_main_and_gate,vog_pushdelta_cms_re plication root 8923 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 9324 1 0 10:00 ? 00:05:10 Bucardo Master Control Program v4.4.8. Active syncs: cmsvog_pushdelta_main_to_gate,cmsvog_swap_main_and_gate,vog_pushdelta_cms_re plication root 14049 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 16514 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 17289 1 0 Nov25 ? 00:00:00 Bucardo Master Control Program v4.4.8. root 27544 1 0 18:01 ? 00:00:28 Bucardo Master Control Program v4.4.8. Active syncs: cmsvog_pushdelta_main_to_gate,cmsvog_swap_main_and_gate,vog_pushdelta_cms_re plication Why are there 13 copies of Master Control Program running??? How does that even happen? I had to use "sudo kill -9 PID#" to stop them (short of rebooting the server). Thank you! Jonathan
BST_Bucardo_keepitup.sh
Description: Binary data
_______________________________________________ Bucardo-general mailing list [email protected] https://mail.endcrypt.com/mailman/listinfo/bucardo-general
