http://bugzilla.spamassassin.org/show_bug.cgi?id=4316
Summary: spamd loops spawning children that die immediately. Product: Spamassassin Version: 3.0.2 Platform: PC OS/Version: Linux Status: NEW Severity: critical Priority: P1 Component: spamc/spamd AssignedTo: dev@spamassassin.apache.org ReportedBy: [EMAIL PROTECTED] I'm running a Sendmail/procmail/spamassassin combo on Debian Linux using spamc/spamd and recently noticed the following errors in my sendmail log: sm-mta[30568]: j412QfpS030544: timeout waiting for input from local during Draining Input Checking the running processes showed things like: 17759 ? S 0:00 spamc 17812 ? S 0:00 spamc 17831 ? S 0:00 spamc 17980 ? S 0:00 spamc 18614 ? S 0:00 spamc 19870 ? S 0:00 spamc 20297 ? S 0:00 spamc 20517 ? S 0:00 spamc 21465 ? R 13:23 /usr/sbin/spamd --create-prefs --max-children 5 --helper-home-dir --max-conn-per-child 5 -d --pidfile=/var/run/spamd.pid 22280 ? Z 0:00 [spamd <defunct>] Telnetting to spamd brought no response. Turning on logging would show everything normal up until the problem occured - at which point there were NO log entries from SA at all, and the last entry would show the finish of a completely normal message processing. strace turned out to be more informative and pointed me to the real culprit (I'll attach a trace in a minute). The end result was failed communication between SA and syslog. A few days earlier I had replaced my sysklogd for syslog-ng (1.5.15-1.1). It appears after several hours (12-24 hours on the average with --max-conn-per-child set to 200) that the communication between SA and syslog-ng would break at which point spamd would start madly spawning children that would die immediately. It appears that spamd is completely dependant upon syslog and cannot function correctly without it. I stopped and started spamd and ran a strace on it. Then to test the theory, I stopepd the syslog service. Once the syslog service was dead the process began. The children died slowly at first as the parent tried to keep up by spawning processes but eventually as more than one would die at a time, the parent couldn't keep up and I was down to one parent madly spawning one child would would die before the parent had a chance to spawn a second - naturally no spam processing could be accomplished during this time. I've replaced syslog-ng with my old sysklogd in hopes this will serve as a fix for the time being. I should note that decreasing the value for --max-conn-per-child appears (have not done enough testing to be certain) to also decrease the time beween failures of spamd (or in other words, setting it to one may bring about the failure in 2-4 hours instead of 12-24 hours). Stopping syslog seems to bring about the failure immediately (it might not be immediately obvious from a ps, but a strace will tell the story in a hurry). ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.