Ashish, > I have a postfix mail receiving server. On this I have setup amavisd-new > 2.6.4 (with Spamassassin 3.3.1 and ClamAV). > > Earlier I have 2 (default) amavisd-new processes running. > > Recently when more of my mails started to go into Postfix deferred queue, > I increased the number of amavisd-new processes to 4 and did the > corresponding changes in /etc/amavisd.conf and master.cf for number > of processes (changed 'maxproc') . > My machine config is as 4GB RAM, AMD Opteron (tm) Processor 885 @ 2.6 GHz
Btw (unrelated to the problem at hand), your host is probably capable of running even more than 4 amavisd/SpamAssassin processes. But first the spinning problem should be resolved. > Since then my CPU utilization shows 100% and one of the amavisd processes > seems to be stuck and I had to restart amavisd to resolve this issue. But > this reoccurs again after a short span of time. > > Here is the output from amavisd-nanny , You can see one of the process > that is always stuck in "content checking just started". > > States legend: > A accepted a connection > b begin with a protocol for accepting a request > m 'MAIL FROM' smtp command started a new transaction in the same session > d transferring data from MTA to amavisd > = content checking just started > G generating and verifying unique mail_id > D decoding of mail parts > V virus scanning > S spam scanning > P pen pals database lookup and updates > r preparing results > Q quarantining and preparing/sending notifications > F forwarding mail to MTA > . content checking just finished > sp space indicates idle (elapsed bar is showing dots) > > PID 00613: 00613-10 0:00:01 =S > PID 00865: 00865-04 0:01:37 =========:=========:=========:===S> > PID 01198: 0:00:24 .........:.........:.... > PID 01288: . 0:00:22 .........:.........:.. > > PID 00613: 00613-10 0:00:03 =SSS > PID 00865: 00865-04 0:01:39 =========:=========:=========:===S> > PID 01198: 0:00:26 .........:.........:...... > PID 01288: 0:00:24 .........:.........:.... > > PID 00613: 00613-10 0:00:05 =SSSSS > PID 00865: 00865-04 0:01:41 =========:=========:=========:===S> > PID 01198: 0:00:28 .........:.........:........ > PID 01288: 0:00:26 .........:.........:...... > > PID 00613: 00613-10 0:00:08 =SSSSSSF > PID 00865: 00865-04 0:01:43 =========:=========:=========:===S> > PID 01198: 0:00:30 .........:.........:.........: > PID 01288: 0:00:28 .........:.........:........ > > PID 00613: . 0:00:10 .........: > PID 00865: 00865-04 0:01:45 =========:=========:=========:===S> > PID 01198: 0:00:32 .........:.........:.........:.. > PID 01288: 0:00:30 .........:.........:.........: > > PID 00613: 0:00:12 .........:.. > PID 00865: 00865-04 0:01:47 =========:=========:=========:===S> > PID 01198: 0:00:35 .........:.........:.........:..... > PID 01288: 0:00:32 .........:.........:.........:.. > > PID 00613: 0:00:14 .........:.... > PID 00865: 00865-04 0:01:49 =========:=========:=========:===S> > PID 01198: 0:00:37 .........:.........:.........:..... > PID 01288: 0:00:34 .........:.........:.........:.... > > PID 00613: 0:00:16 .........:...... > PID 00865: 00865-04 0:01:51 =========:=========:=========:===S> > PID 01198: 0:00:39 .........:.........:.........:..... > PID 01288: 0:00:36 .........:.........:.........:..... > > PID 00613: 0:00:18 .........:........ > PID 00865: 00865-04 0:01:53 =========:=========:=========:===S> > PID 01198: 0:00:41 .........:.........:.........:..... > PID 01288: 0:00:38 .........:.........:.........:..... > > PID 00613: 0:00:20 .........:.........: > PID 00865: 00865-04 0:01:55 =========:=========:=========:===S> > PID 01198: 0:00:43 .........:.........:.........:..... > PID 01288: 0:00:40 .........:.........:.........:..... > > Is it a problem? > > Also if it is so , why is this happening and what changes/fixes I need to > do to circumvent this problem? No, your interpretation is incorrect. Should be looking at the last character (the 'S' in this case), not the leading '=' (which just means 'no state info is available for the past'). If you would have started amavisd-nanny earlier (before the problem occured), you would see the long bar slowly filling up, first with perhaps a D or V, followed by a long stream of S. So your process is stuck in SpamAssassin. The most likely reason is some runaway regular expression, probably in rules, but possibly directly in the SpamAssassin code. Probably the same message repeatedly falls into the same trap, as Postfix re-tries an unsuccessful delivery periodically. To diagnose the problem, first obtain the problematic message: either from a file 'email' in a temporary directory being used by amavisd process currently looping, or alternatively, pick the offending message from a postfix queue using a 'postcat -q <queue-id>' command. Then you can experiment with this message and feed it to a command-line spamassassin (running under the same UID as amavisd), e.g.: # su vscan -c 'spamassassin -t -D <0.msg' Do you have any nonstandard rules, perhaps some old SARE rules? Try disabling/removing some of them (perhaps by using bisection if your heuristics fails) to narrow down to an offending rule. Alternatively (if the loop is not infinite but eventually finishes), you can enable a plugin HitFreqsRuleTiming.pm, which produces a list of elapsed times by rules, sorted by elapsed time, writing it to a file timing.log in a current directory. The plugin comes with the SpamAssassin distribution in a subdirectory ./masses/plugins/ . Mark ------------------------------------------------------------------------------ Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d _______________________________________________ AMaViS-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/amavis-user Please visit http://www.ijs.si/software/amavisd/ regularly For administrativa requests please send email to rainer at openantivirus dot org
