Andrés,

> I am using amavisd-new ( amavisd-new-2.6.4 (20090625) ) with postfix (
> latest ) over FreeBSD 8.0 Also Bind DNS and Squid.
> The hardware is pretty new, 2 Gb RAM, Quad Core.
> When the system is less loaded, about 10 msg/minute, no problem like
> secondary MX.
> But as soon as getting heavy loaded, say 100/150 msg/minute - moving it
> to 1st MX- it sometimes freezes, and after a few minutes come back
> alone, and some ohter times, needs a reset.
> Tried to shut down Squid and Bind, no help.
> CPU is no more than 30%  ( usually 75% Idle ):
> 
> # top
> [...]

This looks quite normal.

> I used amavisd-nanny, and got this:
> 
> fobos[/usr/home/soporte]# amavisd-nanny
> process-id task-id     elapsed in    elapsed-bar (dots indicate idle)
>            or state   idle or busy
> 
> PID 28932: .             0:27:57 .........:.........:.........:.....
> PID 28943: 28943-12-12 terminated   0:13:33
> =========:=========:=========:===d>
> PID 29026: 29026-10    terminated   0:27:51
> =========:=========:=========:===d>
> PID 29528: 29528-01-24 terminated   0:12:46
> =========:=========:=========:===F>
> PID 30121:               0:00:22 .........:.........:..
> PID 30348:               0:00:21 .........:.........:.
> PID 28943: sending SIGKILL in 29 s
> PID 29026: sending SIGKILL in 29 s
> PID 29528: sending SIGKILL in 29 s
> Waiting for the process to terminate: 28943, 29026, 29528
> PID 28943: sending SIGKILL in 28 s
> PID 29026: sending SIGKILL in 28 s
> PID 29528: sending SIGKILL in 28 s
> Waiting for the process to terminate: 28943, 29026, 29528
> ^C
> exited

This is interesting. According to the legend (amavisd-nanny -h)
the 'd' status on 28943 and 29026 means receiving mail:
  d  transferring data from MTA to amavisd
and the status 'F' on 29528 means sending mail:
  F  forwarding mail to MTA

So it looks like the communication between amavisd and postfix
is stuck.

> fobos[/usr/home/soporte]# truss  -p 29026
> ^C   <<<<<<<<<< After 30 seconds, CTRL-C

For all three to-be-terminated processes the 'top' showed
their state to be runnable (RUN), yet WCPU% is 0 and they
accumulated hardly any CPU time - yet they were hanging
inthere for 12..27 minutes dealing with a single message.


> At the freeze time, the login prompt is fast as usual, but the prompt
> for password takes more than 2 minutes, if cames!

Some authentication/authorization issue with PAM (LDAP???).
A DNS or network problem maybe? Maybe some I/O disk trouble?
Check /var/log/messages

> The logs shows nothing but this:
> Apr 15 10:34:55 fobos postfix/smtpd[25919]: timeout after END-OF-MESSAGE
> from localhost[127.0.0.1]
> Apr 15 10:34:57 fobos postfix/smtpd[25777]: timeout after END-OF-MESSAGE
> from localhost[127.0.0.1]
> Apr 15 10:34:58 fobos postfix/smtpd[26055]: timeout after END-OF-MESSAGE
> from localhost[127.0.0.1]
> Apr 15 10:39:00 fobos postfix/smtpd[25963]: timeout after RSET from
> mail.police.gov.bd[203.188.249.4]
> Apr 15 10:39:03 fobos postfix/smtpd[26126]: timeout after END-OF-MESSAGE
> from localhost[127.0.0.1]
> Apr 15 10:47:24 fobos postfix/smtpd[26236]: timeout after END-OF-MESSAGE
> from localhost[127.0.0.1]
> Apr 15 10:48:54 fobos postfix/smtpd[26238]: timeout after RSET from
> mail.police.gov.bd[203.188.249.4]
> Apr 15 10:49:52 fobos postfix/smtpd[26267]: timeout after END-OF-MESSAGE
> from localhost[127.0.0.1]
> Apr 15 11:08:51 fobos postfix/smtpd[26601]: timeout after END-OF-MESSAGE
> from localhost[127.0.0.1]
> Apr 15 11:11:15 fobos postfix/smtpd[26651]: timeout after RSET from
> mail.derbynet.net[216.24.112.20]
> Apr 15 11:21:15 fobos postfix/smtpd[26784]: timeout after RSET from
> mail.derbynet.net[216.24.112.20]

So postfix still seems to be alive.

> And regarding the 29026 process:
> # grep 29026 /var/log/maillog

This looks normal operation up to the moment when it was stuck.
A log level would pinpoint the state more in detail, but we
already know it was somewhere in the data transfer stage (d, F).

> Still now, when I commented the line in Postifx to call the amavisd, the
> process 29026 is hang around ( as others )
> 
> Can someone point me a solution??


> At the freeze time, the login prompt is fast as usual, but the prompt
> for password takes more than 2 minutes, if cames!

The login problem indicates there is some underlying problem with
basic health of the system. Solving the login problem may be the right
starting point.

  Mark

------------------------------------------------------------------------------
_______________________________________________
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/amavis-user 
 AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3 
 AMaViS-HowTos:http://www.amavis.org/howto/ 

Reply via email to