Hi,

I'm running Postfix with SA invoked from amavis-new using both DCC and
Razor2.
All seemed to run smoothly for a couple of weeks. Then all of a sudden no
mail got through. A quick check showed that all instances of amavis-new
were stuck in a loop happily consuming 100% CPU.
I found that the problem was triggered by a mail that hung amavis-new and
when Postfix timed out and made a retry the next amavis-new hung until
all amavs-new instances were hung.
Ok, so this is a amavis-problem then? I'm not sure about that. I disabled
SA in amavis.conf and ran the same message through the system again, this
time with no problem. Running amavis in debug mode showed that SA tried to
run Razor which resulted in the following error and a perl coredump:

 Razor-Log: Computed razorhome from env: /var/amavis/.razor
 Razor-Log: Found razorhome: /var/amavis/.razor
 Razor-Log: Can't read file /var/amavis/.razor/razor-agent.conf: Too many
open files
 Razor-Log: computed razorhome=/var/amavis/.razor,
conf=/var/amavis/.razor/razor-agent.conf,
ident=/var/amavis/.razor/identity
 Razor-Log: Client supported_engines:
 Razor-Log:  prep_mail done: mail 1 headers=144, mime0=210
 Razor-Log: Can't read file , looking relatve to /var/amavis/.razor
 Razor-Log: Can't read file /var/amavis/.razor/: Too many open files
 Razor-Log: Can't read file , looking relatve to /var/amavis/.razor
 Razor-Log: Can't read file /var/amavis/.razor/: Too many open files
 Razor-Log: Can't read file , looking relatve to /var/amavis/.razor
 Razor-Log: Can't read file /var/amavis/.razor/: Too many open files
 Razor-Log: entered nextserver Razor-Log: entered discover
 Razor-Log: no listfile:
 Razor-Log: entered bootstrap_discovery
 Razor-Log: no discovery listfile:
 Razor-Log: Finding Discovery Servers via DNS in the zone
 Razor-Log: Found and evaled Net::DNS::Resolver->new() ==>
Net::DNS::Resolver=HASH(0x98bc1a4)
 Razor-Log: Found 0 Discovery Servers via DNS in the zone
razor2 check skipped: Too many open files IO::Socket::INET: Bad protocol
'udp'   ...propagated at 
/usr/local/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Dns.pm line 409,
<GEN6> line 207.


Next I disabled Razor2 and ran only with DCC. This resulted in this error
(no core dump this time):

Mar 30 11:10:17 marvin amavisd[20976]: (20976-01) SA TIMED OUT,
backtrace: at
/usr/local/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/PerMsgStatus.pm
line 2584\n\teval {...} called at
/usr/local/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/PerMsgStatus.pm
line 2584\n\tMail::SpamAssassin::PerMsgStatus::secure_tmpfile() called at
/usr/local/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/PerMsgStatus.pm
line 
2549\n\tMail::SpamAssassin::PerMsgStatus::create_fulltext_tmpfile('Mail::SpamAssassin::PerMsgStatus=HASH(0x9b53204)','SCALAR(0x9b49574)')
called at /usr/local/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/Dns.pm
line 
718\n\tMail::SpamAssassin::PerMsgStatus::dcc_lookup('Mail::SpamAssassin::PerMsgStatus=HASH(0x9b53204)','SCALAR(0x9b49574)')
called at /usr/local/lib/perl5/site_perl/5.8.1/Mail/SpamAssassin/EvalTests.pm 
line 
2565\n\tMail::SpamAssassin::PerMsgStatus::check_dcc('Mail::SpamAssassin::PerMsgStatus=HASH(0x9b53204)','SCALAR(0x98dc3a8)')
 called ...

Funny thing is that all messages that caused this behaviour originated
from users at the same corporation. Disabling DCC and Razor2 got the
system back online and happily delivering the troublesome messages.

I am running Postfix 2.0.16 with Spamassassin 2.63, Razor 2.36, DCC 1.2.34
and amavis-new 20030616-p8.

Anyone who can offer a clue as to what is wrong? I am, as a stated before,
not even sure if this is an SA issue or if it is amavis-new, dcc or
razor2.

//Erik

Reply via email to