On Jan 19, 2016, at 18.09, Quanah Gibson-Mount <qua...@zimbra.com> wrote: > > We recently updated to Amavisd 2.10.1 from 2.9.0 internally, and have found > that amavisd constantly dies while processing messages after being put under > a moderate load in our QA environment. > > For example, here is postfix passing off the email to amavisd: > > Jan 19 06:57:42 zqa-211 postfix/smtp[24983]: 84B33102BD3: > to=<user...@zqa-211.eng.zimbra.com>, relay=127.0.0.1[127.0.0.1]:10024, > delay=32179, delays=32178/0.01/0.01/0.22, dsn=4.4.2, status= > deferred (lost connection with 127.0.0.1[127.0.0.1] while sending end of data > -- message may be sent more than once) > > Here we can see amavis accept it, and then die: > Jan 19 06:57:42 zqa-211 amavis[13724]: (13724-01) ESMTP [127.0.0.1]:10024 > /opt/zimbra/data/amavisd/tmp/amavis-20160119T065742-13724-gste5uOH: > <ad...@zqa-211.eng.zimbra.com> -> <user543@zqa- > 211.eng.zimbra.com> SIZE=23143 Received: from zqa-211.eng.zimbra.com > ([127.0.0.1]) by localhost (zqa-211.eng.zimbra.com [127.0.0.1]) (amavisd-new, > port 10024) with ESMTP for <user543@zqa-21 > 1.eng.zimbra.com>; Tue, 19 Jan 2016 06:57:42 -0800 (PST) > Jan 19 06:57:42 zqa-211 amavis[13724]: (13724-01) Checking: iugKoVQZWTPd > [10.15.32.142] <ad...@zqa-211.eng.zimbra.com> -> > <user...@zqa-211.eng.zimbra.com> > Jan 19 06:57:52 zqa-211 amavis-services[18544]: PID 13724 went away, 13724-01
does $log_level = 5 reveal any additional clues about what happened to the process? perhaps an strace might as well? it's anecdotal, but, on a handful of occasions, we have had our mail server use up all of its memory, and iirc, it seemed that amavis had trouble handling that elegantly, and troubleshooting was a little obscure. most recently, the culprit was something wrt razor servers changing [hostname, ip address, or such], which caused amavis children to get stuck. -ben