Hello, After using amavisd with relatively few problems for many years, something has recently come up which is proving extremely difficult to diagnose and correct.
After running fine for a few minutes, amavisd stalls out. About five minutes after that, this happens: Feb 28 00:20:20 b1 amavis[6879]: (06879-01) (!!)TROUBLE in check_mail: reg_proc FAILED: timed out Feb 28 00:20:20 b1 amavis[6845]: (06845-01) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4184]: (04184-18) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6845]: (06845-01) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001526-06845-J_pe8wnf Feb 28 00:20:20 b1 amavis[6836]: (06836-01) (!!)TROUBLE in check_mail: forwarding FAILED: timed out Feb 28 00:20:20 b1 amavis[6828]: (06828-01) (!!)TROUBLE in check_mail: forwarding FAILED: timed out Feb 28 00:20:20 b1 amavis[4105]: (04105-20) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6660]: (06660-03) (!!)TROUBLE in check_mail: forwarding FAILED: timed out Feb 28 00:20:20 b1 amavis[4108]: (04108-20) (!!)TROUBLE in check_mail: forwarding FAILED: timed out Feb 28 00:20:20 b1 amavis[4183]: (04183-20) (!!)TROUBLE in check_mail: quar+notif FAILED: timed out Feb 28 00:20:20 b1 amavis[6842]: (06842-01) (!!)TROUBLE in check_mail: forwarding FAILED: timed out Feb 28 00:20:20 b1 amavis[4078]: (04078-20) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4083]: (04083-20) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4083]: (04083-20) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001105-04083-BIJKpd1g Feb 28 00:20:20 b1 amavis[6748]: (06748-02) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4132]: (04132-13) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4132]: (04132-13) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001115-04132-_pPVvTO6 Feb 28 00:20:20 b1 amavis[4172]: (04172-18) (!!)TROUBLE in check_mail: spam_scan FAILED: timed out Feb 28 00:20:20 b1 amavis[4114]: (04114-16) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4114]: (04114-16) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001113-04114-XIBpubu9 Feb 28 00:20:20 b1 amavis[4071]: (04071-17) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4095]: (04095-18) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4071]: (04071-17) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001103-04071-uWuQ_FjN Feb 28 00:20:20 b1 amavis[4095]: (04095-18) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001107-04095-evFSHnz_ Feb 28 00:20:20 b1 amavis[6852]: (06852-01) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4098]: (04098-19) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6846]: (06846-01) (!!)TROUBLE in check_mail: quar+notif FAILED: timed out Feb 28 00:20:20 b1 amavis[6841]: (06841-01) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[6841]: (06841-01) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001525-06841-LjhaSm_A Feb 28 00:20:20 b1 amavis[4169]: (04169-19) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4079]: (04079-20) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4079]: (04079-20) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001103-04079-QxWEpEkc Feb 28 00:20:20 b1 amavis[6808]: (06808-01) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4089]: (04089-20) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6743]: (06743-02) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4128]: (04128-19) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6859]: (06859-01) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6865]: (06865-01) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4151]: (04151-20) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6727]: (06727-02) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[6770]: (06770-01) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4073]: (04073-20) (!!)TROUBLE in check_mail: decide_mail_destiny FAILED: timed out Feb 28 00:20:20 b1 amavis[4157]: (04157-11) (!)Requesting process rundown, task exceeded allowed time Feb 28 00:20:20 b1 amavis[4157]: (04157-11) (!)TempDir removal: tempdir is to be PRESERVED: /data/amavis/tmp/amavis-20190228T001116-04157-JivXLHdS Feb 28 00:20:20 b1 amavis[6776]: (06776-02) (!!)TROUBLE in check_mail: mime_decode-1 FAILED: timed out Feb 28 00:20:20 b1 amavis[6879]: (06879-01) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001530-06879-GFZd6Pxo Feb 28 00:20:20 b1 amavis[6828]: (06828-01) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001524-06828-DS1VbeD9 Feb 28 00:20:20 b1 amavis[4184]: (04184-18) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001117-04184-lixrt0pw Feb 28 00:20:20 b1 amavis[6836]: (06836-01) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001526-06836-k3iHZPIY Feb 28 00:20:20 b1 amavis[4105]: (04105-20) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001108-04105-1hQIqEGe Feb 28 00:20:20 b1 amavis[6660]: (06660-03) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001506-06660-RKk3QK6F Feb 28 00:20:20 b1 amavis[4108]: (04108-20) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001111-04108-ecbB3Q_C Feb 28 00:20:20 b1 amavis[4183]: (04183-20) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001117-04183-ENU2SsX3 Feb 28 00:20:20 b1 amavis[6842]: (06842-01) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001526-06842-yabNzMXc Feb 28 00:20:20 b1 amavis[4078]: (04078-20) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001103-04078-YBmu2Xal Feb 28 00:20:20 b1 amavis[6748]: (06748-02) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001518-06748-coWC10OH Feb 28 00:20:20 b1 amavis[4172]: (04172-18) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001117-04172-WWseqk4c Feb 28 00:20:20 b1 amavis[6852]: (06852-01) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001527-06852-MJAWa9Wt Feb 28 00:20:20 b1 amavis[4098]: (04098-19) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001107-04098-99luc2mN Feb 28 00:20:20 b1 amavis[6846]: (06846-01) (!)PRESERVING EVIDENCE in /data/amavis/tmp/amavis-20190228T001526-06846-SNCNRlZL At this point, amavis is dead until restarted. The version is amavisd-new-2.11.1 on FreeBSD 11. A "ps" of the running processes shows they are all stuck: 110 4063 716 0 20 0 155208 147576 select S - 0:03.03 /usr/local/sbin/amavisd (master) (perl) 110 4068 4063 0 20 0 169304 158632 ucond I - 0:12.24 /usr/local/sbin/amavisd (ch20-avail) (perl) 110 4069 4063 0 20 0 165248 155816 ucond I - 0:10.28 /usr/local/sbin/amavisd (ch19-avail) (perl) 110 4071 4063 0 20 0 169304 159192 ucond I - 0:09.78 /usr/local/sbin/amavisd (ch17-finish) (perl) [...] 110 6938 4063 0 20 0 156672 147984 ucond I - 0:00.03 /usr/local/sbin/amavisd (ch1-avail) (perl) 110 6942 4063 0 20 0 156672 147984 ucond I - 0:00.02 /usr/local/sbin/amavisd (ch1-avail) (perl) 110 6944 4063 0 20 0 156672 147984 ucond I - 0:00.03 /usr/local/sbin/amavisd (ch1-avail) (perl) 110 6951 4063 0 20 0 156664 147976 ucond I - 0:00.02 /usr/local/sbin/amavisd (perl) 110 6952 4063 0 20 0 156664 147976 ucond I - 0:00.02 /usr/local/sbin/amavisd (perl) The ps above shows all the child processes are in wait state "ucond" which is short for "user-mode condition." This means they are all waiting for something outside the kernel, like a POSIX mutex or condition variable. (This specifically excludes file locks, which wait inside the kernel.) So, it appears to an outsider that all the child processes are deadlocked. To verify this, I tried to trace the processes. It seems like the last traced operation in each process is this: 9660 perl CALL _umtx_op(0x80089e004,UMTX_OP_CV_WAIT,0x6<CVWAIT_ABSTIME|CVWAIT_CLOCKID>,0x80089d000,0) So that does appear to bear out the idea that the processes are all waiting for the same condition variable that never gets signaled. Through tweaking amavisd.conf, I found that adding: @bypass_spam_checks_maps = (1); Appears to prevent the issue, which suggests that it's somewhere in the spam-handling code. Does this make sense to anyone, or is there any way for me to troubleshoot this further? Thanks for any advice!
