On 27/2/2014 8:45 μμ, Noel Jones wrote:
Sounds as if the real problem is you're sending amavisd more mail at
a time than your system can handle.
Thank you Noel,
I just found the cause: a particular peculiar mail (long, without
attachment, containing multiple languages and html character coding)
which was sent to a particular user by a particular user group 750 times !
Can I isolate these mails somehow in the deferred or active queue,
remove them all at once and blast them? Is there a way to tell postfix:
remove from queue all mail messages whose sender is x...@example.com?
For some reason, this very mail takes too much time to be scanned by
amavisd (or spam-assassin which runs under it): about 3,5 minutes each
(i.e 3,5 mins x 750)! During this time, CPU tops 100% making the server
suffer, causing active queue to get longer and longer.
The server is an enterprise-class VM (under KMS) on clustered hardware,
with one virtual CPU (it never had a problem until today - it has to
deal with relatively low mail volume):
===============================================================
# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 2
model name : QEMU Virtual CPU version 0.12.5
stepping : 3
cpu MHz : 2000.412
cache size : 4096 KB
fpu : yes
fpu_exception : yes
cpuid level : 4
wp : yes
flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm up rep_good
unfair_spinlock pni cx16 popcnt hypervisor lahf_lm
bogomips : 4000.82
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
===============================================================
Server info during this peak load:
===================================================================================================
[root@mailgw1 log]# free -m
total used free shared buffers cached
Mem: 4840 3573 1267 0 141 2554
-/+ buffers/cache: 877 3963
Swap: 3023 4 3019
[root@mailgw1 log]# iostat
Linux 2.6.32-431.3.1.el6.x86_64 (mailgw1.noa.gr) 02/27/2014
_x86_64_ (1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
8.09 0.02 0.54 0.40 0.00 90.95
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
vda 6.28 20.82 127.33 86776058 530599580
dm-0 16.35 20.58 127.31 85755530 530501728
dm-1 0.01 0.02 0.02 99048 87304
[root@mailgw1 log]# vmstat
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu-----
r b swpd free buff cache si so bi bo in cs us sy
id wa st
2 0 4296 1299284 145172 2615764 0 0 10 64 4 4 8 1
91 0 0
[root@mailgw1 log]# mpstat 3
Linux 2.6.32-431.3.1.el6.x86_64 (mailgw1.noa.gr) 02/27/2014
_x86_64_ (1 CPU)
09:27:00 PM CPU %usr %nice %sys %iowait %irq %soft %steal
%guest %idle
09:27:03 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00
09:27:06 PM all 99.67 0.00 0.33 0.00 0.00 0.00 0.00
0.00 0.00
09:27:09 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00
...
===================================================================================================
Thanks,
Nick