Hi Dr Erwin,
At 07/09/04 13:23 (), you wrote:
At 11:15 07.09.04 +0530, you wrote:
>At 06/09/04 22:15 (), Erwin Hoffmann wrote:
>>At 20:11 06.09.04 +0530, you wrote:
>> >Dear Erwin,
>> >Sorry for question not really related to Vpopmail.
>> >It seems that I am hit by "Silly Qmail (Queue) Syndrome".
>> >I am using the Spamcontrol Patch v2.2.12 along with vpopmail-5.4.6, but
>> >have not used the experimental "bigtodo".
>> >Wished to apply the bigtodo. I would like to get clarified that whether
>> >bigtodo is based on ext_todo patch or big-todo patch or both. I had not
>> >initially compiled the bigtodo thinking that it is experimental.
>> >What do you suggest.
>>Well. At first you have to tell why you think you are hit by the "Silly
>>Qmail Syndrom". Any hints ?
>>Second. Apart from the big-todo enhencement, my implementation of Andre
>>Oppermann's performance enhancements dont work well. After investigation a
>>look of time and testing I didn't find any significant performance
>>Note: The code in SPAMCONTROL is not the ext-big-todo; however it is based
>>of Andre's first suggestion to influence qmail's scheduler for mail
>>processing; which was buggy by itself.
>>Third. The best thing is to avoid bounces to non-existing accounts.
>>Use my RECIPIENTS extension as part of Qmail or perhaps the "real-rcptto"
>>The forthcoming SPAMCONTROL version will include verion 0.42 of the
>>RECIPIENTS extension; check my Qmail page (http://www.fehcom.de/qmail.html).
>Thanks for nice reply.
>I am attaching "Queue Size" graph (5 Minute Average) updated Tuesday, 7
>September 2004 at 0:50 (EDT).
>You can notice between 0400 - 1000 hrs (EDT) a quite high Mail Queue.
>During that time period the smtpd is running to the tune of 100/100. But
>the send is running to the tune of local 3/15 remote 5/40. The "messages in
>queue but not yet preprocessed" goes on increasing in wild. When the smtpd
>runs to the tune of 85/100 its all okay. This has started happening on
>almost every start of the week, when huge volume of genuine + virus
>infected customers mails start pouring in.
Ok. Until now, you did not tell us what hardware and network connection you have. Anyway. My experience using a 2*1G PIII and fast SCSI Disks on FreeBSD show some similar behavior.
Its Linux slsp-da4p21 2.4.18-18.7.x #1 [Red Hat Linux release 7.3 (Valhalla)] Intel(R) Pentium(R) CPU 2.40GHz cache size : 512 KB RAM: 1GB SWAP: 2GB
HDD: Barracuda 7200.7 (It's an IDE Drive) Model Number:ST380011A Capacity:80 GB Speed:7200 rpm Seek time:8.5 ms avg Interface:Ultra ATA/100
df -m Filesystem 1M-blocks Used Available Use% Mounted on /dev/hda3 73990 16422 53810 24% / /dev/hda1 114 9 99 9% /boot none 441 0 440 0% /dev/shm
fdisk -l Disk /dev/hda: 255 heads, 63 sectors, 9729 cylinders Units = cylinders of 16065 * 512 bytes Device Boot Start End Blocks Id System /dev/hda1 * 1 15 120456 83 Linux /dev/hda2 16 146 1052257+ 82 Linux swap /dev/hda3 147 9729 76975447+ 83 Linux
The "/home/vpopmail/domains" and "/var/qmail/queue" both are on "/dev/hda3"
Network Card: Realtek|RTL-8139/8139C
The Server is connected to a 100 MBPS Network Port limited to 10 MBPS (10 M/s is equal to over 3 terabytes of traffic per month).
mii-tool -v eth0: negotiated 10baseT-FD, link ok product info: vendor 00:00:00, model 0 rev 0 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD link partner: 10baseT-FD 10baseT-HD
I have not yet noticed any signs of Network Bottleneck.
>I am not using "RECIPIENTS extension", but using "badrcptto" for >whitelisting mechanism, which works very well (might be a bit slow due to >the reason that lookup is being done into txt database).
Ok. Good choice.
>I am also using >http://linux.voyager.hr/ucspi-tcp/tcpserver-limits-2004-07-25.diff patch to >limit concurrent connection from single IP. This helps identifying Virus >trodden computers and denying them connection (it's a boon).
>I also have Caching-DNS on this Server (djbdns).
>About the todo patches the comments of Dave Sill (of Qmail Handbook fame) >are interesting to note in the thread: > >"Outbound email rate slows when inbound rate is high" >http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&c2coff=1&threadm=e6c47de 7.0310091325.147cade4%40posting.google.com&rnum=2&prev=/groups%3Fq%3Dext-tod o%26hl%3Den%26lr%3D%26ie%3DUTF-8%26c2coff%3D1%26selm%3De6c47de7.0310091325.1 47cade4%2540posting.google.com%26rnum%3D2
Dave is right. No doubt.
>Also one can have a look at the thread >"ext-todo and big-todo patches" >http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&c2coff=1&threadm=wx0lm56 pfo0.fsf%40sws5.ctd.ornl.gov&rnum=1&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUT F-8%26c2coff%3D1%26q%3Ddave%2Bsill%2Bext-todo%2Band%2Bbig-todo
>I tried to apply the patch >http://www.nrg4u.com/qmail/ext_todo-20030105.patch over and above >spamcontrol but it failed at: > >patch p1 < ext_todo-20030105.patch >patching file p1 >patching file EXTTODO-INFO >patching file FILES >patching file Makefile >Hunk #2 succeeded at 713 (offset 8 lines). >Hunk #4 succeeded at 818 (offset 8 lines). >Hunk #5 succeeded at 1598 (offset 87 lines). >Hunk #6 succeeded at 1585 (offset 9 lines). >Hunk #7 succeeded at 1694 (offset 87 lines). >patching file TARGETS >Hunk #1 succeeded at 405 (offset 20 lines). >patching file hier.c >Hunk #1 succeeded at 115 with fuzz 1 (offset 7 lines). >patching file install-big.c >patching file qmail-send.c >Hunk #1 FAILED at 1215. >Hunk #2 succeeded at 1527 (offset 88 lines). >Hunk #3 succeeded at 1662 (offset 20 lines). >Hunk #4 succeeded at 1797 (offset 88 lines). >1 out of 4 hunks FAILED -- saving rejects to file qmail-send.c.rej >patching file qmail-start.c >patching file qmail-todo.c
Yes. I did not dare to include the ext-big-todo into SPAMCONTROL (yet).
Okay, that's your choice. But, many of the SpamControl Features are violations of RFCs. Its choice of an individual user to enable those or not.
It breaks the philosphy of Qmail.
But, many people of repute have recommended separating the processing of "todo" from "qmail-send". You also, may think on it. It's my polite recommendation too.
>It also might be possible that my disk speed be contributing a bit to the >bottleneck. Current iostat figures are (not taken at the silly qmail >syndrome time, I would take that figure to when I am hit again).
>iostat -d -x /dev/hda2 2 >Linux 2.4.18-18.7.x (slsp-da4p21) 09/07/2004 > >Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >avgrq-sz avgqu-sz await svctm %util >/dev/hda2 0.03 2.27 1.85 0.37 14.98 21.73 7.49 10.86 >16.57 0.13 208.66 171.08 3.79 > >Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >avgrq-sz avgqu-sz await svctm %util >/dev/hda2 0.00 0.00 2.00 0.00 16.00 0.00 8.00 0.00 >8.00 0.15 75.00 50.00 1.00 > >Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >avgrq-sz avgqu-sz await svctm %util >/dev/hda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >0.00 0.00 0.00 0.00 0.00 > >Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >avgrq-sz avgqu-sz await svctm %util >/dev/hda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >0.00 0.00 0.00 0.00 0.00 > >I am getting more and more convinced that ext-todo might be a possible >solution in this situation.
>What do you feel !
Well. Its hard to tell and certainly a "feeling" is not enough to solve your situation.
I know, "feeling" is not enough. But, we do at times start the diagnosis with a note of hunch.
You did already a lot of diagnosis and most of your attempts are ok (as far as I can tell).
Thank you (and me too, an expert says that I am in right direction).
As mentioned before, you have to explain a little
a) what hardware you are using
I have detailed the hardware in the beginning of this mail above.
b) what is your average (and not mean) email volume
You can have some idea of email volume from the enclosed graph (msg-day.png).
c) what Anti-Virus and Anti-Spam tools are you using
AntiVirus is clamav-0.75.1 and AntiSpam is SpamAssassin-2.63 with patched version of qmail-scanner "Qmail-Scanner-1.23st (st patch)" from http://xoomer.virgilio.it/j.toribio/qmail-scanner/. This patched version of qmail-scanner has been used to selectively enable only 20% of the domains to have AntiVirus/AntiSpam enabled. I am also using the "--sa-reject" option to have spam messages with a score higher than sa-delete (score of 16 in my case) to be rejected before the smtp session is closed.
d) what is your general Qmail setup
What do you need to know in this case specifically.
e) what is the rate of inbound and outbound traffic
You may get some idea from the attached graphs Local-n-Remote-concurrency-day.png and Smtp-Concurrency-day.png.
From what I can see from your graph, you have typically around 400 email in the queue.
Yes, thats the general trend.
Use my "newanalyse" package + qmailanaloge and you get a nice summary statistics.
I had thought several times to have these deployed. But, have not yet done due to two reasons, a) I am not sure how much extra load they would put on the Server, b) Did not get time to do it.
But, now I would follow your advice. However, I am already using your modified "qmFind", a wonderful tool.
- It might me worthwilhe to reduce the incoming-concurrency. Drop it to 30.
Any figures less than 80 would cause lot many Servers not to get smtp connect to our Server during peak time of 0100 to 0500 hrs EDT.
Actually, the result of this is opposite from what it seems: This helps to better "arbiter" qmail-smtpd and thus improves overall performance.
I do agree and understand this point as I have tested that myself. But, reducing the "concurrencyincoming" too much would refuse smtp connects to lot many Mail Servers and even clients using SMTP-Authentication.
- Most of load originates presumably from I/0. You should try everything to reduce it. In particular, use SPAMCONTROLs badmimetype filter in the first place.
Thanks, I am already using the "badmimmetype" filter, it does its job marvelously. BTW, I am also using LOCALMFCHECK, MFDNSCHECK, MAXRECIPIENTS="20", MAXCONNIP="5".
However, if you use the qmail-queue extension and lets say qmail-scanner, this wont help much (for reasons I will explain in SPAMCONTROL 2.3).
I would be eagerly waiting for the reasoning you would provide.
Use my QMVC instead.
I would study it. Thanks.
PS: Very interesting discussion.
I am very glad.
Once again thanks a lot.