[gentoo-user] Insane load on gentoo server - possibly clamassassin related?
Today my gentoo server that has sat happily churning my mundane (and lightweight) tasks froze and I noticed when it stopped serving DNS queries... and the server was even unresponsive from the command prompt. I rebooted and was a bit taken aback at what I found. The server currently runs, but has a load of over 60, where I'd expect a load of below 0.1. Investigations using top did not suggest that a single process was using vast amounts of processing time... but there were significantly more clamascan processes than I'd expect... and even more procmail processes -- $ ps auwx | grep clamscan | grep -v grep | wc -l 42 $ ps auwx | grep procmail | grep -v grep | wc -l 94 $ ps auwx | grep clamassassin | grep -v grep | wc -l 55 -- The first few lines from top say: -- PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 15451 usr 20 0 35944 33m 872 D 2.7 3.3 0:00.60 clamscan 216 root 15 -5 000 S 0.7 0.0 0:03.80 kswapd0 15116 usr 20 0 76136 15m 668 D 0.7 1.6 0:03.30 clamscan 15299 usr 20 0 2584 1224 840 R 0.7 0.1 0:04.36 top 15428 usr 20 0 61288 57m 872 D 0.7 5.7 0:01.38 clamscan 1 root 20 0 1648 196 172 S 0.0 0.0 0:00.64 init 2 root 15 -5 000 S 0.0 0.0 0:00.00 kthreadd -- The procmail configuration I've adopted hasn't changed in years... -- DEFAULT=$HOME/.maildir/ SHELL=/bin/sh MAILDIR=$HOME/.maildir :0fw * 1024000 | /usr/bin/clamassassin | /usr/bin/spamc -f -- I'm assuming that my suddenly starting to have problems with this is something to do with an update to clamd/clamassassin... I've a vague recollection that one or the other of them might have been updated when I last synchronised and emerged updates... but I can't remember. Any ideas? This isn't a heavily loaded server usually - I've more procmail processes than I usually receive in emails in an hour. Something's wrong - can anyone offer any hints? Has anyone else run into this problem? Is there a known 'quick fix'?
Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?
On Monday 29 June 2009 19:04:44 Steve wrote: Today my gentoo server that has sat happily churning my mundane (and lightweight) tasks froze and I noticed when it stopped serving DNS queries... and the server was even unresponsive from the command prompt. I rebooted and was a bit taken aback at what I found. The server currently runs, but has a load of over 60, where I'd expect a load of below 0.1. Investigations using top did not suggest that a single process was using vast amounts of processing time... but there were significantly more clamascan processes than I'd expect... and even more procmail processes -- $ ps auwx | grep clamscan | grep -v grep | wc -l 42 $ ps auwx | grep procmail | grep -v grep | wc -l 94 $ ps auwx | grep clamassassin | grep -v grep | wc -l 55 -- The first few lines from top say: -- PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 15451 usr 20 0 35944 33m 872 D 2.7 3.3 0:00.60 clamscan 216 root 15 -5 000 S 0.7 0.0 0:03.80 kswapd0 15116 usr 20 0 76136 15m 668 D 0.7 1.6 0:03.30 clamscan 15299 usr 20 0 2584 1224 840 R 0.7 0.1 0:04.36 top 15428 usr 20 0 61288 57m 872 D 0.7 5.7 0:01.38 clamscan 1 root 20 0 1648 196 172 S 0.0 0.0 0:00.64 init 2 root 15 -5 000 S 0.0 0.0 0:00.00 kthreadd -- The procmail configuration I've adopted hasn't changed in years... -- DEFAULT=$HOME/.maildir/ SHELL=/bin/sh MAILDIR=$HOME/.maildir :0fw * 1024000 | /usr/bin/clamassassin | /usr/bin/spamc -f -- I'm assuming that my suddenly starting to have problems with this is something to do with an update to clamd/clamassassin... I've a vague recollection that one or the other of them might have been updated when I last synchronised and emerged updates... but I can't remember. Any ideas? This isn't a heavily loaded server usually - I've more procmail processes than I usually receive in emails in an hour. Something's wrong - can anyone offer any hints? Has anyone else run into this problem? Is there a known 'quick fix'? Looks like you have 200 processes sitting there blocking I/O. Is there anything related in the logs? Your best bet is to examine emerge.log (better still - genlop) and find all recent upgrades that might affect this. Then roll them back one by one till the problem goes away. Once you know the errant package, we can start to examine diffs and see why it might behave like that. -- alan dot mckinnon at gmail dot com
Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?
Steve wrote: $ ps auwx | grep clamscan | grep -v grep | wc -l 42 $ ps auwx | grep procmail | grep -v grep | wc -l 94 $ ps auwx | grep clamassassin | grep -v grep | wc -l 55 -- The first few lines from top say: -- PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 15451 usr 20 0 35944 33m 872 D 2.7 3.3 0:00.60 clamscan 216 root 15 -5 000 S 0.7 0.0 0:03.80 kswapd0 15116 usr 20 0 76136 15m 668 D 0.7 1.6 0:03.30 clamscan 15299 usr 20 0 2584 1224 840 R 0.7 0.1 0:04.36 top 15428 usr 20 0 61288 57m 872 D 0.7 5.7 0:01.38 clamscan 1 root 20 0 1648 196 172 S 0.0 0.0 0:00.64 init 2 root 15 -5 000 S 0.0 0.0 0:00.00 kthreadd -- The procmail configuration I've adopted hasn't changed in years... -- DEFAULT=$HOME/.maildir/ SHELL=/bin/sh MAILDIR=$HOME/.maildir :0fw * 1024000 | /usr/bin/clamassassin | /usr/bin/spamc -f -- Might be bug in clamd/spamassassin. But it could also be you are being mail-bombed (e.g. infinite depth of compressed-in-compressed attachements). I recommend to include some limit for number of clamd/spamassassin instances. Don't know if procmail has such a capability, but it is easy to control it with wrappers like amavisd-new or MailScanner... Jarry -- ___ This mailbox accepts e-mails only from selected mailing-lists! Everything else is considered to be spam and therefore deleted.
Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?
Alan McKinnon wrote: Looks like you have 200 processes sitting there blocking I/O. Is there anything related in the logs? Not sure - as I'm not sure where to look, or what to look for. Your best bet is to examine emerge.log (better still - genlop) and find all recent upgrades that might affect this. Then roll them back one by one till the problem goes away. Once you know the errant package, we can start to examine diffs and see why it might behave like that. The only relevant package seems to be clamav... my emerge.log shows that I upgraded 8 packages yesterday just before 5pm - and the second of these was app-antivirus/clamav-0.95.2 - I think I simply chose to use the new configurations after issuing a dispatch-config... I didn't do anything 'adventurous'. Perhaps this might be something to do with a long-forgotten hack for clamassassin to work with clamd that might have been overwritten... (changing CLAMSCAN=/usr/bin/clamscan to CLAMSCAN=/usr/bin/clamdscan in /usr/bin/clamassassin) but this seems odd - since the date on clamassassin is 7 September 2008... and this problem with my server is very recent - it was working fine yesterday... and clamassassin hasn't been re-installed since everything worked fine - only clamav was emerged. As an interim hack, I've removed /usr/bin/clamassassin from my global procmailrc; stopped spamd; killed all the procmail and clamscan processes - and restarted postfix. This has left me with an operational server with which I can interact. It would seem very strange if I'm the only person having trouble with clamscan... in the context of what (I think) is a fairly standard postfix install.
Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?
Jarry wrote: Might be bug in clamd/spamassassin. But it could also be you are being mail-bombed (e.g. infinite depth of compressed-in-compressed attachements). I thought about that - but I can't find an offending email with a bogus attachment if I am. I recommend to include some limit for number of clamd/spamassassin instances. Don't know if procmail has such a capability, but it is easy to control it with wrappers like amavisd-new or MailScanner... I'd assumed that clamassassin would take care of this with some sensible defaults for me... My default clamd.conf says: -- # Maximum depth directories are scanned at. # Default: 15 #MaxDirectoryRecursion 20 -- So, I'd imagine that would take care of this... conversely - it did seem a bit strange that clamassassin was configured to use clamscan not clamdscan (which would have made more sense to me) but it had been configured that way for a very long time according to the file-dates and it's only recently that things went awry for me... My procmailrc is simply how I wire in my mail delivery filters. I'd expect the filters themselves to behave sensibly... Though it came as a bit of a shock to see that my postfix user had as many processes spawned as it did... I'd always thought that the purpose of postfix was to queue mail in order that it could be processed sequentially in order to avoid this sort of problem...
Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?
On Monday 29 June 2009 19:44:49 Steve wrote: Alan McKinnon wrote: Looks like you have 200 processes sitting there blocking I/O. Is there anything related in the logs? Not sure - as I'm not sure where to look, or what to look for. Your best bet is to examine emerge.log (better still - genlop) and find all recent upgrades that might affect this. Then roll them back one by one till the problem goes away. Once you know the errant package, we can start to examine diffs and see why it might behave like that. The only relevant package seems to be clamav... my emerge.log shows that I upgraded 8 packages yesterday just before 5pm - and the second of these was app-antivirus/clamav-0.95.2 - I think I simply chose to use the new configurations after issuing a dispatch-config... I didn't do anything 'adventurous'. Perhaps this might be something to do with a long-forgotten hack for clamassassin to work with clamd that might have been overwritten... (changing CLAMSCAN=/usr/bin/clamscan to CLAMSCAN=/usr/bin/clamdscan in /usr/bin/clamassassin) but this seems odd - since the date on clamassassin is 7 September 2008... and this problem with my server is very recent - it was working fine yesterday... and clamassassin hasn't been re-installed since everything worked fine - only clamav was emerged. As an interim hack, I've removed /usr/bin/clamassassin from my global procmailrc; stopped spamd; killed all the procmail and clamscan processes - and restarted postfix. This has left me with an operational server with which I can interact. It would seem very strange if I'm the only person having trouble with clamscan... in the context of what (I think) is a fairly standard postfix install. That looks sane enough. I guess now you get to keep an eye on it for a few days. -- alan dot mckinnon at gmail dot com