[gentoo-user] Insane load on gentoo server - possibly clamassassin related?

2009-06-29 Thread Steve
Today my gentoo server that has sat happily churning my mundane (and 
lightweight) tasks froze and I noticed when it stopped serving DNS 
queries... and the server was even unresponsive from the command 
prompt.  I rebooted and was a bit taken aback at what I found.


The server currently runs, but has a load of over 60, where I'd expect a 
load of below 0.1.  Investigations using top did not suggest that a 
single process was using vast amounts of processing time... but there 
were significantly more clamascan processes than I'd expect... and even 
more procmail processes


--
$ ps auwx | grep clamscan | grep -v grep | wc -l
42
$ ps auwx | grep procmail | grep -v grep | wc -l
94
$ ps auwx | grep clamassassin | grep -v grep | wc -l
55
--

The first few lines from top say:

--
 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
15451 usr   20   0 35944  33m  872 D  2.7  3.3   0:00.60 clamscan
 216 root  15  -5 000 S  0.7  0.0   0:03.80 kswapd0
15116 usr   20   0 76136  15m  668 D  0.7  1.6   0:03.30 clamscan
15299 usr   20   0  2584 1224  840 R  0.7  0.1   0:04.36 top
15428 usr   20   0 61288  57m  872 D  0.7  5.7   0:01.38 clamscan
   1 root  20   0  1648  196  172 S  0.0  0.0   0:00.64 init
   2 root  15  -5 000 S  0.0  0.0   0:00.00 kthreadd
--

The procmail configuration I've adopted hasn't changed in years...
--
DEFAULT=$HOME/.maildir/
SHELL=/bin/sh
MAILDIR=$HOME/.maildir

:0fw
*  1024000
| /usr/bin/clamassassin | /usr/bin/spamc -f
--

I'm assuming that my suddenly starting to have problems with this is 
something to do with an update to clamd/clamassassin...  I've a vague 
recollection that one or the other of them might have been updated when 
I last synchronised and emerged updates... but I can't remember.


Any ideas?  This isn't a heavily loaded server usually - I've more 
procmail processes than I usually receive in emails in an hour.  
Something's wrong - can anyone offer any hints?  Has anyone else run 
into this problem?  Is there a known 'quick fix'?





Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?

2009-06-29 Thread Alan McKinnon
On Monday 29 June 2009 19:04:44 Steve wrote:
 Today my gentoo server that has sat happily churning my mundane (and
 lightweight) tasks froze and I noticed when it stopped serving DNS
 queries... and the server was even unresponsive from the command
 prompt.  I rebooted and was a bit taken aback at what I found.

 The server currently runs, but has a load of over 60, where I'd expect a
 load of below 0.1.  Investigations using top did not suggest that a
 single process was using vast amounts of processing time... but there
 were significantly more clamascan processes than I'd expect... and even
 more procmail processes

 --
 $ ps auwx | grep clamscan | grep -v grep | wc -l
 42
 $ ps auwx | grep procmail | grep -v grep | wc -l
 94
 $ ps auwx | grep clamassassin | grep -v grep | wc -l
 55
 --

 The first few lines from top say:

 --
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 15451 usr   20   0 35944  33m  872 D  2.7  3.3   0:00.60 clamscan
   216 root  15  -5 000 S  0.7  0.0   0:03.80 kswapd0
 15116 usr   20   0 76136  15m  668 D  0.7  1.6   0:03.30 clamscan
 15299 usr   20   0  2584 1224  840 R  0.7  0.1   0:04.36 top
 15428 usr   20   0 61288  57m  872 D  0.7  5.7   0:01.38 clamscan
 1 root  20   0  1648  196  172 S  0.0  0.0   0:00.64 init
 2 root  15  -5 000 S  0.0  0.0   0:00.00 kthreadd
 --

 The procmail configuration I've adopted hasn't changed in years...
 --
 DEFAULT=$HOME/.maildir/
 SHELL=/bin/sh
 MAILDIR=$HOME/.maildir

 :0fw

 *  1024000

 | /usr/bin/clamassassin | /usr/bin/spamc -f

 --

 I'm assuming that my suddenly starting to have problems with this is
 something to do with an update to clamd/clamassassin...  I've a vague
 recollection that one or the other of them might have been updated when
 I last synchronised and emerged updates... but I can't remember.

 Any ideas?  This isn't a heavily loaded server usually - I've more
 procmail processes than I usually receive in emails in an hour.
 Something's wrong - can anyone offer any hints?  Has anyone else run
 into this problem?  Is there a known 'quick fix'?

Looks like you have 200 processes sitting there blocking I/O. Is there 
anything related in the logs?

Your best bet is to examine emerge.log (better still - genlop) and find all 
recent upgrades that might affect this. Then roll them back one by one till 
the problem goes away. Once you know the errant package, we can start to 
examine diffs and see why it might behave like that.

-- 
alan dot mckinnon at gmail dot com



Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?

2009-06-29 Thread Jarry

Steve wrote:


$ ps auwx | grep clamscan | grep -v grep | wc -l 42
$ ps auwx | grep procmail | grep -v grep | wc -l 94
$ ps auwx | grep clamassassin | grep -v grep | wc -l 55
--

The first few lines from top say:

--
 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
15451 usr   20   0 35944  33m  872 D  2.7  3.3   0:00.60 clamscan
 216 root  15  -5 000 S  0.7  0.0   0:03.80 kswapd0
15116 usr   20   0 76136  15m  668 D  0.7  1.6   0:03.30 clamscan
15299 usr   20   0  2584 1224  840 R  0.7  0.1   0:04.36 top
15428 usr   20   0 61288  57m  872 D  0.7  5.7   0:01.38 clamscan
   1 root  20   0  1648  196  172 S  0.0  0.0   0:00.64 init
   2 root  15  -5 000 S  0.0  0.0   0:00.00 kthreadd
--

The procmail configuration I've adopted hasn't changed in years...
--
DEFAULT=$HOME/.maildir/
SHELL=/bin/sh
MAILDIR=$HOME/.maildir

:0fw
*  1024000
| /usr/bin/clamassassin | /usr/bin/spamc -f
--


Might be bug in clamd/spamassassin. But it could also be you are
being mail-bombed (e.g. infinite depth of compressed-in-compressed
attachements).

I recommend to include some limit for number of clamd/spamassassin
instances. Don't know if procmail has such a capability, but it is
easy to control it with wrappers like amavisd-new or MailScanner...

Jarry

--
___
This mailbox accepts e-mails only from selected mailing-lists!
Everything else is considered to be spam and therefore deleted.



Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?

2009-06-29 Thread Steve

Alan McKinnon wrote:
Looks like you have 200 processes sitting there blocking I/O. Is there 
anything related in the logs?
  

Not sure - as I'm not sure where to look, or what to look for.
Your best bet is to examine emerge.log (better still - genlop) and find all 
recent upgrades that might affect this. Then roll them back one by one till 
the problem goes away. Once you know the errant package, we can start to 
examine diffs and see why it might behave like that.
  
The only relevant package seems to be clamav... my emerge.log shows that 
I upgraded 8 packages yesterday just before 5pm - and the second of 
these was app-antivirus/clamav-0.95.2 - I think I simply chose to use 
the new configurations after issuing a dispatch-config... I didn't do 
anything 'adventurous'.


Perhaps this might be something to do with a long-forgotten hack for 
clamassassin to work with clamd that might have been overwritten...  
(changing CLAMSCAN=/usr/bin/clamscan to CLAMSCAN=/usr/bin/clamdscan in 
/usr/bin/clamassassin) but this seems odd - since the date on 
clamassassin is 7 September 2008... and this problem with my server is 
very recent - it was working fine yesterday... and clamassassin hasn't 
been re-installed since everything worked fine - only clamav was emerged.


As an interim hack, I've removed /usr/bin/clamassassin from my global 
procmailrc; stopped spamd; killed all the procmail and clamscan 
processes - and restarted postfix.  This has left me with an operational 
server with which I can interact.  It would seem very strange if I'm the 
only person having trouble with clamscan... in the context of what (I 
think) is a fairly standard postfix install.






Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?

2009-06-29 Thread Steve

Jarry wrote:

Might be bug in clamd/spamassassin. But it could also be you are
being mail-bombed (e.g. infinite depth of compressed-in-compressed
attachements).
I thought about that - but I can't find an offending email with a bogus 
attachment if I am.

I recommend to include some limit for number of clamd/spamassassin
instances. Don't know if procmail has such a capability, but it is
easy to control it with wrappers like amavisd-new or MailScanner...
I'd assumed that clamassassin would take care of this with some sensible 
defaults for me...


My default clamd.conf says:

--
# Maximum depth directories are scanned at.
# Default: 15
#MaxDirectoryRecursion 20
--

So, I'd imagine that would take care of this... conversely - it did seem 
a bit strange that clamassassin was configured to use clamscan not 
clamdscan (which would have made more sense to me) but it had been 
configured that way for a very long time according to the file-dates and 
it's only recently that things went awry for me...


My procmailrc is simply how I wire in my mail delivery filters.  I'd 
expect the filters themselves to behave sensibly...  Though it came as a 
bit of a shock to see that my postfix user had as many processes spawned 
as it did... I'd always thought that the purpose of postfix was to queue 
mail in order that it could be processed sequentially in order to avoid 
this sort of problem...





Re: [gentoo-user] Insane load on gentoo server - possibly clamassassin related?

2009-06-29 Thread Alan McKinnon
On Monday 29 June 2009 19:44:49 Steve wrote:
 Alan McKinnon wrote:
  Looks like you have 200 processes sitting there blocking I/O. Is there
  anything related in the logs?

 Not sure - as I'm not sure where to look, or what to look for.

  Your best bet is to examine emerge.log (better still - genlop) and find
  all recent upgrades that might affect this. Then roll them back one by
  one till the problem goes away. Once you know the errant package, we can
  start to examine diffs and see why it might behave like that.

 The only relevant package seems to be clamav... my emerge.log shows that
 I upgraded 8 packages yesterday just before 5pm - and the second of
 these was app-antivirus/clamav-0.95.2 - I think I simply chose to use
 the new configurations after issuing a dispatch-config... I didn't do
 anything 'adventurous'.

 Perhaps this might be something to do with a long-forgotten hack for
 clamassassin to work with clamd that might have been overwritten...
 (changing CLAMSCAN=/usr/bin/clamscan to CLAMSCAN=/usr/bin/clamdscan in
 /usr/bin/clamassassin) but this seems odd - since the date on
 clamassassin is 7 September 2008... and this problem with my server is
 very recent - it was working fine yesterday... and clamassassin hasn't
 been re-installed since everything worked fine - only clamav was emerged.

 As an interim hack, I've removed /usr/bin/clamassassin from my global
 procmailrc; stopped spamd; killed all the procmail and clamscan
 processes - and restarted postfix.  This has left me with an operational
 server with which I can interact.  It would seem very strange if I'm the
 only person having trouble with clamscan... in the context of what (I
 think) is a fairly standard postfix install.

That looks sane enough. I guess now you get to keep an eye on it for a few 
days.



-- 
alan dot mckinnon at gmail dot com