system: 4 CPU Sun E450, solaris 5.9, gcc 3.4.3

before i start: i don't have a copy of gdb on this system, so i'm
unable to provide a debug log.

this system is fairly low load (after blocklists, something less than
50k messages per day) and has been running 0.88.7 since mid-january with
no problems. i tried an upgrade to 0.90 when it came out with many of
the same issues that people were seeing so i backed out to 0.88.7.

recently, i tried installing 0.90.1 with slightly different
issues. here's my initial clamd.conf:

LogTime yes
LogSyslog yes
LogFacility LOG_MAIL
TemporaryDirectory /export/home/clamav/tmp
LocalSocket /var/clamav/clamd.sock
FixStaleSocket yes
MaxConnectionQueueLength 32
StreamMaxLength 64M
MaxThreads 64
SelfCheck 3600
User clamav
ScanMail yes
ScanArchive yes

clam is started with the following code fragment:

   if [ -f /export/home/clamav/sbin/clamd -a -f 
/export/home/clamav/etc/clamd.conf ] ; then
           echo "clamd starting."
           /export/home/clamav/sbin/clamd >/dev/console 2>&1
   fi
   sleep 30
   if [ -f /export/home/clamav/sbin/clamav-milter ] ; then
           echo "clamav-milter starting."
           /export/home/clamav/sbin/clamav-milter -PHl --postmaster=root -m 64 
--external /var/clamav/clmilter.sock >/dev/console 2>&1

the only change from the previous (0.88.7) startup is the addition of
the 'sleep' between clamd and clamav-milter.

with this setup, clam starts up, scans messages and find Bad Things.
just as with 0.90, however, the system load grows over time, but seems
to grow more slowly. after about 15 minutes, CPU on a four-processor
system is pegged and the system load is about 9 and slowly
growing. (by this time with 0.90 i would have had a load of 40+ with
a probable clamd crash.)

as a second try, i've changed
  ScanArchive yes
to
  ScanArchive no
and restarted.

the cpu usage for clamd *seems* to bounce around more than it did, but
it seems to recover, at least in the short term. i let the system run
for about a day with no visible problems. at around the 2-2.5 day
mark, however, we started seeing errors in the logs:

Mar 30 20:07:23 sennit sendmail[11833]: [ID 801593 mail.error] l2V01r0J011833: 
Milter (clamav): timeout before data read
Mar 30 20:07:23 sennit sendmail[11833]: [ID 801593 mail.info] l2V01r0J011833: 
Milter (clamav): to error state

which repeated for a while, to be followed by:

Mar 30 20:18:59 sennit sendmail[12799]: [ID 801593 mail.error] l2V0Ixa9012799: 
Milter (clamav): error connecting to filter: Connection refused by 
/var/clamav/clmilter.sock
Mar 30 20:18:59 sennit sendmail[12799]: [ID 801593 mail.info] l2V0Ixa9012799: 
Milter (clamav): to error state

by this point, clmilter was using 100% of one CPU (as far as i could tell).


the next thing was to change clamd.conf in two ways:

ScanArchive yes
ArchiveMaxRecursion 1

with these settings, the system was stable for about five-to-six hours (albeit
with significantly higher load than with ScanArchive no), at which
point the load started to grow and clamd started consuming more
CPU. at the 7 hour point it was using about 100% of available
CPU and the load was approaching 20.

at the moment, i'm probably going to back out to 0.88.7. before i do,
are there any other suggestions folks might have as to things to try?

rp

rick pim                                           [EMAIL PROTECTED]
information technology services                          (613) 533-2242
queen's university, kingston   
-----------------------------------------------------------------------
"Leaving a trail of slime wherev-"
           >CLICK!<
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://lurker.clamav.net/list/clamav-users.html

Reply via email to