Bug#939850: clamav-daemon: timeout on /run/clamav/clamd.ctl, too slow to reload daily.ldb rules

2019-11-09 Thread Baptiste BEAUPLAT
Hello,

For anyone looking for a workaround with Exim, I have been using the
`tmo` option of the `malware` action. This allow to customize the
timeout for the AV scanner.

diff --git a/exim4/check_data_local_acl b/exim4/check_data_local_acl
index 538f3a9..4013b35 100644
--- a/exim4/check_data_local_acl
+++ b/exim4/check_data_local_acl
@@ -13,7 +13,7 @@ deny

 # Antivirus scan
 deny
-malware = *
+malware = * / tmo=5m
 message = This message was detected as possible malware
($malware_name).

This option is documented here:
https://www.exim.org/exim-html-current/doc/html/spec_html/ch-content_scanning_at_acl_time.html

-- 
Baptiste BEAUPLAT - lyknode



signature.asc
Description: OpenPGP digital signature


Bug#939850: clamav-daemon: timeout on /run/clamav/clamd.ctl, too slow to reload daily.ldb rules

2019-09-09 Thread Baptiste BEAUPLAT
Package: clamav-daemon
Severity: important

Dear maintainer,

Since a couple of week, I get the following errors from my MTA exim:

From /var/log/exim4/paniclog: 

malware acl condition: clamd /var/run/clamav/clamd.ctl : unable to read 
from socket (Connection timed out)

I'm on buster with an up-to-date box, using the following version
for clamav:

clamav   0.101.4+dfsg-0+deb10u1   amd64
clamav-base  0.101.4+dfsg-0+deb10u1   all 
clamav-daemon0.101.4+dfsg-0+deb10u1   amd64
clamav-freshclam 0.101.4+dfsg-0+deb10u1   amd64
libclamav9:amd64 0.101.4+dfsg-0+deb10u1   amd64

Going through clamav's log. I found out that the socket become
unavailable when clamav is reloading its database. See bug #432334[1].

Usually, this is not a problem since the reload is quite quick.
However, currently and on my server, it take up to 2 minutes and half.

See logs:

Aug 25 09:02:44 lyra clamd[892]: Sun Aug 25 09:02:44 2019 -> Reading 
databases from /var/lib/clamav
Aug 25 09:05:29 lyra clamd[892]: Sun Aug 25 09:05:29 2019 -> Database 
correctly reloaded (6291419 signatures)
Aug 26 09:05:21 lyra clamd[892]: Mon Aug 26 09:05:21 2019 -> Reading 
databases from /var/lib/clamav
Aug 26 09:08:09 lyra clamd[892]: Mon Aug 26 09:08:09 2019 -> Database 
correctly reloaded (6292270 signatures)
Aug 27 09:07:57 lyra clamd[892]: Tue Aug 27 09:07:57 2019 -> Reading 
databases from /var/lib/clamav
Aug 27 09:10:47 lyra clamd[892]: Tue Aug 27 09:10:47 2019 -> Database 
correctly reloaded (6293807 signatures)
Aug 28 09:10:36 lyra clamd[892]: Wed Aug 28 09:10:36 2019 -> Reading 
databases from /var/lib/clamav
Aug 28 09:13:26 lyra clamd[892]: Wed Aug 28 09:13:26 2019 -> Database 
correctly reloaded (6295218 signatures)
Aug 29 09:13:16 lyra clamd[892]: Thu Aug 29 09:13:16 2019 -> Reading 
databases from /var/lib/clamav
Aug 29 09:16:07 lyra clamd[892]: Thu Aug 29 09:16:07 2019 -> Database 
correctly reloaded (6296694 signatures)
Aug 30 09:15:55 lyra clamd[892]: Fri Aug 30 09:15:55 2019 -> Reading 
databases from /var/lib/clamav
Aug 30 09:18:48 lyra clamd[892]: Fri Aug 30 09:18:48 2019 -> Database 
correctly reloaded (629 signatures)
Aug 31 09:18:38 lyra clamd[892]: Sat Aug 31 09:18:38 2019 -> Reading 
databases from /var/lib/clamav
Aug 31 09:21:32 lyra clamd[892]: Sat Aug 31 09:21:32 2019 -> Database 
correctly reloaded (6300225 signatures)

(today's logs are the same)

During that reload time, clamav become unreachable and trigger errors
for exim. I had a look at what files where taking that much time to
reload and the issue seems to come from the daily.ldb.

# for file in *.cvd; do echo -n "${file} "; clamscan -d ${file} 
/etc/modules | grep Time:; done
bytecode.cvd Time: 0.033 sec (0 m 0 s)
daily.cvd Time: 73.116 sec (1 m 13 s)
main.cvd Time: 8.738 sec (0 m 8 s)

# sigtool -u daily.cvd

# for file in daily.*; do echo -n "${file} "; clamscan -d ${file} 
/etc/modules | grep Time:; done
daily.cdb Time: 0.008 sec (0 m 0 s)
daily.cfg Time: 0.009 sec (0 m 0 s)
daily.crb Time: 0.008 sec (0 m 0 s)
daily.fp Time: 0.009 sec (0 m 0 s)
daily.ftm Time: 0.008 sec (0 m 0 s)
daily.hdb Time: 0.447 sec (0 m 0 s)
daily.hdu Time: 0.008 sec (0 m 0 s)
daily.hsb Time: 1.609 sec (0 m 1 s)
daily.hsu Time: 0.008 sec (0 m 0 s)
daily.idb Time: 0.009 sec (0 m 0 s)
daily.ign Time: 0.010 sec (0 m 0 s)
daily.ign2 Time: 0.010 sec (0 m 0 s)
daily.ldb Time: 62.581 sec (1 m 2 s)
daily.ldu Time: 0.008 sec (0 m 0 s)
daily.mdb Time: 0.084 sec (0 m 0 s)
daily.mdu Time: 0.008 sec (0 m 0 s)
daily.msb Time: 0.008 sec (0 m 0 s)
daily.msu Time: 0.009 sec (0 m 0 s)
daily.ndb Time: 0.029 sec (0 m 0 s)
daily.ndu Time: 0.009 sec (0 m 0 s)
daily.pdb Time: 0.020 sec (0 m 0 s)
daily.sfp Time: 0.008 sec (0 m 0 s)
daily.wdb Time: 0.024 sec (0 m 0 s)

I tried to split the daily.ldb to isolate the problematic rules
but failed as they seems spread around the file:

# split -l 1 --additional-suffix=.ldb daily.ldb
# for file in *.ldb; do echo -n "${file} "; clamscan -d ${file} 
/etc/modules | grep Time:; done
xaa.ldb Time: 0.470 sec (0 m 0 s)
xab.ldb Time: 0.629 sec (0 m 0 s)
xac.ldb Time: 0.512 sec (0 m 0 s)
xad.ldb Time: 0.634 sec (0 m 0 s)
xae.ldb Time: 0.463 sec (0 m 0 s)
xaf.ldb Time: 2.419 sec (0 m 2 s)
xag.ldb Time: 1.651 sec (0 m 1 s)
xah.ldb Time: 7.637 sec (0 m 7 s)

(Not sure why the time doesn't add-up here).

I did the same test on a up-to-date sid box and found the same result.

Fixing the performances on database reload would be nice, but I
think making the socket available during reload would be even better.
There is a patch on the upstream bugzilla that implement that[2].

Best,