Re: [clamav-users] Occasional sendmail queue delay when using clamav-milter

2018-05-03 Thread G.W. Haywood

Hello again,

On Wed, 2 May 2018, Aaron Paetznick wrote:


... both very helpful, thanks! ... sorry for the late reply ...


It's why we're here.  Don't be. :)


... small course corrections and long periods of observation.


The correct approach, IMHO.


Here are my log levels:

define(`confMILTER_LOG_LEVEL', `8')
define(`confLOG_LEVEL', `10')


For investigation I'd suggest 22 and 15 respectively, but keep an eye
on the logs as that's rather verbose.  And don't tell Claus. :)


For the record, I'm running a distinct copy of clamd using a local unix
domain socket on each server. I'd be happy to consider switching to TCP,


I wouldn't suggest that's necessary, nor even desirable.


...
Here's some threaded top output for my clamd process:
...
?? PID USER? PR? NI??? VIRT??? RES??? SHR S %CPU %MEM TIME+ COMMAND
? 1324 clamav??? 20?? 0? 975436 709588?? 3264 S? 0.0? 5.9 3:17.37 clamd
? 1678 clamav??? 20?? 0? 975436 709588?? 3264 S? 0.0? 5.9 0:08.38 clamd
?57290 clamav??? 20?? 0? 975436 709588?? 3264 S? 0.0? 5.9 1:17.77 clamd
?57438 clamav??? 20?? 0? 975436 709588?? 3264 S? 0.0? 5.9 0:56.31 clamd


Interesting.  This is the sort of thing I see here:

670 clamav20   0 1035580 673556  12344 S  0.0  4.1 199:44.38 clamd
671 clamav20   0 1035580 673556  12344 S  0.0  4.1   0:00.22 clamd

This clamd instance has been running for nearly six months.  As you
can see there are only two threads and only one of them has done any
real work.  (The reason that it hardly ever does any work is that my
homebrew milter does all the heavy lifting before clamd gets a chance
to see the mail data.)  I typically see more clamav-milter threads.
Maybe this is just characteristic of your heavier loads.  Maybe this
is a threads issue - it's the sort of thing they do - but I'd have
expected to hear more about it on the list if it were something in
need of fixing and I don't want to send you off chasing wild geese.


Reloading the clamd databases typically takes under 10 seconds.


That's a lot quicker than typical here, which databases are you using?

I am running two total milters, the other being opendkim. ... 
INPUT_MAIL_FILTER(`clamav',

`S=local:/var/run/clamav-milter/clamav-milter.sock, F=T, T=S:4m;R:4m')
INPUT_MAIL_FILTER(`opendkim', `S=local:/var/run/opendkim/opendkim.sock')


You're using the default (10 second) timeouts for 'S' and 'R' for the
opendkim milter, you might want to consider increasing them, although
I've used similar setups in the past without issues (I've never used
the opendkim milter, I rolled my own:).


All things considered, I may just decide to temporarily set the OnFail
option to Accept in my clamav-milter.conf file. It's certainly not an
ideal solution though.


It certainly isn't.  As it happens a colleague in the USA mentioned
that he occasionally has to restart clamd too.  Except when I've done
something daft I don't recall ever having to do that, so I'm starting
to wonder if there's something not going on here that _is_ going on
elsewhere.  If you'd like to let me have a copy of your clamd.conf
privately I'll compare it with my own and my colleague's to see if
anything obvious meets the eye.  I've tweaked our filter rules so mail
from your address to my list address won't (shouldn't:) be rejected.

--

73,
Ged.
___
clamav-users mailing list
clamav-users@lists.clamav.net
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] Occasional sendmail queue delay when using clamav-milter

2018-05-01 Thread Aaron Paetznick
This and Ted's comment about falling back to a permissive OnFail were 
both very helpful, thanks! I'm sorry for the late reply, but problems 
like this require small course corrections and long periods of observation.


Here are my log levels:

define(`confMILTER_LOG_LEVEL', `8')
define(`confLOG_LEVEL', `10')


For the record, I'm running a distinct copy of clamd using a local unix 
domain socket on each server. I'd be happy to consider switching to TCP, 
but because I'm running a distinct clamd per server I think it's 
unlikely the overhead would be worth it. I could be wrong though.


The VM I'm using for testing has 12GB of RAM allocated right now, and 
averages about 70% memory utilization and under 1% CPU utilization. The 
pool is setup in a DNS round-robin, so the rest should be similar.


Here's some threaded top output for my clamd process:

[root@smtp ~]# top -b -n 1 -H -p 1324
top - 13:50:11 up 6 days,  1:25,  1 user,  load average: 0.14, 0.15, 0.14
Threads:   4 total,   0 running,   4 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  
0.0 st

KiB Mem : 12121752 total,  3800616 free,   966264 used, 7354872 buff/cache
KiB Swap:  1679356 total,  1649968 free,    29388 used. 10741208 avail Mem

   PID USER  PR  NI    VIRT    RES    SHR S %CPU %MEM TIME+ COMMAND
  1324 clamav    20   0  975436 709588   3264 S  0.0  5.9 3:17.37 clamd
  1678 clamav    20   0  975436 709588   3264 S  0.0  5.9 0:08.38 clamd
 57290 clamav    20   0  975436 709588   3264 S  0.0  5.9 1:17.77 clamd
 57438 clamav    20   0  975436 709588   3264 S  0.0  5.9 0:56.31 clamd
[root@smtp ~]#


Reloading the clamd databases typically takes under 10 seconds.

One more thing. I am running two total milters, the other being 
opendkim. Your comment got me thinking that what I'm seeing may not be 
clamd itself. Maybe clamd is aggravating a problem with opendkim, or 
maybe something is going wrong with the interaction between the two? 
Here's how I'm running them side-by-side:


INPUT_MAIL_FILTER(`clamav', 
`S=local:/var/run/clamav-milter/clamav-milter.sock, F=T, T=S:4m;R:4m')

INPUT_MAIL_FILTER(`opendkim', `S=local:/var/run/opendkim/opendkim.sock')


All things considered, I may just decide to temporarily set the OnFail 
option to Accept in my clamav-milter.conf file. It's certainly not an 
ideal solution though.



On 5/1/2018 11:34 AM, G.W. Haywood wrote:

Hi there,

On Tue, 1 May 2018, Aaron Paetznick wrote:


Occasionally a small percentage of email will seemingly unnecessarily
get held in the queue when using clamav-milter, although it will get
delivered successfully on the first attempt with the next queue run. The
size, time, sender, and recipient all seem to be irrelevant. Our
work-around is to simply process the queue every 5 minutes, but this is
not sustainable. We've conclusively narrowed it down to ClamAV, as the
problem vanishes when we comment out the INPUT_MAIL_FILTER line in our
sendmail.cf file. Here's that milter line:

INPUT_MAIL_FILTER(`clamav',
`S=local:/var/run/clamav-milter/clamav-milter.sock, F=T, T=S:4m;R:4m')


The intermittent ones are the trickiest. :(

I haven't seen anything like this issue, although our queues won't be
nearly as busy as yours.

Typically database reloads here (modest hardware) take a couple of
minutes, and your T=S and T=R timeouts for clamav-milter are, like
ours, much longer than that.  But if you have other milters for which
the timeouts are not so long, perhaps it could be an issue.  Even so,
one might then expect that you'd have noticed a correlation with the
reloads, and apparently you haven't.  A puzzle. :)

What log levels are you using for Sendmail and the milters?

Are you using a single clamd instance for all the servers, or one per
server, or ...?

Do you know how long your database reloads take?  Are they reliably
taking that long or do they sometimes stall?

What connections type(s) are you using for clamd?

Have you checked that clamd always responds quickly/reliably to PINGs
on the socket?

Can we take it that you do have enough memory?

--

73,
Ged.
___
clamav-users mailing list
clamav-users@lists.clamav.net
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml



___
clamav-users mailing list
clamav-users@lists.clamav.net
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


Re: [clamav-users] Occasional sendmail queue delay when using clamav-milter

2018-05-01 Thread G.W. Haywood

Hi there,

On Tue, 1 May 2018, Aaron Paetznick wrote:


Occasionally a small percentage of email will seemingly unnecessarily
get held in the queue when using clamav-milter, although it will get
delivered successfully on the first attempt with the next queue run. The
size, time, sender, and recipient all seem to be irrelevant. Our
work-around is to simply process the queue every 5 minutes, but this is
not sustainable. We've conclusively narrowed it down to ClamAV, as the
problem vanishes when we comment out the INPUT_MAIL_FILTER line in our
sendmail.cf file. Here's that milter line:

INPUT_MAIL_FILTER(`clamav',
`S=local:/var/run/clamav-milter/clamav-milter.sock, F=T, T=S:4m;R:4m')


The intermittent ones are the trickiest. :(

I haven't seen anything like this issue, although our queues won't be
nearly as busy as yours.

Typically database reloads here (modest hardware) take a couple of
minutes, and your T=S and T=R timeouts for clamav-milter are, like
ours, much longer than that.  But if you have other milters for which
the timeouts are not so long, perhaps it could be an issue.  Even so,
one might then expect that you'd have noticed a correlation with the
reloads, and apparently you haven't.  A puzzle. :)

What log levels are you using for Sendmail and the milters?

Are you using a single clamd instance for all the servers, or one per
server, or ...?

Do you know how long your database reloads take?  Are they reliably
taking that long or do they sometimes stall?

What connections type(s) are you using for clamd?

Have you checked that clamd always responds quickly/reliably to PINGs
on the socket?

Can we take it that you do have enough memory?

--

73,
Ged.
___
clamav-users mailing list
clamav-users@lists.clamav.net
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml


[clamav-users] Occasional sendmail queue delay when using clamav-milter

2018-04-30 Thread Aaron Paetznick
We use clamav-milter with sendmail on a pool of SMTP servers to handle 
email delivery for some 10k mailboxes. This has been working well for a 
long time, but we've noticed a subtle problem creep up on us.


Occasionally a small percentage of email will seemingly unnecessarily 
get held in the queue when using clamav-milter, although it will get 
delivered successfully on the first attempt with the next queue run. The 
size, time, sender, and recipient all seem to be irrelevant. Our 
work-around is to simply process the queue every 5 minutes, but this is 
not sustainable. We've conclusively narrowed it down to ClamAV, as the 
problem vanishes when we comment out the INPUT_MAIL_FILTER line in our 
sendmail.cf file. Here's that milter line:


INPUT_MAIL_FILTER(`clamav', 
`S=local:/var/run/clamav-milter/clamav-milter.sock, F=T, T=S:4m;R:4m')



My first thought was some sort of resource contention, but honestly the 
servers are individually not very busy. I had briefly thought the 
problem might align with database reloads in the clamd.log file, but 
that just didn't seem to be the case either. We're currently using 
ClamAV 0.100.0 with sendmail 8.15.2 on CentOS 7.4, although the problem 
has been with us though most of the 0.9x series as well. I can send a 
lot more details if it will help.


I guess I'm just wondering if three are any "gotchas" with using ClamAV 
in this way, and if there are any best-practices we may be missing. Thanks!

___
clamav-users mailing list
clamav-users@lists.clamav.net
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml