Re: Warnings when enabling URILocalBL plugin

2018-11-08 Thread Kevin A. McGrail
What are your config entries for the Geo Modules? What version of SA? -- Kevin A. McGrail VP Fundraising, Apache Software Foundation Chair Emeritus Apache SpamAssassin Project https://www.linkedin.com/in/kmcgrail - 703.798.0171 On Thu, Nov 8, 2018 at 2:05 AM Quinn Comendant wrote: > I'm

Re: ClamAV - low detection rates on malware attachments lately

2018-11-08 Thread Brent Clark
On 2018/11/08 00:14, Kenneth Porter wrote: On 11/7/2018 1:24 PM, Kris Deugau wrote: I call ClamAV from MIMEDefang before invoking SA. I use the "unofficial sigs" package (available as an RPM via yum for Red Hat systems) for much better detection.

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Tobi
Hi I checked the first message on my SA and found multiple hits on __SCC_SHORT_WORDS rule which resulted in hits on the metas * 1.0 SCC_10_SHORT_WORD_LINES 10 lines with many short words * 1.0 SCC_5_SHORT_WORD_LINES 5 lines with many short words * 1.0

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 8, 2018, at 2:30 AM, Matus UHLAR - fantomas wrote: > > Do you use autolearn? There are a few rules to detect ham (score > negatively), many of them based on default whitelists and DNS whitelists, > where many mails come from grey area companies, not necessarily spam, but > training their

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 8, 2018, at 12:20 PM, RW wrote: > > these emails don't contain a valid HTML mime section. They contain a bogus > html section that doesn't > start with the separator defined in the top-level Content-Type header. Sorry, that is totally my fault. In the spample, I was trying to sanitize

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
> do you regularly perform sa-update on that box? Yes, it is run every night. However, I am still running 3.4.1, so if the sha1 access has already been disabled, my updates are likely failing as of recently. I'm working on updating to 3.4.2 but this is an ancient box and I haven't yet had the

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 8, 2018, at 12:20 PM, RW wrote: > > I've already explained this. Sorry, I don't recall this discussion, my apologies. > Do these actually display on any email client? Yes. For example, for the first spample (https://pastebin.com/peiXZivJ), Apple Mail (OS X) displays the decoded HTML

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Matus UHLAR - fantomas
On 07.11.18 12:33, Amir Caspi wrote: In the past couple of weeks I've gotten a number of clearly-spam messages that slipped past SA, and the only reason was because they were getting low Bayes scores (BAYES_50 or even down to BAYES_00 or BAYES_05). I do my Bayes training manually on both ham

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread RW
On Thu, 8 Nov 2018 10:09:21 -0700 Amir Caspi wrote: > (2) Does normalize_charset decode HTML entities? If not, is this > something that can be included? Do I need to file a bugzilla? I've already explained this. Ordinarily html is decoded (whether normalize_charset is set or not), but these

Re: Warnings when enabling URILocalBL plugin

2018-11-08 Thread quinn
So, these warnings may be unrelated to URILocalBL: I disabled that plugin and the errors are still appearing. I forgot to mention that I also upgraded from SA v3.3.2 to v3.4.2 at the same time. S, there's that. The only change to my /etc/mail/spamassassin/ files were the addition of the

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread RW
On Wed, 7 Nov 2018 12:33:35 -0700 Amir Caspi wrote: > In many cases, it would appear that these spams have either very > little (real) text (besides the usual attempt at Bayes poisoning) > and/or are using HTML-entity encoding to try to bypass Bayes. Here > are a couple of spamples: > >

Re: Warnings when enabling URILocalBL plugin

2018-11-08 Thread Giovanni Bechis
On 11/8/18 1:18 AM, Quinn Comendant wrote: > I'm getting warnings when enabling Mail::SpamAssassin::Plugin::URILocalBL: > > warn: Use of uninitialized value in subroutine entry at > /usr/share/perl5/vendor_perl/Mail/SpamAssassin/Plugin/RelayCountry.pm line > 219. > warn: plugin: eval failed:

Re: ClamAV - low detection rates on malware attachments lately

2018-11-08 Thread Henrik K
On Wed, Nov 07, 2018 at 04:24:12PM -0500, Kris Deugau wrote: > > I should probably get around to publishing the rewritten version of the > ClamAV plugin I'm using; it's grown quite a bit from the example on the > wiki... Consider submitting to to SA core, make a bug... :-) Also if it's not

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Bill Cole
On 7 Nov 2018, at 14:33, Amir Caspi wrote: Hi all, In the past couple of weeks I've gotten a number of clearly-spam messages that slipped past SA, and the only reason was because they were getting low Bayes scores (BAYES_50 or even down to BAYES_00 or BAYES_05). I do my Bayes training

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Bill Cole
[Resending because it looks like my first send went into a black hole...] On 7 Nov 2018, at 14:33, Amir Caspi wrote: Hi all, In the past couple of weeks I've gotten a number of clearly-spam messages that slipped past SA, and the only reason was because they were getting low Bayes scores

Re: Warnings when enabling URILocalBL plugin

2018-11-08 Thread Bill Cole
On 8 Nov 2018, at 17:26, Kevin A. McGrail wrote: > There are a lot of changes to GeoIP having to do with the database behind > it being deprecated. I think you might have to look at all the GeoIP stuff > and would appreciate your feedback. Bill, do you remember who was working > on all the

Re: ClamAV - low detection rates on malware attachments lately

2018-11-08 Thread Kenneth Porter
--On Thursday, November 08, 2018 10:59 AM -0500 Kris Deugau wrote: https://sourceforge.net/projects/unofficial-sigs/ It's been in Debian for a while too. That upstream link is an old version; it was forked or taken over (not sure which) by extremeshok.com at

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 8, 2018, at 2:19 PM, Bill Cole wrote: > > [Resending because it looks like my first send went into a black hole...] All SA messages appear to be coming with significantly delays today... not sure why. I got RW's first message, sent at 8am today, only about an hour ago, AFTER the

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread RW
On Thu, 8 Nov 2018 13:14:13 -0700 Amir Caspi wrote: > If the HTML section is valid, as it appears to be ... then the HTML > should be decoded. And yet, these emails are hitting BAYES_00 or > BAYES_05 despite the spammy HTML text. In the two examples there isn't really much in the html text. I

Re: ClamAV - low detection rates on malware attachments lately

2018-11-08 Thread Kris Deugau
Kenneth Porter wrote: On 11/7/2018 1:24 PM, Kris Deugau wrote: I use a combination of adding local signatures (mainly hashes for "random-executable-inna-archive") and selected signatures from a number of third parties to the stock set in a "primary" Clam instance that's an absolute yes/no

Re: Warnings when enabling URILocalBL plugin

2018-11-08 Thread Giovanni Bechis
On Thu, Nov 08, 2018 at 01:43:15PM -0600, qu...@strangecode.com wrote: > So, these warnings may be unrelated to URILocalBL: I disabled that plugin and > the errors are still appearing. > ... > Here is the output from `spamassassin -D --lint`: > https://pastebin.com/raw/Zr7umPQv > >

Re: Warnings when enabling URILocalBL plugin

2018-11-08 Thread Kevin A. McGrail
There are a lot of changes to GeoIP having to do with the database behind it being deprecated. I think you might have to look at all the GeoIP stuff and would appreciate your feedback. Bill, do you remember who was working on all the GeoIP stuff? -- Kevin A. McGrail VP Fundraising, Apache

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Bill Cole
On 8 Nov 2018, at 21:55, John Hardin wrote: On Thu, 8 Nov 2018, Amir Caspi wrote: On Nov 8, 2018, at 7:41 PM, John Hardin wrote: Sure, but I't also prefer to have a sample to test on before committing. I'll see if I can get the pastebin to work (i.e. fix the boundary) I can send you

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread RW
On Thu, 8 Nov 2018 23:30:42 + RW wrote: > On Thu, 8 Nov 2018 13:14:13 -0700 > Amir Caspi wrote: > > > > If the HTML section is valid, as it appears to be ... then the HTML > > should be decoded. And yet, these emails are hitting BAYES_00 or > > BAYES_05 despite the spammy HTML text. >

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 8, 2018, at 4:51 PM, RW wrote: > > Unnecessary encoding is fairly common, but a long runs of ASCII > characters encoded like this seems extreme. Right, that was a question I had asked in my email this morning... whether we have a rule to detect long sequences of HTML entities. It would

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread John Hardin
On Thu, 8 Nov 2018, Amir Caspi wrote: On Nov 8, 2018, at 4:51 PM, RW wrote: Unnecessary encoding is fairly common, but a long runs of ASCII characters encoded like this seems extreme. Right, that was a question I had asked in my email this morning... whether we have a rule to detect long

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread John Hardin
On Thu, 8 Nov 2018, Amir Caspi wrote: On Nov 8, 2018, at 7:55 PM, John Hardin wrote: I left it case-sensitive; is there some reason the entities cannot be coded as (e.g.) ? I kinda doubt it, so it should *probably* be case-insensitive to avoid trivial bypass. I think it should be

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 8, 2018, at 7:55 PM, John Hardin wrote: > > I left it case-sensitive; is there some reason the entities cannot be coded > as (e.g.) ? I kinda doubt it, so it should *probably* be > case-insensitive to avoid trivial bypass. I think it should be insensitive, sorry for that oversight on

SpamAssassin 3.4.2 -- RPM for CentOS 5

2018-11-08 Thread Amir Caspi
Hi all, I finally had some bandwidth and was able to get an RPM built for CentOS 5. I used Kevin Fenzi's CentOS 6 source RPM from COPR rather than one from Fedora, though I imagine Fedora would probably work just fine. The only thing I had to do to get this to work was to install the

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread John Hardin
On Thu, 8 Nov 2018, Amir Caspi wrote: On Nov 8, 2018, at 7:41 PM, John Hardin wrote: Sure, but I't also prefer to have a sample to test on before committing. I'll see if I can get the pastebin to work (i.e. fix the boundary) I can send you some new spamples via attachment, privately.

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 8, 2018, at 7:41 PM, John Hardin wrote: > > Sure, but I't also prefer to have a sample to test on before committing. I'll > see if I can get the pastebin to work (i.e. fix the boundary) > I can send you some new spamples via attachment, privately. Unfortunately I lost those

Re: Bayes underperforming, HTML entities?

2018-11-08 Thread Amir Caspi
On Nov 7, 2018, at 12:33 PM, Amir Caspi wrote: > > In many cases, it would appear that these spams have either very little > (real) text (besides the usual attempt at Bayes poisoning) and/or are using > HTML-entity encoding to try to bypass Bayes. Here are a couple of spamples: > >

Re: Warnings when enabling URILocalBL plugin

2018-11-08 Thread Giovanni Bechis
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 11/8/18 11:57 PM, Giovanni Bechis wrote: > On Thu, Nov 08, 2018 at 01:43:15PM -0600, qu...@strangecode.com wrote: >> So, these warnings may be unrelated to URILocalBL: I disabled that plugin >> and the errors are still appearing. >> > ... >>