The rules looks to be performing better in masscheck after the updates to
the corpus checking:

https://ruleqa.spamassassin.org/20190604-r1860591-n/__BOGUS_MIME_VER_01/detail
https://ruleqa.spamassassin.org/20190604-r1860591-n/__BOGUS_MIME_VER_02/detail

Certainly worth letting QA do it's thing and autoscore?

On Tue, 4 Jun 2019 at 02:10, Amir Caspi <ceph...@3phase.com> wrote:

> Hi Kevin,
>
> Here are some spamples -- I've specifically chosen the ones that did NOT
> score enough through other means to get tagged, i.e., these are false
> negatives.  Note that many of them have valid DKIM and hit no other
> markers.  (The spample will NOT pass DKIM because headers have been
> modified for anonymity.)  If you run them through NOW you'll probably find
> they hit Razor and Pyzor and various other things... but they clearly
> didn't at the time of receipt.  Most of them score 4.6 unless they manage
> to have enough Bayes "poison" to score lower.  (And I STILL don't know how
> they keep hitting only BAYES_50...)
>
> https://pastebin.com/BQH3JgWD
> https://pastebin.com/nXtZtUdm
> https://pastebin.com/tBQt1Raw
> https://pastebin.com/wEGvcs73
> https://pastebin.com/nuFJ48k0
> https://pastebin.com/ykCuEPNQ
> ** This last one I received from two different servers within a minute of
> each other.  The first one got nailed by SPFBL so it got marked as spam,
> but only because the combo of SPFBL (2.2) and local BOGUS_MIME_VERSION
> (4.0) pushed it over threshold.  This spample, the second of the two,
> didn't get nailed because the relay wasn't in SPFBL, so BOGUS_MIME_VERSION
> wasn't enough by itself at a score of 4.0, although it WOULD have been
> enough at a score of 4.5.
>
> I should also mention I've seen at least a few recent ones that hit
> Mailscanner's "Eudora long-MIME-boundary attack" rule.  I'm not including
> those as spamples since they got sanitized by MailScanner so aren't useful,
> but I figured it was worth mentioning.
>
> My feeling is that BOGUS_MIME_VERSION is incredibly useful during the
> early hits of snowshoers, before the RBLs, URIBLs, and content hash DBs can
> catch up.  Since it would seem to be 100% spam and 0% ham, I think scoring
> it very highly (4+ points) would be both safe and useful -- it will help
> nix these early hits but won't hinder anything else.
>
> From my experience and these spamples, where most of them are scoring 4.6
> (with 4.0 of that from BOGUS_MIME_VERSION), an optimal score would be in
> the range of 4.5 to 4.9 ... that would push these 4.6s to 5.1 or higher.
>
> I've got MANY other examples in the Junk folders on my server, and I would
> be happy to send them to you privately if needed.
>
> Cheers.
>
> --- Amir
>
> On May 30, 2019, at 9:24 AM, Kevin A. McGrail <kmcgr...@apache.org> wrote:
>
>
> Fair enough.  Happy to look at spamples but I've seen virtually nothing in
> the wild for this.
>
>
>

Reply via email to