The rules looks to be performing better in masscheck after the updates to the corpus checking:
https://ruleqa.spamassassin.org/20190604-r1860591-n/__BOGUS_MIME_VER_01/detail https://ruleqa.spamassassin.org/20190604-r1860591-n/__BOGUS_MIME_VER_02/detail Certainly worth letting QA do it's thing and autoscore? On Tue, 4 Jun 2019 at 02:10, Amir Caspi <ceph...@3phase.com> wrote: > Hi Kevin, > > Here are some spamples -- I've specifically chosen the ones that did NOT > score enough through other means to get tagged, i.e., these are false > negatives. Note that many of them have valid DKIM and hit no other > markers. (The spample will NOT pass DKIM because headers have been > modified for anonymity.) If you run them through NOW you'll probably find > they hit Razor and Pyzor and various other things... but they clearly > didn't at the time of receipt. Most of them score 4.6 unless they manage > to have enough Bayes "poison" to score lower. (And I STILL don't know how > they keep hitting only BAYES_50...) > > https://pastebin.com/BQH3JgWD > https://pastebin.com/nXtZtUdm > https://pastebin.com/tBQt1Raw > https://pastebin.com/wEGvcs73 > https://pastebin.com/nuFJ48k0 > https://pastebin.com/ykCuEPNQ > ** This last one I received from two different servers within a minute of > each other. The first one got nailed by SPFBL so it got marked as spam, > but only because the combo of SPFBL (2.2) and local BOGUS_MIME_VERSION > (4.0) pushed it over threshold. This spample, the second of the two, > didn't get nailed because the relay wasn't in SPFBL, so BOGUS_MIME_VERSION > wasn't enough by itself at a score of 4.0, although it WOULD have been > enough at a score of 4.5. > > I should also mention I've seen at least a few recent ones that hit > Mailscanner's "Eudora long-MIME-boundary attack" rule. I'm not including > those as spamples since they got sanitized by MailScanner so aren't useful, > but I figured it was worth mentioning. > > My feeling is that BOGUS_MIME_VERSION is incredibly useful during the > early hits of snowshoers, before the RBLs, URIBLs, and content hash DBs can > catch up. Since it would seem to be 100% spam and 0% ham, I think scoring > it very highly (4+ points) would be both safe and useful -- it will help > nix these early hits but won't hinder anything else. > > From my experience and these spamples, where most of them are scoring 4.6 > (with 4.0 of that from BOGUS_MIME_VERSION), an optimal score would be in > the range of 4.5 to 4.9 ... that would push these 4.6s to 5.1 or higher. > > I've got MANY other examples in the Junk folders on my server, and I would > be happy to send them to you privately if needed. > > Cheers. > > --- Amir > > On May 30, 2019, at 9:24 AM, Kevin A. McGrail <kmcgr...@apache.org> wrote: > > > Fair enough. Happy to look at spamples but I've seen virtually nothing in > the wild for this. > > >