Re: googleapis hosted phish

2018-11-15 Thread Bill Cole
On 15 Nov 2018, at 7:52, RW wrote: On Thu, 15 Nov 2018 01:22:00 -0500 Bill Cole wrote: On 14 Nov 2018, at 20:11, Alex wrote: Where is it getting these long hostname strings from? There's a bunch of garbage HTML using invisible text (font-size: 0) between tiny bits of visible text to break

Bayes not learning, blacklist not filtering

2018-11-15 Thread MarkCS
So I've been tasked with researching an issue with the mail server at work. We use Spamassassin and at present, it's not blocking some pretty obvious spam, largely from the domain qq.com. Basically email is slipping through, being bounced back at the end receiving server, then our server tries to

unexpected FN, how to improve/tune to catch

2018-11-15 Thread Ian Zimmerman
This little pearl got through upstream filter on a mailing list. https://pastebin.com/JhDGvAAA I show the body only, but the MIME headers were: Content-Transfer-Encoding: base64 Content-Type: text/plain; charset="utf-8"; Format="flowed" Also: From: yourfrugalstore Message-ID:

Re: Hackernews post : SpamAssassin is back

2018-11-15 Thread Kevin A. McGrail
On 11/15/2018 7:54 AM, Brent Clark wrote: > Good day Guys > > Just came across and share > https://news.ycombinator.com/item?id=18458212 > > thats leads too https://lwn.net/Articles/769917/ > > HTH > Brent > P.s. From my side, thanks to all involved and for your time. Much > appreciated. Way to

Re: Bayes underperforming, HTML entities?

2018-11-15 Thread Amir Caspi
On Nov 10, 2018, at 11:30 AM, John Hardin wrote: > > The rawbody rules perform much better (unsurprising), and the ASCII-only one > has a better raw S/O: It looks like HTML_ENTITY_ASCII has been rolled out -- did you decide against the more general unicode due to S/O score? I predict we will

Re: Bayes underperforming, HTML entities?

2018-11-15 Thread John Hardin
On Thu, 15 Nov 2018, Amir Caspi wrote: On Nov 10, 2018, at 11:30 AM, John Hardin wrote: The rawbody rules perform much better (unsurprising), and the ASCII-only one has a better raw S/O: It looks like HTML_ENTITY_ASCII has been rolled out -- did you decide against the more general

Re: Bayes underperforming, HTML entities?

2018-11-15 Thread John Hardin
On Thu, 15 Nov 2018, Amir Caspi wrote: On Nov 15, 2018, at 2:36 PM, John Hardin wrote: That and its resistance to FP avoidance. Despite the generality, I don't see a significant FP risk on the general unicode version. I don't see ANY legitimate reason why an email would hard-encode long

Re: Bayes not learning, blacklist not filtering

2018-11-15 Thread John Hardin
On Thu, 15 Nov 2018, MarkCS wrote: Even when the message is manually learned and the domain in question is blacklisted, these messages are getting through. If you're blacklisting the domain, do so at the MTA level. My question is basically, why would BAYES be failing to learn? The most

Re: Bayes underperforming, HTML entities?

2018-11-15 Thread Amir Caspi
On Nov 15, 2018, at 2:36 PM, John Hardin wrote: > > That and its resistance to FP avoidance. Despite the generality, I don't see a significant FP risk on the general unicode version. I don't see ANY legitimate reason why an email would hard-encode long sequences of human-readable text, in

Re: Bayes underperforming, HTML entities?

2018-11-15 Thread John Hardin
On Thu, 15 Nov 2018, Amir Caspi wrote: On Nov 15, 2018, at 2:36 PM, John Hardin wrote: It doesn't seem to have a very high score just yet... I'm still getting FNs with the rule hitting (due to those messages hitting BAYES_00/05). Manually train those messages as spam and that should

Re: Bayes underperforming, HTML entities?

2018-11-15 Thread Amir Caspi
On Nov 15, 2018, at 2:36 PM, John Hardin wrote: > >> It doesn't seem to have a very high score just yet... I'm still getting FNs >> with the rule hitting (due to those messages hitting BAYES_00/05). > > Manually train those messages as spam and that should repair itself... Actually... right

Re: googleapis hosted phish

2018-11-15 Thread RW
On Thu, 15 Nov 2018 01:22:00 -0500 Bill Cole wrote: > On 14 Nov 2018, at 20:11, Alex wrote: > > > Where is it getting these long hostname strings from? > > There's a bunch of garbage HTML using invisible text (font-size: 0) > between tiny bits of visible text to break Bayes and/or specific

Hackernews post : SpamAssassin is back

2018-11-15 Thread Brent Clark
Good day Guys Just came across and share https://news.ycombinator.com/item?id=18458212 thats leads too https://lwn.net/Articles/769917/ HTH Brent P.s. From my side, thanks to all involved and for your time. Much appreciated.