Re: CPAN Testers reports == spam?

2022-03-07 Thread Timothe Litt

On 06-Mar-22 20:25, Felipe Gasper wrote:



On Mar 6, 2022, at 20:11, Ricardo Signes  wrote:

On Thu, Mar 3, 2022, at 09:48, Felipe Gasper wrote:

   1.0 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
  [score: 1.]
   5.0 BAYES_99   BODY: Bayes spam probability is 99 to 100%
  [score: 1.]

These relate to your training of SpamAssassin.  I don't know how you or your 
provider is training your Bayes db, but you're getting six points from that.

It’s a cPanel server.

Maybe the `nbsp` in there--useless in a plain-text email?--is the culprit. 
(Bayes, from what I’ve read, takes into account misspelled words.)


   3.3 EXCUSE_REMOVE  BODY: Talks about how to be removed from mailings

This is, I think, a bit higher than the default value for this, but not much higher.  I 
think that's too high, but perhaps replacing this with "to unsubscribe" would 
help.  (Again, though, your personal Bayes is the likely problem here.)

The list-help, list-unsubscribe, etc. headers would probably be a good addition 
here?


   0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII
   but isn't

The email does not have a Content-Type header.  It should have one, probably 
"text/plain; charset=utf-8"

Agreed.

-F


The simplest solution would be to feed these reports into the bayes 
filter.  sa-learn is the utility for that.


E.g. file them in a folder, and do something like

sa-learn --ham --mbox ham_folder

I keep training folders for spam, ham, and forget & have a daily cron 
job to keep the filter up-to-date.


sa-learn has a bunch of options for tuning its behavior, which man will 
turn up.





OpenPGP_signature
Description: OpenPGP digital signature


Re: CPAN Testers reports == spam?

2022-03-06 Thread Felipe Gasper



> On Mar 6, 2022, at 20:11, Ricardo Signes  
> wrote:
> 
> On Thu, Mar 3, 2022, at 09:48, Felipe Gasper wrote:
>>   1.0 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
>>  [score: 1.]
>>   5.0 BAYES_99   BODY: Bayes spam probability is 99 to 100%
>>  [score: 1.]
> 
> These relate to your training of SpamAssassin.  I don't know how you or your 
> provider is training your Bayes db, but you're getting six points from that.

It’s a cPanel server.

Maybe the `nbsp` in there--useless in a plain-text email?--is the culprit. 
(Bayes, from what I’ve read, takes into account misspelled words.)

> 
>>   3.3 EXCUSE_REMOVE  BODY: Talks about how to be removed from 
>> mailings
> 
> This is, I think, a bit higher than the default value for this, but not much 
> higher.  I think that's too high, but perhaps replacing this with "to 
> unsubscribe" would help.  (Again, though, your personal Bayes is the likely 
> problem here.)

The list-help, list-unsubscribe, etc. headers would probably be a good addition 
here?

> 
>>   0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII
>>   but isn't
> 
> The email does not have a Content-Type header.  It should have one, probably 
> "text/plain; charset=utf-8"

Agreed.

-F


Re: CPAN Testers reports == spam?

2022-03-06 Thread Ricardo Signes
On Thu, Mar 3, 2022, at 09:48, Felipe Gasper wrote:
>   1.0 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
>  [score: 1.]
>   5.0 BAYES_99   BODY: Bayes spam probability is 99 to 100%
>  [score: 1.]

These relate to your training of SpamAssassin.  I don't know how you or your 
provider is training your Bayes db, but you're getting six points from that.

>   3.3 EXCUSE_REMOVE  BODY: Talks about how to be removed from mailings

This is, I think, a bit higher than the default value for this, but not much 
higher.  I think that's too high, but perhaps replacing this with "to 
unsubscribe" would help.  (Again, though, your personal Bayes is the likely 
problem here.)

>   0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII
>   but isn't

The email does not have a Content-Type header.  It should have one, probably 
"text/plain; charset=utf-8"

-- 
rjbs

CPAN Testers reports == spam?

2022-03-03 Thread Felipe Gasper
Hello,

I’ve noticed that SpamAssassin flags CPAN Testers emails as spam. The 
last one I got had these headers attached:

-
X-Spam-Status: Yes, score=10.4
X-Spam-Score: 104
X-Spam-Bar: ++
X-Spam-Report: Spam detection software, running on the system 
"web1.siteocity.com",
 has identified this incoming email as possible spam.  The original
 message has been attached to this so you can view it or label
 similar future email.  If you have any questions, see
 root\@localhost for details.
 Content preview:  Dear Felipe Gasper, Please find below the latest reports for
your distributions, generated by CPAN Testers, from the last 24 hours. To
change your preferences, or disable notifications, please visit the CPAN
   Testers Preferences system at https://prefs.cpantesters.org. 
 Content analysis details:   (10.4 points, 5.0 required)
  pts rule name  description
  -- --
  1.0 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
 [score: 1.]
  5.0 BAYES_99   BODY: Bayes spam probability is 99 to 100%
 [score: 1.]
  3.3 EXCUSE_REMOVE  BODY: Talks about how to be removed from mailings
  0.0 URIBL_BLOCKED  ADMINISTRATOR NOTICE: The query to URIBL was
 blocked.  See
 
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
  for more information.
 [URIs: fastly.com]
 -0.0 SPF_PASS   SPF: sender matches SPF record
  0.2 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level
 mail domains are different
  0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII
  but isn't
 -0.0 T_SCC_BODY_TEXT_LINE   No description available.
X-Spam-Flag: YES
Subject:  ***SPAM***  CPAN Testers Daily Summary Report
-

It looks like there’s a UTF-8 “♥” in the email; that’s probably where the 
PP_MIME_FAKE_ASCII_TEXT comes from.

For EXCUSE_REMOVE, the body apparently needs to match this pattern: /to be 
removed from.{0,20}(?:mailings|offers)/i

I’m not sure about the Bayes score, but the character encoding and 
HTML-in-plain-text (“ ”) might be related?

Is there something I could do to help with this?

Thanks!

cheers,
-Felipe Gasper