Re: CPAN Testers reports == spam?
On 06-Mar-22 20:25, Felipe Gasper wrote: On Mar 6, 2022, at 20:11, Ricardo Signes wrote: On Thu, Mar 3, 2022, at 09:48, Felipe Gasper wrote: 1.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.] 5.0 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.] These relate to your training of SpamAssassin. I don't know how you or your provider is training your Bayes db, but you're getting six points from that. It’s a cPanel server. Maybe the `nbsp` in there--useless in a plain-text email?--is the culprit. (Bayes, from what I’ve read, takes into account misspelled words.) 3.3 EXCUSE_REMOVE BODY: Talks about how to be removed from mailings This is, I think, a bit higher than the default value for this, but not much higher. I think that's too high, but perhaps replacing this with "to unsubscribe" would help. (Again, though, your personal Bayes is the likely problem here.) The list-help, list-unsubscribe, etc. headers would probably be a good addition here? 0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII but isn't The email does not have a Content-Type header. It should have one, probably "text/plain; charset=utf-8" Agreed. -F The simplest solution would be to feed these reports into the bayes filter. sa-learn is the utility for that. E.g. file them in a folder, and do something like sa-learn --ham --mbox ham_folder I keep training folders for spam, ham, and forget & have a daily cron job to keep the filter up-to-date. sa-learn has a bunch of options for tuning its behavior, which man will turn up. OpenPGP_signature Description: OpenPGP digital signature
Re: CPAN Testers reports == spam?
> On Mar 6, 2022, at 20:11, Ricardo Signes > wrote: > > On Thu, Mar 3, 2022, at 09:48, Felipe Gasper wrote: >> 1.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% >> [score: 1.] >> 5.0 BAYES_99 BODY: Bayes spam probability is 99 to 100% >> [score: 1.] > > These relate to your training of SpamAssassin. I don't know how you or your > provider is training your Bayes db, but you're getting six points from that. It’s a cPanel server. Maybe the `nbsp` in there--useless in a plain-text email?--is the culprit. (Bayes, from what I’ve read, takes into account misspelled words.) > >> 3.3 EXCUSE_REMOVE BODY: Talks about how to be removed from >> mailings > > This is, I think, a bit higher than the default value for this, but not much > higher. I think that's too high, but perhaps replacing this with "to > unsubscribe" would help. (Again, though, your personal Bayes is the likely > problem here.) The list-help, list-unsubscribe, etc. headers would probably be a good addition here? > >> 0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII >> but isn't > > The email does not have a Content-Type header. It should have one, probably > "text/plain; charset=utf-8" Agreed. -F
Re: CPAN Testers reports == spam?
On Thu, Mar 3, 2022, at 09:48, Felipe Gasper wrote: > 1.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% > [score: 1.] > 5.0 BAYES_99 BODY: Bayes spam probability is 99 to 100% > [score: 1.] These relate to your training of SpamAssassin. I don't know how you or your provider is training your Bayes db, but you're getting six points from that. > 3.3 EXCUSE_REMOVE BODY: Talks about how to be removed from mailings This is, I think, a bit higher than the default value for this, but not much higher. I think that's too high, but perhaps replacing this with "to unsubscribe" would help. (Again, though, your personal Bayes is the likely problem here.) > 0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII > but isn't The email does not have a Content-Type header. It should have one, probably "text/plain; charset=utf-8" -- rjbs
CPAN Testers reports == spam?
Hello, I’ve noticed that SpamAssassin flags CPAN Testers emails as spam. The last one I got had these headers attached: - X-Spam-Status: Yes, score=10.4 X-Spam-Score: 104 X-Spam-Bar: ++ X-Spam-Report: Spam detection software, running on the system "web1.siteocity.com", has identified this incoming email as possible spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see root\@localhost for details. Content preview: Dear Felipe Gasper, Please find below the latest reports for your distributions, generated by CPAN Testers, from the last 24 hours. To change your preferences, or disable notifications, please visit the CPAN Testers Preferences system at https://prefs.cpantesters.org. Content analysis details: (10.4 points, 5.0 required) pts rule name description -- -- 1.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.] 5.0 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.] 3.3 EXCUSE_REMOVE BODY: Talks about how to be removed from mailings 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: fastly.com] -0.0 SPF_PASS SPF: sender matches SPF record 0.2 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different 0.9 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII but isn't -0.0 T_SCC_BODY_TEXT_LINE No description available. X-Spam-Flag: YES Subject: ***SPAM*** CPAN Testers Daily Summary Report - It looks like there’s a UTF-8 “♥” in the email; that’s probably where the PP_MIME_FAKE_ASCII_TEXT comes from. For EXCUSE_REMOVE, the body apparently needs to match this pattern: /to be removed from.{0,20}(?:mailings|offers)/i I’m not sure about the Bayes score, but the character encoding and HTML-in-plain-text (“ ”) might be related? Is there something I could do to help with this? Thanks! cheers, -Felipe Gasper