Re: Bayes auto-learn - not happening

2017-08-08 Thread John Hardin

On Tue, 8 Aug 2017, Ian Zimmerman wrote:


I stopped
autolearning and hacked up some scripts that put duplicate of each ham
message into a folder which is then processed by sa-learn from a
cronjob, with sufficient delay that I can review the contents and remove
any false negatives; and similarly with spam, excluding the utterly
horrible category which just goes to /dev/null.


This is generally a good idea, unless you have a really high-volume 
environment - are you an ISP?


Keeping your training corpora around lets you review it for 
misclassifications and retrain very easily if things go off the rails.


Autolearn may be useful once you are initially manually trained. Then you 
can focus on manually training the FPs and FNs.


It's also important to be careful what you train with. If you allow users 
to submit messages for training (particularly a global bayes) then you 
either need to have strong trust in those users' judgement, or review what 
they submit before training with it.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Joan Peterson is like that: you expect at least a pseudological
  argument, but instead you get the weird ramblings of a woman with
  the critical thinking abilities of an 18th century peasant.  -- Ken
---
 7 days until the 72nd anniversary of the end of World War II


Re: Bayes auto-learn - not happening

2017-08-08 Thread Ian Zimmerman
On 2017-08-08 15:20, Scott wrote:

> Another new one  big score, auto-learn disabled.  This one is fairly small.  
> 
> X-Spam-Status: Yes, score=29.428 tag=- tag2=5 kill=6.4
> tests=[DATE_IN_PAST_03_06=1.076, DCC_CHECK=3.2,
> DIGEST_MULTIPLE=0.001,
> FILL_THIS_FORM=0.001, FROM_MISSPACED=0.001, FROM_MISSP_SPF_FAIL=1,
> HEADER_FROM_DIFFERENT_DOMAINS=0.001, HEXHASH_WORD=1,
> HTML_EXTRA_CLOSE=0.001, HTML_MESSAGE=0.001,
> HTML_MIME_NO_HTML_TAG=0.635, MIME_HTML_ONLY=1.105, MISSING_MID=0.14,
> NORMAL_HTTP_TO_IP=0.001, RAZOR2_CF_RANGE_51_100=0.365,
> RAZOR2_CF_RANGE_E8_51_100=2.43, RAZOR2_CHECK=2.5,
> RCVD_IN_BRBL_LASTEXT=1.644, RDNS_NONE=1.274, SPF_FAIL=4,
> SPF_HELO_FAIL=4, STYLE_GIBBERISH=3.093,
> T_HTML_TAG_BALANCE_CENTER=0.01, URIBL_ABUSE_SURBL=1.948,
> WEIRD_QUOTING=0.001] autolearn=unavailable autolearn_force=no
> 
> Can you tell if this one has the 3 point match?

Scott,

when I tried to use the autolearn feature I was as confused as you are.
As far as I remember, the 3 point each from header and body is not the
only requirement; the full truth is that some rules are "privileged" and
can contribute to autolearning while others cannot.  I found it opaque
in the extreme and essentially unpredictable, and so I stopped
autolearning and hacked up some scripts that put duplicate of each ham
message into a folder which is then processed by sa-learn from a
cronjob, with sufficient delay that I can review the contents and remove
any false negatives; and similarly with spam, excluding the utterly
horrible category which just goes to /dev/null.

It may not be possible for you to adopt such a process if your volume is
high, but OTOH in that case you probably have users to help you :)

I think this is what RW is telling you, too.

FWIW, this is documented (sort of) by:

perldoc Mail::SpamAssassin::Plugin::AutoLearnThreshold

-- 
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
Do obvious transformation on domain to reply privately _only_ on Usenet.


Re: Bayes auto-learn - not happening

2017-08-08 Thread Scott
Another new one  big score, auto-learn disabled.  This one is fairly small.  

X-Spam-Status: Yes, score=29.428 tag=- tag2=5 kill=6.4
tests=[DATE_IN_PAST_03_06=1.076, DCC_CHECK=3.2,
DIGEST_MULTIPLE=0.001,
FILL_THIS_FORM=0.001, FROM_MISSPACED=0.001, FROM_MISSP_SPF_FAIL=1,
HEADER_FROM_DIFFERENT_DOMAINS=0.001, HEXHASH_WORD=1,
HTML_EXTRA_CLOSE=0.001, HTML_MESSAGE=0.001,
HTML_MIME_NO_HTML_TAG=0.635, MIME_HTML_ONLY=1.105, MISSING_MID=0.14,
NORMAL_HTTP_TO_IP=0.001, RAZOR2_CF_RANGE_51_100=0.365,
RAZOR2_CF_RANGE_E8_51_100=2.43, RAZOR2_CHECK=2.5,
RCVD_IN_BRBL_LASTEXT=1.644, RDNS_NONE=1.274, SPF_FAIL=4,
SPF_HELO_FAIL=4, STYLE_GIBBERISH=3.093,
T_HTML_TAG_BALANCE_CENTER=0.01, URIBL_ABUSE_SURBL=1.948,
WEIRD_QUOTING=0.001] autolearn=unavailable autolearn_force=no

Can you tell if this one has the 3 point match?





--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/Bayes-auto-learn-not-happening-tp138065p138085.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: apache.org have URIBL_BLOCKED now :/

2017-08-08 Thread Michael Orlitzky
On 08/08/2017 02:32 PM, Benny Pedersen wrote:
> subj might concern infra staff
> 
> forward please to infra
> 

URIBL_BLOCKED means that the URIBL refused your DNS query:

  http://uribl.com/refused.shtml

The name "apache.org" isn't blacklisted, and there's nothing apache can
do to fix it. You need to make your DNS queries from somewhere else,
probably.


RE: Bayes auto-learn - not happening

2017-08-08 Thread Scott Techlist
>you need to train your bayes *by hand* to start with - how do you expect
>bayes classification with no hints afetr purge the database - train 200
>ham and spam mails and *after that* look further

Reindl:

Thanks.  I want to use some auto-training with very conservative thresholds 
set.  All of the messages I've checked would have classified correctly via 
autolearn comfortably in those ranges.

The 200 threshold is for USING the bayes, but not a auto-learning requirement.  
Or that was my clear understanding from many posts.  I saw several old threads 
where others suggested similar but were corrected.  Maybe they changed it, 
dunno.

My concern is that auto-learn is not functioning properly.  I use Amavisd that 
calls spamassassin and has it's own issues.  Trying to make sure my system is 
operating properly.  It appears it is not to me.

No hint should be necessary for it to learn a spam.  Only to use bayes to score 
anything.  I get that.  No?






Re: Bayes auto-learn - not happening

2017-08-08 Thread Scott
I was getting my commands missed up, been looking at this too long.  When I
ran

su amavis -c 'spamassassin -D 2>&1 -t onespam'

That caused it to LEARN the spam.  Database went from not there to one
learned.  Auto-learn apparently.  That's what it should have done when it
arrived.

Brand new spam arrives.  It gets
autolearn=unavailable.

X-Spam-Status: Yes, score=20.704 tag=- tag2=5 kill=6.4
tests=[DATE_IN_PAST_06_12=1.103, DCC_CHECK=3.2,
DIGEST_MULTIPLE=0.001,
HTML_EXTRA_CLOSE=0.001, HTML_MESSAGE=0.001,
HTML_MIME_NO_HTML_TAG=0.635, MIME_HTML_ONLY=1.105, MISSING_MID=0.14,
NORMAL_HTTP_TO_IP=0.001, RAZOR2_CF_RANGE_51_100=0.365,
RAZOR2_CF_RANGE_E8_51_100=2.43, RAZOR2_CHECK=2.5, RDNS_NONE=1.274,
SPF_HELO_SOFTFAIL=3, SPF_SOFTFAIL=3, URIBL_ABUSE_SURBL=1.948]
autolearn=unavailable autolearn_force=no

That implies no auto-learn because the token exists (or there was something
else) as I understand it.  So I try to learn that one spam again...

I had to increase the size limit via:

su amavis -c 'sa-learn -D --spam --showdots  --max-size=600 --mbox
/home/mail/twospam'

Aug  8 16:35:23.567 [18045] dbg: bayes: learned
'419769464db0fabb0f1220f9ae0cf12931ad7076@sa_generated', atime: 1502226537
Learned tokens from 1 message(s) (1 message(s) examined)

At it learned it.  So autolearn=unavailable was NOT due to the token already
there.

Is there a size limit built into autolearn?  








--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/Bayes-auto-learn-not-happening-tp138065p138082.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Bayes auto-learn - not happening

2017-08-08 Thread Scott
Benny:
re tflags
> tflags foo-rule-name noautolearn
> and you can force autolearn based on rulename
> https://lists.gt.net/spamassassin/users/184996
> there is a long thread there that explain it more
>and all condition must be met for learning 

I read the thread.  Nothing there concrete enough for my to latch onto.  I
mean I get the gist of it, but no details on how to look at my tests and see
if I have the requisite 3 parts needed.






--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/Bayes-auto-learn-not-happening-tp138065p138081.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Bayes auto-learn - not happening

2017-08-08 Thread Scott
Cleared the database, ran below on the same message:

su amavis -c 'spamassassin -D 2>&1 -t onespam' | less

I didn't see any errors obvious to me.  

It recreated the databases and added this message as expected.


I don't know how to tell why it would not have auto-learned.  

Can you tell/ teach me from this?


Content analysis details:   (17.7 points, 5.0 required)

 pts rule name  description
 --
--
 1.9 URIBL_ABUSE_SURBL  Contains an URL listed in the ABUSE SURBL
blocklist
[URIs: 145.239.41.28]
 0.0 SUBJ_DOLLARS   Subject starts with dollar amount
 3.0 SPF_HELO_SOFTFAIL  SPF: HELO does not match SPF record (softfail)
 1.1 DATE_IN_PAST_03_06 Date: is 3 to 6 hours before Received: date
 0.0 NORMAL_HTTP_TO_IP  URI: URI host has a public dotted-decimal IPv4
address
 0.0 HTML_EXTRA_CLOSE   BODY: HTML contains far too many close tags
 0.0 HTML_MESSAGE   BODY: HTML included in message
 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 3.2 DCC_CHECK  Detected as bulk mail by DCC (dcc-servers.net)
 2.5 RAZOR2_CHECK   Listed in Razor2 (http://razor.sf.net/)
 2.4 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
above 50%
[cf: 100]
 0.4 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
[cf: 100]
 0.0 DIGEST_MULTIPLEMessage hits more than one network digest check
 0.6 HTML_MIME_NO_HTML_TAG  HTML-only message, but there is no HTML tag
 0.1 MISSING_MIDMissing Message-Id: header
 1.3 RDNS_NONE  Delivered to internal network by a host with no
rDNS

Aug  8 15:47:11.098 [17077] dbg: check: tagrun - tag DKIMDOMAIN is still
blocking action 0
Aug  8 15:47:11.105 [17077] dbg: plugin:
Mail::SpamAssassin::Plugin::MIMEHeader=HASH(0x2ccc328) implements
'finish_tests', priority 0
Aug  8 15:47:11.105 [17077] dbg: plugin:
Mail::SpamAssassin::Plugin::Check=HASH(0x2e04e38) implements 'finish_tests',
priority 0
Aug  8 15:47:11.116 [17077] dbg: netset: cache trusted_networks
hits/attempts: 15/17, 88.2 %







--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/Bayes-auto-learn-not-happening-tp138065p138078.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Bayes auto-learn - not happening

2017-08-08 Thread Benny Pedersen

Scott skrev den 2017-08-08 22:19:

Does this one have the requisite 3-point match?  I don't understand how 
to

tell yet.


spamassassin -D 2>&1 -t mail.msg | less

should show why


Re: Bayes auto-learn - not happening

2017-08-08 Thread RW
On Tue, 8 Aug 2017 13:04:16 -0700 (MST)
Scott wrote:

> The "3 points" criteria does not apply to manually learning 

No it's just a sanity check to reduce mistraining. If you can, don't
use autotraining at all.  


Re: Bayes auto-learn - not happening

2017-08-08 Thread Scott
Apologies, I meant sa-learn.  Brain fart.

Thanks for the clarification on the 3-point rule.

I've had a bunch of them come through.  They all get autolearn=no or I get a
few that say "unavailable" like the sample below.  I gather from trying to
figure out myself that unavailable may be things already learned.  Or
something else whatever that may be, per the wiki.  But if the database is
empty, it seems that "already learned" is not the reason for  "unavailable"
in this case anyway.

X-Spam-Status: Yes, score=20.678 tag=- tag2=5 kill=6.4
tests=[DATE_IN_PAST_03_06=1.076, DCC_CHECK=3.2,
DIGEST_MULTIPLE=0.001,
HTML_EXTRA_CLOSE=0.001, HTML_MESSAGE=0.001,
HTML_MIME_NO_HTML_TAG=0.635, MIME_HTML_ONLY=1.105, MISSING_MID=0.14,
NORMAL_HTTP_TO_IP=0.001, RAZOR2_CF_RANGE_51_100=0.365,
RAZOR2_CF_RANGE_E8_51_100=2.43, RAZOR2_CHECK=2.5, RDNS_NONE=1.274,
SPF_HELO_SOFTFAIL=3, SPF_SOFTFAIL=3, SUBJ_DOLLARS=0.001,
URIBL_ABUSE_SURBL=1.948] autolearn=unavailable autolearn_force=no

Does this one have the requisite 3-point match?  I don't understand how to
tell yet. 

I've cleared the db again.  Will let it run to see if it learns *anything*. 
So far I have not seen that happen.  Surely something will get a 3 way
match.





--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/Bayes-auto-learn-not-happening-tp138065p138075.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Bayes auto-learn - not happening

2017-08-08 Thread Benny Pedersen

Scott skrev den 2017-08-08 22:06:


Better, what test flags in general disable auto-learn?


tflags foo-rule-name noautolearn

and you can force autolearn based on rulename

https://lists.gt.net/spamassassin/users/184996

there is a long thread there that explain it more

and all condition must be met for learning


Re: Bayes auto-learn - not happening

2017-08-08 Thread Benny Pedersen

Scott skrev den 2017-08-08 22:04:
The "3 points" criteria does not apply to manually learning via 
sa-update

then?


typo ?. sa-update does not learn, it just update rules, you meant 
sa-learn ?


when sa-learn is used, its not autolearn, so the limits are not appled


Re: Bayes auto-learn - not happening

2017-08-08 Thread Scott
> some of the listed tags have tflags that disable autolearn

< there is nothing to fix here 

Benny:  Will you elaborate for me please?  So I can understand and
self-help.

Better, what test flags in general disable auto-learn?



--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/Bayes-auto-learn-not-happening-tp138065p138072.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Bayes auto-learn - not happening

2017-08-08 Thread Scott
The "3 points" criteria does not apply to manually learning via sa-update
then?





--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/Bayes-auto-learn-not-happening-tp138065p138071.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: apache.org have URIBL_BLOCKED now :/

2017-08-08 Thread Kevin A. McGrail

On 8/8/2017 2:32 PM, Benny Pedersen wrote:

subj might concern infra staff

forward please to infra


Thanks.  Can you give more details?  I just sent a test message from my 
kmcgr...@apache.org and don't see an issue.  Is there a specific RBL?



Return-Path: 
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by intel1.peregrinehw.com (8.14.9/8.14.9) with SMTP id v78Iaik9025060
for ; Tue, 8 Aug 2017 14:36:45 -0400
Received: (qmail 79636 invoked by uid 99); 8 Aug 2017 18:36:44 -
Received: from mail-relay.apache.org (HELO mail-relay.apache.org) 
(140.211.11.15)
by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Aug 2017 18:36:44 +
Received: from [10.10.11.221] (pool-100-36-131-234.washdc.fios.verizon.net 
[100.36.131.234])
by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) 
with ESMTPSA id 7AAFD1A00A6
for ; Tue,  8 Aug 2017 18:36:43 + (UTC)
To: kmcgr...@pccc.com
From: "Kevin A. McGrail" 
Subject: test
Message-ID: <8bd8f6f4-5f64-9ade-4c98-4b7f527de...@apache.org>
Date: Tue, 8 Aug 2017 14:36:55 -0400
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-PCCC-Virus-Scan: Enabled
X-KAM-Reverse: Passed - Reverse DNS of hermes.apache.org/140.211.11.3
X-Spam-Status: No, hits=-11.0 required=5.8  
tests=KAM_RPTR_PASSED,RCVD_IN_DNSWL_HI,RCVD_IN_HOSTKARMA_W,
  RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,RP_MATCHES_RCVD,
  SPF_PASS,TXREP



Re: Bayes auto-learn - not happening

2017-08-08 Thread RW
On Tue, 8 Aug 2017 13:06:26 -0500
Scott Techlist wrote:

> Centos7
> Postfix 3.2.2
> Amavisd-new 2.11.0
> Spamassassin 3.4.0
> Site-wide configuration
> 
> This is a new box and I've configured some conservative values for
> auto-learn.  I've enabled it properly AFAIK, but I can't see any sign
> of it working.  
> 
> I have these set in local.cf
> use_bayes   1
> bayes_auto_learn1
> bayes_auto_learn_threshold_nonspam -1.7
> bayes_auto_learn_threshold_spam 10.0
> # this is a filename prefix, not a directory per se
> bayes_path  /etc/mail/bayes/bayes
> bayes_file_mode 0666
> 
> -bayes prep 
> Start fresh for troubleshooting:
> su amavis -c 'sa-learn --clear'
> 
> Add one spam manually and check tokens:
> 
> [root@tn2 mail]# su amavis -c 'sa-learn --dump magic'
> 0.000  0  3  0  non-token data: bayes db
> version 0.000  0  1  0  non-token data: nspam
> 0.000  0  0  0  non-token data: nham
> 0.000  0   2157  0  non-token data: ntokens
> 
> -amavisd prep
> 
> Restart amavisd/spamassassin just to be sure all configs read..
> 
> --- ready to process -
> 
> The next high scoring spam arrives, it was sent to my spam mailbox.
> It did NOT autolearn.  Nor did several others.  
> 
> To troubleshoot, I took one that did not autolearn, and learned it
> manually by: su amavis -c 'sa-learn -D --spam --showdots
> --mbox /home/mail/onespam
> 
> even though this message was slightly over the threshold, the log
> says it learned anyway: -D log snippet:
> -
> Aug  8 12:37:27.216 [13198] info: archive-iterator: skipping large
> message: 858 lines, 262203 bytes, limit 262144 bytes
> 
> Learned tokens from 1 message(s) (1 message(s) examined)
> -
> 
> Verified it learned:
> 
> [root@tn2 mail]# su amavis -c 'sa-learn --dump magic'
> 0.000  0  3  0  non-token data: bayes db
> version 0.000  0  2  0  non-token data: nspam
> 
> 
> Partial header from that message:
> 
> X-Spam-Flag: YES
> X-Spam-Score: 17.374
> X-Spam-Level: *
> X-Spam-Status: Yes, score=17.374 tag=- tag2=5 kill=6.31
> tests=[RCVD_IN_BRBL_LASTEXT=1.644, RCVD_IN_DNSWL_NONE=-0.0001,
> RCVD_IN_RP_RNBL=1.284, RCVD_IN_SBL_CSS=3.558,
> RCVD_IN_SORBS_WEB=1.5, RP_MATCHES_RCVD=-0.001,
> SUSPICIOUS_RECIPS=2.497, URIBL_ABUSE_SURBL=1.948, URIBL_BLACK=1.7,
> URIBL_DBL_SPAM=2.5, URIBL_SBL=0.644, URIBL_SBL_A=0.1] autolearn=no
> autolearn_force=no
> 
> Why aren't my spams getting auto-learned?  If sa-learn "ate" it,
> shouldn't auto-learn too?

To autolearn spam you need 3 points from the body and 3 from headers.


apache.org have URIBL_BLOCKED now :/

2017-08-08 Thread Benny Pedersen

subj might concern infra staff

forward please to infra


Re: Bayes auto-learn - not happening

2017-08-08 Thread Benny Pedersen

Scott Techlist skrev den 2017-08-08 20:06:


X-Spam-Flag: YES
X-Spam-Score: 17.374
X-Spam-Level: *
X-Spam-Status: Yes, score=17.374 tag=- tag2=5 kill=6.31
tests=[RCVD_IN_BRBL_LASTEXT=1.644, RCVD_IN_DNSWL_NONE=-0.0001,
RCVD_IN_RP_RNBL=1.284, RCVD_IN_SBL_CSS=3.558, 
RCVD_IN_SORBS_WEB=1.5,

RP_MATCHES_RCVD=-0.001, SUSPICIOUS_RECIPS=2.497,
URIBL_ABUSE_SURBL=1.948, URIBL_BLACK=1.7, URIBL_DBL_SPAM=2.5,
URIBL_SBL=0.644, URIBL_SBL_A=0.1] autolearn=no 
autolearn_force=no



Can't figure out what's wrong...


some of the listed tags have tflags that disable autolearn

there is nothing to fix here


Bayes auto-learn - not happening

2017-08-08 Thread Scott Techlist
Centos7
Postfix 3.2.2
Amavisd-new 2.11.0
Spamassassin 3.4.0
Site-wide configuration

This is a new box and I've configured some conservative values for auto-learn.  
I've enabled it properly AFAIK, but I can't see any sign of it working.  

I have these set in local.cf
use_bayes   1
bayes_auto_learn1
bayes_auto_learn_threshold_nonspam -1.7
bayes_auto_learn_threshold_spam 10.0
# this is a filename prefix, not a directory per se
bayes_path  /etc/mail/bayes/bayes
bayes_file_mode 0666

-bayes prep 
Start fresh for troubleshooting:
su amavis -c 'sa-learn --clear'

Add one spam manually and check tokens:

[root@tn2 mail]# su amavis -c 'sa-learn --dump magic'
0.000  0  3  0  non-token data: bayes db version
0.000  0  1  0  non-token data: nspam
0.000  0  0  0  non-token data: nham
0.000  0   2157  0  non-token data: ntokens

-amavisd prep

Restart amavisd/spamassassin just to be sure all configs read..

--- ready to process -

The next high scoring spam arrives, it was sent to my spam mailbox.  It did NOT 
autolearn.  Nor did several others.  

To troubleshoot, I took one that did not autolearn, and learned it manually by:
su amavis -c 'sa-learn -D --spam --showdots  --mbox /home/mail/onespam

even though this message was slightly over the threshold, the log says it 
learned anyway:
-D log snippet:
-
Aug  8 12:37:27.216 [13198] info: archive-iterator: skipping large message: 858 
lines, 262203 bytes, limit 262144 bytes

Learned tokens from 1 message(s) (1 message(s) examined)
-

Verified it learned:

[root@tn2 mail]# su amavis -c 'sa-learn --dump magic'
0.000  0  3  0  non-token data: bayes db version
0.000  0  2  0  non-token data: nspam


Partial header from that message:

X-Spam-Flag: YES
X-Spam-Score: 17.374
X-Spam-Level: *
X-Spam-Status: Yes, score=17.374 tag=- tag2=5 kill=6.31
tests=[RCVD_IN_BRBL_LASTEXT=1.644, RCVD_IN_DNSWL_NONE=-0.0001,
RCVD_IN_RP_RNBL=1.284, RCVD_IN_SBL_CSS=3.558, RCVD_IN_SORBS_WEB=1.5,
RP_MATCHES_RCVD=-0.001, SUSPICIOUS_RECIPS=2.497,
URIBL_ABUSE_SURBL=1.948, URIBL_BLACK=1.7, URIBL_DBL_SPAM=2.5,
URIBL_SBL=0.644, URIBL_SBL_A=0.1] autolearn=no autolearn_force=no

Why aren't my spams getting auto-learned?  If sa-learn "ate" it, shouldn't 
auto-learn too?

I know there is a default 200 threshold before Bayes starts tagging anything, 
but I understand it should learn without issue.

Can't figure out what's wrong...















Re: HTML (was Re: Sender needs help with false positive)

2017-08-08 Thread Benny Pedersen

Dianne Skoll skrev den 2017-08-08 20:09:

On Tue, 08 Aug 2017 20:01:52 +0200
Benny Pedersen  wrote:


why does the OP need to tell sendgrid his users passwords ?


That is indeed a very good question. :)


+1


It's not as if this is some sort of mass-mailing or marketing-oriented
email that needs to be tracked.


even if dkim was whitelisted for this mails its still sending passwords 
in there emails to sendgrid, stupid


back to learning android studio here


Re: HTML (was Re: Sender needs help with false positive)

2017-08-08 Thread Dianne Skoll
On Tue, 08 Aug 2017 20:01:52 +0200
Benny Pedersen  wrote:

> why does the OP need to tell sendgrid his users passwords ?

That is indeed a very good question. :)

It's not as if this is some sort of mass-mailing or marketing-oriented
email that needs to be tracked.

Regards,

Dianne.



Re: HTML (was Re: Sender needs help with false positive)

2017-08-08 Thread Benny Pedersen

Dianne Skoll skrev den 2017-08-08 15:05:

On Tue, 8 Aug 2017 08:00:04 -0500
David Jones  wrote:


I absolutely agree but it's possible that this part is out of his
control.  Sendgrid might be receiving a plain text email from the
normal source and adding HTML to get that image in there for
tracking.


If you can't determine the content of your own messages, time to find
another provider, I think.  Surely Sendgrid lets you control this sort
of thing?


let me hold your pocket ?

why does the OP need to tell sendgrid his users passwords ?


RE: Sender needs help with false positive

2017-08-08 Thread Jacek Osuchowski
It did. At first I couldn't figure out why it was HTML because the software
was sending plain text message. When I realized it was sendgrid tracing
method that was converting the messages to HTML in order to embed the img
tag so I turned off the tracing.
 

-Original Message-
From: Dianne Skoll [mailto:d...@roaringpenguin.com] 
Sent: Tuesday, August 08, 2017 8:43 AM
To: users@spamassassin.apache.org
Subject: Re: Sender needs help with false positive

On Tue, 8 Aug 2017 07:36:01 -0500
David Jones  wrote:

> The origin of the email and the path it takes makes a big difference 
> in how it's filtered.

Sure, but doing a plain-text message with no HTML will immediately knock
2.2 points off the score.  That's a pretty cheap and easy win.

Regards,

Dianne.



HTML (was Re: Sender needs help with false positive)

2017-08-08 Thread Dianne Skoll
On Tue, 8 Aug 2017 08:00:04 -0500
David Jones  wrote:

> I absolutely agree but it's possible that this part is out of his 
> control.  Sendgrid might be receiving a plain text email from the
> normal source and adding HTML to get that image in there for
> tracking.

If you can't determine the content of your own messages, time to find
another provider, I think.  Surely Sendgrid lets you control this sort
of thing?

Regards,

Dianne.


Re: Sender needs help with false positive

2017-08-08 Thread David Jones

On 08/08/2017 07:43 AM, Dianne Skoll wrote:

On Tue, 8 Aug 2017 07:36:01 -0500
David Jones  wrote:


The origin of the email and the path it takes makes a big difference
in how it's filtered.


Sure, but doing a plain-text message with no HTML will immediately knock
2.2 points off the score.  That's a pretty cheap and easy win.

Regards,

Dianne.



I absolutely agree but it's possible that this part is out of his 
control.  Sendgrid might be receiving a plain text email from the normal 
source and adding HTML to get that image in there for tracking.  We 
(this list) have no way to know for sure without seeing the original 
unaltered message from the normal source.


My point was copy/pasting the same email body and sending it from a 
different source like a desktop/laptop is not going to be valid for 
troubleshooting rule hits.  I know that you know this but I am just 
saying it "out loud" for the OP.


--
David Jones


Re: Sender needs help with false positive

2017-08-08 Thread Dianne Skoll
On Tue, 8 Aug 2017 07:36:01 -0500
David Jones  wrote:

> The origin of the email and the path it takes makes a big difference
> in how it's filtered.

Sure, but doing a plain-text message with no HTML will immediately knock
2.2 points off the score.  That's a pretty cheap and easy win.

Regards,

Dianne.



Re: Sender needs help with false positive

2017-08-08 Thread David Jones

On 08/07/2017 07:36 PM, Jacek Osuchowski wrote:

David,

Thanks a lot. I will try to modify the email text to have more 'meat on the
bone'. I am just surprised email with no links, no adds, no attempts to sell
anything can be interpreted as a spam.
That img in the email is a tag from SendGrid email services used to trace
the emails. I don't know if I can get rid of it.



The folks at Sendgrid know how to properly send out mass emails without 
getting blocked by spam filters.  They should have some resources to 
help with your email delivery.  Check with them since you are paying for 
that service.



That's his PC which is the MSA. As it's the first hop, it's not surprising
it hits Zen PBL (it should, given a host name like
ool-44c047bf.dyn.optonline.net).



About those headers you put in pastebin, is that an actual mail from the 
same source that normally generates these password reset emails or was 
that a test of the same message body from your desktop?  We need to see 
the headers from an exact message sent from the same source as it 
normally would be.


The origin of the email and the path it takes makes a big difference in 
how it's filtered.


--
David Jones


Re: Sender needs help with false positive

2017-08-08 Thread Benny Pedersen

Required score -20 on inbound scanning to protect outbound spam?

Op MSG was dkim signed and valid au, why was it not ADD to whitelist auth, 
maybe i was sleeping :(


Re: Sender needs help with false positive

2017-08-08 Thread Rupert Gallagher
Avoid marketing mass-mailers when sending administrative messages.
Sent from ProtonMail Mobile

On Tue, Aug 8, 2017 at 12:56 AM, Jacek Osuchowski  wrote:

> We use emails to allow users to reset their passwords to our website. We send 
> very brief emails containing the reset password. Example between :
>
>>
>
> Your password to access your account is:
>
> S]U3bC7k
>
> Upon successful login you may change your password by going to Modify Account 
> / Change Your Password.
>
>>
>
> The emails are marked as spam. Sample report from IsnotSpam.com:
>
> SpamAssassin check details:
>
>  -- ---
>
> * 3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
>
> * [score: 0.9995]
>
> * -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3)
>
> * [50.31.63.50 listed in wl.mailspike.net]
>
> * -0.0 SPF_PASS SPF: sender matches SPF record
>
> * 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
>
> * [score: 0.9995]
>
> * 2.1 HTML_IMAGE_ONLY_12 BODY: HTML: images with 800-1200 bytes of words
>
> * 0.1 HTML_MESSAGE BODY: HTML included in message
>
> * -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
>
> * domain
>
> * 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
>
> * valid
>
> * -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
>
> * -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders
>
> X-Spam-Status: Yes, hits=5.7 required=-20.0 tests=BAYES_99,BAYES_999,
>
> DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_IMAGE_ONLY_12,HTML_MESSAGE,
>
> RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS autolearn=no autolearn_force=no
>
> version=3.4.0
>
> X-Spam-Score: 5.7
>
> I understand you trying to provide great software to fight email spam but you 
> are making my live miserable. I am having more problems with our emails 
> marked as spam then from the spam itself. Any help on how avoid being marked 
> as spam would help. Is there a way to be whitelisted by SpamAssasin globally. 
> Most emails are blocked by internet providers like Cablevision or comcast and 
> getting them to help is IMPOSSIBLE. They just install the software and let it 
> run as it is.
>
> Thank You