On Sat, 25 Sep 2004, Theodore Heise wrote:

> I've been pointing sa-learn at Pine mail folders now for over two
> years, and just ignoring the fact it's learning from the Pine folder
> header.  I don't expect to actually get any e-mail resembling it.
> During this time Bayes has always worked very effectively for me.

Well, it occurred to me I could investigate this situation a little
bit more objectively using "spamassassin -t" (test mode).

I typically keep all my Pine mail folders in /home/theo/mail/, with
tagged spam directed to ~/mail/spam.  To train my Bayes, I point
sa-learn at the spam folder, move all tagged spam to an archive
file, and then learn ~/mail/* as spam.  This means the Pine spam
folder header gets looked at first as spam, and then as ham.

I tested the spam folder message after learning as spam, and then
after learning as ham.  I also tested the Pine message heading up
the ~/mail/sent folder.  All three messages hit on the same rules
and gave the same total score (quoted below).  Interestingly, the
actual BAYES_00 score was different for the ~/mail/spam folder
learned as spam, as compared to learned as ham (0.0012 vs. 0.0000).

The difference doesn't seem to be worth the trouble to bother with.

Ted

-- 
Theodore (Ted) Heise     <[EMAIL PROTECTED]>     Bloomington, IN, USA


[~/mail/spam] Pine header message learned as spam
Content analysis details:   (-5.1 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.7 SUBJ_ALL_CAPS          Subject is all capitals
-3.3 ALL_TRUSTED            Did not pass through any untrusted hosts
 0.1 MISSING_HEADERS        Missing To: header
-2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
                            [score: 0.0012]


[~/mail/spam] Pine header message learned as ham
Content analysis details:   (-5.1 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.7 SUBJ_ALL_CAPS          Subject is all capitals
-3.3 ALL_TRUSTED            Did not pass through any untrusted hosts
 0.1 MISSING_HEADERS        Missing To: header
-2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
                            [score: 0.0000]

[~/mail/sent] Pine header message learned as ham
Content analysis details:   (-5.1 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.7 SUBJ_ALL_CAPS          Subject is all capitals
-3.3 ALL_TRUSTED            Did not pass through any untrusted hosts
 0.1 MISSING_HEADERS        Missing To: header
-2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
                            [score: 0.0000]

Reply via email to