Re: [SAtalk] Removing headers etc.. to feed Bayes correctly

Tony Earnshaw Sat, 14 Jun 2003 03:03:54 -0700

Robert Menschel wrote:

TE> To get a reasonable base, it's been my understanding that you teach
TE> Bayes what is spam and what isn't. ...

Agreed.

TE> You don't start contradicting what you've taught it by teaching it
TE> low scoring spam until after you've reached your minimum bias of 200.

How is teaching Bayes that this email with a low SA score is actually
spam "contradicting" what we've taught it? I see this as teaching Bayes
that there is spam which SA doesn't yet have adequate rules for, and IMO,
from my experience, Bayes is a lot more flexible in handling these than
SA is.

I'm simply saying "wait until you've got your 200-base bias to do so."

(I don't have the ability to create new SA rules because of my end-user status. I can change scores, but that has limited application. I can feed all of my spam into Bayes, and Bayes works wonders at recognizing spam that SA can't.)

The system you're using probably reache that minimum bias long ago.

TE> You'll confuse the whole Bayes database if you do anything different.
TE> Why in goodness name put a minimum score of 5 in the first place, if
TE> you're going to contradict yourself?

Bayes doesn't care about the score, and Bayes can't IMO be confused by
seeing ham or spam as long as it's properly identified.

Right. It's the *pattern of tokens* used for Bayes analysis I'm concerned about. After 1,000 spams that's o.k., but under 200 it's critical.

So really we're saying the same thing, only you didn't notice my proviso of a minimum number of spams/non-spams (the latter are far more numerous on my system) before treating the thing as an adult.

Best,

Tony

--
Tony Earnshaw

- Deyr fé, deyr frendr
deyr sjálfr 'it sama
- ek veit ein aldrigi deyr
- dómr um dauđan hvern.

From Hávamál - the voice of the gods.

http://j-walk.com/blog/docs/conference.htm
http://www.billy.demon.nl
Mail: [EMAIL PROTECTED]

-------------------------------------------------------
This SF.NET email is sponsored by: eBay
Great deals on office technology -- on eBay now! Click here:
http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Removing headers etc.. to feed Bayes correctly

Reply via email to