> >> From: "Raquel Rice" <[EMAIL PROTECTED]> >> > On Wed, 11 Feb 2004 18:54:13 -0800 >> > "jdow" <[EMAIL PROTECTED]> wrote: >> > >> > > From: "Raquel Rice" <[EMAIL PROTECTED]> >> > > > On Wed, 11 Feb 2004 11:35:01 -0500 >> > > > Matt Kettler <[EMAIL PROTECTED]> wrote: >> > > > >> > > > > I feed bayes with some spamtraps and nonspamtraps each >> > > > > day, giving it about 100 spams, and 25 nonspams in manual >> > > > > training daily. >> > > > >> > > > How do you select, out of all your mail, 125 emails to train >> > > > bayes with? >> > > >> > > Might it be because SA seems to need 200 spams before the >> > > Bayes filter kicks in? (It performs remarkably well here with >> > > a corpus of some 450 spams and 700 or so hams. >> > > >> > >> > That isn't what I asked. I get over a thousand emails per day, >> > personally. Those are from all the lists I'm on, all the >> > personal mail, and all the business mail. I assume that Matt's >> > email is similar. What I'm asking is, how to select 125 per day >> > out of 1000? >> > >> > (I've been going through all my messages each day, manually >> > moving"ham" to a ham directory and moving "spam" to a spam >> > directory ... a long and tedious job ... then using that to >> > train bayes) >>
Raquel: Don't know if this is what you want either, but sounds like it. Right down at the very bottom of my global procmailrc, I place this recipe to send a copy of the "HAM" to a special HAM collection folder. The other copy is delivered to the appropriate user mbox. This figures that if the messages made it through all of the other recipes above -- it's HAM. Same with SPAM. Any of the recipes that spots a SPAM, a copy goes to a SPAM collection folder. Then, at midnight, a cron job feeds both HAM & SPAM using sa-learn. Hope this helps...... Best regards, Jack L. Stone, Administrator Sage American http://www.sage-american.com [EMAIL PROTECTED]
