Re: Feeding SA-learn

Diego Pomatta Wed, 23 Jan 2008 04:01:59 -0800

Anthony Peacock escribió:

Well the short answer is, yes you can.
The slightly longer answer is that you won't get as good resultsdoing this, as the Bayes system uses tokens found in the completemessage. By only learning on the body you will not gain anyadvantage for tokens found in headers.
Yep, I know, precisely the problem is that I don't have the originalheaders after the mail has been delivered.My intention was to manually feed the few spam messages that slipthru undetected. By the time I get a hold of those, they are in therecipient's mail client inbox, not in the server.I was thinking, if I save the mail as EML files, would that preservethe headers in a way that sa-learn can parse correctly?
Depends on the client.
For instance, Thunderbird stores it's folders in mbox format, sosa-learn can work against those files as-is. Other email clients cansave emails in text format complete with headers.

I use Thunderbird. There are two files for that folder: Junk.msf (7k)and Junk (53.172k). The msf file must be some kind of index. I just feedthe biggest one to sa-learn?

/Regards

Re: Feeding SA-learn

Reply via email to