Depends on the client.

For instance, Thunderbird stores it's folders in mbox format, so sa-learn can work against those files as-is. Other email clients can save emails in text format complete with headers.
I use Thunderbird. There are two files for that folder: Junk.msf (7k) and Junk (53.172k). The msf file must be some kind of index. I just feed the biggest one to sa-learn?

Yes, the .msf file is an index file. I just copy the mbox file (Junk in your case) to the server and run the following command specifying the filename (as shown):

/usr/local/bin/spamassassin --report --mbox Junk


I use Thunderbird as my mail client but have found that I needed to use Evolution to save the messages in mbox format, which was always a hassle.

My emails are stored on an IMAP server and what you suggested wasn't working for me. I had the .msf file, but no corresponding mbox file. Because the emails are kept on the IMAP server and are not local, I had to enable the "Select this folder for offline use" on the "Offline" tab of the folder properties. I then had the mbox file that I could copy off.

--
Mark Johnson
http://www.astroshapes.com/information-technology/blog/

Reply via email to