Depends on the client.
For instance, Thunderbird stores it's folders in mbox format, so
sa-learn can work against those files as-is. Other email clients can
save emails in text format complete with headers.
I use Thunderbird. There are two files for that folder: Junk.msf (7k)
and Junk (53.172k). The msf file must be some kind of index. I just
feed the biggest one to sa-learn?
Yes, the .msf file is an index file. I just copy the mbox file (Junk in
your case) to the server and run the following command specifying the
filename (as shown):
/usr/local/bin/spamassassin --report --mbox Junk
I use Thunderbird as my mail client but have found that I needed to use
Evolution to save the messages in mbox format, which was always a hassle.
My emails are stored on an IMAP server and what you suggested wasn't
working for me. I had the .msf file, but no corresponding mbox file.
Because the emails are kept on the IMAP server and are not local, I had
to enable the "Select this folder for offline use" on the "Offline" tab
of the folder properties. I then had the mbox file that I could copy off.
--
Mark Johnson
http://www.astroshapes.com/information-technology/blog/