Diego Pomatta wrote:
John Thompson escribió:
On 2008-01-23, Diego Pomatta <[EMAIL PROTECTED]> wrote:

I use Thunderbird. There are two files for that folder: Junk.msf (7k) and Junk (53.172k). The msf file must be some kind of index. I just feed the biggest one to sa-learn?

Yup. Use "sa-learn --spam --mbox Junk" to learn your spam. You'll want to use the "--mbox" switch so sa-learn will process it as an mbox format mailbox, since that's what Thunderbird uses to store mail.

~/sa-learn --spam --mbox Junk
Learned tokens from 7 message(s) (7 message(s) examined)

Looks like it worked feeding it the entire Thunderbird Junk folder file. :)
Thanks all.

Btw, what the difference between using "sa-learn --spam..." and "spamassassin --report..." like Anthony said?

From:

http://spamassassin.apache.org/full/3.2.x/doc/spamassassin-run.html

"-r, --report
Report this message as manually-verified spam. This will submit the mail message read from STDIN to various spam-blocker databases. Currently, these are the Distributed Checksum Clearinghouse http://www.rhyolite.com/anti-spam/dcc/, Pyzor http://pyzor.sourceforge.net/, Vipul's Razor http://razor.sourceforge.net/, and SpamCop http://www.spamcop.net/.

If the message contains SpamAssassin markup, the markup will be stripped out automatically before submission. The support modules for DCC, Pyzor, and Razor must be installed for spam to be reported to each service. SpamCop reports will have greater effect if you register and set the spamcop_to_address option.

The message will also be submitted to SpamAssassin's learning systems; currently this is the internal Bayesian statistical-filtering system (the BAYES rules). (Note that if you only want to perform statistical learning, and do not want to report mail to third-parties, you should use the sa-learn command directly instead.)"

This option teaches the Bayesian system, but also submits to third party systems like DCC and SpamCop.

--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:    http://www.chime.ucl.ac.uk/~rmhiajp/
"A CAT scan should take less time than a PET scan.  For a CAT scan,
 they're only looking for one thing, whereas a PET scan could result in
 a lot of things."    - Carl Princi, 2002/07/19

Reply via email to