From: Matt Kettler <[EMAIL PROTECTED]>
To: "Mooky Mooksgill" <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
Subject: Re: bayes_path, sa-learn, rules
Date: Tue, 11 May 2004 16:06:21 -0400


At 03:40 PM 5/11/2004, Mooky Mooksgill wrote:
i added a bayes_path to my local.cf, but i don't see any bayes_* files there.... will they only show up once i've run sa-learn?

does your bayes_path statement end in /bayes_ ?

thanks for the feedback and for antidrug!

btw - i still get this one:

Buy your drug of choice, NO prescription required
Today's special: Free overnight Fedex delivery

Vicodin.....................$2.53/dose
Hydrocodone............$2.11/dose
Xanax.......................$2.51/dose
Valium......................$2.61/dose
Phentermine..............$0.85/dose

Stock is limited and selling fast, so hurry
Buy them here


my bayes_path looks like this:

bayes_path /usr/local/etc/mail/spamassassin/bayes (from the apache sa wiki for sitewide bayes db)
but no bayes_ (or any bayes*) files are there - the directory /usr/local/etc/mail/spamassassin/ exists


here are my rules:
-rw-r--r--  1 root  wheel   31906 May 10 22:39 70_sare_adult.cf
-rw-r--r--  1 root  wheel    3927 May 10 22:38 70_sare_bayes_poison_nxm.cf
-rw-r--r--  1 root  wheel   92553 May 10 22:51 70_sare_genlsubj0.cf
-rw-r--r--  1 root  wheel   16840 May 10 22:52 70_sare_genlsubj1.cf
-rw-r--r--  1 root  wheel   10544 May 10 22:52 70_sare_genlsubj2.cf
-rw-r--r--  1 root  wheel   12475 May 10 22:52 70_sare_genlsubj3.cf
-rw-r--r--  1 root  wheel    9271 May 10 22:41 70_sare_oem.cf
-rw-r--r--  1 root  wheel   13937 May 10 22:40 70_sare_random.cf
-rw-r--r--  1 root  wheel    4945 May 10 22:40 70_sare_spoof.cf
-rw-r--r--  1 root  wheel   13193 May 10 22:39 72_sare_bml_post25x.cf
-rw-r--r--  1 root  wheel    9065 May 10 22:43 88_FVGT_body.cf
-rw-r--r--  1 root  wheel    4621 May 10 22:37 88_FVGT_headers.cf
-rw-r--r--  1 root  wheel    8342 May 10 22:42 88_FVGT_rawbody.cf
-rw-r--r--  1 root  wheel    6753 May 10 22:43 88_FVGT_subject.cf
-rw-r--r--  1 root  wheel   11300 May 10 22:44 88_FVGT_uri.cf
-rw-r--r--  1 root  wheel   57580 May 10 22:57 99_FVGT_Tripwire.cf
-rw-r--r--  1 root  wheel   11274 May 10 22:36 99_FVGT_meta.cf
-rw-r--r--  1 root  wheel   10147 May 10 22:40 99_sare_fraud_post25x.cf
-rw-r--r--  1 root  wheel   29003 May 10 22:57 airmax.cf
-rw-r--r--  1 root  wheel   14284 May 10 22:55 antidrug.cf
-rw-r--r--  1 root  wheel   22546 May 10 22:56 backhair.cf
-rw-r--r--  1 root  wheel   68435 May 10 22:56 bigevil.cf
-rw-r--r--  1 root  wheel   81029 May 10 22:56 bogus-virus-warnings.cf
-rw-r--r--  1 root  wheel   23422 May 10 22:56 chickenpox.cf
-rw-r--r--  1 root  wheel   19127 May 10 22:38 coding_html.cf
-rw-r--r--  1 root  wheel   16337 May 10 22:56 evilnumbers.cf
-rw-r--r--  1 root  wheel   10792 May 10 22:39 header_abuse.cf
-rw-r--r--  1 root  wheel    2088 May 11 13:22 local.cf
-rw-r--r--  1 root  wheel     302 May  4 23:05 local.cf.sample
-rw-r--r--  1 root  wheel    3569 May 10 22:57 random.current.cf
-rw-r--r--  1 root  wheel   13201 May 10 22:39 ratware.cf
-rw-r--r--  1 root  wheel  744607 May 10 20:01 sa-blacklist.current.cf
-rw-r--r--  1 root  wheel  213123 May 10 22:57 sa-blacklist.current.uri.cf
-rw-r--r--  1 root  wheel    4007 May 10 22:53 useless.cf
-rw-r--r--  1 root  wheel    3880 May 10 22:57 weeds.cf
-rw-r--r--  1 root  wheel    4629 May 10 22:57 weeds2.cf

would you not use any of these? i think i might adjust some of the scoring, but they have been quite effective.


Note that bayes_path is both a path, and a start of a filename. Bayes will fail if the literal path is a directory name alone.




also - is it sufficient to enclose all the escaped spam and ham into respective messages to train SA with, eg. can sa-learn learn from one message with a bunch of enclosed messages? i will be re-sending the FPs and FNs via outlook to special accounts on the SA server.

sa-learn can only learn properly from ORIGINAL messages or spamassassin's own encapsulations in their complete original form as generated by SA. Forwards, attachments generated by anything but spamassassin itself, etc are not appropriate training data.

i think i will only train sa on FNs/spam and FPs/ham, since it already knows how to catch what it's caught? yeah?
in this case, i'll need to figure out how to transfer my escaped spams... any ideas? i have them in a folder in outlook.



SA can recognize it's own encapsulations, but if you feed it

is it necessary to train sa on mail its already caught? this is the only case i know where sa encapsulates msgs.



messages generated by your mailclient sa-learn cannot magicaly distinguish between a spam message with attachments, and a non-spam message having several spam messages attached to it.

If you want to do some kind of attachment scheme, you'll need an external tool to strip off the attachments and feed them to sa-learn.

i have started using rules like backhair, antidrug, etc. i have about 20 rules in there, is it bad to use so many? i've definitely noticed reduced spam getting through, but had to whitelist_from on some FPs...

Not really, unless you get load average and/or false positive problems. It's really a matter of configuring SA to meet _your_ needs.


Disclaimer: I wrote antidrug, thus have some natural bias.




_________________________________________________________________
Stop worrying about overloading your inbox - get MSN Hotmail Extra Storage! http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1/go/onm00200362ave/direct/01/




Reply via email to