From: Matt Kettler <[EMAIL PROTECTED]>
To: "Mooky Mooksgill" <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
Subject: Re: bayes_path, sa-learn, rules
Date: Tue, 11 May 2004 16:06:21 -0400
At 03:40 PM 5/11/2004, Mooky Mooksgill wrote:i added a bayes_path to my local.cf, but i don't see any bayes_* files there.... will they only show up once i've run sa-learn?
does your bayes_path statement end in /bayes_ ?
thanks for the feedback and for antidrug!
btw - i still get this one:
Buy your drug of choice, NO prescription required Today's special: Free overnight Fedex delivery
Vicodin.....................$2.53/dose Hydrocodone............$2.11/dose Xanax.......................$2.51/dose Valium......................$2.61/dose Phentermine..............$0.85/dose
Stock is limited and selling fast, so hurry Buy them here
my bayes_path looks like this:
bayes_path /usr/local/etc/mail/spamassassin/bayes (from the apache sa wiki for sitewide bayes db)
but no bayes_ (or any bayes*) files are there - the directory /usr/local/etc/mail/spamassassin/ exists
here are my rules: -rw-r--r-- 1 root wheel 31906 May 10 22:39 70_sare_adult.cf -rw-r--r-- 1 root wheel 3927 May 10 22:38 70_sare_bayes_poison_nxm.cf -rw-r--r-- 1 root wheel 92553 May 10 22:51 70_sare_genlsubj0.cf -rw-r--r-- 1 root wheel 16840 May 10 22:52 70_sare_genlsubj1.cf -rw-r--r-- 1 root wheel 10544 May 10 22:52 70_sare_genlsubj2.cf -rw-r--r-- 1 root wheel 12475 May 10 22:52 70_sare_genlsubj3.cf -rw-r--r-- 1 root wheel 9271 May 10 22:41 70_sare_oem.cf -rw-r--r-- 1 root wheel 13937 May 10 22:40 70_sare_random.cf -rw-r--r-- 1 root wheel 4945 May 10 22:40 70_sare_spoof.cf -rw-r--r-- 1 root wheel 13193 May 10 22:39 72_sare_bml_post25x.cf -rw-r--r-- 1 root wheel 9065 May 10 22:43 88_FVGT_body.cf -rw-r--r-- 1 root wheel 4621 May 10 22:37 88_FVGT_headers.cf -rw-r--r-- 1 root wheel 8342 May 10 22:42 88_FVGT_rawbody.cf -rw-r--r-- 1 root wheel 6753 May 10 22:43 88_FVGT_subject.cf -rw-r--r-- 1 root wheel 11300 May 10 22:44 88_FVGT_uri.cf -rw-r--r-- 1 root wheel 57580 May 10 22:57 99_FVGT_Tripwire.cf -rw-r--r-- 1 root wheel 11274 May 10 22:36 99_FVGT_meta.cf -rw-r--r-- 1 root wheel 10147 May 10 22:40 99_sare_fraud_post25x.cf -rw-r--r-- 1 root wheel 29003 May 10 22:57 airmax.cf -rw-r--r-- 1 root wheel 14284 May 10 22:55 antidrug.cf -rw-r--r-- 1 root wheel 22546 May 10 22:56 backhair.cf -rw-r--r-- 1 root wheel 68435 May 10 22:56 bigevil.cf -rw-r--r-- 1 root wheel 81029 May 10 22:56 bogus-virus-warnings.cf -rw-r--r-- 1 root wheel 23422 May 10 22:56 chickenpox.cf -rw-r--r-- 1 root wheel 19127 May 10 22:38 coding_html.cf -rw-r--r-- 1 root wheel 16337 May 10 22:56 evilnumbers.cf -rw-r--r-- 1 root wheel 10792 May 10 22:39 header_abuse.cf -rw-r--r-- 1 root wheel 2088 May 11 13:22 local.cf -rw-r--r-- 1 root wheel 302 May 4 23:05 local.cf.sample -rw-r--r-- 1 root wheel 3569 May 10 22:57 random.current.cf -rw-r--r-- 1 root wheel 13201 May 10 22:39 ratware.cf -rw-r--r-- 1 root wheel 744607 May 10 20:01 sa-blacklist.current.cf -rw-r--r-- 1 root wheel 213123 May 10 22:57 sa-blacklist.current.uri.cf -rw-r--r-- 1 root wheel 4007 May 10 22:53 useless.cf -rw-r--r-- 1 root wheel 3880 May 10 22:57 weeds.cf -rw-r--r-- 1 root wheel 4629 May 10 22:57 weeds2.cf
would you not use any of these? i think i might adjust some of the scoring, but they have been quite effective.
Note that bayes_path is both a path, and a start of a filename. Bayes will fail if the literal path is a directory name alone.
also - is it sufficient to enclose all the escaped spam and ham into respective messages to train SA with, eg. can sa-learn learn from one message with a bunch of enclosed messages? i will be re-sending the FPs and FNs via outlook to special accounts on the SA server.
sa-learn can only learn properly from ORIGINAL messages or spamassassin's own encapsulations in their complete original form as generated by SA. Forwards, attachments generated by anything but spamassassin itself, etc are not appropriate training data.
i think i will only train sa on FNs/spam and FPs/ham, since it already knows how to catch what it's caught? yeah?
in this case, i'll need to figure out how to transfer my escaped spams... any ideas? i have them in a folder in outlook.
SA can recognize it's own encapsulations, but if you feed it
is it necessary to train sa on mail its already caught? this is the only case i know where sa encapsulates msgs.
messages generated by your mailclient sa-learn cannot magicaly distinguish between a spam message with attachments, and a non-spam message having several spam messages attached to it.
If you want to do some kind of attachment scheme, you'll need an external tool to strip off the attachments and feed them to sa-learn.
i have started using rules like backhair, antidrug, etc. i have about 20 rules in there, is it bad to use so many? i've definitely noticed reduced spam getting through, but had to whitelist_from on some FPs...
Not really, unless you get load average and/or false positive problems. It's really a matter of configuring SA to meet _your_ needs.
Disclaimer: I wrote antidrug, thus have some natural bias.
_________________________________________________________________
Stop worrying about overloading your inbox - get MSN Hotmail Extra Storage! http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1/go/onm00200362ave/direct/01/
