Hi, >>>> http://pastebin.com/raw.php?i=1Y5QCkfh >>>> http://pastebin.com/raw.php?i=KdmZXM0d ... >> What I haven't been able to figure out is a more generalized pattern >> from these, such as something in the header that is inconsistent with >> non-spam or contains some type of invalid header data, such as the >> mismatch between having originated at yahoo but being sent as >> sbcglobal? >> >> Shouldn't have bayes picked this up after learning a dozen or more of >> these? >> > > IMHO, yes. Are you sure you are training bayes correctly. Are you using the > same user to train bayes as the user that is running SA? Work through some > of the advice already given regarding bayes.
Yes, I'm pretty sure bayes is solid. I'm autolearning, but at -1 and 13 instead of the defaults, and about 9.7M tokens. I could probably return it now to defaults since it's been running now for a while. Bayes is in mysql, so I have bayes_sql_username set, so it always uses that database, and there aren't any other databases. I am wondering why even after a sync it isn't represented, but perhaps that's due to mysql? $ sa-learn --sync $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 383489 0 non-token data: nspam 0.000 0 484418 0 non-token data: nham 0.000 0 9768178 0 non-token data: ntokens 0.000 0 1316487858 0 non-token data: oldest atime 0.000 0 1325550621 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 1325550621 0 non-token data: last expiry atime 0.000 0 5529600 0 non-token data: last expire atime delta 0.000 0 982370 0 non-token data: last expire reduction count Does the mysql bayes even have a journal or is it managed by mysql? > sanjit.in is now listed in a couple URIBLs (URIBL_PH_SURBL & > URIBL_HOSTKARMA_BL) - don't know if it was listed at the time you received > them. Yes, for me too. > They hit some local meta rules I have combining FREEMAIL_FROM with > __HAS_ANY_URI, __MANY_RECIPS, and various missing/blank subject rules. For > me these are relatively good indicators of FREEMAIL spam. Yes, the missing/blank are good triggers for metas. RW rwmailli...@googlemail.com wrote: > RP_MATCHES_RCVD=-1.613, > IIWY I'd take a look at how RP_MATCHES_RCVD is working for you. A lot > of us find it does more harm than good. In particular it's adding a > negative score to AOL, Yahoo, etc. That's a good idea. I noticed one hit this rule and the other didn't. Not sure I can remove it altogether, because so many ham messages have hit it on my system, but maybe a meta can be built from it, such as with FREEMAIL_FROM and adding points if it doesn't hit RP_MATCHES_RCVD (from doesn't match recvd)? Thanks again, Alex