Adam, I'm sure everyone else that replies may say basically the same as me, but here's my input about SA and your management's questions.
> Hash Busting - slightly modify each copy of message to foil > 'fingerprinting' techniques AFAIK, the fingerprinting techinques are "fuzzy" and can withstand a a little bit of abuse. > Bayes Poisoning - addition of random dictionary words My only experience with bayes poisoning has been from this list :) By that, I mean I have had a mail on this list go talking about spam, and the db got almost reversed. I'll talk more about this later. > Hidden Text - using invisible text in html messages SA has specificc rules for this. > Keyword Corruption - using obfuscated text to hide keywords SA has specific rules for this. > Tiny Messages - messages with only URL or image SA has specific rules for this. What I like about SA is that there is no specific rule or subset of rules that can trigger a mail to be labled as spam. Even when my bayes db got poisoned by this list. I was experimenting with tagging low spam scores with '*** LIKELY SPAM ***' in the subject because my anal retentive users would complain very loudly if anything was marked as a false positive. The thing that irritated me about these complaints is that all of the mails that were labled this way scored just barely as spam, and all of these mails were _solicited_ bulk email, and looked like spam to me, and used many of the same tools and tricks that real spammers use. What I do now, is I set my threshold score high (10), and I have custom spam and ham rules as well as a 3rd party plugin to raise scores. My average spam score is 20 or above. I don't have real data, but the number of missed mails is very low. Like less than 10 since SA 3.0 came out. And I have had 0 false positives for my mailbox, and the 1st false positive for one user today from a mail that was very borderline, and it would not have been missed if it was not delivered. My only issue with SA is that it does not appear to scale very well. I have not experienced this problem personally, because the domain that I run SA on does not have very high mail traffic, but this does appear to be an issue, and there are workarounds for it by skipping some tests at the expense of doing better filtering. OK, another issue, but a small one (I'm pretty picky), is that the scores for some of the rules do not always seem correct. High scores for things that seem pretty benign, and low scores for things that look almost exclusively like spam (such as forged headers, or mismatched IPs). I know these scores are somehow objectively done with a corpus of ham and spam and some algorithm to score accordingly, but to me some of them just seem wrong. Maybe scores and rules could be autolearned like bayes. Not sure. Thats my input for your managers. Mike -- /-----------------------------------------\ | Michael Barnes <[EMAIL PROTECTED]> | | UNIX Systems Administrator | | College of William and Mary | | Phone: (757) 879-3930 | \-----------------------------------------/