Lars Troen wrote: > I have one external email account that receives some spam on a host that > is not using ASSP. I'm not an admin on this network and using Spam > Assassin on this box. SA has also been a very successful solution until > recently. Lately there have been emails coming in that are not typical > spam, but have quite legitimate text, looking much similar to a normal > email conversation, but what I still consider spam. I also received one > such email where the majority of the text was pasted from the GNU Public > License. > > I'm wondering if this is a new initiative by the spammers trying to > corrupt the bayesian databases. I know also many commercial grade anti > spam solutions have had some troubles lately with certain kinds of spam. > > > ASSP seems to do it's job excellently and classifies these emails as > spam (as it normally should: it *is* after all spam). But I'm fearing > that over time and given enough of these spam emails the bayesian > spam-database might be weighted in a way that can block normal legal > email. > > ASSP still seems to be the ultimate antispam product! :) > > Lars > > > I believe thats called Bayesian poisoning.(search wikipedia for "Bayesian poisoning") It's a touchy subject for many and it's not very new it's just becoming more widespread. Spammers send allot of email with "legitimate" text (I've seen excerpts from books, websites copied verbatim, and some of it is hand written with very bad grammar) in the hopes of changing the way your server classifies spam, and thus letting their messages through.
It can sometimes be tricky to handle. I use the CCspam option to send a copy of all the spam to a mailbox for my review when I have an hour or two free. If I see allot of it in a short period of time I usually open the headers of the messages and see if they come from the same ip range. If they do they get added to the denysmtpconnection list. You can also redlist certain addresses that get the email so they do not contribute to the corpus. Micheal Espinola had some thoughts on a subject related to this last week that turned into a very log thread, it was originally titled "[Assp-user] PHP tool to view and Sort your Spam/Notspam", it was somewhat hijacked form the original topic :) . One of the nice things about ASSP is that it refreshes the corpus constantly, once is reaches maxFiles the corpus messages are overwritten randomly, thus making is so that the "Poisoning" never lasts longer than it takes you to get new messages for the corpus, as long as you slow or stop the flow the corpus will recover. Most corporate solutions (personal experience speaking here) will keep the same corpus forever and just keep adding to it, you can see how email like you described would be an issue with that approach, with enough fake messages they can change what it calls spam with the end result being that the corpus is falsely weighted and starts categorizing email wrong(aka letting their email through, and in my experience unless measures are taken to stop this it will spiral out of control and render the corpus useless. This was the main reason I started using ASSP. Phew, that got long fast. Hope it answers your question(s). Kevin ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Assp-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-user
