On this whole discussion, I read an article that pointed me to an interesting fact that has beared up extremely well under ,albiet limited, my personal investigations. Here is the article:
http://www.colinfahey.com/spam_topics/spam_the_phenomenon.htm The author proposes a solution which I do not see as being very workable for a number of different reasons. However, he touched upon a fact that I find useful. A lot of spam nowadays advertizes links to 'throwaway' domains. I did some quick research and found that the emails that spamassassin wasn't catching had a very high percentage of links to domains that were under a month old, most under a week old. This is in line with what makes spam a very hard thing to filter for. Spammers are constantly changing their tactics so that there is no consistent pattern to be able to filter for. However, in this case, due to the fact that they keep changing urls, the domain age acts as a large red flag. I have started thinking about writing a spamassassin plugin that checks this. I think there are some basic problems with this design. The first one that probably pops into most peoples' mind is that people will be penalized for having a new domain name, i.e. they will have problems sending mail to people using this plugin on their mail servers. Also, if this becomes commonplace, spammers can compensate by just waiting a month or two or more before using a throwaway domain. Now, both of the above are true and will have an increasing chance of being a probable issue as such a solution becomes more widely used. The first can be handled by whitelisting individual domains on a per site basis. I handle email concerns for a medium-sized corporate environment with some email contact with customers. This may have some impact but can minimized by not enabling spamassassin on certain critical service emails and strategic whitelisting. I think the potential benefits outweigh the impact. On the second concern, I think this is kind of the nature of the game. Until social and economic pressures are brought to bear so that spamming is no longer a profitable ability, spammers will continue to evolve their tactics to counter the tactics of spam filterers. This throws a wrench in their plans for a bit and will likely catch at least a bit more spam in the future than not using the method. The largest technical concern I have is the availability of whois servers. I have already decided to include a caching database in the design of this tool that will hold entries so that whois servers will not be attempted for every piece of email being processed by the mail server. This is a very similiar concept as to DNS caching. If this evolves into a multiple site deployment, such a caching database could be centralized keeping the load on whois servers low. However, I am curious as to whether or not whois servers have blocking mechanisms which will engage if they get so many requests from a single IP in a certain amount of time. If anyone has data on this, let me know. Otherwise, I will be using trial & error to determine whether or not such methods exist and have an effect. Input on this idea is welcome. Feel free to email me directly on it. Individuals who want to assist are welcome as well. Joe Gilbert -----Original Message----- From: John Andersen [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 14, 2004 7:08 PM To: [EMAIL PROTECTED] Subject: Re: [Razor-users] Poor detection ratio On Wednesday 14 April 2004 15:31, [EMAIL PROTECTED] wrote: > I believe that if the users of SpamAssassin would take the 2 minutes to > configure razor-report you might be able to achieve >>80% detection ratio > in a matter of days. > > However, since most people who use SpamAssassin don't take the time to > RTFM they just assume that someone else is running razor-report correctly > and don't pay it any attention. The point is if everyone who used > razor-check also used razor-report then it would be _much_ better in > performance than what you might see today. > > For me, since it's such low overhead, I report everything as spam and only > check the one's that I am uncertain of. However, I do not use > SpamAssassin at all. Using a different local spam detection engine and > reporting back to razor should give SpamAssassin some extra benefit. Except that SA already does WAY better than Razor at detecting spam, and turning off Razor check in SA does not hurt it's detection rate at all but does improve performance. Lets face it, Razor NEEDS something like SpamAssassin and a few others to tell it what is spam. Given that, Why bother with razor? Once 'ive detected something as spam with SA (or what ever) why should I feed Razor? Razor has a fundamentally flawed design, which is parasitic on the rest of the anti-spam industry. In spite of the fact that spam is evolving very fast, SA, Bayes, and other tools catch it first, and then, only after some days does it ever show up in razor. The best advice is to move off of razor to a "primary detector" spam engine, and not wait till the secondary detectors catch up. And since most spam contains web addresses something like surbl makes more sense than Razor. See http://surbl.org/ I hate being so negative about Razor but useing it for 3 years and watching the hit rate compared to the rest of SpamAssassin has convinced me Razor is built on false assumptions. -- _____________________________________ John Andersen ------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ Razor-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/razor-users ------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ Razor-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/razor-users