Hi Marco, I'm glad that you are thinking about anti-spam strategies. While this approach might be helpful, I am afraid it has the following or more problems:
1) It is indeed too expensive in terms of data and time. Normally spamassassin takes a fraction of a second to scan each mail. Your mail will become backed up if even a fraction of your incoming mail is waiting on usually slow HTTP servers and their usual maze of redirects. 2) The act of blindly following URL's can have nasty side-effects like confirming that your address is alive, thus attracting more spam. Sometimes those links are to "confirm" subscription to a spammer's list. Thus they send more spam, and claim that you opted in for that spam. Warren Togami [email protected] On Mon, Dec 27, 2010 at 4:22 PM, Marco Ribeiro <[email protected]> wrote: > I am aware of the Web Redirect plugin [4], but it was last updated in > 2006. Is it too expensive to query for webpages? Does the cost make > this approach useless? I was initially thinking of trying to implement > this on Spam Assassin as a Google Summer of Code project, but it is > such a basic task that (if it's usable) I could probably do it in no > time. The classifier I used outputs readable rules, so it would be a > piece of cake to translate them into regular expressions. And it seems > spammers don't even bother trying to obfuscate the web pages (or maybe > they don't even have control over them). For example, 36.7% of the > webpages I downloaded contained the word viagra in them, and 99.84% of > them were spam (the 0.16% probably was as well, it was probably due to > some minor error). What do you guys think? Is it worth trying? Any > ideas? > >
