-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
fwiw, the easy way to extract URLs is to write a very short SpamAssassin plugin that calls the get_uri_list() API on the Mail::SpamAssassin::PerMsgStatus object and prints them to stdout. Let SpamAssassin do all the hard work. ;) - --j. Chris Santerre writes: > I think I know a bit about extracting URLs from spam ;) > > It is pretty damn complicated. A lot of tricks they play, like > www.amazon.com.buy-my-drugs-com.optelnd.net > > Then you have hex and decimal links to deal with. And yeah, they do pepper > the spam with legit urls. What about akami image links? Its was common to > see 20 links in a spam, and only one was the evil one you wanted. > > Automation without a LOT of checks and balances = FPs. > > You have to have a LOT more autoresearched evidence then just that they are > contained in a spam. But hey! A+ for effort! Its a start, and it will always > get better. > > Chris Santerre > SysAdmin and SARE/URIBL ninja > http://www.uribl.com > http://www.rulesemporium.com > > > > > -----Original Message----- > > From: Kristopher Austin [mailto:[EMAIL PROTECTED] > > Sent: Friday, February 10, 2006 11:04 AM > > To: [EMAIL PROTECTED]; spamassassin-users@incubator.apache.org > > Subject: RE: Xtracting urls from saved spams & making SA rules - > > xurl001.pl > > > > > > I would recommend caution when using such a program. I see > > lots of spam > > that have legitimate URLs sprayed in them as well. > > > > I do think this would be very useful though. Just need to > > make sure you > > look through the rules and remove the good guys. > > > > Kris > > > > > -----Original Message----- > > > From: Michael W Cocke [mailto:[EMAIL PROTECTED] > > > Sent: Friday, February 10, 2006 8:57 AM > > > To: spamassassin-users@incubator.apache.org > > > Subject: Xtracting urls from saved spams & making SA rules - > > xurl001.pl > > > > > > It's absolutely not finished, but attached is a quick perl hack I'm > > > using to read thru a directory of saved spam (text files), extract > > > urls and automatically build SA rules for them. It's not debugged > > > throughly and I have a few more things to add, but I know > > I'm not the > > > only person who can use this. > > > > > > Mike- > > > -- > > > If you're not confused, you're not trying hard enough. > > > -- > > > Please note - Due to the intense volume of spam, we have installed > > > site-wide spam filters at catherders.com. If email from > > you bounces, > > > try non-HTML, non-encoded, non-attachments, > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Exmh CVS iD8DBQFD7NooMJF5cimLx9ARAsj+AJ9riDyoEIoG1tqWEYk1m+t3PwjqJgCdEv1U 0G7r/Jq7FpYje8XAP6cTYrM= =n/07 -----END PGP SIGNATURE-----