-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

fwiw, the easy way to extract URLs is to write a very short SpamAssassin
plugin that calls the get_uri_list() API on the
Mail::SpamAssassin::PerMsgStatus object and prints them to stdout.

Let SpamAssassin do all the hard work. ;)

- --j.

Chris Santerre writes:
> I think I know a bit about extracting URLs from spam ;)  
> 
> It is pretty damn complicated. A lot of tricks they play, like
> www.amazon.com.buy-my-drugs-com.optelnd.net
> 
> Then you have hex and decimal links to deal with. And yeah, they do pepper
> the spam with legit urls. What about akami image links? Its was common to
> see 20 links in a spam, and only one was the evil one you wanted. 
> 
> Automation without a LOT of checks and balances = FPs. 
> 
> You have to have a LOT more autoresearched evidence then just that they are
> contained in a spam. But hey! A+ for effort! Its a start, and it will always
> get better. 
> 
> Chris Santerre
> SysAdmin and SARE/URIBL ninja
> http://www.uribl.com
> http://www.rulesemporium.com
> 
> 
> 
> > -----Original Message-----
> > From: Kristopher Austin [mailto:[EMAIL PROTECTED]
> > Sent: Friday, February 10, 2006 11:04 AM
> > To: [EMAIL PROTECTED]; spamassassin-users@incubator.apache.org
> > Subject: RE: Xtracting urls from saved spams & making SA rules -
> > xurl001.pl
> > 
> > 
> > I would recommend caution when using such a program.  I see 
> > lots of spam
> > that have legitimate URLs sprayed in them as well.
> > 
> > I do think this would be very useful though.  Just need to 
> > make sure you
> > look through the rules and remove the good guys.
> > 
> > Kris
> > 
> > > -----Original Message-----
> > > From: Michael W Cocke [mailto:[EMAIL PROTECTED]
> > > Sent: Friday, February 10, 2006 8:57 AM
> > > To: spamassassin-users@incubator.apache.org
> > > Subject: Xtracting urls from saved spams & making SA rules -
> > xurl001.pl
> > > 
> > > It's absolutely not finished, but attached is a quick perl hack I'm
> > > using to read thru a directory of saved spam (text files), extract
> > > urls and automatically build SA rules for them.  It's not debugged
> > > throughly and I have a few more things to add, but I know 
> > I'm not the
> > > only person who can use this.
> > > 
> > > Mike-
> > > --
> > > If you're not confused, you're not trying hard enough.
> > > --
> > > Please note - Due to the intense volume of spam, we have installed
> > > site-wide spam filters at catherders.com.  If email from 
> > you bounces,
> > > try non-HTML, non-encoded, non-attachments,
> >
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFD7NooMJF5cimLx9ARAsj+AJ9riDyoEIoG1tqWEYk1m+t3PwjqJgCdEv1U
0G7r/Jq7FpYje8XAP6cTYrM=
=n/07
-----END PGP SIGNATURE-----

Reply via email to