On Mon, Jun 16, 2003 at 12:27:01PM +1200, Andrew McNaughton wrote: > > Talking of which, does anyone know any good or interesting approaches to > identifying these junk strings? > > A checksum algorithm based spam system (eg vipul's razor) could be > modified to work with checksums of only the recognized words in an email. > All unrecognized stuff (based on a standard wordlist) would get stripped > before the checksum was generated. This would help for a while, and I'd > be interested to hear about anything out there,
Like the Distributed Checksum Clearinghouse perhaps :-) http://www.rhyolite.com/anti-spam/dcc/ > but the spammers could > deal with it easily enough by modifying their approach to just tack on > half a dozen common words selected at random. ... I think this entry in the DCC FAQ address this: Do the fuzzy checksums ignore "personalizations"? Yes, they ignore many so called "personalizations". DCC Looks good, but I don't know how well it works. Regards, Matt -- SLUG - Sydney Linux User's Group - http://slug.org.au/ More Info: http://lists.slug.org.au/listinfo/slug
