Therefore it is not necessary to postulate a method to avoid razor's algorithms, it is merely necessary to assume (in order of probability): A flood of New spam not previously seen Slowness or technical problems with the razor database Spam reporting to razor falling off (people on vacations , etc)
Or spam that generates enough variations of the message body that the likelihood is very small that enough people will get the same hashes you do.
There was a time when every message in a spam run was identical, so a very simple hashing algorithm was enough. Since then spammers have added small variations (Dear <your email address here>), medium variations (random characters at the end of the message), large variations (a paragraph from a novel), random invisible varations (tiny/white fonts, invalid HTML or comments), and now they're randomizing the messages themselves. I've seen some spam runs where I'll get the exact same message twice, except every word will be misspelled differently each time.
Some of these have been primarily hash-busters, some have been primarily Bayes poison, and some have been aimed at breaking up keywords or phrases, but all end up increasing the number of hash signatures that result from messages in a given spam run.
End result: old hashing schemes cease to be effective, and new ones must be developed.
If you don't like the word algorithms, that's fine, but even Vipul admits that this has happened.
Kelson Vibber
SpeedGate Communications <www.speed.net>
------------------------------------------------------- This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND _______________________________________________ Razor-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/razor-users