On Sun, May 02, 2004 at 05:39:14PM -0500, Michael Parker wrote: > I'm contemplating limiting bayes tokens to 128 chars, in the tokenize > method. Anyone see a problem with that?
Am I missing something? use constant MAX_TOKEN_LENGTH => 15; ... although, I don't see a substr() that actually limits it ... :( > Maybe 128 is too large in a theoretical worst-case attack (of someone > turning on storage of original tokens). 32 or 64 might be better. What's the issue exactly? If we're hashing down to 5 bytes anyway, who cares what size the input is? The large length tokens aren't a big deal unless huge mails start going around (who cares if we have a handful of large tokens?) Limiting the size would also cause us issue if we wanted to do multi-word tokens. -- Randomly Generated Tagline: "I couldn't NT my way out of a wet paper bag." - Unknown at LISA '99
pgpIHyMaheNwO.pgp
Description: PGP signature
