Theo Van Dinter <[EMAIL PROTECTED]> writes: > What's the issue exactly? If we're hashing down to 5 bytes anyway, > who cares what size the input is? The large length tokens aren't a > big deal unless huge mails start going around (who cares if we have a > handful of large tokens?)
1. We should probably not truncate tokens (at least not so much) since we're hashing now. Some amount of truncation may still be helpful, though, so a 10fcv would be a good idea. Um, I don't recall anyone posting a 10fcv for the hashing. Someone did do that, right? 2. Second, the thing you may be missing is Herk's idea to optionally include the original token as part of the value -- not the key. In SQL, it would be a separate column. In DBM, it would optionally appear at the end of the hashed token's value. Daniel -- Daniel Quinlan anti-spam (SpamAssassin), Linux, http://www.pathname.com/~quinlan/ and open source consulting
