On Thu, Oct 09, 2003 at 10:41:34PM -0400, Chip Salzenberg wrote: > 3. Add (or xor or whatever) PL_new_hash_seed with the user-provided > hash value *inside* the functions that use hashes, *not* outside > them.
I think that it has to be "whatever" It can't be xor. The attack relies on pre-computing a set of values which are a pathological case for the hashing algorithm, values which all end up in one bucket. IIRC the mapping to buckets is done using the low N bits of the 32 bit hash value (where N starts at 3 and increases as the hash grows). Given a set of N bit hash values which map to the same bucket, xoring them all with a constant will still put them in the same bucket. All this would do is change the order for keys on non-pathological hashes, without actually splitting the set. Likewise add would just rotate the buckets round, as effectively it would be modulo arithmetic in base N I think that we have to do "whatever" where whatever is hashing the input 32 bit PERL_HASH value inside the hv.c functions, seeded by the per run random value. I'm just not sure what the "whatever" is. We could use 5 bits of it and rotate the 32 bit hash value round. This makes the attack harder, because then the attacker has to find a set of strings that hash to the same 32 bit value, rather than just the same bucket. Then again, if the attacker can do that, anything we do inside hv.c is fruitless, as we're trying to reliably distinguish between identical values :-( At which point we may as well calculate a new hash value inside hv.c each time and kill performance in the general case. If we don't do that, I believe that information we have to play with is 1: 32 bit PERL_HASH value 2: key length 3: per run 32 bit random seed How can we generate a new value out of that at little cost? Nicholas Clark -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

