Carsten - This test is great -- let me make sure I understand your results though:
> > The longer I think about the key generation, I feel that we cannot use > hashed values at least as long as we use "long" for the hash value. > > Why? The answer is simple, a long has 2^64 possibilities. Often the > information that is hashed is a file name or a URI. Now, if we > only count lower case characters for the file name, we have 26 > possibilities for each position, so there are 26^[length of > information] > possible values and this is very quick much more possibilities > than the hashed value offers. So there must be collisions! > And a cache based on possible collisions is not usable. > > To come to a conlusion I did a simple performance test: > - Creating 10000 random keys (strings) where each character is a value > between 0 and 9 > - Hashing the 10000 values > - Putting the strings into the cache (a HashMap) > - Putting the Hashed values into another cache (a HashMap) > - Searching for all values in the string cache I guess you mean for each value (in a loop)? > - Searching for all values in the hashed cache (this requires > of course > rehashing the values) > > And this is the result: > Start key generation > Created 10000 strings with average length 169 OK, > Start hashing > Hashed: 120 ms 120 for all 10,000 -> The hashing function takes .012 ms per call on average > Creating cache with Strings > Created: 60 ms 60 ms for all 10,000 -> .006 ms on average > Creating cache with longs > Created: 30 ms same, .003 ms for longs (faster) > Testing strings > Strings: 10 ms Object = mapWithStrings.get(KeyString) -> 10 ms for all 10000? -> .001 ms per lookup average? (wow) > Testing hashed strings > Hashed strings: 131 ms Object = mapWithHashes.get(hashCreate(KeyString)) -> 131 ms for all -> .0131 ms per lookup average. > > So, looking up the hashed information is slower as the values > have to be > rehashed. But > putting strings into the cache is slower. But putting strings in only takes twice the time -- taking them out is ten times faster and happens more often (right?) > From the test above it seems that using strings is still a > little bit faster > than > using hashed values. > Or did I oversee something? I had been assuming that if the keys were not objects (strings) but integers that there could be a better datastructure than HashMap. I haven't had much time to think about it today but will tonight. Any ideas? Geoff --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]