Hi,
This is a little off topic but this group seems pretty swift so I
thought I would ask. I am aggregating a day's worth of log data which means I
have a Map of over 24 million elements. What would be a good algorithm to use
for generating Hash Codes for these elements that cut down on collisions? I
application starts out reading in a log (144 logs in all) in about 20 seconds
and by the time I reach the last log it is taking around 120 seconds. The extra
100 seconds have to do with Hash Table Collisions. I've played around with
different Hashing algorithms and cut the original time from over 300 seconds to
120 but I know I can do better.
The key I am using for the Map is an alpha-numeric string that is approximately
16 character long with the last 4 or 5 character being the most unique.
Any ideas?
Thanks
-Pete