Double hashing is a find thing.  To actually answer the question, though, I
would recommend Murmurhash or JOAAT (
http://en.wikipedia.org/wiki/Jenkins_hash_function)

On Wed, Mar 16, 2011 at 3:48 PM, Andrey Stepachev <oct...@gmail.com> wrote:

> Try hash table with double hashing.
> Something like this
>
> http://www.java2s.com/Code/Java/Collections-Data-Structure/Hashtablewithdoublehashing.htm
>
> 2011/3/17 Peter Haidinyak <phaidin...@local.com>
>
> > Hi,
> >        This is a little off topic but this group seems pretty swift so I
> > thought I would ask. I am aggregating a day's worth of log data which
> means
> > I have a Map of over 24 million elements. What would be a good algorithm
> to
> > use for generating Hash Codes for these elements that cut down on
> > collisions? I application starts out reading in a log (144 logs in all)
> in
> > about 20 seconds and by the time I reach the last log it is taking around
> > 120 seconds. The extra 100 seconds have to do with Hash Table Collisions.
> > I've played around with different Hashing algorithms and cut the original
> > time from over 300 seconds to 120 but I know I can do better.
> > The key I am using for the Map is an alpha-numeric string that is
> > approximately 16 character long with the last 4 or 5 character being the
> > most unique.
> >
> > Any ideas?
> >
> > Thanks
> >
> > -Pete
> >
>

Reply via email to