No *guarantees* on collision, but yes, it is a deterministic mapping and
you won't see collisions in that range (provided you choose enough bits).

See MurmurHash here: http://en.wikipedia.org/wiki/MurmurHash

and to understand collision probabilities, read this:
http://en.wikipedia.org/wiki/Birthday_problem


On Mon, Jul 21, 2014 at 7:55 PM, Mingtao Zhang <[email protected]>
wrote:

> Thank you all!
>
> Sorry, I think 'consistent hashing' is wrong word.
>
> For my use case, I need to store this 'prefix' (either hashed/not) into
> another table.
>
> Will this murmur hashing guarantee next time same string will map to same
> bytes? And no collision for around 2^10 records?
>
> Mingtao Sent from iPhone
>
> > On Jul 21, 2014, at 10:28 PM, Ishan Chhabra <[email protected]>
> wrote:
> >
> > Mingtao,
> > If I understand correctly, you want to prefix the key with a hash (as
> > mentioned in the book) to get a good distribution. Use MurmurHash (there
> is
> > an implementation in HBase code itself) as it is fast and gives a uniform
> > distribution.
> >
> > "Consistent Hashing" is not the correct term to use here if I understand
> > your intent correctly.
> >
> >
> >> On Mon, Jul 21, 2014 at 2:44 PM, Liam Slusser <[email protected]>
> wrote:
> >>
> >> MD5 isn't a consistent hashing algorithm.  Consistent hashing is a
> scheme
> >> that provides a hash table functionality in a way that the adding or
> >> removing of one slot does not significantly change the mapping of keys
> to
> >> slots.  With that said, a lot of consistent hashing algorithms USE
> >> md5...but it alone won't get you all the way there.
> >>
> >> Some light bedtime reading:
> >> http://en.wikipedia.org/wiki/Consistent_hashing
> >>
> >> liam
> >>
> >>
> >> On Mon, Jul 21, 2014 at 7:18 AM, Mingtao Zhang <[email protected]>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> I am trying to find a consistant hasing algorithm for the first portion
> >> of
> >>> the row key.
> >>>
> >>> I saw the document/book that MD5 is mentioned everything.
> >>>
> >>> But I have trouble to persuade myself that MD5 (
> >>> http://en.wikipedia.org/wiki/MD5) is considered as consistant hasing.
> >>>
> >>> Could any of you point me to the library contains the hashing you are
> >>> using?
> >>>
> >>> Thanks in advance!
> >>>
> >>> Best Regards,
> >>> Mingtao
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Reply via email to