Thank you all. Moved to Murmur hash. Best Regards, Mingtao
On Mon, Jul 21, 2014 at 10:58 PM, Ishan Chhabra <[email protected]> wrote: > No *guarantees* on collision, but yes, it is a deterministic mapping and > you won't see collisions in that range (provided you choose enough bits). > > See MurmurHash here: http://en.wikipedia.org/wiki/MurmurHash > > and to understand collision probabilities, read this: > http://en.wikipedia.org/wiki/Birthday_problem > > > On Mon, Jul 21, 2014 at 7:55 PM, Mingtao Zhang <[email protected]> > wrote: > > > Thank you all! > > > > Sorry, I think 'consistent hashing' is wrong word. > > > > For my use case, I need to store this 'prefix' (either hashed/not) into > > another table. > > > > Will this murmur hashing guarantee next time same string will map to same > > bytes? And no collision for around 2^10 records? > > > > Mingtao Sent from iPhone > > > > > On Jul 21, 2014, at 10:28 PM, Ishan Chhabra <[email protected]> > > wrote: > > > > > > Mingtao, > > > If I understand correctly, you want to prefix the key with a hash (as > > > mentioned in the book) to get a good distribution. Use MurmurHash > (there > > is > > > an implementation in HBase code itself) as it is fast and gives a > uniform > > > distribution. > > > > > > "Consistent Hashing" is not the correct term to use here if I > understand > > > your intent correctly. > > > > > > > > >> On Mon, Jul 21, 2014 at 2:44 PM, Liam Slusser <[email protected]> > > wrote: > > >> > > >> MD5 isn't a consistent hashing algorithm. Consistent hashing is a > > scheme > > >> that provides a hash table functionality in a way that the adding or > > >> removing of one slot does not significantly change the mapping of keys > > to > > >> slots. With that said, a lot of consistent hashing algorithms USE > > >> md5...but it alone won't get you all the way there. > > >> > > >> Some light bedtime reading: > > >> http://en.wikipedia.org/wiki/Consistent_hashing > > >> > > >> liam > > >> > > >> > > >> On Mon, Jul 21, 2014 at 7:18 AM, Mingtao Zhang < > [email protected]> > > >> wrote: > > >> > > >>> Hi, > > >>> > > >>> I am trying to find a consistant hasing algorithm for the first > portion > > >> of > > >>> the row key. > > >>> > > >>> I saw the document/book that MD5 is mentioned everything. > > >>> > > >>> But I have trouble to persuade myself that MD5 ( > > >>> http://en.wikipedia.org/wiki/MD5) is considered as consistant > hasing. > > >>> > > >>> Could any of you point me to the library contains the hashing you are > > >>> using? > > >>> > > >>> Thanks in advance! > > >>> > > >>> Best Regards, > > >>> Mingtao > > > > > > > > > > > > -- > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc. > > > > > > -- > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc. >
