Re: random access and hotspots

Alex Baranov Thu, 11 Mar 2010 03:49:39 -0800

Hello Tux,

Accessing a table in "random access"-manner is not the reason for
randomizing keys. You will likely need to randomize your keys only for
better performance during importing existed large dataset into HBase.
Otherwise if you don't have insertion rate bigger than 20K records/sec I
wouldn't suggest you to think about this issue. It would be great if you
tell us more about your use-case.

MD5, SHA-1 or Jenkins Hash (in org.apache.hadoop.hbase.util.JenkinsHash) are
all mechanisms you might consider.

Alex Baranau

sematext.com
http://en.wordpress.com/tag/hadoop-ecosystem-digest/

On Thu, Mar 11, 2010 at 12:07 PM, TuX RaceR <tuxrace...@gmail.com> wrote:

> Hello List,
>
> I'll be accessing a table mainly in random access and I am looking for an
> efficient way of randomizing the keys.
> I thought about a MD5 hash of the ID of the record, but as MD5 returns a
> string of chars [0-9A-F] I was wondering if there was a better method to
> use.
>
> Thanks
> TuX
>

Re: random access and hotspots

Reply via email to