According to the HBase book , pre splitting tables and doing manual
splits is a better long term strategy than letting HBase handle it.

I have done a lot of offline testing with HBase and I am at a stage
now where I would like to hook my cluster into the production queue
feeding data into our systems.

Since I do not know what the keys from the prod system are going to
look like , I am adding a machine number prefix to the the row keys
and pre splitting the tables  based on the prefix (prefix 0 goes to
machine A, prefix 1 goes to machine b etc).

Once I decide to add more machines, I can always do a rolling split
and add more prefixes.

Is this a good strategy for pre splitting the tables ?

Reply via email to