It depends if you are also inserting in an ordered fashion right? Even if you have regions a through z, but you start inserting only keys with starting with "a", then you'll only hit the first regions.
J-D On Tue, Mar 22, 2011 at 11:46 AM, Vivek Krishna <[email protected]> wrote: > I have GBs of data to be dumped to HBase. After lots of trials and reading > through the mailing list, I figured out creating regions manually is a good > option because all data was hitting one node initially... > > My approach to creating regions is as follow. > - I sampled like about 1% of the actual data and created say 'n' regions > based on this sample. > > Now while doing the insertions, it still hits one node first and then > spreads out. > > Our theory is that, the key it encounters while inserting does'nt fall in > the region that we created(using the sample) and hence it inserts as it > would do normally. > > So, has anyone approached this problem in a smarter way ? > > Viv >
