bb will fall into the first region since the next start key is ca and bb is smaller than that.
J-D On Tue, Mar 22, 2011 at 12:07 PM, Vivek Krishna <[email protected]> wrote: > For eg., lets assume I have keys in range of aa, ab,ac..zz > > Using the sample data I create regions like this > > aa-ba region 1 > ca-da region 2 etc., > > The reason why I did not create region bb-bz because I did not encounter in > the sample.q > But when I encounter a key like bb, it does not fall in the region I created > and hence follow the normal procedure I guess. ? > > Also, I have incoming data distributed almost evenly. > Viv > > > > On Tue, Mar 22, 2011 at 2:55 PM, Jean-Daniel Cryans > <[email protected]>wrote: > >> It depends if you are also inserting in an ordered fashion right? Even >> if you have regions a through z, but you start inserting only keys >> with starting with "a", then you'll only hit the first regions. >> >> J-D >> >> On Tue, Mar 22, 2011 at 11:46 AM, Vivek Krishna <[email protected]> >> wrote: >> > I have GBs of data to be dumped to HBase. After lots of trials and >> reading >> > through the mailing list, I figured out creating regions manually is a >> good >> > option because all data was hitting one node initially... >> > >> > My approach to creating regions is as follow. >> > - I sampled like about 1% of the actual data and created say 'n' >> regions >> > based on this sample. >> > >> > Now while doing the insertions, it still hits one node first and then >> > spreads out. >> > >> > Our theory is that, the key it encounters while inserting does'nt fall in >> > the region that we created(using the sample) and hence it inserts as it >> > would do normally. >> > >> > So, has anyone approached this problem in a smarter way ? >> > >> > Viv >> > >> >
