i had a customer with a sequence-based key (yes, he knew all the downsides for that). being able to split manually meant he could split a region that got too big at the end vice right down the middle. with a sequentially increasing key, splitting the region in half left one region half the desired size and likely to never be added to
On Wed, Aug 6, 2014 at 2:44 AM, Arun Allamsetty <[email protected]> wrote: > Hi Ming, > > The reason why we have it is because the user can decide where each key > goes. I can think multiple scenarios off the top of my head where it would > be useful and others can correct me if I am wrong. > > 1. Cases where you cannot have row keys which are equally lexically > distributed, leading in unequal loads on the regions. In such cases, we can > set key ranges to be assigned to different regions so that we can have a > more equal distribution. > > 2. The second scenario I am thinking of may be wrong and if it is, it'll > clear my misconceptions. In case you cannot denormalize your data and you > have to perform joins on certain range of row keys which are lexically > similar. So we split them and they would be assigned to the same region > server (right?) and the join would be performed locally. > > Cheers, > Arun > > Sent from a mobile device. Please don't mind the typos. > On Aug 6, 2014 12:30 AM, "Liu, Ming (HPIT-GADSC)" <[email protected]> > wrote: > > > Hi, all, > > > > As I understand, HBase will automatically split a region when the region > > is too big. > > So in what scenario, user needs to do a manual split? Could someone > kindly > > give me some examples that user need to do the region split explicitly > via > > HBase Shell or Java API? > > > > Thanks very much. > > > > Regards, > > Ming > > >
