i had a customer with a sequence-based key (yes, he knew all the downsides
for that). being able to split manually meant he could split a region that
got too big at the end vice right down the middle. with a sequentially
increasing key, splitting the region in half left one region half the
desired size and likely to never be added to


On Wed, Aug 6, 2014 at 2:44 AM, Arun Allamsetty <[email protected]>
wrote:

> Hi Ming,
>
> The reason why we have it is because the user can decide where each key
> goes. I can think multiple scenarios off the top of my head where it would
> be useful and others can correct me if I am wrong.
>
> 1. Cases where you cannot have row keys which are equally lexically
> distributed, leading in unequal loads on the regions. In such cases, we can
> set key ranges to be assigned to different regions so that we can have a
> more equal distribution.
>
> 2. The second scenario I am thinking of may be wrong and if it is, it'll
> clear my misconceptions. In case you cannot denormalize your data and you
> have to perform joins on certain range of row keys which are lexically
> similar. So we split them and they would be assigned to the same region
> server (right?) and the join would be performed locally.
>
> Cheers,
> Arun
>
> Sent from a mobile device. Please don't mind the typos.
> On Aug 6, 2014 12:30 AM, "Liu, Ming (HPIT-GADSC)" <[email protected]>
> wrote:
>
> > Hi, all,
> >
> > As I understand, HBase will automatically split a region when the region
> > is too big.
> > So in what scenario, user needs to do a manual split? Could someone
> kindly
> > give me some examples that user need to do the region split explicitly
> via
> > HBase Shell or Java API?
> >
> > Thanks very much.
> >
> > Regards,
> > Ming
> >
>

Reply via email to