My feeling is that lower requirement for table regions should be:
my_table_region_count > REGION_SERVER_count*3.

Each Region server should get at least one table region, so your read/write
load would be evenly distributed across all region servers in any cases.

*Assumption is that your data is not skewed, you never ever have hot spots.
If it's skewed, you would start to solve completely different problems :)



2015-09-07 14:59 GMT+02:00 Ted Yu <yuzhih...@gmail.com>:

> For the 96 region table, region size is too small.
>
> In production, I have seen region size as high as 50GB.
>
> FYI
>
>
>
> > On Sep 7, 2015, at 2:55 AM, Akmal Abbasov <akmal.abba...@icloud.com>
> wrote:
> >
> > Hi,
> > I would like to know about pros and cons against small region sizes.
> > Currently I have cluster with 5 nodes, which serve 5 tables, but there
> are ~80 regions per node, while actual data(total size of all hstores) is
> ~50GB.
> > Isn’t it an overhead, since there is a table which is ~30MB which has 96
> regions.
> > I was thinking about merging regions, because of overhead for managing
> them(metadata, memstore per region, more flushes, more compactions).
> > Any suggestions? What is the avg region size in your case?
> >
> > Thanks.
> >
> >
>

Reply via email to