Best split policy for wide distribution of table sizes

Michael Young Thu, 03 Aug 2017 11:10:10 -0700

We have several phoenix tables which vary quite a bit in size. Namely, we
have around 10-15 tables which contain perhaps 6-10x more data than the
other 50 tables.


The default split policy is currently used, and the count of regions across
the clusters is uniform.  However, we noticed some tables have more regions
concentrated on some nodes, presumably to keep the total count of regions
constant.  This seems to negatively impact query performance for our
largest data tables.

We tested using the ConstantSizeSplitPolicy, to have the region data sizes
be better balanced, and the queries seem to behave somewhat better.

Is this a good approach or does anyone have a more appropriate solution?
We don't want to implement a custom split policy but are more than willing
to try other available split policies, or other config tuning.

Thanks,
Michael Young

Best split policy for wide distribution of table sizes

Reply via email to