On Fri, Apr 15, 2011 at 12:48 PM, Joe Pallas <[email protected]> wrote: > It seems that, until you have enough data relative to your cluster size, you > must choose between locality and distribution. (When you have enough data, > you get a better balance between the two.) >
Yes. > The HBase rebalancer, as I understand it, adjusts region assignments, but > doesn't adjust split points (hence, the number of regions). Maybe that would > be a useful feature for some cases. > What would you suggest Joe? It currently splits regions down the middle. You'd instead have a split point that split the requests happening on a region over say, the last five or ten minutes? St.Ack
