Logged HBASE-17462 for #2. FYI
On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu <yuzhih...@gmail.com> wrote: > For #2, I think MemstoreSizeCostFunction belongs to the same category if > we are to adopt moving average. > > Some factors to consider: > > The data structure used by StochasticLoadBalancer should be concise. The > number of regions in a cluster can be expected to approach 1 million. We > cannot afford to store long history of read / write requests in master. > > Efficiency of cost calculation should be high - there're many cost > functions the balancer goes through, it is expected for each cost function > to return quickly. Otherwise we would not come up with proper region > movement plan(s) in time. > > Cheers > > On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> For #2, I think it makes sense to try out using request rates for cost >> calculation. >> >> If the experiment result turns out to be better, we can consider using >> such measure. >> >> Thanks >> >> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown <t...@siftscience.com> >> wrote: >> >>> Hi, >>> >>> I have a couple of questions about the StochasticLoadBalancer. >>> >>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is weights >>> later samples of the RegionLoad more than previous ones. For example, >>> with >>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 + >>> .125*load4). Is this the intended behavior? >>> >>> 2) Would it make more sense to calculate the ReadRequestCost and >>> WriteRequestCost as rates? Right now it looks like the cost is just based >>> off the total number of read/write requests a region has gotten over its >>> lifetime. >>> >>> -Tim >>> >> >> >