For #2, you're more than welcome to attach patch on the JIRA. For #1, last time I tried to trace which JIRA introduced the formula but ended up with one Elliott did which just moved that line of code. I can spend more time in the future on this.
What downside have you observed for #1 ? Cheers On Fri, Jan 13, 2017 at 2:07 PM, Timothy Brown <t...@siftscience.com> wrote: > I tried it out on our staging cluster and saw that the total number of > requests per region server a bit more balanced with our current weights for > the read and write costs. I did not attempt to calculate the exact requests > per second but rather looked at a relative rate by averaging the increase > in reads and writes over the interval that the RegionLoad is currently > polled. This should have the same desired effect of balancing the number of > requests across the cluster. If you don't mind, I would like to take a stab > at the JIRA you've created. > > For #1, any idea if this is the desired behavior? > > Thanks, > Tim > > On Fri, Jan 13, 2017 at 10:27 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > Logged HBASE-17462 for #2. > > > > FYI > > > > On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > For #2, I think MemstoreSizeCostFunction belongs to the same category > if > > > we are to adopt moving average. > > > > > > Some factors to consider: > > > > > > The data structure used by StochasticLoadBalancer should be concise. > The > > > number of regions in a cluster can be expected to approach 1 million. > We > > > cannot afford to store long history of read / write requests in master. > > > > > > Efficiency of cost calculation should be high - there're many cost > > > functions the balancer goes through, it is expected for each cost > > function > > > to return quickly. Otherwise we would not come up with proper region > > > movement plan(s) in time. > > > > > > Cheers > > > > > > On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > >> For #2, I think it makes sense to try out using request rates for cost > > >> calculation. > > >> > > >> If the experiment result turns out to be better, we can consider using > > >> such measure. > > >> > > >> Thanks > > >> > > >> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown <t...@siftscience.com> > > >> wrote: > > >> > > >>> Hi, > > >>> > > >>> I have a couple of questions about the StochasticLoadBalancer. > > >>> > > >>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is > weights > > >>> later samples of the RegionLoad more than previous ones. For example, > > >>> with > > >>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 + > > >>> .125*load4). Is this the intended behavior? > > >>> > > >>> 2) Would it make more sense to calculate the ReadRequestCost and > > >>> WriteRequestCost as rates? Right now it looks like the cost is just > > based > > >>> off the total number of read/write requests a region has gotten over > > its > > >>> lifetime. > > >>> > > >>> -Tim > > >>> > > >> > > >> > > > > > >