Re: mod_proxy/mod_proxy_balancer bug

Jess Holle Tue, 14 Apr 2009 14:24:21 -0700

Jess Holle wrote:

proxy_handler() calls ap_proxy_pre_request() inside a do loop overbalanced workers.
This in turn calls proxy_balancer_pre_request() which does

    (*worker)->s->busy++.

Correspondingly proxy_balancer_post_request() does:

        if (worker && worker->s->busy)
            worker->s->busy--;
Unfortunately, proxy_handler only calls proxy_run_post_request() andthus proxy_balancer_post_request() outside the do loop. Thus the"busy" count of workers which currently cannot take requests (e.g.that are currently dead) increases without bound due to retries -- andis never reset.
Does anyone (i.e. who is more familiar with this code) havesuggestions for how this should be fixed? If not, I can take a swingat it.
Similarly, when retrying workers in various routines inmod_proxy_balancer.c those worker's lbstatus is incremented. If theretry fails, however, the lbstatus is never reset. This issue alsoleads to an lbstatus that increases without bound. Just because aworker was dead for 8 hours does not mean it can handle all the workload now. It needs to start fresh -- not 8 hours in the hole. Thisissue also creates an unduly huge impact when doing
    mycandidate->s->lbstatus -= total_factor;

Actually I'm offbase here. total_factor places undue emphasis on anyworker that satisfies a request when multiple dead workers are retried.For instance, if there are 7 dead workers, all being retried, 2 healthyworkers, and all with an lbfactor of 1 the worker that gets the requestgets its lbstatus decremented by 9, whereas it really should only bedecremented by 2 -- else the weighting gets thrown way off. However, itis /not/ thrown off more due to the huge lbstatus values that build upin dead workers. That only becomes an issue when dead workers come to life.

We're seeing the load balancing be thrown dramatically off in this case.
Does anyone have suggestions for how this should be fixed? If not,again I can take a swing at this, e.g. reseting lbstatus to 0 inap_proxy_retry_worker().
It *seems* like both of the issue center on handling of dead workers,especially having a multiple dead workers and/or workers that are deadfor long periods of time.
I've not yet checked whether mod_jk (where I believe these basicalgorithms came from) has similar issues.
--
Jess Holle

Re: mod_proxy/mod_proxy_balancer bug

Reply via email to