On 8/31/23 18:46, Rainer Jung wrote:
Hi there,

mod_jk for example uses such aging, but only for the non busyness case. busyness is meant to show the number of currently in-flight requests, so aging isn't a good fit there. Old load numbers are never part of busyness. But busyness is the mode that is most sensitive to the numer skew effects that JFC observed. Therefore that attempt to have more precise counting there.

Based on the mod_jk code, I have a PR:
https://github.com/apache/httpd/pull/383


It makes sense for byrequests and bytraffic though. But in mod_jk we use a different byrequests algorithm. Not the original count and decrement system that Mladen introduced but instead a count and age system.

The aging for byrequests and bytraffic could be hooked on mod_watchdog which is nice, because we would not need to run it as part of normal request handling.

I will look to the age() and other to see how to use it with byrequests and bytraffic.


Another thing that comes to my mind is (graceful) restart handlingan bybusyness. It might make sense to clear the numbers in case of such an event.

Best regards,

Rainer

Am 31.08.23 um 18:23 schrieb Jim Jagielski:
IIRC, the goal of having an "aging" function was to handle this exact kind of thing, where values could be normalized over a long period of time so that old entries that may skew results are not weighted as heavily as new ones.

On Aug 30, 2023, at 11:19 AM, jean-frederic clere <jfcl...@gmail.com> wrote:

Hi,

All the balancers have thread/process safe issues, but with bybusyness the effect is worse, basically a worker may stay with a busy count greater than zero even no request is being processed.

busy is displayed in the balancer_handler() so users/customers will notice the value doesn't return to zero...

If you run a load test the value of busy will increase by time and in all the workers

When using bybusyness, having pics in the load and later no much load makes the lowest busy workers used and the ones with a wrong higher value not being used.

In a test with 3 workers, I end with busy:
worker1: 3
worker2: 0
worker3: 2
Doing the load test several time the buys values are increasing in all workers.

I am wondering is we could end with something like:
worker1: 1000
worker2: 0
worker3: 1000

in this case bybusyness will send all the load to worker2 until we reach 1000 simultaneous request on worker2... Obvious that looks bad.

How to fix that?
1 - reset the busy using a watchdog and elected (or transferred+read) unchanged for some time (using one of timeout we have on workers). 2 - warn in the docs that bybusyness is not the best choice for loadbalancing.
3 - create another balancer that just choose random a worker.

--
Cheers

Jean-Frederic
ยด

--
Cheers

Jean-Frederic

Reply via email to