Hi JFC,
I have not checked ur current code, but the topic reminds me of our
history in mod_jk land. There we switched the counters to atomics were
available. The other problematic part could be how to handle process
local counters versus global counters.
Busyness was especially problematic for mod_jk as well, because we never
deremented below zero if we lost increments, but if we lost decrements
the counters stayed elevated. I think there we now have no longer such
problems.
Best regards,
Rainer
Am 30.08.23 um 17:19 schrieb jean-frederic clere:
Hi,
All the balancers have thread/process safe issues, but with bybusyness
the effect is worse, basically a worker may stay with a busy count
greater than zero even no request is being processed.
busy is displayed in the balancer_handler() so users/customers will
notice the value doesn't return to zero...
If you run a load test the value of busy will increase by time and in
all the workers
When using bybusyness, having pics in the load and later no much load
makes the lowest busy workers used and the ones with a wrong higher
value not being used.
In a test with 3 workers, I end with busy:
worker1: 3
worker2: 0
worker3: 2
Doing the load test several time the buys values are increasing in all
workers.
I am wondering is we could end with something like:
worker1: 1000
worker2: 0
worker3: 1000
in this case bybusyness will send all the load to worker2 until we reach
1000 simultaneous request on worker2... Obvious that looks bad.
How to fix that?
1 - reset the busy using a watchdog and elected (or transferred+read)
unchanged for some time (using one of timeout we have on workers).
2 - warn in the docs that bybusyness is not the best choice for
loadbalancing.
3 - create another balancer that just choose random a worker.