Hi Godbach,
On Sat, Jul 20, 2013 at 06:45:14PM +0800, Godbach wrote:
> Hi Willy,
>
> I have done a test for haproxy on latest snapshot which has two servers
> with such key configuration
> lb algo is roundrobin
> server A: weight 128
> server B: weight 256
>
> I sent 9 requests and the order for getting responses from server is
> expected as:
> BBABBABBA
> But I got the unexpected order as below:
> ABBAAAAAA
>
> Indeed, I have noticed that there is a problem for the implentation of
> roundrobin yesterday before this test, and this test verified the
> problem. The description is as below:
>
> srv->eweight will exceed SRV_EWGHT_MAX in accordance with the fromula:
> srv->eweight = srv->uweight * BE_WEIGHT_SCALE;
>
> Since uweight can be reach to 256 and BE_WEIGHT_SCALE equals to 16. So
> the max vaule of eweight should be 256*16. But there is a macro named
> SRV_EWGHT_MAX, it equals to 255*16(SRV_UWGHT_MAX*BE_WEIGHT_SCALE). And
> when a server is insterted into round robin tree during initialization,
> fwrr_queue_by_weight() is called:
>
> static inline void fwrr_queue_by_weight(struct eb_root *root, struct
> server *s)
> {
> s->lb_node.key = SRV_EWGHT_MAX - s->eweight;
> eb32_insert(root, &s->lb_node);
> s->lb_tree = root;
> }
>
> When server's weight is 256, eweight is 256*16. The eweight will larger
> than SRV_EWGHT_MAX. As a result, the key value in lb tree will be closed
> to UNIT_MAX in unsigned type, and it will be not elected as the first
> server to process request. Furthermore, roundrobin can not work well in
> my test.
>
> Since there is a macro SRV_UWGHT_MAX equals to 255 as below:
> #define SRV_UWGHT_RANGE 256
> #define SRV_UWGHT_MAX (SRV_UWGHT_RANGE - 1)
>
> I want to know whether the server's weight should not larger than
> SRV_UWGHT_MAX(255), but the max value is 256 now in cfgparse.c as below:
> else if (!strcmp(args[cur_arg], "weight")) {
> int w;
> w = atol(args[cur_arg + 1]);
> if (w < 0 || w > 256) {
>
> If so, I will give a patch to fix it, or it should be fixed in other way.
That's quite a good analysis. I'm realizing that durnig the development
of the algorithm, the maximum weight was 255, and it was later changed to
256. The test code that was used for this is still in tests/filltab25.c if
you're curious. It displays in which order servers are picked, and applies
random weight changes on the fly.
I carefully checked and for me we can set SRV_UWGHT_MAX to SRV_UWGHT_RANGE.
It will have the effect of limiting the maximum number of servers a full
weight to 4095 instead of 4128 (so please update the doc for this). It is
also used in the leastconn algorithm, and similarly, the effective max
number of connections per server without overflowing will be limited to
1048575 instead of 1052688 (this is not a big issue).
So yeah, please go ahead and send a fix for this by. Please also replace
4128 with 4095 in the documentation!
Best regards,
Willy