Hi Kay, On Wed, Jul 20, 2016 at 11:17:57AM +0200, Kay Fuchs wrote: > Hi! > > 2016-07-19 11:26 GMT+02:00 Kay Fuchs <[email protected]>: > > i'm using a stick-table with HAProxy 1.6.7 on an active/standby > > configuration like this: > > > > stick-table type ipv6 size 500k expire 60s peers hacluster store > > gpc0,conn_cur,http_req_rate(10s),http_err_rate(10s) > > http-request track-sc0 > > > > On the standby peer the table obviously shows wrong http_err_rates: > > > > 0xe6ce10: key=xxx use=0 exp=59598 gpc0=0 conn_cur=1 > > http_req_rate(10000)=1 http_err_rate(10000)=346 > > 0xe3ed80: key=xxx use=0 exp=58440 gpc0=0 conn_cur=1 > > http_req_rate(10000)=27 http_err_rate(10000)=38841809 > > > > The active peer seems to behave as expected and shows very low error rates. > > > > I'm no programmer, but i think it has to do with "frqp->curr_tick" in > > "peers.c" which seems to have the value "0" if the very first error > > appears. This leads to sending "now_ms" to the peer. If i check > > "frqp->curr_tick" before the encoding like > > > > if (frqp->curr_tick == 0) > > frqp->curr_tick = now_ms; > > > > the error rates seems reasonable on the standby peer. > > I think either the function "intencode" or "intdecode" in "peers.c" > seems not to return the expected values. I've made a simple loop to > compare the input for "intencode" with the outputs of "intdecode" for > the encoded message. The first wrong encoded/decoded range of integers > are 4336-4351. > > That might explain > http://thread.gmane.org/gmane.comp.web.haproxy/27168 in combination > with the sending of large integer "now_ms" reported above.
Very interesting. Yesterday Fred (in CC) found a bug there and we concluded that it could explain such random issues that we were not able to reproduce (possibly because we didn't face the exact faulty value). I think we'll have a patch shortly for this. Thanks for your feedback, Willy

