Re: Replicated stick tables have absurd values for conn_cur

2019-01-15 Thread Willy Tarreau
On Tue, Jan 15, 2019 at 09:51:12PM +0100, Tim Düsterhus wrote: > Willy, > > Am 15.01.19 um 21:41 schrieb Willy Tarreau: > >> Ideally the peers would exchange their local values, only. > > > > Yes and that's how this currently works. > > I believe they exchange what they believe to be the current

Re: Replicated stick tables have absurd values for conn_cur

2019-01-15 Thread Tim Düsterhus
Willy, Am 15.01.19 um 21:41 schrieb Willy Tarreau: >> Ideally the peers would exchange their local values, only. > > Yes and that's how this currently works. I believe they exchange what they believe to be the current global connection count, instead of their local connection count, no? >> Thes

Re: Replicated stick tables have absurd values for conn_cur

2019-01-15 Thread Willy Tarreau
Hi Tim, On Tue, Jan 15, 2019 at 09:32:42PM +0100, Tim Düsterhus wrote: > Willy, > > Am 15.01.19 um 15:32 schrieb Willy Tarreau: > > Got it! I thought the problem was local to a process and that we > > replicated bad data, but in fact not, it's a distributed race. In > > this case there is no othe

Re: Replicated stick tables have absurd values for conn_cur

2019-01-15 Thread Tim Düsterhus
Willy, Am 15.01.19 um 15:32 schrieb Willy Tarreau: > Got it! I thought the problem was local to a process and that we > replicated bad data, but in fact not, it's a distributed race. In > this case there is no other short-term solution, and the drift has > no reason to significantly accumulate ove

Re: Replicated stick tables have absurd values for conn_cur

2019-01-15 Thread Willy Tarreau
Hi Emerson, On Tue, Jan 15, 2019 at 12:21:07PM +0100, Emerson Gomes wrote: > Hi Willy, Tim, > > I am providing some more details about my setup if you wish to try to > reproduce the issue. > As I mentioned before, I have 5 HAProxy nodes, all of them listening to > public IPs. > My DNS is setup wi

Re: Replicated stick tables have absurd values for conn_cur

2019-01-15 Thread Emerson Gomes
Hi Willy, Tim, I am providing some more details about my setup if you wish to try to reproduce the issue. As I mentioned before, I have 5 HAProxy nodes, all of them listening to public IPs. My DNS is setup with round-robin mode on AWS R53, resolving to one of the HAProxy nodes individual IPs for e

Re: Replicated stick tables have absurd values for conn_cur

2019-01-15 Thread Willy Tarreau
Hello Emerson, On Mon, Jan 14, 2019 at 10:26:40PM +0100, Emerson Gomes wrote: > Hello Tim, > > Sorry for the delayed answer. > The segfaults I had experinced apparently were related to something else - > Maybe some issue in my env. > At first I tried to apply the patch to 1.9.0, but after applyin

Re: Replicated stick tables have absurd values for conn_cur

2019-01-14 Thread Emerson Gomes
Hello Tim, Sorry for the delayed answer. The segfaults I had experinced apparently were related to something else - Maybe some issue in my env. At first I tried to apply the patch to 1.9.0, but after applying it to 1.8.7, I no longer had the segfaults. So far I yet haven't experienced the underfl

Re: Replicated stick tables have absurd values for conn_cur

2019-01-12 Thread Tim Düsterhus
Emerson, Am 07.01.19 um 13:40 schrieb Emerson Gomes: > Just to update you, I have tried the patch, and while I didnt see any new > occurences of the underflow, HAProxy started to crash constantly... > > Jan 07 10:32:37 afrodite haproxy[14364]: [ALERT] 006/103237 (14364) : > Current worker #1 (143

Re: Replicated stick tables have absurd values for conn_cur

2019-01-07 Thread Emerson Gomes
Hello Tim, Just to update you, I have tried the patch, and while I didnt see any new occurences of the underflow, HAProxy started to crash constantly... Jan 07 10:32:37 afrodite haproxy[14364]: [ALERT] 006/103237 (14364) : Current worker #1 (14366) exited with code 139 (Segmentation fault) Jan 07

Re: Replicated stick tables have absurd values for conn_cur

2019-01-03 Thread Emerson Gomes
Hello Tim, Thanks a lot for the patch. I will try it out and let you know the results. BR., Emerson Em qui, 3 de jan de 2019 às 21:18, Tim Düsterhus escreveu: > Emerson, > > Am 03.01.19 um 21:58 schrieb Emerson Gomes: > > However, the underflow scenario only seem to be possible if the peers ar

Re: Replicated stick tables have absurd values for conn_cur

2019-01-03 Thread Tim Düsterhus
Emerson, Am 03.01.19 um 21:58 schrieb Emerson Gomes: > However, the underflow scenario only seem to be possible if the peers are > sending relative values, rather than absolute ones. I don't believe so. My hypothetical timeline was created with absolute values in mind. > Apparently both cases (a

Re: Replicated stick tables have absurd values for conn_cur

2019-01-03 Thread Emerson Gomes
Hello Tim, Thanks for your answer. Indeed it's a very plausible explanation. And in my case I do have some clients very frequently establishing/aborting connections to all of the 5 nodes, which is increasing the odds of running in the race condition and underflow issues. However, the underflow s

Re: Replicated stick tables have absurd values for conn_cur

2019-01-03 Thread Tim Düsterhus
Emerson, Am 03.01.19 um 16:19 schrieb Emerson Gomes: > This works fine most of the time, but every now and then, when I check the > stick table contents, one or more IPs show up with an absurd number of > cunn_cur - Often around 4 Billion entries - A number very close to > the 32-bit unsigned int