Re: Table sticky counters decrementation problem

2021-03-30 Thread Vincent Bernat
 ❦ 30 mars 2021 11:21 +02, Thomas SIMON:

> And I confirm you than when rolling back with source compilation and
> 2.3.7 version (can't do this with repository as only last version is 
> available) , counters decrements well.

The old debs are still here, so you can still download them manually if
you need to. I need to switch to aptly at some point since reprepro is
unlikely to ever support several versions for the same source package. I
would also have to host and build packages for Ubuntu as well as it is
common request.
-- 
Choose variable names that won't be confused.
- The Elements of Programming Style (Kernighan & Plauger)



Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 07:31:34PM +0200, Sander Klein wrote:
> Yes! It works. Sometimes you just need to go home, eat something and look
> again.

Oh I know how it feels, it works the same when emitting releases sometimes...

> It did need a full restart to get it going again though.

You mean "as opposed to a simple reload" ? This ought to be decorellated
since the master always performs an exec() on reload to make sure the new
binary is properly taken into account. Or maybe it failed to restart for
whatever reason and remained on the old one.

Anyway, thanks for your tests, that makes me more confident in 2.2.12.
It's probably not for this evening though (too high risk of mistakes
caused by high context-switch rate, I need to cool down first) but likely
tomorrow morning.

Cheers,
Willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread Sander Klein

On 2021-03-30 19:15, Willy Tarreau wrote:

On Tue, Mar 30, 2021 at 07:07:41PM +0200, Sander Klein wrote:

On 2021-03-30 18:14, Willy Tarreau wrote:

> No, my chance is already gone :-)
>
> OK, I'm pushing this one into 2.3, re-running the tests a last time,
> and issuing 2.3.9. We'll be able to issue 2.2.12 soon finally, as users
> of 2.2 are still into trouble between 2.2.9 and 2.2.11 depending on the
> bug they try to avoid :-/

Somehow either my patching skillz have gone down the drain or this fix
doesn't work for me on 2.2.11. I still see the same behavior.


No worries, I'll backport whatever is needed so that you can test the
latest maintenance version, it will make you more confident in your
tests.


Yes! It works. Sometimes you just need to go home, eat something and 
look again. It did need a full restart to get it going again though.


Sander



Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 07:07:41PM +0200, Sander Klein wrote:
> On 2021-03-30 18:14, Willy Tarreau wrote:
> 
> > No, my chance is already gone :-)
> > 
> > OK, I'm pushing this one into 2.3, re-running the tests a last time,
> > and issuing 2.3.9. We'll be able to issue 2.2.12 soon finally, as users
> > of 2.2 are still into trouble between 2.2.9 and 2.2.11 depending on the
> > bug they try to avoid :-/
> 
> Somehow either my patching skillz have gone down the drain or this fix
> doesn't work for me on 2.2.11. I still see the same behavior.

No worries, I'll backport whatever is needed so that you can test the
latest maintenance version, it will make you more confident in your
tests.

Thanks!
Willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread Sander Klein

On 2021-03-30 18:14, Willy Tarreau wrote:


No, my chance is already gone :-)

OK, I'm pushing this one into 2.3, re-running the tests a last time,
and issuing 2.3.9. We'll be able to issue 2.2.12 soon finally, as users
of 2.2 are still into trouble between 2.2.9 and 2.2.11 depending on the
bug they try to avoid :-/


Somehow either my patching skillz have gone down the drain or this fix 
doesn't work for me on 2.2.11. I still see the same behavior.


Sander



Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 06:33:12PM +0200, Thomas SIMON wrote:
> Hi willy,
> 
> just to confirm that sticky counter decrement is okay with your patch on
> 2.3.8 version, so no objection for 2.3.9 patching neither :)

Great, thanks for the test. I've just committed the fix and am preparing
2.3.9 now.

Willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread Thomas SIMON

Hi willy,

just to confirm that sticky counter decrement is okay with your patch on 
2.3.8 version, so no objection for 2.3.9 patching neither :)


thomas

Le 30/03/2021 à 15:47, Willy Tarreau a écrit :

On Tue, Mar 30, 2021 at 03:17:34PM +0200, Sander Klein wrote:

On 2021-03-30 15:13, Willy Tarreau wrote:


diff --git a/src/time.c b/src/time.c
index 0cfc9bf3c..fafe3720e 100644
--- a/src/time.c
+++ b/src/time.c
@@ -268,7 +268,7 @@ void tv_update_date(int max_wait, int interrupted)
 old_now_ms = global_now_ms;
 do {
 new_now_ms = old_now_ms;
-   if (tick_is_lt(new_now_ms, now_ms))
+   if (tick_is_lt(new_now_ms, now_ms) || !new_now_ms)
 new_now_ms = now_ms;
 }  while (!_HA_ATOMIC_CAS(&global_now_ms, &old_now_ms,
new_now_ms));

Do I need to apply this on top of the other fixes? Or should this be done on
the vanilla 2.2.11?

It's indeed on top of other fixes like those present in 2.3.8 or queued
in 2.2-maint.

Just let me know if you need some help with the patch or if you need another
one. I've mostly focused on 2.3 for now since 2.3.8 was expected to be
definitely fixed and I wanted to do 2.2.12 today based on it.

Thanks!
Willy


--
Thomas SIMON
Responsable Infrastructures
Neteven




Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 06:15:28PM +0200, William Dauchy wrote:
> On Tue, Mar 30, 2021 at 5:57 PM Willy Tarreau  wrote:
> > out of curiosity I wanted to check when the overflow happened:
> >
> > $ date --date=@$$(date +%s) * 1000) & -0x800) / 1000))
> > Mon Mar 29 23:59:46 CEST 2021
> >
> > So it only affects processes started since today. I'm quite tempted not
> > to wait further and to emit 2.3.9 urgently to fix this before other
> > people get trapped after reloading their process. Any objection ?
> 
> I do confirm the timestamp on our side but do not have the necessary
> tooling to test the fix.

Many thanks William!

willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread William Dauchy
On Tue, Mar 30, 2021 at 5:57 PM Willy Tarreau  wrote:
> out of curiosity I wanted to check when the overflow happened:
>
> $ date --date=@$$(date +%s) * 1000) & -0x800) / 1000))
> Mon Mar 29 23:59:46 CEST 2021
>
> So it only affects processes started since today. I'm quite tempted not
> to wait further and to emit 2.3.9 urgently to fix this before other
> people get trapped after reloading their process. Any objection ?

I do confirm the timestamp on our side but do not have the necessary
tooling to test the fix.

Thanks,
-- 
William



Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 06:09:09PM +0200, Lukas Tribus wrote:
> Hi Willy,
> 
> On Tue, 30 Mar 2021 at 17:56, Willy Tarreau  wrote:
> >
> > Guys,
> >
> > out of curiosity I wanted to check when the overflow happened:
> >
> > $ date --date=@$$(date +%s) * 1000) & -0x800) / 1000))
> > Mon Mar 29 23:59:46 CEST 2021
> >
> > So it only affects processes started since today. I'm quite tempted not
> > to wait further and to emit 2.3.9 urgently to fix this before other
> > people get trapped after reloading their process. Any objection ?
> 
> No objection, but also: what a coincidence. I suggest you get a
> lottery ticket today.

No, my chance is already gone :-)

OK, I'm pushing this one into 2.3, re-running the tests a last time,
and issuing 2.3.9. We'll be able to issue 2.2.12 soon finally, as users
of 2.2 are still into trouble between 2.2.9 and 2.2.11 depending on the
bug they try to avoid :-/

Willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread Lukas Tribus
Hi Willy,

On Tue, 30 Mar 2021 at 17:56, Willy Tarreau  wrote:
>
> Guys,
>
> out of curiosity I wanted to check when the overflow happened:
>
> $ date --date=@$$(date +%s) * 1000) & -0x800) / 1000))
> Mon Mar 29 23:59:46 CEST 2021
>
> So it only affects processes started since today. I'm quite tempted not
> to wait further and to emit 2.3.9 urgently to fix this before other
> people get trapped after reloading their process. Any objection ?

No objection, but also: what a coincidence. I suggest you get a
lottery ticket today.


cheers,
lukas



Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
Guys,

out of curiosity I wanted to check when the overflow happened:

$ date --date=@$$(date +%s) * 1000) & -0x800) / 1000))
Mon Mar 29 23:59:46 CEST 2021

So it only affects processes started since today. I'm quite tempted not
to wait further and to emit 2.3.9 urgently to fix this before other
people get trapped after reloading their process. Any objection ?

Willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 03:17:34PM +0200, Sander Klein wrote:
> On 2021-03-30 15:13, Willy Tarreau wrote:
> 
> > diff --git a/src/time.c b/src/time.c
> > index 0cfc9bf3c..fafe3720e 100644
> > --- a/src/time.c
> > +++ b/src/time.c
> > @@ -268,7 +268,7 @@ void tv_update_date(int max_wait, int interrupted)
> > old_now_ms = global_now_ms;
> > do {
> > new_now_ms = old_now_ms;
> > -   if (tick_is_lt(new_now_ms, now_ms))
> > +   if (tick_is_lt(new_now_ms, now_ms) || !new_now_ms)
> > new_now_ms = now_ms;
> > }  while (!_HA_ATOMIC_CAS(&global_now_ms, &old_now_ms,
> > new_now_ms));
> 
> Do I need to apply this on top of the other fixes? Or should this be done on
> the vanilla 2.2.11?

It's indeed on top of other fixes like those present in 2.3.8 or queued
in 2.2-maint.

Just let me know if you need some help with the patch or if you need another
one. I've mostly focused on 2.3 for now since 2.3.8 was expected to be
definitely fixed and I wanted to do 2.2.12 today based on it.

Thanks!
Willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread Sander Klein

On 2021-03-30 15:13, Willy Tarreau wrote:


diff --git a/src/time.c b/src/time.c
index 0cfc9bf3c..fafe3720e 100644
--- a/src/time.c
+++ b/src/time.c
@@ -268,7 +268,7 @@ void tv_update_date(int max_wait, int interrupted)
old_now_ms = global_now_ms;
do {
new_now_ms = old_now_ms;
-   if (tick_is_lt(new_now_ms, now_ms))
+   if (tick_is_lt(new_now_ms, now_ms) || !new_now_ms)
new_now_ms = now_ms;
}  while (!_HA_ATOMIC_CAS(&global_now_ms, &old_now_ms, 
new_now_ms));


Do I need to apply this on top of the other fixes? Or should this be 
done on the vanilla 2.2.11?


Sander

0x2E78FBE8.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature


Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 02:54:55PM +0200, Willy Tarreau wrote:
> And I've tested using the same method (http_req_rate(2s) and 500ms this
> time to cover both >1s and <1s). So I don't know what to say. I'm now
> extremely tempted to revert all these fixes because in the end the
> original problem was much less visible for most users :-(
> 
> I'm now trying with this *exact* config in case I missed something else.

So I was not crazy. The reason is that... the date changed since my
tests one week ago :-(

The new date is updated if it's in the past compared to the newest one,
except that it starts at zero. Last week during my tests, the now_ms
date was ~1.6 billion. 0-1.6 billion is negative so the new_now_ms date
was updated to reflect it.

But today the date it ~2.2 billion, which is higher than 2^31, thus
0-2.2 billion is positive and the new date is not after the old one so
the old one is not updated.

Thus we need to special case the update to reflect this. What saddens
me is that I hesitated to completely rewrite this part to simplify it
and concluded that I'd rather stay on the safe side. Someone getting
rid of legacy would be better, really!

This patch fixes it.

Sander, Thomas, please check again with it, it MUST work this time!

Thanks,
Willy

diff --git a/src/time.c b/src/time.c
index 0cfc9bf3c..fafe3720e 100644
--- a/src/time.c
+++ b/src/time.c
@@ -268,7 +268,7 @@ void tv_update_date(int max_wait, int interrupted)
old_now_ms = global_now_ms;
do {
new_now_ms = old_now_ms;
-   if (tick_is_lt(new_now_ms, now_ms))
+   if (tick_is_lt(new_now_ms, now_ms) || !new_now_ms)
new_now_ms = now_ms;
}  while (!_HA_ATOMIC_CAS(&global_now_ms, &old_now_ms, new_now_ms));
 




Re: Table sticky counters decrementation problem

2021-03-30 Thread Willy Tarreau
On Tue, Mar 30, 2021 at 10:17:01AM +0200, Lukas Tribus wrote:
> Hello Thomas,
> 
> 
> this is a known issue in any release train other than 2.3 ...
> 
> https://github.com/haproxy/haproxy/issues/1196
> 
> However neither 2.3.7 (does not contain the offending commits), nor
> 2.3.8 (contains all the fixes) should be affected by this.
> 
> 
> Are you absolutely positive that you are running 2.3.8 and not
> something like 2.2 or 2.0 ? Can you provide the full output of haproxy
> -vv?

I must say that I'm completely puzzled because that's exactly the issue
that affected 2.3.7, and which was fixed in 2.3.8. Here it seems to do
exactly the opposite!

And I've tested using the same method (http_req_rate(2s) and 500ms this
time to cover both >1s and <1s). So I don't know what to say. I'm now
extremely tempted to revert all these fixes because in the end the
original problem was much less visible for most users :-(

I'm now trying with this *exact* config in case I missed something else.

Thanks,
Willy



Re: Table sticky counters decrementation problem

2021-03-30 Thread Thomas SIMON

Hi Lukas,

I'm on 2.3.8 yes

root@web12:~# haproxy -vv
HA-Proxy version 2.3.8-1~bpo10+1 2021/03/25 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2022.
Known bugs: http://www.haproxy.org/bugs/bugs-2.3.8.html
Running on: Linux 5.4.78-2-pve #1 SMP PVE 5.4.78-2 (Thu, 03 Dec 2020 
14:26:17 +0100) x86_64

Build options :
  TARGET  = linux-glibc
  CPU = generic
  CC  = cc
  CFLAGS  = -O2 -g -O2 -fdebug-prefix-map=/build/haproxy-2.3.8=. 
-fstack-protector-strong -Wformat -Werror=format-security -Wdate-time 
-D_FORTIFY_SOURCE=2 -Wall -Wextra -Wdeclaration-after-statement -fwrapv 
-Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered 
-Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits 
-Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond 
-Wnull-dereference
  OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 
USE_ZLIB=1 USE_SYSTEMD=1

  DEBUG   =

Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT +PCRE2 
+PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +BACKTRACE 
-STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT 
+CRYPT_H +GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 -CLOSEFROM +ZLIB 
-SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL 
+SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS


Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=8).
Built with OpenSSL version : OpenSSL 1.1.1d  10 Sep 2019
Running on OpenSSL version : OpenSSL 1.1.1d  10 Sep 2019
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with network namespace support.
Built with the Prometheus exporter as a service
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), 
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT 
IPV6_TRANSPARENT IP_FREEBIND

Built with PCRE2 version : 10.32 2018-09-10
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 8.3.0

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as  cannot be specified using 'proto' keyword)
  h2 : mode=HTTP   side=FE|BE mux=H2
    fcgi : mode=HTTP   side=BE    mux=FCGI
    : mode=HTTP   side=FE|BE mux=H1
    : mode=TCP    side=FE|BE mux=PASS

Available services : prometheus-exporter
Available filters :
    [SPOE] spoe
    [CACHE] cache
    [FCGI] fcgi-app
    [COMP] compression
    [TRACE] trace

I'm using buster-backports repository, and I've updated package 
yesterday morning


[11:15:44]root@web12:~# cat /etc/apt/sources.list.d/haproxy.list
deb [signed-by=/usr/share/keyrings/haproxy.debian.net.gpg] 
http://haproxy.debian.net buster-backports-2.3 main


root@web12:~# aptcp haproxy
haproxy:
  Installed: 2.3.8-1~bpo10+1
  Candidate: 2.3.8-1~bpo10+1
  Version table:
 *** 2.3.8-1~bpo10+1 100
    100 http://haproxy.debian.net buster-backports-2.3/main amd64 
Packages

    100 /var/lib/dpkg/status

root@web12:~# grep haproxy /var/log/dpkg.log
2021-03-29 09:31:56 upgrade haproxy:amd64 2.3.7-1~bpo10+1 2.3.8-1~bpo10+1
2021-03-29 09:31:56 status half-configured haproxy:amd64 2.3.7-1~bpo10+1
2021-03-29 09:31:56 status unpacked haproxy:amd64 2.3.7-1~bpo10+1
2021-03-29 09:31:56 status half-installed haproxy:amd64 2.3.7-1~bpo10+1
2021-03-29 09:31:56 status unpacked haproxy:amd64 2.3.8-1~bpo10+1
2021-03-29 09:31:57 configure haproxy:amd64 2.3.8-1~bpo10+1 
2021-03-29 09:31:57 status unpacked haproxy:amd64 2.3.8-1~bpo10+1
2021-03-29 09:32:06 conffile /etc/logrotate.d/haproxy keep
2021-03-29 09:32:06 status half-configured haproxy:amd64 2.3.8-1~bpo10+1
2021-03-29 09:32:08 status installed haproxy:amd64 2.3.8-1~bpo10+1

And I confirm you than when rolling back with source compilation and 
2.3.7 version (can't do this with repository as only last version is 
available) , counters decrements well.


thanks

thomas

Le 30/03/2021 à 10:17, Lukas Tribus a écrit :

Hello Thomas,


this is a known issue in any release train other than 2.3 ...

https://github.com/haproxy/haproxy/issues/1196

However neither 2.3.7 (does not contain the offending commits), nor
2.3.8 (contains all the fixes) should be affected by this.


Are you absolutely positive that you are running 2.3.8 and not
something like 2.2 or 2.0 ? Can you provide the full output of haproxy
-vv?



Thanks,

Lukas


--
Thomas SIMON
Responsable Infrastructures
Neteven




Re: Table sticky counters decrementation problem

2021-03-30 Thread Sander Klein

On 2021-03-30 10:17, Lukas Tribus wrote:

Hello Thomas,


this is a known issue in any release train other than 2.3 ...

https://github.com/haproxy/haproxy/issues/1196

However neither 2.3.7 (does not contain the offending commits), nor
2.3.8 (contains all the fixes) should be affected by this.


Are you absolutely positive that you are running 2.3.8 and not
something like 2.2 or 2.0 ? Can you provide the full output of haproxy
-vv?



I can confirm I'm seeing this on 2.3.8 as well. But moreover, I also see 
this happening on 2.2.11 with Willy's patches in it as well.


I am very confused because I am pretty sure this problem was gone last 
week when I tested the patches and took that version in production.


Sander



0x2E78FBE8.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature


Re: Table sticky counters decrementation problem

2021-03-30 Thread Lukas Tribus
Hello Thomas,


this is a known issue in any release train other than 2.3 ...

https://github.com/haproxy/haproxy/issues/1196

However neither 2.3.7 (does not contain the offending commits), nor
2.3.8 (contains all the fixes) should be affected by this.


Are you absolutely positive that you are running 2.3.8 and not
something like 2.2 or 2.0 ? Can you provide the full output of haproxy
-vv?



Thanks,

Lukas



Table sticky counters decrementation problem

2021-03-30 Thread Thomas SIMON

Hi all,

Since version 2.3.8, I've noticed problem with come sticky counters, 
which only increments, and never decrements. The behavior was OK in 2.3.7



frontend web
    bind *:443 ssl crt /etc/ssl/certs/...
    http-request track-sc0 src table global_limits

backend global_limits
 stick-table type ip size 1m expire 1h store 
conn_cur,http_req_rate(20s),http_err_rate(1h)



Stick table

echo "show table global_limits" | socat stdio 
unix-connect:/run/haproxy/admin.sock
0x7ff0f4027d40: key=195.219.xxx.xxx use=2 exp=3599384 conn_cur=2 
http_req_rate(2)=607 http_err_rate(360)=0


One minute after :

0x7ff0f4027d40: key=195.219.250.105 use=2 exp=3599923 conn_cur=2 
http_req_rate(2)=689 http_err_rate(360)=0


Conn_cur increments and decrements well, but http_req_rate and 
http_err_rate doesn't.



regards
thomas