Re: Server timeouts since HAProxy 2.2

2022-08-11 Thread William Edwards

William Edwards schreef op 2022-08-08 16:24:

Hi,

William Edwards schreef op 2022-08-06 17:41:

Igor Cicimov schreef op 2022-08-04 01:46:

Because of keep-alive?


Disabling keepalive on the server side using `option
http-server-close` fixes the issue. I've yet to figure out why.


The connections with which this issue occurs are always in FIN_WAIT2
("Connection is closed, and the socket is waiting for a shutdown from
the remote end.") on the backend server, and in CLOSE_WAIT ("The
remote end has shut down, waiting for the socket to close.") on the
HAProxy side. The connections on both sides are shut down within 25 to
30 seconds (before /proc/sys/net/ipv4/tcp_fin_timeout). HAProxy does
wait until `timeout server` before terminating the request. Could it
be that the active close is delayed due to a bug?


I upgraded one system from HAProxy 2.2.9 to 2.4.18. The issue doesn't 
occur anymore. I'll bite the bullet and upgrade to 2.4.x everywhere.




Here is a dump of a session with this issue:

```
root@lb0-1:~# haproxyctl show sess
0x7f13c4030020: proto=tcpv4 src=185.233.175.144:54656 fe=fr_other
be=bk_http.kls srv=http-kls03.vk.ha.cyberfusion.cloud ts=00 age=44s
calls=6 rate=0 cpu=0 lat=0 rq[f=498400a0h,i=0,an=8000h,rx=,wx=,ax=]
rp[f=8004h,i=0,an=400h,rx=4m16s,wx=,ax=]
s0=[8,290008h,fd=35,ex=] s1=[8,200118h,fd=36,ex=] exp=4m16s

root@lb0-1:~# haproxyctl show sess 0x7f13c4030020
0x7f13c4030020: [08/Aug/2022:15:41:12.450130] id=14094 proto=tcpv4
source=185.233.175.144:54656
  flags=0xc4e, conn_retries=3, srv_conn=0x560d53757c60, pend_pos=(nil) 
waiting=0
  frontend=fr_other (id=5 mode=http), listener=? (id=2) 
addr=185.233.175.132:443

  backend=bk_http.kls (id=15 mode=http) addr=10.40.7.10:58770
  server=http-kls03.vk.ha.cyberfusion.cloud (id=1) addr=10.40.7.13:80
  task=0x7f13c4037eb0 (state=0x00 nice=0 calls=6 rate=0 exp=4m8s
tmask=0x2 age=52s)
  txn=0x7f13c4036ba0 flags=0x3000 meth=1 status=200 req.st=MSG_DONE
rsp.st=MSG_DATA req.f=0x4e rsp.f=0x0e
  si[0]=0x7f13c40302c8 (state=EST flags=0x290008
endp0=CS:0x7f13c40285b0 exp= et=0x000 sub=0)
  si[1]=0x7f13c4030320 (state=EST flags=0x200118
endp1=CS:0x7f13c401f9e0 exp= et=0x000 sub=1)
  co0=0x7f13c4030890 ctrl=tcpv4 xprt=SSL mux=H1 data=STRM
target=LISTENER:0x560d52929730
  flags=0x80043300 fd=35 fd.state=22 updt=0 fd.tmask=0x2
  cs=0x7f13c40285b0 csf=0xd000 ctx=0x7f13c401fa30
  co1=0x7f13c401fd60 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM
target=SERVER:0x560d53757c60
  flags=0x3300 fd=36 fd.state=22 updt=0 fd.tmask=0x2
  cs=0x7f13c401f9e0 csf=0x ctx=0x7f13c401f760
  req=0x7f13c4030030 (f=0x498400a0 an=0x8000 pipe=0 tofwd=0 total=124)
  an_exp= rex= wex=
  buf=0x7f13c4030038 data=(nil) o=0 p=0 i=0 size=0
  htx=0x560d50a5e7c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
  res=0x7f13c4030090 (f=0x8004 an=0x400 pipe=0 tofwd=-1 
total=5199)

  an_exp= rex=4m8s wex=
  buf=0x7f13c4030098 data=(nil) o=0 p=0 i=0 size=0
  htx=0x560d50a5e7c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
```

P.S. doc/management.txt says:

show sess 
  [...]
  You may find a description of all fields
  returned in src/dumpstats.c

... but these fields were moved away from src/dumpstats.c in commit
74c24fb071ed9b76e10e12c237d2e7d3ff4025d8. I'm not sure where to report
this issue.





-

From: William Edwards 
Sent: Thursday, 4 August 2022, 00:26
To: haproxy@formilux.org 
Subject: Server timeouts since HAProxy 2.2

[You don't often get email from wedwa...@cyberfusion.nl. Learn why
this is important at https://aka.ms/LearnAboutSenderIdentification ]

Hi,

Two days ago, I upgraded my first production system from HAProxy
1.8.19
to 2.2.9. Since then, many HTTP requests are hitting the server
timeout.

Before upgrade:

 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.5.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.4.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.3.gz | wc -l
 0

After upgrade:

 # Day of upgrade
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.2.gz | wc -l
 3798
 # Yesterday
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log.1 | wc -l
 127176
 # Today, so far
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log | wc -l
 85063

For this specific request, Ta ("total active time for the HTTP
request")
is 3, and Tt ("total TCP session duration time, between the moment 
the

proxy accepted it and the moment both ends were closed") is 34 (5
minutes, the server timeout):

 Aug  3 00:31:05 lb0-0 haproxy[16884]: $ip:62223
[03/Aug/2022:00:26:05.337] fr_other~
bk_http.lyr_http-lyr02.cf.ha.cyberfusion.cloud/http-lyr02.cf.ha.cyberfusion.cloud
0/0/0/3/34 200 27992 - - sD-- 616/602/226/226/0 0/0 "GET
https://$domain/wp-content/uploads/2022/07/20220712_155022-300x300.jpg
HTTP/2.0"

The backend server indeed served the request within Ta:

 $domain $ip - - [03/Aug/2022:00:26:05 +0200] 

Re: Server timeouts since HAProxy 2.2

2022-08-08 Thread William Edwards

Hi,

William Edwards schreef op 2022-08-06 17:41:

Igor Cicimov schreef op 2022-08-04 01:46:

Because of keep-alive?


Disabling keepalive on the server side using `option
http-server-close` fixes the issue. I've yet to figure out why.


The connections with which this issue occurs are always in FIN_WAIT2 
("Connection is closed, and the socket is waiting for a shutdown from 
the remote end.") on the backend server, and in CLOSE_WAIT ("The remote 
end has shut down, waiting for the socket to close.") on the HAProxy 
side. The connections on both sides are shut down within 25 to 30 
seconds (before /proc/sys/net/ipv4/tcp_fin_timeout). HAProxy does wait 
until `timeout server` before terminating the request. Could it be that 
the active close is delayed due to a bug?


Here is a dump of a session with this issue:

```
root@lb0-1:~# haproxyctl show sess
0x7f13c4030020: proto=tcpv4 src=185.233.175.144:54656 fe=fr_other 
be=bk_http.kls srv=http-kls03.vk.ha.cyberfusion.cloud ts=00 age=44s 
calls=6 rate=0 cpu=0 lat=0 rq[f=498400a0h,i=0,an=8000h,rx=,wx=,ax=] 
rp[f=8004h,i=0,an=400h,rx=4m16s,wx=,ax=] 
s0=[8,290008h,fd=35,ex=] s1=[8,200118h,fd=36,ex=] exp=4m16s


root@lb0-1:~# haproxyctl show sess 0x7f13c4030020
0x7f13c4030020: [08/Aug/2022:15:41:12.450130] id=14094 proto=tcpv4 
source=185.233.175.144:54656
  flags=0xc4e, conn_retries=3, srv_conn=0x560d53757c60, pend_pos=(nil) 
waiting=0
  frontend=fr_other (id=5 mode=http), listener=? (id=2) 
addr=185.233.175.132:443

  backend=bk_http.kls (id=15 mode=http) addr=10.40.7.10:58770
  server=http-kls03.vk.ha.cyberfusion.cloud (id=1) addr=10.40.7.13:80
  task=0x7f13c4037eb0 (state=0x00 nice=0 calls=6 rate=0 exp=4m8s 
tmask=0x2 age=52s)
  txn=0x7f13c4036ba0 flags=0x3000 meth=1 status=200 req.st=MSG_DONE 
rsp.st=MSG_DATA req.f=0x4e rsp.f=0x0e
  si[0]=0x7f13c40302c8 (state=EST flags=0x290008 endp0=CS:0x7f13c40285b0 
exp= et=0x000 sub=0)
  si[1]=0x7f13c4030320 (state=EST flags=0x200118 endp1=CS:0x7f13c401f9e0 
exp= et=0x000 sub=1)
  co0=0x7f13c4030890 ctrl=tcpv4 xprt=SSL mux=H1 data=STRM 
target=LISTENER:0x560d52929730

  flags=0x80043300 fd=35 fd.state=22 updt=0 fd.tmask=0x2
  cs=0x7f13c40285b0 csf=0xd000 ctx=0x7f13c401fa30
  co1=0x7f13c401fd60 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM 
target=SERVER:0x560d53757c60

  flags=0x3300 fd=36 fd.state=22 updt=0 fd.tmask=0x2
  cs=0x7f13c401f9e0 csf=0x ctx=0x7f13c401f760
  req=0x7f13c4030030 (f=0x498400a0 an=0x8000 pipe=0 tofwd=0 total=124)
  an_exp= rex= wex=
  buf=0x7f13c4030038 data=(nil) o=0 p=0 i=0 size=0
  htx=0x560d50a5e7c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
  res=0x7f13c4030090 (f=0x8004 an=0x400 pipe=0 tofwd=-1 
total=5199)

  an_exp= rex=4m8s wex=
  buf=0x7f13c4030098 data=(nil) o=0 p=0 i=0 size=0
  htx=0x560d50a5e7c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
```

P.S. doc/management.txt says:

show sess 
  [...]
  You may find a description of all fields
  returned in src/dumpstats.c

... but these fields were moved away from src/dumpstats.c in commit 
74c24fb071ed9b76e10e12c237d2e7d3ff4025d8. I'm not sure where to report 
this issue.






-

From: William Edwards 
Sent: Thursday, 4 August 2022, 00:26
To: haproxy@formilux.org 
Subject: Server timeouts since HAProxy 2.2

[You don't often get email from wedwa...@cyberfusion.nl. Learn why
this is important at https://aka.ms/LearnAboutSenderIdentification ]

Hi,

Two days ago, I upgraded my first production system from HAProxy
1.8.19
to 2.2.9. Since then, many HTTP requests are hitting the server
timeout.

Before upgrade:

 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.5.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.4.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.3.gz | wc -l
 0

After upgrade:

 # Day of upgrade
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.2.gz | wc -l
 3798
 # Yesterday
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log.1 | wc -l
 127176
 # Today, so far
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log | wc -l
 85063

For this specific request, Ta ("total active time for the HTTP
request")
is 3, and Tt ("total TCP session duration time, between the moment the
proxy accepted it and the moment both ends were closed") is 34 (5
minutes, the server timeout):

 Aug  3 00:31:05 lb0-0 haproxy[16884]: $ip:62223
[03/Aug/2022:00:26:05.337] fr_other~
bk_http.lyr_http-lyr02.cf.ha.cyberfusion.cloud/http-lyr02.cf.ha.cyberfusion.cloud
0/0/0/3/34 200 27992 - - sD-- 616/602/226/226/0 0/0 "GET
https://$domain/wp-content/uploads/2022/07/20220712_155022-300x300.jpg
HTTP/2.0"

The backend server indeed served the request within Ta:

 $domain $ip - - [03/Aug/2022:00:26:05 +0200] "GET
/wp-content/uploads/2022/07/20220712_155022-300x300.jpg HTTP/1.1" 200
28008 "https://$domain/stoffen/; "Mozilla/5.0 (Windows NT 10.0; Win64;
x64) 

Re: Server timeouts since HAProxy 2.2

2022-08-06 Thread William Edwards

Vincent Bernat schreef op 2022-08-04 12:14:

On 2022-08-04 10:35, William Edwards wrote:

However, 
https://haproxy.debian.net/#distribution=Debian=buster=2.2 
says:


"The Debian HAProxy packaging team provides various versions of 
HAProxy packages for use on different Debian or Ubuntu systems. The 
following wizard helps you to find the package suitable for your 
system. [...] You will get a stable release of HAProxy 2.2: you may 
not get the latest version but important fixes from later versions are 
included. Moreover, regressions are unlikely."


The bugs page tries to get users to ALWAYS use the latest version. But 
the haproxy.debian.org page says that it's okay not to use the latest 
version.


That's two different point of views, one from Debian, one from
upstream. They are difficult to reconcile. That's why you (as a user)
have to choose: an old version with only "important" fixes (security
fixes mostly) and with known bugs but unlikely regressions on upgrade,
or a recent version of a stable branch with fixes and sometimes
regressions.

Upstream is unlikely to help debug old versions. The Debian solution
is to report the issue on bugs.debian.org, but this does not scale
well and I am likely to just ignore the bug because I am too short on
time.


The statement on the HAProxy bugs page implies that there is only one 
right way. That same website refers to haproxy.debian.org, which 
contradicts the former. I understand that the points of view are 
difficult to reconcile. I do not think that -when the user is actively 
pointed towards both sources- they should contradict each other, 
however.



If 2.2.9 as in official Debian repository does not work for you,
the easiest path is to upgrade to 2.2.25 using the second set of
instructions.


I found this bug[1] on the bugs page which looks promising. I'll do
some more investigation today. Perhaps someone could corroborate that
that bug's symptoms match what I'm seeing.


Note that if this patch fixes this bug, this is a lot of work to
integrate it into the current release of Debian. This will have to
wait for the next point release (not a security issue), I would need
to ask people to authorize the patch, explain, ask again, prepare,
upload, then upload the backports until you get the resulting package
available as 2.2.9-2+deb11u4~bpo10+1. Backporting a random patch may
trigger regressions as it may need other patches to be backported.
This is a nest of problems. So, if this patch solves your issue, you
are on your own maintaining a fork of the package.

The commit mentioned in the patch (eddcfbc1911c when backported) is
introduced in 2.2.23, so it's likely not the patch you need or you
need other patches as well.


According to http://www.haproxy.org/bugs, 2.2.9 is affected by the 
bug[1]. However, the changelog[2] only shows the causing commit 
("BUG/MEDIUM: mux-h2: make use of http-request and keep-alive timeouts") 
to be included in 2.2.23. How could 2.2.9 be affected by a bug which was 
introduced by a commit that is included in 2.2.23?


[1]: http://git.haproxy.org/?p=haproxy-2.2.git;a=commitdiff;h=3e2434e
[2]: http://git.haproxy.org/?p=haproxy-2.2.git;a=blob_plain;f=CHANGELOG

--
With kind regards,

William Edwards




Re: Server timeouts since HAProxy 2.2

2022-08-06 Thread Willy Tarreau
On Thu, Aug 04, 2022 at 12:14:04PM +0200, Vincent Bernat wrote:
> On 2022-08-04 10:35, William Edwards wrote:
> 
> > However,
> > https://haproxy.debian.net/#distribution=Debian=buster=2.2
> > says:
> > 
> > "The Debian HAProxy packaging team provides various versions of HAProxy
> > packages for use on different Debian or Ubuntu systems. The following
> > wizard helps you to find the package suitable for your system. [...] You
> > will get a stable release of HAProxy 2.2: you may not get the latest
> > version but important fixes from later versions are included. Moreover,
> > regressions are unlikely."
> > 
> > The bugs page tries to get users to ALWAYS use the latest version. But
> > the haproxy.debian.org page says that it's okay not to use the latest
> > version.
> 
> That's two different point of views, one from Debian, one from upstream.
> They are difficult to reconcile.

They will not for the simple reason that both have different goals and
difficulties:
  - a project's goal is to reduce the number of bugs because bugs have
direct impact on users experience, cause insatisfaction, and waste
development time trying to analyse already fixed issues.

  - a distro's goal is to limit the risk of *regressions*, because a
distro doesn't have the manpower nor skills to fix issues in every
project, and they're on the first line of bug reports. As such they
prefer to keep known bugs, than risking to break something for
existing users. Users continually experiencing issues will naturally
try another project / distro / version.

The only way for distros to limit the amount of bugs without risking
regressions currently is to ship proven stable versions. But in the
perpetually evolving world of the WWW, standards are dictated by users
(browsers, application componetns etc) and it's not always simple for
users to accept to stay on an older but much stable version.

The best solution to address the needs of users that are in between is what
you're doing, Vincent, with your packages on https://haproxy.debian.net/.
This is by far the best offer one can think of, and I confess that we're
extremely lucky as a project to benefit from this. I understand that not
all projects in a distro could have this, it's a significant extra work.
But it perfectly plugs the hole, and that's why I strongly encourage users
to switch to these packages. They remove some hassle from the distro since
upstream can handle bugs, and improve the users' experience by delivering
fixes for all known bugs.

If that would help, we could even add links to alternate repositories in
the output of "haproxy -v" so that users are more naturally invited to
switch if they feel like it better matches their needs.

Regards,
Willy



Re: Server timeouts since HAProxy 2.2

2022-08-06 Thread William Edwards

Igor Cicimov schreef op 2022-08-04 01:46:

Because of keep-alive?


Disabling keepalive on the server side using `option http-server-close` 
fixes the issue. I've yet to figure out why.




-

From: William Edwards 
Sent: Thursday, 4 August 2022, 00:26
To: haproxy@formilux.org 
Subject: Server timeouts since HAProxy 2.2

[You don't often get email from wedwa...@cyberfusion.nl. Learn why
this is important at https://aka.ms/LearnAboutSenderIdentification ]

Hi,

Two days ago, I upgraded my first production system from HAProxy
1.8.19
to 2.2.9. Since then, many HTTP requests are hitting the server
timeout.

Before upgrade:

 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.5.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.4.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.3.gz | wc -l
 0

After upgrade:

 # Day of upgrade
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.2.gz | wc -l
 3798
 # Yesterday
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log.1 | wc -l
 127176
 # Today, so far
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log | wc -l
 85063

For this specific request, Ta ("total active time for the HTTP
request")
is 3, and Tt ("total TCP session duration time, between the moment the
proxy accepted it and the moment both ends were closed") is 34 (5
minutes, the server timeout):

 Aug  3 00:31:05 lb0-0 haproxy[16884]: $ip:62223
[03/Aug/2022:00:26:05.337] fr_other~
bk_http.lyr_http-lyr02.cf.ha.cyberfusion.cloud/http-lyr02.cf.ha.cyberfusion.cloud
0/0/0/3/34 200 27992 - - sD-- 616/602/226/226/0 0/0 "GET
https://$domain/wp-content/uploads/2022/07/20220712_155022-300x300.jpg
HTTP/2.0"

The backend server indeed served the request within Ta:

 $domain $ip - - [03/Aug/2022:00:26:05 +0200] "GET
/wp-content/uploads/2022/07/20220712_155022-300x300.jpg HTTP/1.1" 200
28008 "https://$domain/stoffen/; "Mozilla/5.0 (Windows NT 10.0; Win64;
x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0
Safari/537.36"

The timeouts only occur with 5 out of 13 backends. There is no clear
pattern, i.e. the timeouts don't come in bursts, and they aren't
caused
by fixed clients.

Does anyone know why the TCP session is kept open, and the HTTP
request
is not responded to by HAProxy after the backend server responded to
the
HTTP request, but only after the server timeout is reached?

--
With kind regards,

William Edwards

Know Your Customer due diligence on demand, powered by intelligent
process automation

Blogs [1] |  LinkedIn [2] |  Twitter [3]

Encompass Corporation UK Ltd | Company No. SC493055 | Address: Level
3, 33 Bothwell Street, Glasgow, UK, G2 6NL
Encompass Corporation Pty Ltd | ACN 140 556 896 | Address: Level 10,
117 Clarence Street, Sydney, New South Wales, 2000
Encompass Corporation US Ltd | Company No. 7946259 | Address: 5th
floor, 1460 Broadway, New York, New York, 10036
This email and any attachments is intended only for the use of the
individual or entity named above and may contain confidential
information
If you are not the intended recipient, any dissemination, distribution
or copying of this email is prohibited.
If received in error, please notify us immediately by return email and
destroy the original message.



Links:
--
[1] https://www.encompasscorporation.com/blog/
[2] https://www.linkedin.com/company/encompass-corporation/
[3] https://twitter.com/EncompassCorp


--
With kind regards,

William Edwards




Re: Server timeouts since HAProxy 2.2

2022-08-04 Thread Vincent Bernat

On 2022-08-04 10:35, William Edwards wrote:

However, 
https://haproxy.debian.net/#distribution=Debian=buster=2.2 says:


"The Debian HAProxy packaging team provides various versions of HAProxy 
packages for use on different Debian or Ubuntu systems. The following 
wizard helps you to find the package suitable for your system. [...] You 
will get a stable release of HAProxy 2.2: you may not get the latest 
version but important fixes from later versions are included. Moreover, 
regressions are unlikely."


The bugs page tries to get users to ALWAYS use the latest version. But 
the haproxy.debian.org page says that it's okay not to use the latest 
version.


That's two different point of views, one from Debian, one from upstream. 
They are difficult to reconcile. That's why you (as a user) have to 
choose: an old version with only "important" fixes (security fixes 
mostly) and with known bugs but unlikely regressions on upgrade, or a 
recent version of a stable branch with fixes and sometimes regressions.


Upstream is unlikely to help debug old versions. The Debian solution is 
to report the issue on bugs.debian.org, but this does not scale well and 
I am likely to just ignore the bug because I am too short on time. If 
2.2.9 as in official Debian repository does not work for you, the 
easiest path is to upgrade to 2.2.25 using the second set of instructions.


> I found this bug[1] on the bugs page which looks promising. I'll do
> some more investigation today. Perhaps someone could corroborate that
> that bug's symptoms match what I'm seeing.

Note that if this patch fixes this bug, this is a lot of work to 
integrate it into the current release of Debian. This will have to wait 
for the next point release (not a security issue), I would need to ask 
people to authorize the patch, explain, ask again, prepare, upload, then 
upload the backports until you get the resulting package available as 
2.2.9-2+deb11u4~bpo10+1. Backporting a random patch may trigger 
regressions as it may need other patches to be backported. This is a 
nest of problems. So, if this patch solves your issue, you are on your 
own maintaining a fork of the package.


The commit mentioned in the patch (eddcfbc1911c when backported) is 
introduced in 2.2.23, so it's likely not the patch you need or you need 
other patches as well.




Re: Server timeouts since HAProxy 2.2

2022-08-04 Thread William Edwards

Hi Christopher,

Thanks for your reply.

Christopher Faulet schreef op 2022-08-04 08:56:

Le 8/3/22 à 16:23, William Edwards a écrit :

Hi,

Two days ago, I upgraded my first production system from HAProxy 
1.8.19
to 2.2.9. Since then, many HTTP requests are hitting the server 
timeout.


Before upgrade:

  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.5.gz | wc -l
  0
  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.4.gz | wc -l
  0
  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.3.gz | wc -l
  0

After upgrade:

  # Day of upgrade
  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.2.gz | wc -l
  3798
  # Yesterday
  root@lb0-0:~# grep 'sD--' /var/log/haproxy.log.1 | wc -l
  127176
  # Today, so far
  root@lb0-0:~# grep 'sD--' /var/log/haproxy.log | wc -l
  85063

For this specific request, Ta ("total active time for the HTTP 
request")

is 3, and Tt ("total TCP session duration time, between the moment the
proxy accepted it and the moment both ends were closed") is 34 (5
minutes, the server timeout):

  Aug  3 00:31:05 lb0-0 haproxy[16884]: $ip:62223
[03/Aug/2022:00:26:05.337] fr_other~
bk_http.lyr_http-lyr02.cf.ha.cyberfusion.cloud/http-lyr02.cf.ha.cyberfusion.cloud
0/0/0/3/34 200 27992 - - sD-- 616/602/226/226/0 0/0 "GET
https://$domain/wp-content/uploads/2022/07/20220712_155022-300x300.jpg
HTTP/2.0"

The backend server indeed served the request within Ta:

  $domain $ip - - [03/Aug/2022:00:26:05 +0200] "GET
/wp-content/uploads/2022/07/20220712_155022-300x300.jpg HTTP/1.1" 200
28008 "https://$domain/stoffen/; "Mozilla/5.0 (Windows NT 10.0; Win64;
x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0
Safari/537.36"

The timeouts only occur with 5 out of 13 backends. There is no clear
pattern, i.e. the timeouts don't come in bursts, and they aren't 
caused

by fixed clients.

Does anyone know why the TCP session is kept open, and the HTTP 
request
is not responded to by HAProxy after the backend server responded to 
the

HTTP request, but only after the server timeout is reached?



Hi,

The 2.2.9 is pretty old. [...] You must update it to
2.2.25 first.


The public outings regarding versioning contradict each other.

The bugs page says:

"If your version is not the last one in the maintenance branch, you are 
missing fixes for known bugs, and by not updating you are needlessly 
taking the responsibility for the risk of unexpected service outages and 
exposing your web site to possible security issues."


However, 
https://haproxy.debian.net/#distribution=Debian=buster=2.2 
says:


"The Debian HAProxy packaging team provides various versions of HAProxy 
packages for use on different Debian or Ubuntu systems. The following 
wizard helps you to find the package suitable for your system. [...] You 
will get a stable release of HAProxy 2.2: you may not get the latest 
version but important fixes from later versions are included. Moreover, 
regressions are unlikely."


The bugs page tries to get users to ALWAYS use the latest version. But 
the haproxy.debian.org page says that it's okay not to use the latest 
version.


It is affected by 369 known bugs 
(http://www.haproxy.org/bugs/bugs-2.2.9.html).


I found this bug[1] on the bugs page which looks promising. I'll do some 
more investigation today. Perhaps someone could corroborate that that 
bug's symptoms match what I'm seeing.




Regards,


[1]: http://git.haproxy.org/?p=haproxy-2.2.git;a=commitdiff;h=3e2434e

--
With kind regards,

William Edwards




Re: Server timeouts since HAProxy 2.2

2022-08-04 Thread Christopher Faulet

Le 8/3/22 à 16:23, William Edwards a écrit :

Hi,

Two days ago, I upgraded my first production system from HAProxy 1.8.19
to 2.2.9. Since then, many HTTP requests are hitting the server timeout.

Before upgrade:

  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.5.gz | wc -l
  0
  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.4.gz | wc -l
  0
  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.3.gz | wc -l
  0

After upgrade:

  # Day of upgrade
  root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.2.gz | wc -l
  3798
  # Yesterday
  root@lb0-0:~# grep 'sD--' /var/log/haproxy.log.1 | wc -l
  127176
  # Today, so far
  root@lb0-0:~# grep 'sD--' /var/log/haproxy.log | wc -l
  85063

For this specific request, Ta ("total active time for the HTTP request")
is 3, and Tt ("total TCP session duration time, between the moment the
proxy accepted it and the moment both ends were closed") is 34 (5
minutes, the server timeout):

  Aug  3 00:31:05 lb0-0 haproxy[16884]: $ip:62223
[03/Aug/2022:00:26:05.337] fr_other~
bk_http.lyr_http-lyr02.cf.ha.cyberfusion.cloud/http-lyr02.cf.ha.cyberfusion.cloud
0/0/0/3/34 200 27992 - - sD-- 616/602/226/226/0 0/0 "GET
https://$domain/wp-content/uploads/2022/07/20220712_155022-300x300.jpg
HTTP/2.0"

The backend server indeed served the request within Ta:

  $domain $ip - - [03/Aug/2022:00:26:05 +0200] "GET
/wp-content/uploads/2022/07/20220712_155022-300x300.jpg HTTP/1.1" 200
28008 "https://$domain/stoffen/; "Mozilla/5.0 (Windows NT 10.0; Win64;
x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0
Safari/537.36"

The timeouts only occur with 5 out of 13 backends. There is no clear
pattern, i.e. the timeouts don't come in bursts, and they aren't caused
by fixed clients.

Does anyone know why the TCP session is kept open, and the HTTP request
is not responded to by HAProxy after the backend server responded to the
HTTP request, but only after the server timeout is reached?



Hi,

The 2.2.9 is pretty old. It is affected by 369 known bugs 
(http://www.haproxy.org/bugs/bugs-2.2.9.html). You must update it to 2.2.25 first.


Regards,
--
Christopher Faulet



Re: Server timeouts since HAProxy 2.2

2022-08-03 Thread Igor Cicimov
Because of keep-alive?


From: William Edwards 
Sent: Thursday, 4 August 2022, 00:26
To: haproxy@formilux.org 
Subject: Server timeouts since HAProxy 2.2

[You don't often get email from wedwa...@cyberfusion.nl. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi,

Two days ago, I upgraded my first production system from HAProxy 1.8.19
to 2.2.9. Since then, many HTTP requests are hitting the server timeout.

Before upgrade:

 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.5.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.4.gz | wc -l
 0
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.3.gz | wc -l
 0

After upgrade:

 # Day of upgrade
 root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.2.gz | wc -l
 3798
 # Yesterday
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log.1 | wc -l
 127176
 # Today, so far
 root@lb0-0:~# grep 'sD--' /var/log/haproxy.log | wc -l
 85063

For this specific request, Ta ("total active time for the HTTP request")
is 3, and Tt ("total TCP session duration time, between the moment the
proxy accepted it and the moment both ends were closed") is 34 (5
minutes, the server timeout):

 Aug  3 00:31:05 lb0-0 haproxy[16884]: $ip:62223
[03/Aug/2022:00:26:05.337] fr_other~
bk_http.lyr_http-lyr02.cf.ha.cyberfusion.cloud/http-lyr02.cf.ha.cyberfusion.cloud
0/0/0/3/34 200 27992 - - sD-- 616/602/226/226/0 0/0 "GET
https://$domain/wp-content/uploads/2022/07/20220712_155022-300x300.jpg
HTTP/2.0"

The backend server indeed served the request within Ta:

 $domain $ip - - [03/Aug/2022:00:26:05 +0200] "GET
/wp-content/uploads/2022/07/20220712_155022-300x300.jpg HTTP/1.1" 200
28008 "https://$domain/stoffen/; "Mozilla/5.0 (Windows NT 10.0; Win64;
x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0
Safari/537.36"

The timeouts only occur with 5 out of 13 backends. There is no clear
pattern, i.e. the timeouts don't come in bursts, and they aren't caused
by fixed clients.

Does anyone know why the TCP session is kept open, and the HTTP request
is not responded to by HAProxy after the backend server responded to the
HTTP request, but only after the server timeout is reached?

--
With kind regards,

William Edwards




[https://c.ap4.content.force.com/servlet/servlet.ImageServer?id=0156F0DRM7G=00D9000absk=1526270984000]

Know Your Customer due diligence on demand, powered by intelligent process 
automation

Blogs |  
LinkedIn |  
Twitter

Encompass Corporation UK Ltd | Company No. SC493055 | Address: Level 3, 33 
Bothwell Street, Glasgow, UK, G2 6NL
Encompass Corporation Pty Ltd | ACN 140 556 896 | Address: Level 10, 117 
Clarence Street, Sydney, New South Wales, 2000
Encompass Corporation US Ltd | Company No. 7946259 | Address: 5th floor, 1460 
Broadway, New York, New York, 10036
This email and any attachments is intended only for the use of the individual 
or entity named above and may contain confidential information
If you are not the intended recipient, any dissemination, distribution or 
copying of this email is prohibited.
If received in error, please notify us immediately by return email and destroy 
the original message.