Keepalive to backend is always 10 seconds

2022-01-07 Thread Rui Santos
Hello,

This is my 1st cry for help on HAProxy here. If this is not the
correct place, please be so kind as to redirect me to the proper one.

I'm new to HAProxy, and I'm trying to set up HAProxy 2.5.0, to act as
an SSL terminator to a single backend, for my initial testing.

The connection itself, as well the communication between the client,
all the way to the backend is operational. The issue I'm having is
with http keepalive. The description is as follows:
- http keepalive is operating properly between client and HAProxy,
respecting all parameters provided/defined on both client and HAProxy
- http keepalive is also operating as expected between HAProxy and the
backend server, as long as there's less than 10 seconds between
transfers from the client

if more than 10 seconds pass between any communication is received
from the client side, this is what happens, at second T+10:
- There is no extra data produced by the client, nothing on tcpdump
- Between HAProxy and the backend, this happens:
17:22:54.058275 IP 10.116.0.96.443 > 10.116.0.65.36096: Flags [P.],
seq 3889:3920, ack 832, win 505, options [nop,nop,TS val 2833024558
ecr 864209762], length 31
17:22:54.058330 IP 10.116.0.65.36096 > 10.116.0.96.443: Flags [.], ack
3920, win 501, options [nop,nop,TS val 864219772 ecr 2833024558],
length 0
17:22:54.058366 IP 10.116.0.96.443 > 10.116.0.65.36096: Flags [F.],
seq 3920, ack 832, win 505, options [nop,nop,TS val 2833024558 ecr
864219772], length 0
17:22:54.058500 IP 10.116.0.65.36096 > 10.116.0.96.443: Flags [P.],
seq 832:863, ack 3921, win 501, options [nop,nop,TS val 864219772 ecr
2833024558], length 31
17:22:54.058516 IP 10.116.0.96.443 > 10.116.0.65.36096: Flags [R], seq
443460723, win 0, length 0
which culminates is the session being closed. This capture was taken
on the backend server.

This always happens after 10 seconds, which led me to believe it's a
timeout on the HAProxy side, but I was unable to find any parameter to
adjust it, when looking at the documentation.

As an extra note, if I communicate from the client, towards the
backend directly, http keepalive also works as it should.

Here the HAProxy configuration file:
global
  log /dev/log local0 info

defaults
  mode tcp
  timeout connect 500s
  timeout client 500s
  timeout server 500s
  maxconn 30
  timeout http-request 500s
  timeout http-keep-alive 500s

frontend all-in
  mode http
  bind 1.1.1.1:443 ssl crt /etc/haproxy/ssl_certs/somedomain.pem
  tcp-request inspect-delay 5s
  use_backend somedomain if { ssl_fc_sni_end .somedomain.com }
  option forwardfor
  timeout client 2147483647

backend somedomain
  mode http
  balance source
  hash-type consistent
  http-reuse always
  server somedomain 1.1.1.1:8443 ssl verify none alpn http/1.1
  no option http-server-close
  no option httpclose
  option forwardfor
  timeout http-request 500s
  timeout http-keep-alive 500s
  timeout server 2147483647

The public IPs were hidden for privacy.
The setup is one physical host that holds HAProxy, and the backend is
on a docker container, on the same host.

Also some HAProxy information:
# haproxy -vv
HAProxy version 2.5.0 2021/11/23 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2023.
Known bugs: http://www.haproxy.org/bugs/bugs-2.5.0.html
Running on: Linux 5.9.11-1.el7.elrepo.x86_64 #1 SMP Tue Nov 24
09:45:34 EST 2020 x86_64
Build options :
  TARGET  = linux-glibc
  CPU = generic
  CC  = cc
  CFLAGS  = -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement
-fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter
-Wno-clobbered -Wno-missing-field-initializers -Wtype-limits
-Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond
-Wnull-dereference
  OPTIONS = USE_PCRE2_JIT=1 USE_THREAD=1 USE_LIBCRYPT=1 USE_OPENSSL=1
USE_LUA=1 USE_SLZ=1 USE_SYSTEMD=1
  DEBUG   =

Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT -PCRE2
+PCRE2_JIT +POLL +THREAD +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY
+LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL
+LUA +ACCEPT4 -CLOSEFROM -ZLIB +SLZ +CPU_AFFINITY +TFO +NS +DL +RT
-DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL
-PROCCTL +THREAD_DUMP -EVPORTS -OT -QUIC -PROMEX -MEMORY_PROFILING

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=48).
Built with OpenSSL version : OpenSSL 1.1.1k  FIPS 25 Mar 2021
Running on OpenSSL version : OpenSSL 1.1.1k  FIPS 25 Mar 2021
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.4.3
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT
IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 

[ANNOUNCE] haproxy-2.4.11

2022-01-07 Thread Willy Tarreau
Hi,

HAProxy 2.4.11 was released on 2022/01/07. It added 20 new commits
after version 2.4.10.

This version addresses a small number of issues that were not merged into
2.4.10 yet, either because not stricly required and uncertain by then, or
because they were fixed after.

 - there was a possible slow memory leak of struct sockaddr during
   layer-7 retries that would end up with a redispatch. We're speaking
   about ~200 bytes per retried request, which normally doesn't harm,
   but can at least fool some monitoring and cause some concerns

 - there was a risk of frozen stream or spinning loop when combining
   layer-7 retries with some filters because an analyser responsible
   for releasing the filter was dropped. This was fixed.

 - there was an allocation problem when SSL was configured using a
   "default-server" directive. Some SSL settings like "crt" or
   possibly "ca" as well were causing an SSL_CTX to be allocated too
   early (at the moment the directive was parsed) and replicated for
   each server inheriting it. But that led to problems when these
   fields were updated at runtime for a given server as that could
   affect other servers' as well. And during soft-stop it would cause
   double-free issues as reported in github issue 1488.

 - William found that a number of free() were missing for server SSL
   settings when deleting a server. That's not dramatic but it could
   definitely be noticeable by those adding/removing servers often.

 - splicing of HTTP/1.1 responses would always incorrectly end up in
   closing the client connection at the end of the transfer, and was simply
   disabled for messages of unknown lengths (neither content-length nor
   transfer-encoding). This was fixed.

 - since 2.4 during a soft-stop we're closing all idle frontend connections
   so that we don't have to wait for clients to time out nor for them to
   send a new request. But it turns out that doing it as any server would
   do it disturbs AWS' ALB, which immediately emits a 502 to their clients
   after failing to upload a new request on such a closed connection. It's
   well known (and documented) that reused connections have a window of
   uncertainty and that an agent must retry on them (which is why haproxy
   usually silently closes with the client when it experiences this so
   that the client can decide to retry). Thus ALB's behavior is incorrect
   and prevents from using keep-alive normally with the next hop. What was
   done here was to add an option "idle-close-on-response" to reintroduce
   the old behavior and wait for clients to speak first before closing.
   Credits go to William Dauchy for the report and the work around.

 - eliminate a rare risk of deadlock when built with DEBUG_UAF. It
   would only affect developers chasing some user-after-free bugs,
   but better fix it anyway.

 - on reload we used to transfer listening sockets by packs of 253 between
   the old and the new process but it looks like for whatever reason on
   musl 253 doesn't work and the limit is 252. It might be caused by a
   slightly different layout for the message. So the limit was lowered by
   one as this will definitely not affect reload time!

 - Daniel Jakots fixed the build with libreSSL 3.5 and newer (some macros
   didn't work anymore).

 - David Carlier fixed the build with FreeBSD 14, which changes the cpuset
   API to better match Linux's.

 - another build issue, this time with clang on i386. It tries to make
   use of the CMPXCHG8B instruction to perform 64-bit atomics but
   incorrectly expects the operands to be 64-bit aligned while neither
   the ABI nor the instruction have this requirement. So basically it
   complains about the code it produces itself. The analysis showed that
   working around this would require tens to hundreds of isolated hacks
   and that the least dirty solution is to disable the warning. Firefox
   faced the same issue 3 years ago and adopted the same work around. I
   guess nobody's interested anymore in i386 for anyone to expect a fix
   there anyway.

 - fixed some usual "maybe unused" warnings on old compilers for
   unusual platform (gcc-4.7 on MIPS with threads disabled).

 - a small improvement, in order to help users provide exploitable cores,
   there's now a new command-line option "-dL" which dumps the dynamic
   libraries that were detected at run time just before forking. This
   possibly includes dependencies from Lua or various other libs that
   do not always appear in "ldd". Typically libgcc_s is listed. The
   output format allows to pipe that to tar to produce an archive of
   all executable code that apparently tends to open well with a core,
   irrelevant to the distros in use. Since it eases bug reports, we've
   decided to backport it.

There's still one thing currently being discussed in issue 1498: there is
an incompatibility between the nghttp client and a few HTTP/2 servers
among which haproxy when the HPACK headers table is set by the 

[PATCH] CI: cleanup default step condition

2022-01-07 Thread Илья Шипицин
Hello,

this is cleanup patch that removes default (non needed) step condition.
behavior is not changed.

thanks,
Ilya
From edbcc5312efa468f028ea8d97cbe1393aafdfcd7 Mon Sep 17 00:00:00 2001
From: Ilya Shipitsin 
Date: Fri, 7 Jan 2022 20:09:35 +0500
Subject: [PATCH] CI: github actions: clean default step conditions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

step condition "if: ${{ !failure() }}" was added in 2ef4c7c84363f5a9b80a2093df1370514319db28
during my experiments. As Tim Düsterhus mentioned, that condition is default and may be omitted.
---
 .github/workflows/vtest.yml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/.github/workflows/vtest.yml b/.github/workflows/vtest.yml
index 121c37d4e..421384b46 100644
--- a/.github/workflows/vtest.yml
+++ b/.github/workflows/vtest.yml
@@ -87,7 +87,6 @@ jobs:
   ADDLIB="-Wl,-rpath,/usr/local/lib/ -Wl,-rpath,$HOME/opt/lib/"
 sudo make install
 - name: Show HAProxy version
-  if: ${{ !failure() }}
   id: show-version
   run: |
 echo "::group::Show dynamic libraries."
@@ -102,11 +101,9 @@ jobs:
 haproxy -vv
 echo "::set-output name=version::$(haproxy -v |awk 'NR==1{print $3}')"
 - name: Install problem matcher for VTest
-  if: ${{ !failure() }}
   # This allows one to more easily see which tests fail.
   run: echo "::add-matcher::.github/vtest.json"
 - name: Run VTest for HAProxy ${{ steps.show-version.outputs.version }}
-  if: ${{ !failure() }}
   id: vtest
   run: |
 # This is required for macOS which does not actually allow to increase
-- 
2.33.1



Re: [PATCH] CI: github actions: do not try to show vtest results if vtest was not run

2022-01-07 Thread Willy Tarreau
On Fri, Jan 07, 2022 at 02:45:17PM +0300,  ??? wrote:
> I'm stuck for 1 week on "github caching". I created bad cache and I have to
> wait 1 week until it is expired (no purge option).

Ah the fun of caches :-)

> So, I'll send " cleanup" part first, I tested Tim's suggestion, works as
> designed

OK! I merged your 30th typo fixes BTW.

Thanks,
Willy



Re: [PATCH] CI: github actions: do not try to show vtest results if vtest was not run

2022-01-07 Thread Илья Шипицин
I'm stuck for 1 week on "github caching". I created bad cache and I have to
wait 1 week until it is expired (no purge option).

So, I'll send " cleanup" part first, I tested Tim's suggestion, works as
designed

On Sat, Dec 25, 2021, 6:32 PM Илья Шипицин  wrote:

>
>
> On Sat, Dec 25, 2021, 5:09 PM Willy Tarreau  wrote:
>
>> On Sat, Dec 25, 2021 at 06:40:57PM +0500,  ??? wrote:
>> > Let's merge as is.
>> >
>> > I'll test changes later. Anyway, I've figured out how to enable cache
>> and
>> > there will be patches later
>>
>> OK that works, now merged.
>>
>> Have a nice week-end!
>>
>
> Merry Christmas!
>
>> Willy
>>
>


[PATCH] refactor CI spell check, fix 2 spelling typos

2022-01-07 Thread Илья Шипицин
Hello,

another spelling check cleanup.

Ilya
From 7f8ecfac2319fa8ebf0518796b5c7493a681fd6e Mon Sep 17 00:00:00 2001
From: Ilya Shipitsin 
Date: Fri, 7 Jan 2022 14:46:15 +0500
Subject: [PATCH 2/2] CLEANUP: assorted typo fixes in the code and comments

This is 30th iteration of typo fixes
---
 reg-tests/ssl/dynamic_server_ssl.vtc | 2 +-
 src/xprt_quic.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/reg-tests/ssl/dynamic_server_ssl.vtc b/reg-tests/ssl/dynamic_server_ssl.vtc
index 84854a38b..0e6ecb5ab 100644
--- a/reg-tests/ssl/dynamic_server_ssl.vtc
+++ b/reg-tests/ssl/dynamic_server_ssl.vtc
@@ -1,5 +1,5 @@
 #REGTEST_TYPE=bug
-# Test if a certicate can be dynamically updated once a server which used it
+# Test if a certificate can be dynamically updated once a server which used it
 # was removed.
 #
 varnishtest "Delete server via cli and update certificates"
diff --git a/src/xprt_quic.c b/src/xprt_quic.c
index 93a24f089..4a0085f3b 100644
--- a/src/xprt_quic.c
+++ b/src/xprt_quic.c
@@ -3276,7 +3276,7 @@ static inline void quic_conn_take(struct quic_conn *qc)
 }
 
 /* Decrement the  refcount. If the refcount is zero *BEFORE* the
- * substraction, the quic_conn is freed.
+ * subtraction, the quic_conn is freed.
  */
 static void quic_conn_drop(struct quic_conn *qc)
 {
-- 
2.33.1

From 36180474c340682bfdce414757a2184902dd039b Mon Sep 17 00:00:00 2001
From: Ilya Shipitsin 
Date: Fri, 7 Jan 2022 14:42:54 +0500
Subject: [PATCH 1/2] CI: refactor spelling check

let us switch to codespell github actions instead of invocation from cmdline.
also, "ifset,thrid,strack,ba,chck,hel,unx,mor" added to whitelist, those are
variable names and special terms widely used in HAProxy
---
 .github/workflows/codespell.yml | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/.github/workflows/codespell.yml b/.github/workflows/codespell.yml
index 955560a0a..3b3114135 100644
--- a/.github/workflows/codespell.yml
+++ b/.github/workflows/codespell.yml
@@ -12,12 +12,8 @@ jobs:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v2
-- name: Install codespell
-  run: sudo pip install codespell
-- name: Run codespell
-  run: |
-codespell \
-  -c \
-  -q 2 \
-  --ignore-words-list ist,ists,hist,wan,ca,cas,que,ans,te,nd,referer,ot,uint,iif,fo,keep-alives,dosen \
-  --skip="CHANGELOG,Makefile,*.fig,*.pem"
+- uses: codespell-project/codespell-problem-matcher@v1
+- uses: codespell-project/actions-codespell@master
+  with:
+skip: CHANGELOG,Makefile,*.fig,*.pem
+ignore_words_list: ist,ists,hist,wan,ca,cas,que,ans,te,nd,referer,ot,uint,iif,fo,keep-alives,dosen,ifset,thrid,strack,ba,chck,hel,unx,mor
-- 
2.33.1