Hello,

I'm a happy user of HAProxy and so far have been able to resolve any issues I've had by reading the docs and following the answers given on this mailing list or online tutorials. But now I've run into a problem I cannot resolve myself and hope someone could help me figure out what might be wrong.

The setup I have has been running fine for many months now. HAProxy terminates TLS for HTTPS requests and forwards these requests to a couple of backends. The backend servers are HTTP/1.1 only. As long as the frontend is limited to HTTP/1.1, the special backend I use for file uploads is operating exactly as intended. After enabling HTTP/2 on the frontend, however, the file upload backends are not working as before. The requests will not be processed properly and run into timeouts.

These file upload backends in my HAProxy configuration are somewhat special. They try to "serialize" certain HTTP file upload requests made by browser clients via AJAX calls (i.e. drag-and-drop of several files at once into the browser window). These files need to be processed one after the other per client (not in parallel). So the HAProxy backend in question uses a number of servers with a "maxconn 1" setting each, which will process the first request immediately but queue subsequent HTTP requests coming in at the same time until the previous request is finished. This approach certainly is not perfect in design, but has been working for me when using a somewhat high arbitrary number of pseudo-servers to realize it, so that each client making these file upload requests will be served by one "server" exclusively. This is what the backend definition looks like:

backend upload_ananas
    option forwardfor if-none header X-Client-IP #    except 127.0.0.0/8
    stick-table type string len 32 size 10k expire 2m
    stick match req.cook(mycookie),url_dec
    stick store-request req.cook(mycookie),url_dec
    timeout server 10m
    timeout queue  20s
    balance hdr(Cookie)
default-server no-check maxconn 1 maxqueue 20 send-proxy-v2 track xy/ananas source "ipv6@[${BACKENDUSEIPV6}]"
    server-template a 32 "${ANANAS}":9876


Once I switched the HTTP frontend to use HTTP/2 (using "alpn h2,http/1.1"), this special backend is not working as expected anymore. All is fine as long as there is only one request present for a certain server at any given time. However, when there are two or more requests at the same time, i.e. as soon as the queueing mechanism is supposed to kick in, the setup is not working the way it does with a HTTP/1.1 frontend. The parallel requests aren't properly put into the queue (or taken out of the queue) in this case. From what I can see in the log file, the requests seem to be blocking one another, and nothing is happening until the timeout set by "timeout queue" is reached. At that point, 1 or 2 out of 4 requests in an example call may succeed, but the others will fail.

Clearly, this kind of setup is quite the opposite of what most people will be using. In my case, I'm deliberately trying to stuff requests into a queue, whereas normally, one would try to move requests to a server that has got slots open for processing. So I think that my use case is hitting different code paths than most other setups.

I've read in previous emails on the mailing list that the "maxconn" setting nowadays does not limit the number of TCP sessions to the backend server, but the number of parallel HTTP requests. This made we wonder if the trouble I'm seeing might have to do with the way multiplexed HTTP/2 requests are mapped to HTTP/1.1 backends. Could it be that when the backend server finishes processing the first request, this isn't generating a proper event in HAProxy's backend logic, so that the next request is not being processed when it could? Or maybe there is something special about the number of "0" free slots in the server definition in this case, once the first slot has been taken?

Trying to work around the problem, I've switched on and off quite a few settings that may influence the way processing takes place, but still haven't been able to come up with a working configuration in the HTTP/2 case. Some of the settings I tried were:
  * default-server max-reuse 0
  * http-reuse never
  * option http-server-close
  * option httpclose
  * option http-buffer-request
  * retry-on conn-failure 408 503
  * http-request wait-for-body time 15s at-least 16k if METH_POST

With HTTP/2 active, there will be log entries like this (I re-ordered them to be in order of begin of processing).

Mar 18 17:30:31 localhost haproxy[478250]: fd90:1234::21a:52738 [18/Mar/2022:17:29:51.411] Loadbalancer~ zyx_ananas/a24 12/0/1/40112/40125 200 1217 mycookie=3BRoK6tyijmqndBJRzLyT9Lq7dsiPmeT - ---- 356/356/1/0/0 0/0 {serialize.example.com} "POST https://serialize.example.com/services/ajax.php/file/upload HTTP/2.0" Mar 18 17:30:11 localhost haproxy[478250]: fd90:1234::21a:52738 [18/Mar/2022:17:29:51.436] Loadbalancer~ zyx_ananas/a24 26/20015/-1/-1/20041 503 9228 mycookie=3BRoK6tyijmqndBJRzLyT9Lq7dsiPmeT - sQ-- 326/326/1/1/0 1/0 {serialize.example.com} "POST https://serialize.example.com/services/ajax.php/file/upload HTTP/2.0" Mar 18 17:30:31 localhost haproxy[478250]: fd90:1234::21a:52738 [18/Mar/2022:17:30:11.476] Loadbalancer~ zyx_ananas/a24 0/20002/-1/-1/20002 503 9228 mycookie=3BRoK6tyijmqndBJRzLyT9Lq7dsiPmeT - sQ-- 355/355/1/1/0 1/0 {serialize.example.com} "POST https://serialize.example.com/services/ajax.php/file/upload HTTP/2.0" Mar 18 17:30:26 localhost haproxy[478250]: fd90:1234::21a:52738 [18/Mar/2022:17:30:11.477] Loadbalancer~ zyx_ananas/<NOSRV> -1/-1/-1/-1/15001 408 0 - - cD-- 356/356/2/0/3 0/0 {serialize.example.com} "POST https://serialize.example.com/services/ajax.php/file/upload HTTP/2.0" Mar 18 17:30:31 localhost haproxy[478250]: fd90:1234::21a:52738 [18/Mar/2022:17:30:31.478] Loadbalancer~ zyx_ananas/a24 0/58/1/44/103 200 1218 mycookie=3BRoK6tyijmqndBJRzLyT9Lq7dsiPmeT - ---- 356/356/0/0/0 1/0 {serialize.example.com} "POST https://serialize.example.com/services/ajax.php/file/upload HTTP/2.0"

Note how the log line with code 408 does not contain the captured session cookie, like the other lines do. That request is most likely hitting the "wait-for-body" 15 second limit. The 0 byte answer on that line is due to a configuration line reading "errorfile 408 /dev/null" (as suggested by <https://www.haproxy.com/blog/haproxy-and-http-errors-408-in-chrome/>).

Also note how most of the request durations is aligned to certain round numbers (15 seconds, 20 seconds, 40 seconds) that represent configuration timeout settings.

On a successful upload, that is when the server's frontend is limited to HTTP/1.1, the log entries look like this, showing how queueing takes place correctly:

Mar 18 20:41:12 localhost haproxy[488918]: fd90:1234::21a:56502 [18/Mar/2022:20:41:09.253] Loadbalancer~ zyx_ananas/a24 0/0/1/2868/2869 200 1214 mycookie=39Lq7zLRmBoK6tyTjyidsnmedBiPqTJR - ---- 81/81/3/0/0 0/0 {deppmail.uni-koeln.de} "POST /services/ajax.php/file/upload HTTP/1.1" Mar 18 20:41:14 localhost haproxy[488918]: fd90:1234::21a:56503 [18/Mar/2022:20:41:09.291] Loadbalancer~ zyx_ananas/a24 0/2831/0/2119/4950 200 1215 mycookie=39Lq7zLRmBoK6tyTjyidsnmedBiPqTJR - ---- 85/85/2/0/0 1/0 {deppmail.uni-koeln.de} "POST /services/ajax.php/file/upload HTTP/1.1" Mar 18 20:41:16 localhost haproxy[488918]: fd90:1234::21a:56498 [18/Mar/2022:20:41:09.805] Loadbalancer~ zyx_ananas/a24 0/4435/1/2475/6911 200 1216 mycookie=39Lq7zLRmBoK6tyTjyidsnmedBiPqTJR - ---- 79/79/1/1/0 2/0 {deppmail.uni-koeln.de} "POST /services/ajax.php/file/upload HTTP/1.1" Mar 18 20:41:19 localhost haproxy[488918]: fd90:1234::21a:56501 [18/Mar/2022:20:41:09.807] Loadbalancer~ zyx_ananas/a24 0/6910/0/2581/9491 200 1216 mycookie=39Lq7zLRmBoK6tyTjyidsnmedBiPqTJR - ---- 76/76/0/0/0 3/0 {deppmail.uni-koeln.de} "POST /services/ajax.php/file/upload HTTP/1.1"

Note how the client's port numbers are different on each request in the HTTP/1.1 example, showing that a TLS handshake took place for each upload request, resulting in a few milliseconds delay. This is in contrast to the HTTP/2 example, where only one HTTP client connection is used (as should be).

So, does anyone have an idea what could be wrong in the HTTP/2 case and how I could fix it?

I'm running the latest 2.4 version of HAProxy, 2.4.15, but would be willing to try a newer version if that helps to rule out certain bugs fixed recently. Of course I can also provide further configuration information if needed. I didn't want to include all of it right here, as this email has grown rather lengthy already, as I tried to explain all the subtleties of the HAProxy setup in use.

Thanks,
Jens

Reply via email to