wt., 21 mar 2023 o 11:39 Willy Tarreau <[email protected]> napisaĆ(a):
> Just to be clear on these last few points, when you say you cannot
> connect, you mean in fact that the connection establishes to haproxy
> but your request cannot reach the server, right ? 503 will indeed
> indicate a failure to find a server or a connection that died in the
> queue.
Correct!. TCP connection from client to haproxy is established and haproxy
returns 503 (termination state sQ--).
> Does your stats page indicate that for the servers or the
> backend there are still connections in the queue ?
When the traffic was directed to another data center, the stats page on the
affected server showed that there was no connection/traffic and HAProxy
still could not connect to the backend server (sQ--).
> A test could be useful, to force the LB algorithm to something determinist
> (e.g.
> "balance source").
>
I will check if it is possible to balance by source. This frontend serves
around 8000 rps and I'm not sure if changing "ratio" algorithm to "balance
source" won't cause any troubles (with customers behind huge NAT).
Sample log of request that was made AFTER the traffic switch to another
data-center.
{
"_source": {
"status_code": "503",
"time_queue": "1001",
<- timeout queue 1s
"time_sess_tot": "1012",
"ssl_cipher": "TLS_AES_256_GCM_SHA384",
"memb_conc_conn": "0", <-
%sc (server concurrent connections)
"client_port": "49681",
"time_req_headers": "11",
"member":
"-:-", <- member
IP was not selected
"time_data_transmission": "-1",
"time_req_active": "1001",
"method": "HEAD",
"termination_state": "sQ--",
<- looks like queue is full
"time_tcp_memb": "-1",
"time_ssl": "11",
"time_resp": "-1",
"time_idle": "0",
"member_name": "<NOSRV>",
"bytes_read_h": "206",
"bytes_uploaded_h": "98",
"time_req_from_firstbyte": "0",
"ssl_ver": "TLSv1.3",
"http_ver": "HTTP/2.0"
}
}