Hi Maksim,

On Thu, May 23, 2019 at 10:00:19AM +0300, ?????? ????????? wrote:
> 2nd session (from haproxy to ssl-enabled backend A, dumped with tshark for
> better readability):
> 1 09:10:48.222518 HAPROXY -> BACKEND_A TCP 94 36568 -> 9790 [SYN] Seq=0
> Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=3064071282 TSecr=0 WS=2048
> 2 09:10:48.222624 BACKEND_A -> HAPROXY TCP 94 9790 -> 36568 [SYN, ACK] Seq=0
> Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=3366865490
> TSecr=3064071282 WS=256
> 3 09:10:48.222639 HAPROXY -> BACKEND_A TCP 86 36568 -> 9790 [ACK] Seq=1 Ack=1
> Win=26624 Len=0 TSval=3064071283 TSecr=3366865490
> 4 09:10:48.222658 HAPROXY -> BACKEND_A TLSv1 603 Client Hello
> 5 09:10:48.222741 BACKEND_A -> HAPROXY TCP 86 9790 -> 36568 [ACK] Seq=1
> Ack=518 Win=27904 Len=0 TSval=3366865490 TSecr=3064071283
> 6 09:10:48.272165 HAPROXY -> BACKEND_A TCP 86 36568 -> 9790 [RST, ACK]
> Seq=518 Ack=1 Win=26624 Len=0 TSval=3064071332 TSecr=3366865490
> 
> Backend didn't answer with Server Hello in 49.5ms after tcp-handshed has
> finished for some reason. That is the root case of the error!!!

Indeed, this is an interesting case. I'm not sure why it's reported like
this but it definitely is a corner case as the L4 connection is established
and the handshake was aborted. It should have been reported as "sC" (timeout
during connect). But I can easily understand how we can make some wrong
assumptions based on the available elements when reporting an error (e.g.
the connection is valid, only the handshake is incomplete). We definitely
need to figure what's happening before releasing 2.0, as it could indicate
a bigger issue in the connection setup error path. By the way I'm thinking
that for 2.1 we should probably think about reporting a separate step for
the server-side handshake, but that's another story.

> The last session (from haproxy to plain-http backend B):
> 1 09:10:48.272235 HAPROXY -> BACKEND_B TCP 94 33532 -> 9791 [SYN] Seq=0
> Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=561683483 TSecr=0 WS=2048
> 2 09:10:48.272358 BACKEND_B -> HAPROXY TCP 94 9791 -> 33532 [SYN, ACK] Seq=0
> Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=874005989 TSecr=561683483
> WS=256
> 3 09:10:48.272369 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [ACK] Seq=1 Ack=1
> Win=26624 Len=0 TSval=561683483 TSecr=874005989
> 4 09:10:48.272396 HAPROXY -> BACKEND_B HTTP 3590 GET /xxxxxx/xxxxxx/xxx
> HTTP/1.1
> 5 09:10:48.272448 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [FIN, ACK]
> Seq=3505 Ack=1 Win=26624 Len=0 TSval=561683483 TSecr=874005989
> 6 09:10:48.272529 BACKEND_B -> HAPROXY TCP 86 9791 -> 33532 [ACK] Seq=1
> Ack=3505 Win=33792 Len=0 TSval=874005989 TSecr=561683483
> 7 09:10:48.272729 BACKEND_B -> HAPROXY TCP 86 9791 -> 33532 [FIN, ACK] Seq=1
> Ack=3506 Win=33792 Len=0 TSval=874005989 TSecr=561683483
> 8 09:10:48.272736 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [ACK] Seq=3506
> Ack=2 Win=26624 Len=0 TSval=561683484 TSecr=874005989
> 
> As you can see, haproxy instance made another try to establish connection
> and it did succeed but 50ms are over, and FIN was send right after
> GET-request.

This should never happen either, or you may quickly run out of source ports
by having your ports in TIME_WAIT state :-(

> Conclusion:
> * Haproxy does not respond with 502 in case of timing out on ssl-connection
> establishing to backends

So for this case since it's a timeout, it should be a 504.

> * Seems strange to me that connection timer was not reset after the first
> unsuccessfull connection ("retries 1" was set)

Indeed you're right, that might be the reason for the FIN just after
the GET.

> * SD-status of error is confusing :)

I suspect there are in fact 2 or 3 issues in the outgoing connection
code that result in all of this. This code is very complex since it
has to deal with reuse, server pools and redispatch at the same time.
We need to have a look into this. I'll wait for Olivier's availability
since he knows this area better (especially the reuse stuff that I
would break just by approaching it).

Many thanks for your detailed traces and analysis, this is very informative!

Willy

Reply via email to