Hi Willy!

Sorry for bothering you, but do you have any news about this case?

чт, 23 мая 2019 г. в 10:35, Willy Tarreau <w...@1wt.eu>:

> Hi Maksim,
>
> On Thu, May 23, 2019 at 10:00:19AM +0300, ?????? ????????? wrote:
> > 2nd session (from haproxy to ssl-enabled backend A, dumped with tshark
> for
> > better readability):
> > 1 09:10:48.222518 HAPROXY -> BACKEND_A TCP 94 36568 -> 9790 [SYN] Seq=0
> > Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=3064071282 TSecr=0 WS=2048
> > 2 09:10:48.222624 BACKEND_A -> HAPROXY TCP 94 9790 -> 36568 [SYN, ACK]
> Seq=0
> > Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=3366865490
> > TSecr=3064071282 WS=256
> > 3 09:10:48.222639 HAPROXY -> BACKEND_A TCP 86 36568 -> 9790 [ACK] Seq=1
> Ack=1
> > Win=26624 Len=0 TSval=3064071283 TSecr=3366865490
> > 4 09:10:48.222658 HAPROXY -> BACKEND_A TLSv1 603 Client Hello
> > 5 09:10:48.222741 BACKEND_A -> HAPROXY TCP 86 9790 -> 36568 [ACK] Seq=1
> > Ack=518 Win=27904 Len=0 TSval=3366865490 TSecr=3064071283
> > 6 09:10:48.272165 HAPROXY -> BACKEND_A TCP 86 36568 -> 9790 [RST, ACK]
> > Seq=518 Ack=1 Win=26624 Len=0 TSval=3064071332 TSecr=3366865490
> >
> > Backend didn't answer with Server Hello in 49.5ms after tcp-handshed has
> > finished for some reason. That is the root case of the error!!!
>
> Indeed, this is an interesting case. I'm not sure why it's reported like
> this but it definitely is a corner case as the L4 connection is established
> and the handshake was aborted. It should have been reported as "sC"
> (timeout
> during connect). But I can easily understand how we can make some wrong
> assumptions based on the available elements when reporting an error (e.g.
> the connection is valid, only the handshake is incomplete). We definitely
> need to figure what's happening before releasing 2.0, as it could indicate
> a bigger issue in the connection setup error path. By the way I'm thinking
> that for 2.1 we should probably think about reporting a separate step for
> the server-side handshake, but that's another story.
>
> > The last session (from haproxy to plain-http backend B):
> > 1 09:10:48.272235 HAPROXY -> BACKEND_B TCP 94 33532 -> 9791 [SYN] Seq=0
> > Win=26520 Len=0 MSS=8840 SACK_PERM=1 TSval=561683483 TSecr=0 WS=2048
> > 2 09:10:48.272358 BACKEND_B -> HAPROXY TCP 94 9791 -> 33532 [SYN, ACK]
> Seq=0
> > Ack=1 Win=26784 Len=0 MSS=8940 SACK_PERM=1 TSval=874005989
> TSecr=561683483
> > WS=256
> > 3 09:10:48.272369 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [ACK] Seq=1
> Ack=1
> > Win=26624 Len=0 TSval=561683483 TSecr=874005989
> > 4 09:10:48.272396 HAPROXY -> BACKEND_B HTTP 3590 GET /xxxxxx/xxxxxx/xxx
> > HTTP/1.1
> > 5 09:10:48.272448 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [FIN, ACK]
> > Seq=3505 Ack=1 Win=26624 Len=0 TSval=561683483 TSecr=874005989
> > 6 09:10:48.272529 BACKEND_B -> HAPROXY TCP 86 9791 -> 33532 [ACK] Seq=1
> > Ack=3505 Win=33792 Len=0 TSval=874005989 TSecr=561683483
> > 7 09:10:48.272729 BACKEND_B -> HAPROXY TCP 86 9791 -> 33532 [FIN, ACK]
> Seq=1
> > Ack=3506 Win=33792 Len=0 TSval=874005989 TSecr=561683483
> > 8 09:10:48.272736 HAPROXY -> BACKEND_B TCP 86 33532 -> 9791 [ACK]
> Seq=3506
> > Ack=2 Win=26624 Len=0 TSval=561683484 TSecr=874005989
> >
> > As you can see, haproxy instance made another try to establish connection
> > and it did succeed but 50ms are over, and FIN was send right after
> > GET-request.
>
> This should never happen either, or you may quickly run out of source ports
> by having your ports in TIME_WAIT state :-(
>
> > Conclusion:
> > * Haproxy does not respond with 502 in case of timing out on
> ssl-connection
> > establishing to backends
>
> So for this case since it's a timeout, it should be a 504.
>
> > * Seems strange to me that connection timer was not reset after the first
> > unsuccessfull connection ("retries 1" was set)
>
> Indeed you're right, that might be the reason for the FIN just after
> the GET.
>
> > * SD-status of error is confusing :)
>
> I suspect there are in fact 2 or 3 issues in the outgoing connection
> code that result in all of this. This code is very complex since it
> has to deal with reuse, server pools and redispatch at the same time.
> We need to have a look into this. I'll wait for Olivier's availability
> since he knows this area better (especially the reuse stuff that I
> would break just by approaching it).
>
> Many thanks for your detailed traces and analysis, this is very
> informative!
>
> Willy
>

Reply via email to