Re: TCP mode and ultra short lived connection

Максим Куприянов Thu, 11 Feb 2021 02:20:49 -0800

Thank you very much, Willy!

Turning off abortonclose (it was enabled globally) for this particular
session really helped :)


--
Best regards,
Maksim

вт, 9 февр. 2021 г. в 17:46, Willy Tarreau <w...@1wt.eu>:

> Hi guys,
>
> > > I faced a problem dealing with l4 (tcp mode) haproxy-based proxy over
> > > Graphite's component receiving metrics from clients and clients who are
> > > connecting just to send one or two Graphite-metrics and disconnecting
> right
> > > after.
> > >
> > > It looks like this
> > > 1. Client connects to haproxy (SYN/SYN-ACK/ACK)
> > > 2. Client sends one line of metric
> > > 3. Haproxy acknowledges receiving this line (ACK to client)
> > > 4. Client disconnects (FIN, FIN-ACK)
> > > 5. Haproxy writes 1/-1/0/0 CC-termination state to log without even
> trying to connect to a backend and send client's data to it.
> > > 6. Metric is lost :(
> > >
> > > If the client is slow enough between steps 1 and 2 or it sends a bunch
> of metrics so haproxy has time to connect to a backend - everything works
> like a charm.
> >
> > The issue though is the client disconnect. If we delay the client
> > disconnect, it could work. Try playing with tc by delaying the
> > incoming FIN packets for a few hundred milliseconds (make sure you
> > only apply this to this particular traffic, for example this
> > particular destination port).
> >
>
> In fact it's not that black-or-white. A client disconnecting first
> in TCP is *always* a protocol design issue, because it leaves the
> source port in TIME_WAIT on the client side for 1 minute (even 4 on
> certain legacy stacks), and once all source ports are blocked like
> this, the client cannot establish new connections anymore.
>
> However, this is a situation we *normally* deal with in haproxy:
>
>   - in TCP, we're *supposed* to respect exactly this sequence, and
>     do the same on the other side since it might be the only way to
>     pass the protocol from end-to-end ; there's even an series of
>     test for this one in the old test-fsm.cfg ;
>
>   - in HTTP, we normally pass the request as-is, and prepare for
>     closing after delivering the response (since some clients are
>     just netcat scripts).
>
> But it's well known that in HTTP, a FIN from a client after the request
> and before the respones usually corresponds to a browser closing by the
> user clicking "stop" or closing a tab. For this reason there's an
> option "abortonclose" which is used to abort the request before passing
> it to the other side, or while it's still waiting for a connection to
> establish.
>
> It turns out that this "abortonclose" option also works for TCP and
> totally makes sense there for a number of protocols. Thus, one
> possible explanation is that this option is present in the original
> config (maybe even inherited from the defaults section), in which case
> this is the desired behavior. It would also correspond to the CC log
> output (client closed during connect).
>
> But it's also possible that we broke something again. This half-closed
> client situation was broken a few times in the past because it doesn't
> get enough love. It essentially corresponds to a denial-of-service
> attempt and rarely to a normal behavior, and is rarely tested from this
> last perspective. In addition, the idea of leaving blocked source ports
> behind doesn't sound appealing to anyone for a reg-test :-/
>
> > In TCP mode, we need to propagate the close from one side to the
> > other, as we are not aware of the protocol. Not sure if it is possible
> > (or a good idea) to keep sending buffer contents to the backend server
> > when the client is already gone.
>
> It's expected to work and is indeed not a good idea at the same time,
> because this forces haproxy to consume all of its source ports very
> quickly and makes it trivial for a client to block all of its outgoing
> communications by maintaining a load of only ~500 connections per second.
> Once this is assumed however, it must be possible (barring any bug, again).
>
> > "[no] option abortonclose" only affects HTTP, according to the docs.
>
> I'm pretty sure it's not limited to HTTP because I've met PR_O_ABRT_CLOSE
> or something like this quite a few times in the connection setup code.
> However it's very possible that the doc isn't clear about this or only
> focuses on HTTP since it's where this usually matters.
>
> Hoping this helps,
> Willy
>

Re: TCP mode and ultra short lived connection

Reply via email to