Re: TCP_NODELAY in tcp mode

Willy Tarreau Sun, 30 Aug 2015 12:38:50 -0700

On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote:
> >> Ok, you may be hitting a bug. Can you provide haproxy -vv output?
> >>
> >
> >
> > What do you mean? I get the following warning when trying to use this
> > option in tcp backend/frontend:
> 
> Yes I know (I didn't realize you are using tcp mode). I don't mean the
> warning is the bug, I mean the tcp mode is supposed to not cause any
> delays by default, if I'm not mistaken.


You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE
is not used there since we never know if more data follows. In fact there's
only one case where it can happen, it's when data wrap at the end of the
buffer and we want to send them together.

> You are running freebsd, so splicing (Linux) can't be an issue either.
> Is strace available on your OS (afaik 64bit freebsd doesn't have strace)?
> 
> Can you try disabling kqueue [1], to see if the behavior changes? If
> not, try disabling poll as well [2]. That way haproxy falls back to
> select().
> 
> Having all syscalls (strace) and tcpdumps from the front and backend
> traffic would be helpful. Especially interesting would be if haproxy sets
> TCP_NODELAY and MSG_MORE. It should set the former, but not the
> latter.

I used to find in the past on Linux (old version) that forcing TCP_NODELAY
could end up with an actually higher latency than desired. This is due to
the fact that you're not supposed to send anything after an incomplete TCP
PUSH until it's been ACKed. I used to see this even cause slowdowns on some
proxies. But something like 1 or 2 years ago while I was discussing about
this on the HTTP WG with the Chromium developers, I couldn't reproduce it
anymore, which means that the behavious has changed at least on Linux. I
would not be surprized if it still exists on other OSes.

A tcpdump will definitely tell us if that's the case because we'll see
that a new segment is emitted immediately once the previous one gets ACKed.

There's nothing that can be done about this (except switching to another
stack or changing the application of course), because :
  - without TCP_NODELAY, you face Nagle and your data may wait up to 40ms
  - with TCP_NODELAY you can be blocked here.

In practice, any application should only send a push when it has nothing
more to send and is waiting for the other side to respond, so if the
application sends many small messages, only the last one of each batch
should have the PUSH flag set. I know it's not always easy to do especially
when you forward data that comes from an uncontrolled source :-)

Regards,
Willy

Re: TCP_NODELAY in tcp mode

Reply via email to