Re: TCP_NODELAY in tcp mode
Hi Dmitry, On Fri, Sep 11, 2015 at 01:58:42PM +0300, Dmitry Sivachenko wrote: > For reference: I tracked this down to be FreeBSD-specific problem: > https://lists.freebsd.org/pipermail/freebsd-net/2015-September/043314.html > > Thanks all for your help. Thanks for the update. What I'm seeing in your description looks very much like equivalent issues we used to face with softirq on Linux, so it's possible that you're in the worst case where work cannot be aggregated but comes with a huge overhead. Also maybe you have pf or something like this eating some extra CPU. I can't be specific, I don't use FreeBSD myself, but like Linux it's a modern and performant OS so I think you'll come to a solution. I don't know if you can pin processes to CPUs but there could be interesting tests to run regarding how processes and interrupts are pinned. Also if your NIC supports multiple queues, you'll need to check how interrupts are delivered. It would be possible that you're facing a scalability issue in the network driver or stack, maybe just in case where too few sockets are used or when packets get highly reordered. Cheers, Willy
Re: TCP_NODELAY in tcp mode
> On 8 сент. 2015 г., at 18:33, Willy Tarreau wrote: > > Hi Dmitry, > > On Tue, Sep 08, 2015 at 05:25:33PM +0300, Dmitry Sivachenko wrote: >> >>> On 30 ??. 2015 ??., at 22:29, Willy Tarreau wrote: >>> >>> On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote: >> Ok, you may be hitting a bug. Can you provide haproxy -vv output? >> > > > What do you mean? I get the following warning when trying to use this > option in tcp backend/frontend: Yes I know (I didn't realize you are using tcp mode). I don't mean the warning is the bug, I mean the tcp mode is supposed to not cause any delays by default, if I'm not mistaken. >>> >>> You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE >>> is not used there since we never know if more data follows. In fact there's >>> only one case where it can happen, it's when data wrap at the end of the >>> buffer and we want to send them together. >>> >> >> >> Hello, >> >> yes, you are right, the problem is not TCP_NODELAY. I performed some >> testing: >> >> Under low network load, passing TCP connection through haproxy involves >> almost zero overhead. >> When load grows, at some point haproxy starts to slow things down. >> >> In our testing scenario the application establishes long-lived TCP >> connection to server and sends many small requests. >> Typical traffic at which adding haproxy in the middle causes measurable >> slowdown is ~30MB/sec, ~100kpps. > > This is not huge, it's smaller than what can be achieved in pure HTTP mode, > where I could achieve about 180k req/s end-to-end, which means at least > 180kpps > in both directions on both sides, so 360kpps in each direction. > For reference: I tracked this down to be FreeBSD-specific problem: https://lists.freebsd.org/pipermail/freebsd-net/2015-September/043314.html Thanks all for your help.
Re: TCP_NODELAY in tcp mode
Hi Dmitry, On Tue, Sep 08, 2015 at 05:25:33PM +0300, Dmitry Sivachenko wrote: > > > On 30 ??. 2015 ??., at 22:29, Willy Tarreau wrote: > > > > On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote: > Ok, you may be hitting a bug. Can you provide haproxy -vv output? > > >>> > >>> > >>> What do you mean? I get the following warning when trying to use this > >>> option in tcp backend/frontend: > >> > >> Yes I know (I didn't realize you are using tcp mode). I don't mean the > >> warning is the bug, I mean the tcp mode is supposed to not cause any > >> delays by default, if I'm not mistaken. > > > > You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE > > is not used there since we never know if more data follows. In fact there's > > only one case where it can happen, it's when data wrap at the end of the > > buffer and we want to send them together. > > > > > Hello, > > yes, you are right, the problem is not TCP_NODELAY. I performed some testing: > > Under low network load, passing TCP connection through haproxy involves > almost zero overhead. > When load grows, at some point haproxy starts to slow things down. > > In our testing scenario the application establishes long-lived TCP connection > to server and sends many small requests. > Typical traffic at which adding haproxy in the middle causes measurable > slowdown is ~30MB/sec, ~100kpps. This is not huge, it's smaller than what can be achieved in pure HTTP mode, where I could achieve about 180k req/s end-to-end, which means at least 180kpps in both directions on both sides, so 360kpps in each direction. > haproxy process CPU usage is about 15-20%. And the rest is for the system ? Willy
Re: TCP_NODELAY in tcp mode
> On 30 авг. 2015 г., at 22:29, Willy Tarreau wrote: > > On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote: Ok, you may be hitting a bug. Can you provide haproxy -vv output? >>> >>> >>> What do you mean? I get the following warning when trying to use this >>> option in tcp backend/frontend: >> >> Yes I know (I didn't realize you are using tcp mode). I don't mean the >> warning is the bug, I mean the tcp mode is supposed to not cause any >> delays by default, if I'm not mistaken. > > You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE > is not used there since we never know if more data follows. In fact there's > only one case where it can happen, it's when data wrap at the end of the > buffer and we want to send them together. > Hello, yes, you are right, the problem is not TCP_NODELAY. I performed some testing: Under low network load, passing TCP connection through haproxy involves almost zero overhead. When load grows, at some point haproxy starts to slow things down. In our testing scenario the application establishes long-lived TCP connection to server and sends many small requests. Typical traffic at which adding haproxy in the middle causes measurable slowdown is ~30MB/sec, ~100kpps. haproxy process CPU usage is about 15-20%.
Re: TCP_NODELAY in tcp mode
On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote: > >> Ok, you may be hitting a bug. Can you provide haproxy -vv output? > >> > > > > > > What do you mean? I get the following warning when trying to use this > > option in tcp backend/frontend: > > Yes I know (I didn't realize you are using tcp mode). I don't mean the > warning is the bug, I mean the tcp mode is supposed to not cause any > delays by default, if I'm not mistaken. You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE is not used there since we never know if more data follows. In fact there's only one case where it can happen, it's when data wrap at the end of the buffer and we want to send them together. > You are running freebsd, so splicing (Linux) can't be an issue either. > Is strace available on your OS (afaik 64bit freebsd doesn't have strace)? > > Can you try disabling kqueue [1], to see if the behavior changes? If > not, try disabling poll as well [2]. That way haproxy falls back to > select(). > > Having all syscalls (strace) and tcpdumps from the front and backend > traffic would be helpful. Especially interesting would be if haproxy sets > TCP_NODELAY and MSG_MORE. It should set the former, but not the > latter. I used to find in the past on Linux (old version) that forcing TCP_NODELAY could end up with an actually higher latency than desired. This is due to the fact that you're not supposed to send anything after an incomplete TCP PUSH until it's been ACKed. I used to see this even cause slowdowns on some proxies. But something like 1 or 2 years ago while I was discussing about this on the HTTP WG with the Chromium developers, I couldn't reproduce it anymore, which means that the behavious has changed at least on Linux. I would not be surprized if it still exists on other OSes. A tcpdump will definitely tell us if that's the case because we'll see that a new segment is emitted immediately once the previous one gets ACKed. There's nothing that can be done about this (except switching to another stack or changing the application of course), because : - without TCP_NODELAY, you face Nagle and your data may wait up to 40ms - with TCP_NODELAY you can be blocked here. In practice, any application should only send a push when it has nothing more to send and is waiting for the other side to respond, so if the application sends many small messages, only the last one of each batch should have the PUSH flag set. I know it's not always easy to do especially when you forward data that comes from an uncontrolled source :-) Regards, Willy
RE: TCP_NODELAY in tcp mode
>> Ok, you may be hitting a bug. Can you provide haproxy -vv output? >> > > > What do you mean? I get the following warning when trying to use this > option in tcp backend/frontend: Yes I know (I didn't realize you are using tcp mode). I don't mean the warning is the bug, I mean the tcp mode is supposed to not cause any delays by default, if I'm not mistaken. You are running freebsd, so splicing (Linux) can't be an issue either. Is strace available on your OS (afaik 64bit freebsd doesn't have strace)? Can you try disabling kqueue [1], to see if the behavior changes? If not, try disabling poll as well [2]. That way haproxy falls back to select(). Having all syscalls (strace) and tcpdumps from the front and backend traffic would be helpful. Especially interesting would be if haproxy sets TCP_NODELAY and MSG_MORE. It should set the former, but not the latter. Regards, Lukas [1] http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.2-nokqueue [2] http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#nopoll
Re: TCP_NODELAY in tcp mode
> On 28 авг. 2015 г., at 12:18, Lukas Tribus wrote: > >>> Use "option http-no-delay" [1] to disable Nagle unconditionally. >> >> >> This option requires HTTP mode, but I must use TCP mode because our >> protocol is not HTTP (some custom protocol over TCP) > > Ok, you may be hitting a bug. Can you provide haproxy -vv output? > What do you mean? I get the following warning when trying to use this option in tcp backend/frontend: [WARNING] 239/121424 (71492) : config : 'option http-no-delay' ignored for frontend 'shard0-front' as it requires HTTP mode. [WARNING] 239/121424 (71492) : config : 'option http-no-delay' ignored for backend 'shard0-back' as it requires HTTP mode. So it is clear that this option is intended for HTTP mode only. For reference: HA-Proxy version 1.5.11 2015/01/31 Copyright 2000-2015 Willy Tarreau Build options : TARGET = freebsd CPU = generic CC = cc CFLAGS = -O2 -pipe -O2 -fno-strict-aliasing -pipe -fstack-protector -DFREEBSD_PORTS OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_OPENSSL=1 USE_STATIC_PCRE=1 USE_PCRE_JIT=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.8 Compression algorithms supported : identity, deflate, gzip Built with OpenSSL version : OpenSSL 1.0.1l-freebsd 15 Jan 2015 Running on OpenSSL version : OpenSSL 1.0.1l-freebsd 15 Jan 2015 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.35 2014-04-04 PCRE library supports JIT : yes Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue.
RE: TCP_NODELAY in tcp mode
>> Use "option http-no-delay" [1] to disable Nagle unconditionally. > > > This option requires HTTP mode, but I must use TCP mode because our > protocol is not HTTP (some custom protocol over TCP) Ok, you may be hitting a bug. Can you provide haproxy -vv output? Thanks, Lukas
Re: TCP_NODELAY in tcp mode
> On 28 авг. 2015 г., at 12:12, Lukas Tribus wrote: > >> Hello, >> >> The flag TCP_NODELAY is unconditionally set on each TCP (ipv4/ipv6) >> connections between haproxy and the server, and beetwen the client and >> haproxy. > > That may be true, however HAProxy uses MSG_MORE to disable and > enable Nagle based on the individual situation. > > Use "option http-no-delay" [1] to disable Nagle unconditionally. This option requires HTTP mode, but I must use TCP mode because our protocol is not HTTP (some custom protocol over TCP) > > > > Regards, > > Lukas > > > [1] > http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4-option%20http-no-delay >
RE: TCP_NODELAY in tcp mode
> Hello, > > The flag TCP_NODELAY is unconditionally set on each TCP (ipv4/ipv6) > connections between haproxy and the server, and beetwen the client and > haproxy. That may be true, however HAProxy uses MSG_MORE to disable and enable Nagle based on the individual situation. Use "option http-no-delay" [1] to disable Nagle unconditionally. Regards, Lukas [1] http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4-option%20http-no-delay
Re: TCP_NODELAY in tcp mode
On Thu, 27 Aug 2015 20:34:35 +0300 Dmitry Sivachenko wrote: > Hello, > > we have a client-server application which establish a long-living TCP > connection and generates a lot of small request-response packets which need > to be processed very fast. > Setting TCP_NODELAY on sockets speed things up to about 3 times. > > Not I want to put a haproxy in the middle so it balances traffic between > several servers. > > Something like > > defaults > mode tcp > > frontend shard0-front > bind *:9000 > default_backend shard0-back > > backend shard0-back > server srv1 srv1:3456 check > server srv2 srv2:3456 check > > In such configuration application slows significantly. I suspect that > setting frontend's and backend's sockets option TCP_NODELAY would help as it > did without haproxy involved. Is there any parameter which allows me to set > TCP_NODELAY option? Hello, The flag TCP_NODELAY is inconditionally set on each TCP (ipv4/ipv6) connections between haproxy and the serveur, and beetwen the client and haproxy. You can use "strace" for displying the system calls and ensure yourself that the TCP_NODELAY flags is set after each "accept()", and after each "connect()". Thierry
TCP_NODELAY in tcp mode
Hello, we have a client-server application which establish a long-living TCP connection and generates a lot of small request-response packets which need to be processed very fast. Setting TCP_NODELAY on sockets speed things up to about 3 times. Not I want to put a haproxy in the middle so it balances traffic between several servers. Something like defaults mode tcp frontend shard0-front bind *:9000 default_backend shard0-back backend shard0-back server srv1 srv1:3456 check server srv2 srv2:3456 check In such configuration application slows significantly. I suspect that setting frontend's and backend's sockets option TCP_NODELAY would help as it did without haproxy involved. Is there any parameter which allows me to set TCP_NODELAY option? Thanks!