Re: TCP_NODELAY in tcp mode

2015-09-11 Thread Willy Tarreau
Hi Dmitry,

On Fri, Sep 11, 2015 at 01:58:42PM +0300, Dmitry Sivachenko wrote:
> For reference: I tracked this down to be FreeBSD-specific problem:
> https://lists.freebsd.org/pipermail/freebsd-net/2015-September/043314.html
> 
> Thanks all for your help.

Thanks for the update. What I'm seeing in your description looks
very much like equivalent issues we used to face with softirq on
Linux, so it's possible that you're in the worst case where work
cannot be aggregated but comes with a huge overhead. Also maybe
you have pf or something like this eating some extra CPU. I can't
be specific, I don't use FreeBSD myself, but like Linux it's a
modern and performant OS so I think you'll come to a solution.

I don't know if you can pin processes to CPUs but there could be
interesting tests to run regarding how processes and interrupts
are pinned.

Also if your NIC supports multiple queues, you'll need to check
how interrupts are delivered. It would be possible that you're
facing a scalability issue in the network driver or stack, maybe
just in case where too few sockets are used or when packets get
highly reordered.

Cheers,
Willy




Re: TCP_NODELAY in tcp mode

2015-09-11 Thread Dmitry Sivachenko

> On 8 сент. 2015 г., at 18:33, Willy Tarreau  wrote:
> 
> Hi Dmitry,
> 
> On Tue, Sep 08, 2015 at 05:25:33PM +0300, Dmitry Sivachenko wrote:
>> 
>>> On 30 ??. 2015 ??., at 22:29, Willy Tarreau  wrote:
>>> 
>>> On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote:
>> Ok, you may be hitting a bug. Can you provide haproxy -vv output?
>> 
> 
> 
> What do you mean? I get the following warning when trying to use this
> option in tcp backend/frontend:
 
 Yes I know (I didn't realize you are using tcp mode). I don't mean the
 warning is the bug, I mean the tcp mode is supposed to not cause any
 delays by default, if I'm not mistaken.
>>> 
>>> You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE
>>> is not used there since we never know if more data follows. In fact there's
>>> only one case where it can happen, it's when data wrap at the end of the
>>> buffer and we want to send them together.
>>> 
>> 
>> 
>> Hello,
>> 
>> yes, you are right, the problem is not TCP_NODELAY.  I performed some 
>> testing:
>> 
>> Under low network load, passing TCP connection through haproxy involves 
>> almost zero overhead.
>> When load grows, at some point haproxy starts to slow things down.
>> 
>> In our testing scenario the application establishes long-lived TCP 
>> connection to server and sends many small requests.
>> Typical traffic at which adding haproxy in the middle causes measurable 
>> slowdown is ~30MB/sec, ~100kpps.
> 
> This is not huge, it's smaller than what can be achieved in pure HTTP mode,
> where I could achieve about 180k req/s end-to-end, which means at least 
> 180kpps
> in both directions on both sides, so 360kpps in each direction.
> 


For reference: I tracked this down to be FreeBSD-specific problem:
https://lists.freebsd.org/pipermail/freebsd-net/2015-September/043314.html

Thanks all for your help.




Re: TCP_NODELAY in tcp mode

2015-09-08 Thread Willy Tarreau
Hi Dmitry,

On Tue, Sep 08, 2015 at 05:25:33PM +0300, Dmitry Sivachenko wrote:
> 
> > On 30 ??. 2015 ??., at 22:29, Willy Tarreau  wrote:
> > 
> > On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote:
>  Ok, you may be hitting a bug. Can you provide haproxy -vv output?
>  
> >>> 
> >>> 
> >>> What do you mean? I get the following warning when trying to use this
> >>> option in tcp backend/frontend:
> >> 
> >> Yes I know (I didn't realize you are using tcp mode). I don't mean the
> >> warning is the bug, I mean the tcp mode is supposed to not cause any
> >> delays by default, if I'm not mistaken.
> > 
> > You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE
> > is not used there since we never know if more data follows. In fact there's
> > only one case where it can happen, it's when data wrap at the end of the
> > buffer and we want to send them together.
> > 
> 
> 
> Hello,
> 
> yes, you are right, the problem is not TCP_NODELAY.  I performed some testing:
> 
> Under low network load, passing TCP connection through haproxy involves 
> almost zero overhead.
> When load grows, at some point haproxy starts to slow things down.
> 
> In our testing scenario the application establishes long-lived TCP connection 
> to server and sends many small requests.
> Typical traffic at which adding haproxy in the middle causes measurable 
> slowdown is ~30MB/sec, ~100kpps.

This is not huge, it's smaller than what can be achieved in pure HTTP mode,
where I could achieve about 180k req/s end-to-end, which means at least 180kpps
in both directions on both sides, so 360kpps in each direction.

> haproxy process CPU usage is about 15-20%.

And the rest is for the system ?

Willy




Re: TCP_NODELAY in tcp mode

2015-09-08 Thread Dmitry Sivachenko

> On 30 авг. 2015 г., at 22:29, Willy Tarreau  wrote:
> 
> On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote:
 Ok, you may be hitting a bug. Can you provide haproxy -vv output?
 
>>> 
>>> 
>>> What do you mean? I get the following warning when trying to use this
>>> option in tcp backend/frontend:
>> 
>> Yes I know (I didn't realize you are using tcp mode). I don't mean the
>> warning is the bug, I mean the tcp mode is supposed to not cause any
>> delays by default, if I'm not mistaken.
> 
> You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE
> is not used there since we never know if more data follows. In fact there's
> only one case where it can happen, it's when data wrap at the end of the
> buffer and we want to send them together.
> 


Hello,

yes, you are right, the problem is not TCP_NODELAY.  I performed some testing:

Under low network load, passing TCP connection through haproxy involves almost 
zero overhead.
When load grows, at some point haproxy starts to slow things down.

In our testing scenario the application establishes long-lived TCP connection 
to server and sends many small requests.
Typical traffic at which adding haproxy in the middle causes measurable 
slowdown is ~30MB/sec, ~100kpps.

haproxy process CPU usage is about 15-20%.


Re: TCP_NODELAY in tcp mode

2015-08-30 Thread Willy Tarreau
On Fri, Aug 28, 2015 at 11:40:18AM +0200, Lukas Tribus wrote:
> >> Ok, you may be hitting a bug. Can you provide haproxy -vv output?
> >>
> >
> >
> > What do you mean? I get the following warning when trying to use this
> > option in tcp backend/frontend:
> 
> Yes I know (I didn't realize you are using tcp mode). I don't mean the
> warning is the bug, I mean the tcp mode is supposed to not cause any
> delays by default, if I'm not mistaken.

You're not mistaken, tcp_nodelay is unconditional in TCP mode and MSG_MORE
is not used there since we never know if more data follows. In fact there's
only one case where it can happen, it's when data wrap at the end of the
buffer and we want to send them together.

> You are running freebsd, so splicing (Linux) can't be an issue either.
> Is strace available on your OS (afaik 64bit freebsd doesn't have strace)?
> 
> Can you try disabling kqueue [1], to see if the behavior changes? If
> not, try disabling poll as well [2]. That way haproxy falls back to
> select().
> 
> Having all syscalls (strace) and tcpdumps from the front and backend
> traffic would be helpful. Especially interesting would be if haproxy sets
> TCP_NODELAY and MSG_MORE. It should set the former, but not the
> latter.

I used to find in the past on Linux (old version) that forcing TCP_NODELAY
could end up with an actually higher latency than desired. This is due to
the fact that you're not supposed to send anything after an incomplete TCP
PUSH until it's been ACKed. I used to see this even cause slowdowns on some
proxies. But something like 1 or 2 years ago while I was discussing about
this on the HTTP WG with the Chromium developers, I couldn't reproduce it
anymore, which means that the behavious has changed at least on Linux. I
would not be surprized if it still exists on other OSes.

A tcpdump will definitely tell us if that's the case because we'll see
that a new segment is emitted immediately once the previous one gets ACKed.

There's nothing that can be done about this (except switching to another
stack or changing the application of course), because :
  - without TCP_NODELAY, you face Nagle and your data may wait up to 40ms
  - with TCP_NODELAY you can be blocked here.

In practice, any application should only send a push when it has nothing
more to send and is waiting for the other side to respond, so if the
application sends many small messages, only the last one of each batch
should have the PUSH flag set. I know it's not always easy to do especially
when you forward data that comes from an uncontrolled source :-)

Regards,
Willy




RE: TCP_NODELAY in tcp mode

2015-08-28 Thread Lukas Tribus
>> Ok, you may be hitting a bug. Can you provide haproxy -vv output?
>>
>
>
> What do you mean? I get the following warning when trying to use this
> option in tcp backend/frontend:

Yes I know (I didn't realize you are using tcp mode). I don't mean the
warning is the bug, I mean the tcp mode is supposed to not cause any
delays by default, if I'm not mistaken.

You are running freebsd, so splicing (Linux) can't be an issue either.
Is strace available on your OS (afaik 64bit freebsd doesn't have strace)?

Can you try disabling kqueue [1], to see if the behavior changes? If
not, try disabling poll as well [2]. That way haproxy falls back to
select().

Having all syscalls (strace) and tcpdumps from the front and backend
traffic would be helpful. Especially interesting would be if haproxy sets
TCP_NODELAY and MSG_MORE. It should set the former, but not the
latter.



Regards,

Lukas





[1] http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.2-nokqueue
[2] http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#nopoll


  


Re: TCP_NODELAY in tcp mode

2015-08-28 Thread Dmitry Sivachenko

> On 28 авг. 2015 г., at 12:18, Lukas Tribus  wrote:
> 
>>> Use "option http-no-delay" [1] to disable Nagle unconditionally.
>> 
>> 
>> This option requires HTTP mode, but I must use TCP mode because our
>> protocol is not HTTP (some custom protocol over TCP)
> 
> Ok, you may be hitting a bug. Can you provide haproxy -vv output?
> 


What do you mean?  I get the following warning when trying to use this option 
in tcp backend/frontend:

[WARNING] 239/121424 (71492) : config : 'option http-no-delay' ignored for 
frontend 'shard0-front' as it requires HTTP mode.
[WARNING] 239/121424 (71492) : config : 'option http-no-delay' ignored for 
backend 'shard0-back' as it requires HTTP mode.

So it is clear that this option is intended for HTTP mode only.  For reference:

HA-Proxy version 1.5.11 2015/01/31
Copyright 2000-2015 Willy Tarreau 

Build options :
  TARGET  = freebsd
  CPU = generic
  CC  = cc
  CFLAGS  = -O2 -pipe -O2 -fno-strict-aliasing -pipe -fstack-protector 
-DFREEBSD_PORTS
  OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_OPENSSL=1 USE_STATIC_PCRE=1 
USE_PCRE_JIT=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity, deflate, gzip
Built with OpenSSL version : OpenSSL 1.0.1l-freebsd 15 Jan 2015
Running on OpenSSL version : OpenSSL 1.0.1l-freebsd 15 Jan 2015
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.35 2014-04-04
PCRE library supports JIT : yes
Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY

Available polling systems :
 kqueue : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use kqueue.





RE: TCP_NODELAY in tcp mode

2015-08-28 Thread Lukas Tribus
>> Use "option http-no-delay" [1] to disable Nagle unconditionally.
>
>
> This option requires HTTP mode, but I must use TCP mode because our
> protocol is not HTTP (some custom protocol over TCP)

Ok, you may be hitting a bug. Can you provide haproxy -vv output?


Thanks,

Lukas

  


Re: TCP_NODELAY in tcp mode

2015-08-28 Thread Dmitry Sivachenko

> On 28 авг. 2015 г., at 12:12, Lukas Tribus  wrote:
> 
>> Hello,
>> 
>> The flag TCP_NODELAY is unconditionally set on each TCP (ipv4/ipv6)
>> connections between haproxy and the server, and beetwen the client and
>> haproxy.
> 
> That may be true, however HAProxy uses MSG_MORE to disable and
> enable Nagle based on the individual situation.
> 
> Use "option http-no-delay" [1] to disable Nagle unconditionally.


This option requires HTTP mode, but I must use TCP mode because our protocol is 
not HTTP (some custom protocol over TCP)


> 
> 
> 
> Regards,
> 
> Lukas
> 
> 
> [1] 
> http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4-option%20http-no-delay
>




RE: TCP_NODELAY in tcp mode

2015-08-28 Thread Lukas Tribus
> Hello,
>
> The flag TCP_NODELAY is unconditionally set on each TCP (ipv4/ipv6)
> connections between haproxy and the server, and beetwen the client and
> haproxy.

That may be true, however HAProxy uses MSG_MORE to disable and
enable Nagle based on the individual situation.

Use "option http-no-delay" [1] to disable Nagle unconditionally.



Regards,

Lukas


[1] 
http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4-option%20http-no-delay
 


Re: TCP_NODELAY in tcp mode

2015-08-27 Thread thierry . fournier
On Thu, 27 Aug 2015 20:34:35 +0300
Dmitry Sivachenko  wrote:

> Hello,
> 
> we have a client-server application which establish a long-living TCP 
> connection and generates a lot of small request-response packets which need 
> to be processed very fast.
> Setting TCP_NODELAY on sockets speed things up to about 3 times.
> 
> Not I want to put a haproxy in the middle so it balances traffic between 
> several servers.
> 
> Something like 
> 
> defaults
>  mode tcp
> 
> frontend shard0-front
>  bind *:9000
>  default_backend shard0-back
> 
> backend shard0-back
>  server srv1 srv1:3456 check
>  server srv2 srv2:3456 check
> 
> In such configuration application slows significantly.  I suspect that 
> setting frontend's and backend's sockets option TCP_NODELAY would help as it 
> did without haproxy involved.  Is there any parameter which allows me to set 
> TCP_NODELAY option?


Hello,

The flag TCP_NODELAY is inconditionally set on each TCP (ipv4/ipv6)
connections between haproxy and the serveur, and beetwen the client and
haproxy.

You can use "strace" for displying the system calls and ensure yourself
that the TCP_NODELAY flags is set after each "accept()", and after each
"connect()".

Thierry



TCP_NODELAY in tcp mode

2015-08-27 Thread Dmitry Sivachenko
Hello,

we have a client-server application which establish a long-living TCP 
connection and generates a lot of small request-response packets which need to 
be processed very fast.
Setting TCP_NODELAY on sockets speed things up to about 3 times.

Not I want to put a haproxy in the middle so it balances traffic between 
several servers.

Something like 

defaults
 mode tcp

frontend shard0-front
 bind *:9000
 default_backend shard0-back

backend shard0-back
 server srv1 srv1:3456 check
 server srv2 srv2:3456 check

In such configuration application slows significantly.  I suspect that setting 
frontend's and backend's sockets option TCP_NODELAY would help as it did 
without haproxy involved.  Is there any parameter which allows me to set 
TCP_NODELAY option?

Thanks!