Hi Lukas,

On Mon, Dec 16, 2013 at 11:52:27PM +0100, Lukas Tribus wrote:
> Hi Willy and everyone,
> 
> 
> > Subject: [ANNOUNCE] haproxy-1.5-dev20
> >
> > Hi all,
> >
> > here is probably the largest update we ever had, it's composed of 345
> > patches!
> 
> Wow, thats one hell of a -dev release, nice work :)

yes, now you understand why seeing the release postpone forever started
to look like a nightmare for me :-)

> > - keep-alive: the dynamic allocation of the connection and applet in the
> > session now allows to reuse or kill a connection that was previously
> > associated with the session. Thus we now have a very basic support for
> > keep-alive to the servers. There is even an option to relax the load
> > balancing to try to keep the same connection. Right now we don't do
> > any connection sharing so the main use is for static servers and for
> > far remote servers or those which require the broken NTLM auth. That
> > said, the performance tests I have run show an increase from 71000
> > connections per second to 150000 keep-alive requests per second running
> > on one core of a Xeon E5 3.6 GHz. This doubled to 300k requests per
> > second with two cores. I didn't test above, I lacked injection tools :-)
> 
> Very nice.
> 
> A few questions about this:
> 
> I keep reading that server side keepalive is not a big win on low latency
> LAN networks (the docs also suggest this to some extent), but I am really
> not so sure about this, and your tests seem to contradict it also.
> 
> Did you do your test in a LAN environment with low latency (I suspect you
> did) and what was the object size?

The most optimal case, empty objects (equal to 304 Not modified). Just a few
headers. This was done on a 10G network with a very low latency. Of course,
if I had to reach a far away server, it would be very useful, but this is
generally not the case.

The reason for the small gain is that the protocol is very ping-pong like.
You send a tiny packet, you receive a tiny packet, you may even ACK it,
etc. It's totally suboptimal. The real gain comes from saving on the
connect() and the close() syscall, both of which use a lock in the system,
when searching a free source port and when releasing the FD. And of course,
saving 2 packets for connection establishment and one for the close.

For sure if you're running with a stateful firewall (eg: iptables+conntrack)
on the load balancer or the server, it will make a significant difference,
but people dealing with very high traffic rates already remove all this
stuff.

In my opinion we'll get the real savings the day we're able to fuse multiple
requests from multiple clients into the same TCP segments going to the same
server in pipeline mode. But here again, pipelining requests to servers is
fine if you're certain that objects are all approximately the same size, or
very small.

Another thing I really don't like with keep-alive is that we maintain open
connections to the servers while we strive to optimize server connection
usage the best possible way. When a server is able to respond in 10 ms then
close, and you suddenly decide to keep the connections open for 1 second,
you multiply its concurrent connections by 100, that's really bad :-/

> How does end-to-end keep-alive work in conjunction with:
>  timeout http-keep-alive <timeout>
> 
> When does the server side connection get closed? When the original client
> connection closes?

That does not change. The client-side connection still holds the session.
It's terminated on the http-keep-alive timeout if the client remains idle,
which induces the termination of the session and of the server side
connection. Till a few minutes ago, there was not even any monitoring of
the server-side connection, so even if the server closed, we didn't see
it.

> I guess once we have connection sharing, it will make sense to have
> independent keep-alive timeouts on the client and the server side and
> perhaps even "connection-pools" to backends, with min/max values for
> idle keep-alive connections.

Yes absolutely! I'm all for sharing server-side connections. But keeping
them dedicated to a client is the worst thing HTTP has ever invented.

> But that is 1.6 material I suspect ;)

Most likely, yes.

> IIRC one of the Amazon cloud load-balancer solutions (or was it cloudflare?)
> always maintains at least a single tcp session to every backend.

It's probably used by the health checks to avoid logging on intermediate
firewalls.

> Shouldn't server side TCP Fast Open have a similar effect on performance, btw?

Yes indeed, though slightly less because you have connect/close again
(sendto/close to be exact). However I tried a few months ago to implement
TFO to the server and discovered is was not as obvious as I would have
expected it to be. I stopped because my kernel did not yet reliably
support it so I wouldn't test a lot anyway. But I'll probably have to
give it another try. IIRC one of the problem was that currently once
we have sent something, the data are flushed out of the buffer. But
with sendto() on TFO, the data were marked sent even before receiving
any ack, preventing us from performing any retry or redispatch if
required.
 
> > I still have to perform difficult changes on the health checks system to
> > adapt the agent-check after identifying some limitations caused by the
> > original design we decided on a few months ago.
> >
> > Another set of pending changes concerns the polling. Yes I know, each time
> > I touch the pollers I break things. But I need to do them, as now with
> > keep-alive it becomes obvious that we waste too much time enabling and
> > disabling polling because we miss one flag ("this FD last returned EAGAIN").
> > The good point is that it will simplify the connection management and 
> > checks.
> >
> > If these points are done quick enough, I'll see if I can implement a very
> > basic support for server connection sharing connection (without which I
> > still consider keep-alive as not a huge improvement).
> 
> It would be great to have this in 1.5!

Yes but I want to remain reasonable as well :-) I mean, I keep an eye on it
but not with the top priority. I'd rather backport a few patches later if it
makes sense than breaking everything in a few patches!

> Keep up the good work!

Thanks!
Willy


Reply via email to