Hello Patrick,

Sorry for this late reply. I preferred focusing on how to reproduce your peer CLOSE_WAIT issue.

See my answers below.

On 05/30/2017 05:29 PM, Patrick Hemmer wrote:


On 2017/5/29 16:04, Frederic Lecaille wrote:
On 05/29/2017 06:12 PM, Patrick Hemmer wrote:

On 2017/5/29 08:22, Frederic Lecaille wrote:

Hi Patrick,

First thank you for this nice and helpful report.

Would it be possible to have an output of this command the next time
you reproduce such an issue please?

    echo "show sess" | socat stdio <haproxy stats socket path>

Unfortunately this would not be possible. When the issue occurs, the
haproxy process has stopped accepting connections on all sockets. If I
were to run this command, it would be sent to the new process, not the
one that won't shut down.


If you send a SIGHUP to haproxy-systemd-wrapper it asks the old
process to graceful stop.
Yes, that is what my issue report is about. When sent a SIGHUP, the new
process comes up, but the old process won't shut down.


Please have a look to this documentation:

https://cbonte.github.io/haproxy-dconv/1.7/management.html#4

So you are true, if everything goes well no more connection are
accept()'ed by the old process (the sockets have been unbound). But in
your reported case the peers sockets are not closed because still in
CLOSE_WAIT state, so are still being processed, so stats information
are still available from the socket stats.
The information might still be tracked within the process, but there is
no way to query the information because the process is no longer
accepting new connections. The new process has taken over control of the
admin socket.

Of course but the new process may use another stats socket.

Here is an example of peer TCP session in CLOSE_WAIT state which will never be closed:

0x82ff420: [13/Jun/2017:13:15:18.527593] id=0 proto=tcpv4
  flags=0x23006, conn_retries=0, srv_conn=(nil), pend_pos=(nil)
  frontend=hostB (id=4294967295 mode=tcp), listener=? (id=0)
  backend=<NONE> (id=-1 mode=-) addr=127.0.0.1:51488
  server=<NONE> (id=-1) addr=127.0.0.10:10000
  task=0x82ff3c8 (state=0x08 nice=0 calls=2 exp=<NEVER> age=12m11s)
si[0]=0x82ff570 (state=CLO flags=0x4040 endp0=APPCTX:0x8303bd0 exp=<NEVER>, et=0x000) si[1]=0x82ff58c (state=CON flags=0x50 endp1=CONN:0x8303a88 exp=<NEVER>, et=0x020)
  app0=0x8303bd0 st0=11 st1=0 st2=0 applet=<PEER>
  co1=0x8303a88 ctrl=tcpv4 xprt=RAW data=STRM target=PROXY:0x82b8288
      flags=0x0020b310 fd=1 fd.state=22 fd.cache=0 updt=0
  req=0x82ff42c (f=0x80a020 an=0x0 pipe=0 tofwd=0 total=0)
      an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
      buf=0x81562a8 data=0x81562b8 o=0 p=0 req.next=0 i=0 size=0
  res=0x82ff460 (f=0x80402020 an=0x0 pipe=0 tofwd=0 total=0)
      an_exp=<NEVER> rex=<NEVER> wex=<NEVER>
      buf=0x81562a8 data=0x81562b8 o=0 p=0 rsp.next=0 i=0 size=0


If I have missed something please does not hesitate to yell at me ;) .

I have been told that "show sess *all*" give more information.


I have only one question (see below).

On 05/24/2017 10:40 AM, Willy Tarreau wrote:
Hi Patrick,

On Tue, May 23, 2017 at 01:49:42PM -0400, Patrick Hemmer wrote:
(...)
haproxy 28856 root    1u     IPv4          420797940      0t0
TCP 10.0.33.145:35754->10.0.33.147:1029 (CLOSE_WAIT)
haproxy 28856 root    2u     IPv4          420266351      0t0
TCP 10.0.33.145:52898->10.0.33.147:1029 (CLOSE_WAIT)
haproxy 28856 root    3r      REG                0,3        0
4026531956 net
haproxy 28856 root    4u     IPv4          422150834      0t0
TCP 10.0.33.145:38874->10.0.33.147:1029 (CLOSE_WAIT)

These ones are very interesting.

These traces also seem interesting to me.

# strace -p 28856
Process 28856 attached
epoll_wait(0, {}, 200, 319)             = 0
epoll_wait(0, {}, 200, 0)               = 0
epoll_wait(0, {}, 200, 362)             = 0
epoll_wait(0, {}, 200, 0)               = 0
epoll_wait(0, {}, 200, 114)             = 0
epoll_wait(0, {}, 200, 0)               = 0
epoll_wait(0, {}, 200, 203)             = 0
epoll_wait(0, {}, 200, 0)               = 0
epoll_wait(0, {}, 200, 331)             = 0
epoll_wait(0, {}, 200, 0)


Were such "epoll_wait(0, 0, 200, 0)" calls infinitively displayed?
Yes



In fact I am wondering if it is normal to have so much epoll_wait(0,
{}, 200, 0) calls for a haproxy process which has shut down.

I suspect they are in relation with peer tasks (obviously which has
expired).

If this is the case, and with configurations with only peer tasks,
haproxy would definitively hang consuming a lot of CPU resources.
HAProxy was not consuming high CPU. Note that in every other call to
`epoll_wait`, the 4th value was >0. If every single timeout value were
0, then yes, it would spin consuming CPU.


agreed... but perhaps your configuration does not use only peer tasks,
contrary to my configuration... this is your traces which lead me to
check how the peer task expiration is handled with configurations with
only peers as backends.

In my case with only two peers I see such following traces, after a
peer has sent a synchronization request:

    epoll_wait(0, {}, 200, 1000}
    epoll_wait(0, {}, 200, 1000}
    epoll_wait(0, {}, 200, 1000}
    epoll_wait(0, {}, 200, 1000}
    epoll_wait(0, {}, 200, X}    # with X < 1000

followed by a big loop of

    epoll_wait(0, {}, 200, 0}    # so consuming high CPU resources
during a fraction of second

then:

    shutdown(SHUT_WR)            # FIN TCP segment at about 5s after
the first epoll_wait(0, 0, 200, 1000} above.

then again:

    epoll_wait(0, {}, 200, 0}

until the remote peer, which is in CLOSE_WAIT state shuts down its
socket.
This doesn't make sense. CLOSE_WAIT means that the remote side has
already closed the socket, and the application on the local side needs
to issue a close() on it.

Agreed Patrick. But this is also what I meant above, but from the haproxy point of view: it shuts down(SHUT_WR) its local side meaning I have nothing to write anymore, so as you wrote the remote side is in CLOSE_WAIT state and have to close its socket because from its point of view the remote side (haproxy) has shutdown its socket.

The bug here is that it never closes it socket.

I finally managed to reproduce (on demand) such a CLOSE_WAIT issue thanks to this Willy blog:

https://www.haproxy.com/blog/truly-seamless-reloads-with-haproxy-no-more-hacks/

which refers to this Yelp workaround:

https://engineeringblog.yelp.com/2015/04/true-zero-downtime-haproxy-reloads.html

As I have captured *only one time* with wireshark such a CLOSE_WAIT issue but could not deduce much enough information from it and produce and validate any fix, I realized that with a *plug* qdisc it was feasible to temporarily block/unblock incoming connections internally to an haproxy process so that to reproduce as often as required the network traffic which leads to such CLOSE_WAIT issues.

So I have compiled haproxy with the little src/plug_qdisc.c source file (attached to this mail) highly inspired from libnl-utils package sources and managed to make a haproxy peer block and unblock incoming connections from another remote peer calling plug_qdisc_release_indefinite_buffer() and plug_qdisc_plug_buffer() at the right places and only on one peer side, the CLOSE_WAIT issue happening on the remote side.

So, I realized that there is a condition where peer sessions were shutdown just after having been created. This leads their peer I/O handler to never be called. So the underlying TCP session is never closed.

I hope the pach attached to this mail will definitively fix such peer CLOSE_WAIT issues.

Regards,

Fred.









#include <stdint.h>
#include <netlink/cache.h>
#include <netlink/cli/utils.h>
#include <netlink/cli/tc.h>
#include <netlink/cli/qdisc.h>
#include <netlink/cli/link.h>
#include <netlink/route/qdisc/plug.h>

static struct nl_sock *nl_sock;
static struct nl_cache *link_cache;
struct rtnl_qdisc *qdisc;
struct rtnl_tc *tc;

static int qdisc_init(void)
{
	nl_sock = nl_cli_alloc_socket();
	nl_cli_connect(nl_sock, NETLINK_ROUTE);
	link_cache = nl_cli_link_alloc_cache(nl_sock);
	qdisc = nl_cli_qdisc_alloc();
	tc = (struct rtnl_tc *)qdisc;

	return 0;
}

int plug_qdisc_release_indefinite_buffer(void)
{
	rtnl_qdisc_plug_release_indefinite(qdisc);
	return rtnl_qdisc_add(nl_sock, qdisc, 0);
}

int plug_qdisc_plug_buffer(void)
{
	rtnl_qdisc_plug_buffer(qdisc);
	return rtnl_qdisc_add(nl_sock, qdisc, 0);
}

int plug_qdisc_attach(char *device, char *parent, char *id, uint32_t limit)
{
	int ret;

	if (!tc && qdisc_init() == -1)
		return -1;

	nl_cli_tc_parse_dev(tc, link_cache, device);
	nl_cli_tc_parse_parent(tc, parent);
	if (!rtnl_tc_get_ifindex(tc))
		return -1;

	if (!rtnl_tc_get_parent(tc))
		return -1;
	if (id)
		nl_cli_tc_parse_handle(tc, id, 1);

	rtnl_tc_set_kind(tc, "plug");
	if (limit)
		rtnl_qdisc_plug_set_limit(qdisc, limit);

	ret = rtnl_qdisc_add(nl_sock, qdisc, NLM_F_CREATE);
	if (ret < 0) {
		fprintf(stderr, "Could add attach qdisc: %s\n", nl_geterror(ret));
		return -1;
	}
	/* Release buffer. */
	plug_qdisc_release_indefinite_buffer();

	return 0;
}

>From 97a6d0714502c348842e66f531e2dce724131f64 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20L=C3=A9caille?= <[email protected]>
Date: Tue, 13 Jun 2017 16:39:57 +0200
Subject: [PATCH] MINOR: peers: Peers CLOSE_WAIT issue.

A peer session which has just been created upon reconnect timeout expirations,
could be right after shutdown (at peer session level) because the remote
side peer could also righ after have connected leading the peer I/O handler
to never be called. In such a case the TCP session was still running
(connect()/accept()) and finally left in CLOSE_WAIT state after the remote
side stopped writting (shutdown(SHUT_WR)) because, at this time, only the
peer session I/O handler is able to close a TCP session. As the peer
session is shutdown, the I/O handler is never called.

Now on, with this patch we never shutdown such peer sessions wich have just
been created. We leave them connect to the remote peer which is already
connected and must shutdown its own peer session.

Thanks to Patric Hemmer for reporting this issue and to Willy and Yelp blogs
which helped me in fixing this issue.
(See https://www.haproxy.com/blog/truly-seamless-reloads-with-haproxy-no-more-hacks/ and
https://engineeringblog.yelp.com/2015/04/true-zero-downtime-haproxy-reloads.htmll)
---
 src/peers.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/peers.c b/src/peers.c
index 0c8861f..af9d03e 100644
--- a/src/peers.c
+++ b/src/peers.c
@@ -1719,7 +1719,12 @@ static void peer_session_forceshutdown(struct appctx *appctx)
 {
 	struct peer *ps;
 
-	if (!appctx)
+	/* Note that the peers which have just been created
+	 * (->st0 == PEER_SESS_SC_CONNECTCODE) must not
+	 * be shutdown, if not, the peer I/O handler will never be
+	 * not be called.
+	 */
+	if (!appctx || appctx->st0 == PEER_SESS_SC_CONNECTCODE)
 		return;
 
 	if (appctx->applet != &peer_applet)
-- 
2.1.4

Reply via email to