Re: dev18 splice-auto
Hi Sander, the patch I proposed was not enough, it only fixed a few of the occurrences. The issue was introduced in dev12 with the connection rework. Please use the attached patch, which I have tested to fix the issue here and merged. The issue mainly happens with chunked-encoded responses where splice() may read more data than expected, causing read() to fail to get the chunk size, and aborting the connection. Regards, Willy From 4fc90efed039f23d0d48cde895939d3e94410e54 Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Sat, 6 Apr 2013 11:29:39 +0200 Subject: BUG/MEDIUM: splicing is broken since 1.5-dev12 Commit 96199b10 reintroduced the splice() mechanism in the new connection system. However, it failed to account for the number of transferred bytes, allowing more bytes than scheduled to be transferred to the client. This can cause an issue with small-chunked responses, where each packet from the server may contain several chunks, because a single splice() call may succeed, then try to splice() a second time as the pipe is not full, thus consuming the next chunk size. This patch also reverts commit baf2a5 (OPTIM: splice: detect shutdowns...) because it introduced a related regression. The issue is that splice() may return less data than available also if the pipe is full, so having EPOLLRDHUP after splice() returns less than expected is not a sufficient indication that the input is empty. In both cases, the issue may be detected by the presence of SD termination flags in the logs, and worked around by disabling splicing (using -dS). This problem was reported by Sander Klein, and no backport is needed. --- src/raw_sock.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/raw_sock.c b/src/raw_sock.c index ad4a0aa..e030253 100644 --- a/src/raw_sock.c +++ b/src/raw_sock.c @@ -159,10 +159,7 @@ int raw_sock_to_pipe(struct connection *conn, struct pipe *pipe, unsigned int co retval += ret; pipe-data += ret; - - /* if a POLL_HUP was present, we've read the last segment */ - if ((fdtab[conn-t.sock.fd].ev (FD_POLL_ERR|FD_POLL_HUP)) == FD_POLL_HUP) - goto out_read0; + count -= ret; if (pipe-data = SPLICE_FULL_HINT || ret = global.tune.recv_enough) { /* We've read enough of it for this time, let's stop before -- 1.7.12.2.21.g234cd45.dirty
Re: dev18 splice-auto
Heh, I didn't have time to test the previous one, but I'll test this one this evening. Greets, Sander On 6 apr. 2013, at 11:50, Willy Tarreau w...@1wt.eu wrote: Hi Sander, the patch I proposed was not enough, it only fixed a few of the occurrences. The issue was introduced in dev12 with the connection rework. Please use the attached patch, which I have tested to fix the issue here and merged. The issue mainly happens with chunked-encoded responses where splice() may read more data than expected, causing read() to fail to get the chunk size, and aborting the connection. Regards, Willy 0001-BUG-MEDIUM-splicing-is-broken-since-1.5-dev12.patch
Re: dev18 splice-auto
On 06.04.2013 11:50, Willy Tarreau wrote: Hi Sander, the patch I proposed was not enough, it only fixed a few of the occurrences. The issue was introduced in dev12 with the connection rework. Please use the attached patch, which I have tested to fix the issue here and merged. The issue mainly happens with chunked-encoded responses where splice() may read more data than expected, causing read() to fail to get the chunk size, and aborting the connection. Just to confirm, using the patch and splice-auto again everything seems to be fine. Thanks again Willy. Greets, Sander
Re: dev18 splice-auto
On Sat, Apr 06, 2013 at 08:22:23PM +0200, Sander Klein wrote: On 06.04.2013 11:50, Willy Tarreau wrote: Hi Sander, the patch I proposed was not enough, it only fixed a few of the occurrences. The issue was introduced in dev12 with the connection rework. Please use the attached patch, which I have tested to fix the issue here and merged. The issue mainly happens with chunked-encoded responses where splice() may read more data than expected, causing read() to fail to get the chunk size, and aborting the connection. Just to confirm, using the patch and splice-auto again everything seems to be fine. Thanks again Willy. Cool, thanks to you for testing ! Willy
RE: dev18 splice-auto
Hi Lukas, On 05.04.2013 12:00, Lukas Tribus wrote: Whats is the percentage of requests failing this way? I'm not sure. But I think it's less than 1%. We do a couple of 100's request per second and about every second I see one failed request. Do you know if this is an issue introduced by a certain haproxy build, and thus was working previously, or did you only recently enable splice-auto? Are you able to reproduce this in dev17 or in stable 1.4.23 (but you probably rely on 1.5 features)? I cannot try 1.4 because I indeed rely on 1.5 features. But I did try dev18 and dev17-ss-20130125. Both give the same problems. I cannot go any further back because I had some issues with versions before 20130125 if I recall correctly. I'm not sure what is was anymore :-) Can you remove splice-auto, and check whether splice-request or splice-response or both are affected? Using splice-request and splice-response I get the same issue. Using splice-request gives no problems. Using splice-response I get the issue again. Do you see this in a lab setup as well or do you need to troubleshoot this with production services? I do not have a big lab setup in which I can reproduce this. Are you able to tcpdump an affected session (both front and backend traffic)? It is possible to do that, but only if really necessary. And I probably only want to share that with direct HAProxy developers. I use kernel 3.2.40 with grsec patch Any kernel messages in dmesg? Nope, not anything out of the ordinary. Do you have the possibility to install a stable but recent vanilla kernel from kernel.org (I suppose 3.8.5 would be a good choice)? This may as well be a kernel issue. Vanilla 3.2.X would be possible, anything else is a bit more problematic. Not impossible, but I only want to do that if everything else fails. Greets, Sander
RE: dev18 splice-auto
It is possible to do that, but only if really necessary. And I probably only want to share that with direct HAProxy developers. No problem with that; but we need to get more informations somehow and I have the feeling that this is hard to reproduce ... Can you give us the output of the failed request with show errors [1] on the unix socket? If we can't see anything with show errors and I'm afraid you gonna need to tcpdump the failed request (likely on the backend) and sent it to Willy. Regards, Lukas http://cbonte.github.com/haproxy-dconv/configuration-1.5.html#9-show%20errors
Re: dev18 splice-auto
Hi Sander, On Fri, Apr 05, 2013 at 02:51:54PM +0200, Sander Klein wrote: Hi Lukas, On 05.04.2013 12:00, Lukas Tribus wrote: Whats is the percentage of requests failing this way? I'm not sure. But I think it's less than 1%. We do a couple of 100's request per second and about every second I see one failed request. Do you know if this is an issue introduced by a certain haproxy build, and thus was working previously, or did you only recently enable splice-auto? Are you able to reproduce this in dev17 or in stable 1.4.23 (but you probably rely on 1.5 features)? I cannot try 1.4 because I indeed rely on 1.5 features. But I did try dev18 and dev17-ss-20130125. Both give the same problems. I cannot go any further back because I had some issues with versions before 20130125 if I recall correctly. I'm not sure what is was anymore :-) Could you test if the following patch fixes the problem ? It comments out an optim that was merged in 20130107 (before the snapshot you describe) but now I'm having a doubt about it : I think that if the input pipe is full just before storing the whole response, and the server closes along with the response, it might result in truncating it to the pipe's size. Thanks, Willy diff --git a/src/raw_sock.c b/src/raw_sock.c index ad4a0aa..6b9a13d 100644 --- a/src/raw_sock.c +++ b/src/raw_sock.c @@ -161,8 +161,8 @@ int raw_sock_to_pipe(struct connection *conn, struct pipe *pipe, unsigned int co pipe-data += ret; /* if a POLL_HUP was present, we've read the last segment */ - if ((fdtab[conn-t.sock.fd].ev (FD_POLL_ERR|FD_POLL_HUP)) == FD_POLL_HUP) - goto out_read0; + //if ((fdtab[conn-t.sock.fd].ev (FD_POLL_ERR|FD_POLL_HUP)) == FD_POLL_HUP) + // goto out_read0; if (pipe-data = SPLICE_FULL_HINT || ret = global.tune.recv_enough) { /* We've read enough of it for this time, let's stop before