Re: dev18 splice-auto

2013-04-06 Thread Willy Tarreau
Hi Sander,

the patch I proposed was not enough, it only fixed a few of the
occurrences. The issue was introduced in dev12 with the connection
rework.

Please use the attached patch, which I have tested to fix the issue here
and merged.

The issue mainly happens with chunked-encoded responses where splice()
may read more data than expected, causing read() to fail to get the
chunk size, and aborting the connection.

Regards,
Willy

From 4fc90efed039f23d0d48cde895939d3e94410e54 Mon Sep 17 00:00:00 2001
From: Willy Tarreau w...@1wt.eu
Date: Sat, 6 Apr 2013 11:29:39 +0200
Subject: BUG/MEDIUM: splicing is broken since 1.5-dev12

Commit 96199b10 reintroduced the splice() mechanism in the new connection
system. However, it failed to account for the number of transferred bytes,
allowing more bytes than scheduled to be transferred to the client. This
can cause an issue with small-chunked responses, where each packet from
the server may contain several chunks, because a single splice() call may
succeed, then try to splice() a second time as the pipe is not full, thus
consuming the next chunk size.

This patch also reverts commit baf2a5 (OPTIM: splice: detect shutdowns...)
because it introduced a related regression. The issue is that splice() may
return less data than available also if the pipe is full, so having EPOLLRDHUP
after splice() returns less than expected is not a sufficient indication that
the input is empty.

In both cases, the issue may be detected by the presence of SD termination
flags in the logs, and worked around by disabling splicing (using -dS).

This problem was reported by Sander Klein, and no backport is needed.
---
 src/raw_sock.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/raw_sock.c b/src/raw_sock.c
index ad4a0aa..e030253 100644
--- a/src/raw_sock.c
+++ b/src/raw_sock.c
@@ -159,10 +159,7 @@ int raw_sock_to_pipe(struct connection *conn, struct pipe 
*pipe, unsigned int co
 
retval += ret;
pipe-data += ret;
-
-   /* if a POLL_HUP was present, we've read the last segment */
-   if ((fdtab[conn-t.sock.fd].ev  (FD_POLL_ERR|FD_POLL_HUP)) == 
FD_POLL_HUP)
-   goto out_read0;
+   count -= ret;
 
if (pipe-data = SPLICE_FULL_HINT || ret = 
global.tune.recv_enough) {
/* We've read enough of it for this time, let's stop 
before
-- 
1.7.12.2.21.g234cd45.dirty



Re: dev18 splice-auto

2013-04-06 Thread Sander Klein
Heh, I didn't have time to test the previous one, but I'll test this one this 
evening. 

Greets,

Sander

On 6 apr. 2013, at 11:50, Willy Tarreau w...@1wt.eu wrote:

 Hi Sander,
 
 the patch I proposed was not enough, it only fixed a few of the
 occurrences. The issue was introduced in dev12 with the connection
 rework.
 
 Please use the attached patch, which I have tested to fix the issue here
 and merged.
 
 The issue mainly happens with chunked-encoded responses where splice()
 may read more data than expected, causing read() to fail to get the
 chunk size, and aborting the connection.
 
 Regards,
 Willy
 
 0001-BUG-MEDIUM-splicing-is-broken-since-1.5-dev12.patch



Re: dev18 splice-auto

2013-04-06 Thread Sander Klein

On 06.04.2013 11:50, Willy Tarreau wrote:

Hi Sander,

the patch I proposed was not enough, it only fixed a few of the
occurrences. The issue was introduced in dev12 with the connection
rework.

Please use the attached patch, which I have tested to fix the issue 
here

and merged.

The issue mainly happens with chunked-encoded responses where splice()
may read more data than expected, causing read() to fail to get the
chunk size, and aborting the connection.


Just to confirm, using the patch and splice-auto again everything seems 
to be fine.


Thanks again Willy.

Greets,

Sander



Re: dev18 splice-auto

2013-04-06 Thread Willy Tarreau
On Sat, Apr 06, 2013 at 08:22:23PM +0200, Sander Klein wrote:
 On 06.04.2013 11:50, Willy Tarreau wrote:
 Hi Sander,
 
 the patch I proposed was not enough, it only fixed a few of the
 occurrences. The issue was introduced in dev12 with the connection
 rework.
 
 Please use the attached patch, which I have tested to fix the issue 
 here
 and merged.
 
 The issue mainly happens with chunked-encoded responses where splice()
 may read more data than expected, causing read() to fail to get the
 chunk size, and aborting the connection.
 
 Just to confirm, using the patch and splice-auto again everything seems 
 to be fine.
 
 Thanks again Willy.

Cool, thanks to you for testing !

Willy




RE: dev18 splice-auto

2013-04-05 Thread Sander Klein

Hi Lukas,

On 05.04.2013 12:00, Lukas Tribus wrote:

Whats is the percentage of requests failing this way?


I'm not sure. But I think it's less than 1%. We do a couple of 100's 
request per second and about every second I see one failed request.


Do you know if this is an issue introduced by a certain haproxy build, 
and thus

was working previously, or did you only recently enable splice-auto?
Are you able to reproduce this in dev17 or in stable 1.4.23 (but you 
probably

rely on 1.5 features)?


I cannot try 1.4 because I indeed rely on 1.5 features. But I did try 
dev18 and dev17-ss-20130125. Both give the same problems. I cannot go 
any further back because I had some issues with versions before 20130125 
if I recall correctly. I'm not sure what is was anymore :-)


Can you remove splice-auto, and check whether splice-request or 
splice-response

or both are affected?


Using splice-request and splice-response I get the same issue.
Using splice-request gives no problems.
Using splice-response I get the issue again.

Do you see this in a lab setup as well or do you need to troubleshoot 
this

with production services?


I do not have a big lab setup in which I can reproduce this.

Are you able to tcpdump an affected session (both front and backend 
traffic)?


It is possible to do that, but only if really necessary. And I probably 
only want to share that with direct HAProxy developers.



I use kernel 3.2.40 with grsec patch


Any kernel messages in dmesg?


Nope, not anything out of the ordinary.

Do you have the possibility to install a stable but recent vanilla 
kernel
from kernel.org (I suppose 3.8.5 would be a good choice)? This may as 
well

be a kernel issue.


Vanilla 3.2.X would be possible, anything else is a bit more 
problematic. Not impossible, but I only want to do that if everything 
else fails.


Greets,

Sander



RE: dev18 splice-auto

2013-04-05 Thread Lukas Tribus
 It is possible to do that, but only if really necessary. And I probably
 only want to share that with direct HAProxy developers.

No problem with that; but we need to get more informations somehow and I
have the feeling that this is hard to reproduce ...

Can you give us the output of the failed request with show errors [1]
on the unix socket?

If we can't see anything with show errors and I'm afraid you gonna need
to tcpdump the failed request (likely on the backend) and sent it to Willy.


Regards,
Lukas


http://cbonte.github.com/haproxy-dconv/configuration-1.5.html#9-show%20errors   
  


Re: dev18 splice-auto

2013-04-05 Thread Willy Tarreau
Hi Sander,

On Fri, Apr 05, 2013 at 02:51:54PM +0200, Sander Klein wrote:
 Hi Lukas,
 
 On 05.04.2013 12:00, Lukas Tribus wrote:
 Whats is the percentage of requests failing this way?
 
 I'm not sure. But I think it's less than 1%. We do a couple of 100's 
 request per second and about every second I see one failed request.
 
 Do you know if this is an issue introduced by a certain haproxy build, 
 and thus
 was working previously, or did you only recently enable splice-auto?
 Are you able to reproduce this in dev17 or in stable 1.4.23 (but you 
 probably
 rely on 1.5 features)?
 
 I cannot try 1.4 because I indeed rely on 1.5 features. But I did try 
 dev18 and dev17-ss-20130125. Both give the same problems. I cannot go 
 any further back because I had some issues with versions before 20130125 
 if I recall correctly. I'm not sure what is was anymore :-)

Could you test if the following patch fixes the problem ? It comments out
an optim that was merged in 20130107 (before the snapshot you describe)
but now I'm having a doubt about it : I think that if the input pipe is
full just before storing the whole response, and the server closes along
with the response, it might result in truncating it to the pipe's size.

Thanks,
Willy

diff --git a/src/raw_sock.c b/src/raw_sock.c
index ad4a0aa..6b9a13d 100644
--- a/src/raw_sock.c
+++ b/src/raw_sock.c
@@ -161,8 +161,8 @@ int raw_sock_to_pipe(struct connection *conn, struct pipe 
*pipe, unsigned int co
pipe-data += ret;
 
/* if a POLL_HUP was present, we've read the last segment */
-   if ((fdtab[conn-t.sock.fd].ev  (FD_POLL_ERR|FD_POLL_HUP)) == 
FD_POLL_HUP)
-   goto out_read0;
+   //if ((fdtab[conn-t.sock.fd].ev  (FD_POLL_ERR|FD_POLL_HUP)) 
== FD_POLL_HUP)
+   //  goto out_read0;
 
if (pipe-data = SPLICE_FULL_HINT || ret = 
global.tune.recv_enough) {
/* We've read enough of it for this time, let's stop 
before