OK so my assumption was right and I could even model a reproducer against
this bug :-)

Interestingly it doesn't affect 1.6+ because we added some protections
against polling loops in 1.6, so a failed read is properly detected and
disabled. Janusz, that explains why 1.6.4 doesn't cause the spikes.

Janusz and Nenad, please apply the following patch to your 1.5 tree.
It works for me and does what the code is supposed to do (ie: subtract
outgoing data from the reserve as done in channel_full() when deciding
to re-enable polling).

I'll issue a new 1.5 soon with this fix to limit the breakage for people
who have not upgraded yet.

Thanks,
Willy
>From 8a32106fffb4642687af76c78fd3c171847c917e Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Mon, 11 Apr 2016 17:04:02 +0200
Subject: BUG/MEDIUM: channel: fix miscalculation of available buffer space
 (2nd try)

Commit 999f643 ("BUG/MEDIUM: channel: fix miscalculation of available buffer
space.") introduced a bug which made output data to be ignored when computing
the remaining room in a buffer. The problem is that channel_may_recv()
properly considers them and may declare that the FD may be polled for read
events, but once the even strikes, channel_recv_limit() called before recv()
says the opposite. In 1.6 and later this case is automatically caught by
polling loop detection at the connection level and is harmless. But the
backport in 1.5 ends up with a busy polling loop as soon as it becomes
possible to have a buffer with this conflict. In order to reproduce it, it
is necessary to have less than [maxrewrite] bytes available in a buffer, no
forwarding enabled (end of transfer) and [buf->o >= maxrewrite - free space].

Since this heavily depends on socket buffers, it will randomly strike users.
On 1.5 with 8kB buffers it was possible to reproduce it with httpterm using
the following command line :

   $ (printf "GET /?s=675000 HTTP/1.0\r\n\r\n"; sleep 60) | \
       nc6 --rcvbuf-size 1 --send-only 127.0.0.1 8002

This bug is only medium in 1.6 and later but is major in the 1.5 backport,
so it must be backported there.

Thanks to Nenad Merdanovic and Janusz Dziemidowicz for reporting this issue
with enough elements to help understand it.
---
 include/proto/channel.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/proto/channel.h b/include/proto/channel.h
index 848ab02..31993fa 100644
--- a/include/proto/channel.h
+++ b/include/proto/channel.h
@@ -342,7 +342,7 @@ static inline int channel_recv_limit(const struct channel 
*chn)
            chn->to_forward == CHN_INFINITE_FORWARD)
                goto end;
 
-       transit = chn->to_forward - chn->buf->i;
+       transit = chn->buf->o + chn->to_forward - chn->buf->i;
        if (transit < 0)
                transit = 0;
 
-- 
1.7.12.1

Reply via email to