Hi Willy, Thierry, others,

Op 13-10-2015 om 18:29 schreef Willy Tarreau:
Hi again :-)

On Tue, Oct 13, 2015 at 06:10:33PM +0200, Willy Tarreau wrote:
I can't reproduce either unfortunately. I'm seeing some other minor
issues related to how the closed input is handled and showing that
pipelining doesn't work (only the first request is handled) but that's
all I'm seeing I'm sorry.

I've tried injecting on stats in parallel to the other frontend, I've
tried with close and keep-alive etc... I tried to change the poller
just in case you would be facing a race condition, no way :-(

In general it's good to keep in mind that buffer_slow_realign() is
called to realign wrapped requests, so that normally means that
pipelining is needed. But even then for now I can't succeed.
As usual, sending an e-mail scares the bug and it starts to shake the
white flag :-)

So by configuring the buffer size to 10000 and sending large 8kB requests,
I'm seeing a random behaviour. First, most of then time I end up with a
stuck session which never ends (no expiration timer set). And from time
to time it may crash. This time it was not in buffer_slow_realign() but
in buffer_insert_line2(), though the problem is the same :

(gdb) up
#2  0x000000000046e094 in http_header_add_tail2 (msg=0x7ce628, hdr_idx=0x7ce5c8, 
text=0x53b339 "Connection: close", len=17) at src/proto_http.c:595
595             bytes = buffer_insert_line2(msg->chn->buf, msg->chn->buf->p + 
msg->eoh, text, len);

(gdb) p msg->eoh
$6 = 8057
(gdb) p *msg->chn->buf
$7 = {p = 0x7f8e7b44bf9e "3456789.123456789\n", 'P' <repeats 182 times>..., size = 10008, 
i = 0, o = 8058, data = 0x7f8e7b44a024 "GET /1234567"}

(gdb) p msg->chn->buf->p - msg->chn->buf->data
$8 = 8058

As one may notice, since p is already 8kB from the beginning of the buffer
(hence 2kB from the end), writing at p + eoh is definitely wrong. Here we're
having a problem that msg->eoh is wrong or buf->p is wrong.

My opinion here is that buf->p is the wrong one, since we're dealing with a
8kB request, so it should definitely have been realigned. Or maybe it was
stripped and removed from the request buffer with HTTP processing still

All this part is still totally unclear to me I'm afraid. I suggest that we
don't rush too fast on lua services and try to fix that during the stable
cycle. I don't want to postpone the release any further for something that
was added very recently and that is not causing any regression to existing

Best regards,

Ok got some good news here :).. 1.6.0-release nolonger has the error i encountered.

The commit below fixed the issue already.
CLEANUP: cli: ensure we can never double-free error messages

I was still testing with 1.6-dev7 the fix above came the day after.. Probably your testing with HEAD, which is why it doesn't happen for you. Using snapshots or HEAD is not as easy as just following dev releases.. So i usually stick to those unless i have reason to believe a newer version might fix it already. I should have tested again sooner sorry.. (I actually did test latest snapshot at the moment when i first reported the issue..)

Anyway i burned some more hours on both your and my side than was probably needed.
One more issue gone :)

Thanks for the support!


