Re: HAProxy 1.9.6 unresponsive

2019-05-13 Thread Willy Tarreau
Hi Patrick,

On Mon, May 13, 2019 at 08:18:21AM -0400, Patrick Hemmer wrote:
> There's been mention of releasing 1.9.8. Will that release contain a fix for
> the issue reported in this thread?

Sorry, I mixed it with the other ones speaking about 100% CPU. I've
re-read the whole thread and yours was related to the trailers in HTX
mode which was indeed addressed in 1.9 by commit ec4ae19eb ("BUG/MEDIUM:
mux-h2/htx: never wait for EOM when processing trailers").

I've been trying to release 1.9.8 this week-end then today but got
distracted by reviews, last minute fixes and carefully checking that we
don't need any new patch. Now everything looks OK, trying to get back
to this now.

Willy



Re: HAProxy 1.9.6 unresponsive

2019-05-13 Thread Patrick Hemmer




*From:* Willy Tarreau [mailto:w...@1wt.eu]
*Sent:* Saturday, May 11, 2019, 06:10 EDT
*To:* Patrick Hemmer 
*Cc:* haproxy@formilux.org
*Subject:* HAProxy 1.9.6 unresponsive


Hi Patrick,

On Fri, May 10, 2019 at 09:17:25AM -0400, Patrick Hemmer wrote:

So I see a few updates on some of the other 100% CPU usage threads, and that
some fixes have been pushed. Are any of those in relation to this issue? Or
is this one still outstanding?

Apparently we've pulled a long piece of string and uncovered a series of
such bugs. It's likely that different persons have been affected by
different bugs. We still have the issue Maciej is experiencing that I'd
really like to nail down, given the last occurrence doesn't seem to make
sense as the code looks right after Olivier's fix.

Thanks,
Willy


Thanks, but I'm unsure if that means the issue I reported is fixed, or 
if other related issues are fixed and this one is still outstanding.
There's been mention of releasing 1.9.8. Will that release contain a fix 
for the issue reported in this thread?


-Patrick


Re: HAProxy 1.9.6 unresponsive

2019-05-11 Thread Willy Tarreau
Hi Patrick,

On Fri, May 10, 2019 at 09:17:25AM -0400, Patrick Hemmer wrote:
> So I see a few updates on some of the other 100% CPU usage threads, and that
> some fixes have been pushed. Are any of those in relation to this issue? Or
> is this one still outstanding?

Apparently we've pulled a long piece of string and uncovered a series of
such bugs. It's likely that different persons have been affected by
different bugs. We still have the issue Maciej is experiencing that I'd
really like to nail down, given the last occurrence doesn't seem to make
sense as the code looks right after Olivier's fix.

Thanks,
Willy



Re: HAProxy 1.9.6 unresponsive

2019-05-10 Thread Patrick Hemmer




*From:* Willy Tarreau [mailto:w...@1wt.eu]
*Sent:* Tuesday, May 7, 2019, 14:46 EDT
*To:* Patrick Hemmer 
*Cc:* haproxy@formilux.org
*Subject:* HAProxy 1.9.6 unresponsive


Hi Patrick,

On Tue, May 07, 2019 at 02:01:33PM -0400, Patrick Hemmer wrote:

Just in case it's useful, we had the issue recur today. However I gleaned a
little more information from this recurrence. Provided below are several
outputs from a gdb `bt full`. The important bit is that in the captures, the
last frame which doesn't change between each capture is the `si_cs_send`
function. The last stack capture provided has the shortest stack depth of
all the captures, and is inside `h2_snd_buf`.

Thank you. At first glance this remains similar. Christopher and I have
been studying these issues intensely these days because they have deep
roots into some design choices and tradeoffs we've had to make and that
we're relying on, and we've come to conclusions about some long term
changes to address the causes, and some fixes for 1.9 that now appear
valid. We're still carefully reviewing our changes before pushing them.
Then I think we'll emit 1.9.8 anyway since it will already fix quite a
number of issues addressed since 1.9.7, so for you it will probably be
easier to try again.
  
So I see a few updates on some of the other 100% CPU usage threads, and 
that some fixes have been pushed. Are any of those in relation to this 
issue? Or is this one still outstanding?


Thanks

-Patrick



Re: HAProxy 1.9.6 unresponsive

2019-05-07 Thread Willy Tarreau
Hi Patrick,

On Tue, May 07, 2019 at 02:01:33PM -0400, Patrick Hemmer wrote:
> Just in case it's useful, we had the issue recur today. However I gleaned a
> little more information from this recurrence. Provided below are several
> outputs from a gdb `bt full`. The important bit is that in the captures, the
> last frame which doesn't change between each capture is the `si_cs_send`
> function. The last stack capture provided has the shortest stack depth of
> all the captures, and is inside `h2_snd_buf`.

Thank you. At first glance this remains similar. Christopher and I have
been studying these issues intensely these days because they have deep
roots into some design choices and tradeoffs we've had to make and that
we're relying on, and we've come to conclusions about some long term
changes to address the causes, and some fixes for 1.9 that now appear
valid. We're still carefully reviewing our changes before pushing them.
Then I think we'll emit 1.9.8 anyway since it will already fix quite a
number of issues addressed since 1.9.7, so for you it will probably be
easier to try again.
 
> Otherwise it's still the behavior is the same as last time with `strace`
> showing absolutely nothing, so it's still looping.

I'm not surprised. We managed to break that loop in a dirty way a first
time but it came with impacts (some random errors could be spewed depending
on the frame sizes, which is obviously not acceptable). But yes, this loop
has no way to give up. That's the second argument convincing me of finishing
the watchdog so that at least it dies when this happens!

Expect some updates on this this week.

Cheers,
Willy



Re: HAProxy 1.9.6 unresponsive

2019-05-07 Thread Patrick Hemmer




*From:* Willy Tarreau [mailto:w...@1wt.eu]
*Sent:* Monday, May 6, 2019, 08:42 EDT
*To:* Patrick Hemmer 
*Cc:* haproxy@formilux.org
*Subject:* HAProxy 1.9.6 unresponsive


On Sun, May 05, 2019 at 09:40:02AM +0200, Willy Tarreau wrote:

With this said, after studying the code a little bit more, I'm seeing a
potential case where if we'd have a trailers entry in the HTX buffer but
no end of message, we could loop forever there not consuming this block.
I have no idea if this is possible in an HTX message, I'll ask Christopher
tomorrow. In any case we need to address this one way or another, possibly
reporting an error instead if required. Thus I'm postponing 1.9.8 for
tomorrow.

So the case is indeed possible and at the moment all we can do is try to
minimize the probability to produce it :-(  The issue is caused by the
moment we've received the end of trailsers but not the end of the mesage.
 From the H2 protocol perspective if we've sent the END_STREAM flag, the
stream is closed, and a closed stream gets detached and cannot receive
new traffic, so at best we'll occasionally close too early and report
client failures at the upper layers while everything went OK. We cannot
send trailers without the END_STREAM flag since no frame may follow.
Abusing CONTINUATION is out of question here as this would require to
completely freeze the whole connection (including control frames) for
the time it takes to get this final EOM block. I thought about simply
reporting an error when we're in this situation between trailers and EOM
but it will mean that occasionally some chunked responses of sizes close
to 16N kB with trailers may err out, which is not acceptable either.

For 2.0 we approximately see what needs to be modified to address this
situation, but that will not be trivial and not backportable.

For 1.9 I'm still trying to figure what the "best" solution is. I may
finally end up marking the stream as closed as soon as we see the
trailers pushed down. I'm just unsure right now about all the possible
consequences and need to study the edge cases. Also I fear that this
will be something hard to unroll later, so I'm still studying.

Willy


Just in case it's useful, we had the issue recur today. However I 
gleaned a little more information from this recurrence. Provided below 
are several outputs from a gdb `bt full`. The important bit is that in 
the captures, the last frame which doesn't change between each capture 
is the `si_cs_send` function. The last stack capture provided has the 
shortest stack depth of all the captures, and is inside `h2_snd_buf`.


Otherwise it's still the behavior is the same as last time with `strace` 
showing absolutely nothing, so it's still looping.





#0  h1_headers_to_hdr_list (start=0x7f5a4ea6b5fb "grpco\243?", 
stop=0x7f5a4ea6b5ff "o\243?", hdr=hdr@entry=0x7ffdc58f6400, 
hdr_num=hdr_num@entry=101, h1m=h1m@entry=0x7ffdc58f63d0, 
slp=slp@entry=0x0) at src/h1.c:793

    ret = 
    state = 
    ptr = 
    end = 
    hdr_count = 
    skip = 0
    sol = 
    col = 
    eol = 
    sov = 
    sl = 
    skip_update = 
    restarting = 
    n = 
    v = {ptr = 0x7f5a4eb51453 "LZ\177", len = 140025825685243}
#1  0x7f5a4d862539 in h2s_htx_make_trailers 
(h2s=h2s@entry=0x7f5a4ecc7860, htx=htx@entry=0x7f5a4ea67630) at 
src/mux_h2.c:4996
    list = {{n = {ptr = 0x0, len = 0}, v = {ptr = 0x0, len = 0}} 
}

    h2c = 0x7f5a4ec56610
    blk = 
    blk_end = 0x0
    outbuf = {size = 140025844274259, area = 0x7f5a4d996efb 
 
"\205\300~\aHc\320H\001SXH\205\355t\026Lc\310E1\300D\211\351L\211⾃", 
data = 16472, head = 140025845781936}
    h1m = {state = H1_MSG_HDR_NAME, flags = 2056, curr_len = 0, 
body_len = 0, next = 4, err_pos = 0, err_state = 1320431563}

    type = 
    ret = 
    hdr = 0
    idx = 5
    start = 
#2  0x7f5a4d866ef5 in h2_snd_buf (cs=0x7f5a4e9a8980, 
buf=0x7f5a4e777d78, count=4, flags=) at src/mux_h2.c:5372

    h2s = 
    orig_count = 
    total = 16291
    ret = 
    htx = 0x7f5a4ea67630
    blk = 
    btype = 
    idx = 
#3  0x7f5a4d8f4be4 in si_cs_send (cs=cs@entry=0x7f5a4e9a8980) at 
src/stream_interface.c:691

    send_flag = 
    conn = 0x7f5a4e86f4c0
    si = 0x7f5a4e777f98
    oc = 0x7f5a4e777d70
    ret = 
    did_send = 0
#4  0x7f5a4d8f6305 in si_cs_io_cb (t=, 
ctx=0x7f5a4e777f98, state=) at src/stream_interface.c:737

    si = 0x7f5a4e777f98
    cs = 0x7f5a4e9a8980
    ret = 0
#5  0x7f5a4d925f02 in process_runnable_tasks () at src/task.c:437
    t = 
    state = 
    ctx = 
    process = 
    t = 
    max_processed = 
#6  0x7f5a4d89f6ff in run_poll_loop () at 

Re: HAProxy 1.9.6 unresponsive

2019-05-06 Thread Willy Tarreau
On Sun, May 05, 2019 at 09:40:02AM +0200, Willy Tarreau wrote:
> With this said, after studying the code a little bit more, I'm seeing a
> potential case where if we'd have a trailers entry in the HTX buffer but
> no end of message, we could loop forever there not consuming this block.
> I have no idea if this is possible in an HTX message, I'll ask Christopher
> tomorrow. In any case we need to address this one way or another, possibly
> reporting an error instead if required. Thus I'm postponing 1.9.8 for
> tomorrow.

So the case is indeed possible and at the moment all we can do is try to
minimize the probability to produce it :-(  The issue is caused by the
moment we've received the end of trailsers but not the end of the mesage.
>From the H2 protocol perspective if we've sent the END_STREAM flag, the
stream is closed, and a closed stream gets detached and cannot receive
new traffic, so at best we'll occasionally close too early and report
client failures at the upper layers while everything went OK. We cannot
send trailers without the END_STREAM flag since no frame may follow.
Abusing CONTINUATION is out of question here as this would require to
completely freeze the whole connection (including control frames) for
the time it takes to get this final EOM block. I thought about simply
reporting an error when we're in this situation between trailers and EOM
but it will mean that occasionally some chunked responses of sizes close
to 16N kB with trailers may err out, which is not acceptable either.

For 2.0 we approximately see what needs to be modified to address this
situation, but that will not be trivial and not backportable.

For 1.9 I'm still trying to figure what the "best" solution is. I may
finally end up marking the stream as closed as soon as we see the
trailers pushed down. I'm just unsure right now about all the possible
consequences and need to study the edge cases. Also I fear that this
will be something hard to unroll later, so I'm still studying.

Willy



Re: HAProxy 1.9.6 unresponsive

2019-05-05 Thread Willy Tarreau
Hi Patrick,

On Fri, May 03, 2019 at 04:33:07PM -0400, Patrick Hemmer wrote:
> We are running HAProxy 1.9.6 and managed to get into a state where HAProxy
> was completely unresponsive. It was pegged at 100% like many of the other
> experiences here on the mailing list lately. But in addition it wouldn't
> respond to anything. The stats socket wasn't even responsive.
> 
> When I attached an strace, it sat there with no activity. When I attached
> GDB I got the following stack:
(...)
> Our config is big and complex, and not something I want to post here (I may
> be able to provide directly if required). However I think the important bit
> is that we we have a frontend and backend which are used for load balancing
> gRPC traffic (thus h2). The backend servers are h2c (no SSL).
(...)

Function h2s_htx_make_trailers() is called in loops here, and I see no way
this function can return without consuming the block, marking an error or
indicating that it's blocked. Thus I suspect this one could be a consequence
of the bug fixed by commit 9a0f559 ("BUG/MEDIUM: h2: Make sure we're not
already in the send_list in h2_subscribe().") which was backported into
1.9.7. Do not rush an upgrade though, I'm going to issue 1.9.8 soon with
a few more fixes.

With this said, after studying the code a little bit more, I'm seeing a
potential case where if we'd have a trailers entry in the HTX buffer but
no end of message, we could loop forever there not consuming this block.
I have no idea if this is possible in an HTX message, I'll ask Christopher
tomorrow. In any case we need to address this one way or another, possibly
reporting an error instead if required. Thus I'm postponing 1.9.8 for
tomorrow.

> The service has been restarted, so it cannot be probed any more. However I
> did capture a core file before doing so.

That might actually be useful to study the sequence of HTX messages there.
I may ask you to dig a little bit into it.

Thanks!
Willy



HAProxy 1.9.6 unresponsive

2019-05-03 Thread Patrick Hemmer
We are running HAProxy 1.9.6 and managed to get into a state where 
HAProxy was completely unresponsive. It was pegged at 100% like many of 
the other experiences here on the mailing list lately. But in addition 
it wouldn't respond to anything. The stats socket wasn't even responsive.


When I attached an strace, it sat there with no activity. When I 
attached GDB I got the following stack:


        (gdb) bt full
        #0  htx_get_head (htx=0x7fbeb666eba0) at include/common/htx.h:357
        No locals.
        #1  h2s_htx_make_trailers (h2s=h2s@entry=0x7fbeb625f9f0, 
htx=htx@entry=0x7fbeb666eba0) at src/mux_h2.c:4975
                        list = {{n = {ptr = 0x0, len = 0}, v = {ptr = 
0x0, len = 0}} }

                        h2c = 0x7fbeb6372320
                        blk = 
                        blk_end = 0x0
                        outbuf = {size = 140722044755807, area = 0x0, 
data = 140457080712096, head = 140457060939041}
                        h1m = {state = H1_MSG_HDR_NAME, flags = 2056, 
curr_len = 140457077580664, body_len = 16384, next = 2, err_pos = 0, 
err_state = -1237668736}

                        type = 
                        ret = 0
                        hdr = 0
                        idx = 
                        start = 
        #2  0x7fbeb50f2ef5 in h2_snd_buf (cs=0x7fbeb63ea9a0, 
buf=0x7fbeb6127048, count=2, flags=) at src/mux_h2.c:5372

                        h2s = 
                        orig_count = 
                        total = 15302
                        ret = 
                        htx = 0x7fbeb666eba0
                        blk = 
                        btype = 
                        idx = 
        #3  0x7fbeb5180be4 in si_cs_send (cs=0x7fbeb63ea9a0) at 
src/stream_interface.c:691

                        send_flag = 
                        conn = 0x7fbeb6051a70
                        si = 0x7fbeb6127268
                        oc = 0x7fbeb6127040
                        ret = 
                        did_send = 0
        #4  0x7fbeb51817c8 in si_update_both 
(si_f=si_f@entry=0x7fbeb6127268, si_b=si_b@entry=0x7fbeb61272a8) at 
src/stream_interface.c:850

                        req = 0x7fbeb6126fe0
                        res = 
                        cs = 
        #5  0x7fbeb50ea2e1 in process_stream (t=, 
context=0x7fbeb6126fd0, state=) at src/stream.c:2502

                        srv = 
                        s = 0x7fbeb6126fd0
                        sess = 
                        rqf_last = 
                        rpf_last = 3255042562
                        rq_prod_last = 
                        rq_cons_last = 
                        rp_cons_last = 7
                        rp_prod_last = 7
                        req_ana_back = 
                        req = 0x7fbeb6126fe0
                        res = 0x7fbeb6127040
                        si_f = 0x7fbeb6127268
                        si_b = 0x7fbeb61272a8
        #6  0x7fbeb51b20a8 in process_runnable_tasks () at 
src/task.c:434

                        t = 
                        state = 
                        ctx = 
                        process = 
                        t = 
                        max_processed = 
        #7  0x7fbeb512b6ff in run_poll_loop () at src/haproxy.c:2642
                        next = 
                        exp = 
        #8  run_thread_poll_loop (data=data@entry=0x7fbeb5d84620) at 
src/haproxy.c:2707

                        ptif = 
                        ptdf = 
                        start_lock = 0
        #9  0x7fbeb507d2b5 in main (argc=, 
argv=0x7ffc677d73b8) at src/haproxy.c:3343

                        tids = 0x7fbeb5d84620
                        threads = 0x7fbeb5eb6d90
                        i = 
                        old_sig = {__val = {68097, 0, 511101108338, 0, 
140722044760335, 140457059422467, 140722044760392, 140454020513805, 124, 
140457064304960, 390842023936, 140457064395072, 48, 140457035994976, 
18446603351664791121, 140454020513794}}

        ---Type  to continue, or q  to quit---
                        blocked_sig = {__val = {1844674406710583, 
18446744073709551615 }}

                        err = 
                        retry = 
                        limit = {rlim_cur = 131300, rlim_max = 131300}
                        errmsg = 
"\000@\000\000\000\000\000\000\002\366\210\263\276\177\000\000\300\364m\265\276\177\000\000`\227\274\263\276\177\000\000\030\000\000\000\000\000\000\000>\001\000\024\000\000\000\000p$o\265\276\177\000\000@>k\265\276\177\000\000\000\320$\265\276\177\000\000\274\276\177\000\000 
t}g\374\177\000\000\000\000\000\000\000\000\000\000P\367m\265"

                        pidfd = 

Our config is big and complex, and not something I want to post here (I 
may be able to provide directly if required). However I think the 
important bit is that we we have a frontend and backend which are used 
for load balancing gRPC traffic (thus h2). The backend servers are