Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-13 Thread Maciej Zdeb
Ok, I'll wait for update from you! :)

pon., 13 maj 2019 o 08:03 Willy Tarreau  napisał(a):

> Hi Maciej,
>
> On Mon, May 13, 2019 at 07:21:59AM +0200, Maciej Zdeb wrote:
> > Hi,
> >
> > I'm not observing any issues, so I think it's fixed. :)
> >
> > Willy, Olivier thank you very much!
>
> Great, thanks. I'm going to issue 1.9.8 with the patch I sent you then.
> However after discussing about it with Olivier, we came to the conclusion
> that it is not the cleanest solution, as it addresses the consequence of
> a problem (being flow controlled and being in the send list) instead of
> the cause.  Olivier proposed me a different one that I'd very much
> appreciate you to test after the release; I don't want to take any new
> risk for 1.9.8 for now. We'll keep you updated.
>
> Cheers,
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-13 Thread Willy Tarreau
Hi Maciej,

On Mon, May 13, 2019 at 07:21:59AM +0200, Maciej Zdeb wrote:
> Hi,
> 
> I'm not observing any issues, so I think it's fixed. :)
> 
> Willy, Olivier thank you very much!

Great, thanks. I'm going to issue 1.9.8 with the patch I sent you then.
However after discussing about it with Olivier, we came to the conclusion
that it is not the cleanest solution, as it addresses the consequence of
a problem (being flow controlled and being in the send list) instead of
the cause.  Olivier proposed me a different one that I'd very much
appreciate you to test after the release; I don't want to take any new
risk for 1.9.8 for now. We'll keep you updated.

Cheers,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-12 Thread Maciej Zdeb
Hi,

I'm not observing any issues, so I think it's fixed. :)

Willy, Olivier thank you very much!

Kind regards,

niedz., 12 maj 2019 o 12:11 Willy Tarreau  napisał(a):

> On Sun, May 12, 2019 at 12:00:44PM +0200, Maciej Zdeb wrote:
> > As for now it is stable, no looping. I'll leave it till monday and return
> > with feedback! :)
>
> Many thanks Maciej for this positive feedback. I'll merge it and issue
> 1.9.8
> then. There are definitely enough pending fixes for a release!
>
> Cheers,
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-12 Thread Willy Tarreau
On Sun, May 12, 2019 at 12:00:44PM +0200, Maciej Zdeb wrote:
> As for now it is stable, no looping. I'll leave it till monday and return
> with feedback! :)

Many thanks Maciej for this positive feedback. I'll merge it and issue 1.9.8
then. There are definitely enough pending fixes for a release!

Cheers,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-12 Thread Maciej Zdeb
As for now it is stable, no looping. I'll leave it till monday and return
with feedback! :)

Thanks,

sob., 11.05.2019, 15:44 użytkownik Maciej Zdeb  napisał:

> Patch applied, finger crossed, testing! :-)
>
> Thanks!
>
> sob., 11 maj 2019 o 14:58 Willy Tarreau  napisał(a):
>
>> On Sat, May 11, 2019 at 11:01:42AM +0200, Willy Tarreau wrote:
>> > On Sat, May 11, 2019 at 10:52:35AM +0200, Willy Tarreau wrote:
>> > > I certainly made a few reasoning mistakes above but I don't see
>> anything
>> > > in the code preventing this case from happening.
>> > >
>> > > Thus I'd like you to try the attached patch which is supposed to
>> prevent
>> > > this scenario from happening. At least I've verified that it doesn't
>> > > break the h2spec test suite.
>> >
>> > While trying to check if it still applied to the latest 1.9 I figured
>> > that it corresponds to what Olivier had also found and fixed in his
>> > latest patch :-/  The positive point is that my analysis was correct.
>> >
>> > So I'm afraid that if it still fails with his fix, we'll need another
>> > core :-(
>>
>> Actually not, Olivier's fix is incomplete regarding the scenario I
>> proposed :
>> - in h2s_frt_make_resp_data() we can set H2_SF_BLK_SFCTL and remove the
>>   element from the list
>> - then in h2_shutr() and h2_shutw(), we check if the list is empty before
>>   subscribing the element, which is true after the case above
>> - then in h2c_update_all_ws() we still have H2_SF_BLK_SFCTL with the item
>>   in the send_list, thus LIST_ADDQ() adds it a second time.
>>
>> Thus the first part of the patch I sent is still required, I'm attaching
>> it
>> again, rebased on top of Olivier's patch and simplified so that we don't
>> detach then re-attach.
>>
>> I'm still keeping hope ;-)
>>
>> Willy
>>
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-11 Thread Maciej Zdeb
Patch applied, finger crossed, testing! :-)

Thanks!

sob., 11 maj 2019 o 14:58 Willy Tarreau  napisał(a):

> On Sat, May 11, 2019 at 11:01:42AM +0200, Willy Tarreau wrote:
> > On Sat, May 11, 2019 at 10:52:35AM +0200, Willy Tarreau wrote:
> > > I certainly made a few reasoning mistakes above but I don't see
> anything
> > > in the code preventing this case from happening.
> > >
> > > Thus I'd like you to try the attached patch which is supposed to
> prevent
> > > this scenario from happening. At least I've verified that it doesn't
> > > break the h2spec test suite.
> >
> > While trying to check if it still applied to the latest 1.9 I figured
> > that it corresponds to what Olivier had also found and fixed in his
> > latest patch :-/  The positive point is that my analysis was correct.
> >
> > So I'm afraid that if it still fails with his fix, we'll need another
> > core :-(
>
> Actually not, Olivier's fix is incomplete regarding the scenario I
> proposed :
> - in h2s_frt_make_resp_data() we can set H2_SF_BLK_SFCTL and remove the
>   element from the list
> - then in h2_shutr() and h2_shutw(), we check if the list is empty before
>   subscribing the element, which is true after the case above
> - then in h2c_update_all_ws() we still have H2_SF_BLK_SFCTL with the item
>   in the send_list, thus LIST_ADDQ() adds it a second time.
>
> Thus the first part of the patch I sent is still required, I'm attaching it
> again, rebased on top of Olivier's patch and simplified so that we don't
> detach then re-attach.
>
> I'm still keeping hope ;-)
>
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-11 Thread Willy Tarreau
On Sat, May 11, 2019 at 11:01:42AM +0200, Willy Tarreau wrote:
> On Sat, May 11, 2019 at 10:52:35AM +0200, Willy Tarreau wrote:
> > I certainly made a few reasoning mistakes above but I don't see anything
> > in the code preventing this case from happening.
> > 
> > Thus I'd like you to try the attached patch which is supposed to prevent
> > this scenario from happening. At least I've verified that it doesn't
> > break the h2spec test suite.
> 
> While trying to check if it still applied to the latest 1.9 I figured
> that it corresponds to what Olivier had also found and fixed in his
> latest patch :-/  The positive point is that my analysis was correct.
> 
> So I'm afraid that if it still fails with his fix, we'll need another
> core :-(

Actually not, Olivier's fix is incomplete regarding the scenario I proposed :
- in h2s_frt_make_resp_data() we can set H2_SF_BLK_SFCTL and remove the
  element from the list
- then in h2_shutr() and h2_shutw(), we check if the list is empty before
  subscribing the element, which is true after the case above
- then in h2c_update_all_ws() we still have H2_SF_BLK_SFCTL with the item
  in the send_list, thus LIST_ADDQ() adds it a second time.

Thus the first part of the patch I sent is still required, I'm attaching it
again, rebased on top of Olivier's patch and simplified so that we don't
detach then re-attach.

I'm still keeping hope ;-)

Willy
diff --git a/src/mux_h2.c b/src/mux_h2.c
index d201921..94e4e5e 100644
--- a/src/mux_h2.c
+++ b/src/mux_h2.c
@@ -1484,9 +1484,8 @@ static void h2c_update_all_ws(struct h2c *h2c, int diff)
 
if (h2s->mws > 0 && (h2s->flags & H2_SF_BLK_SFCTL)) {
h2s->flags &= ~H2_SF_BLK_SFCTL;
-   if (h2s->send_wait)
+   if (h2s->send_wait && LIST_ISEMPTY(>list))
LIST_ADDQ(>send_list, >list);
-
}
 
node = eb32_next(node);
@@ -1791,9 +1790,8 @@ static int h2c_handle_window_update(struct h2c *h2c, 
struct h2s *h2s)
h2s->mws += inc;
if (h2s->mws > 0 && (h2s->flags & H2_SF_BLK_SFCTL)) {
h2s->flags &= ~H2_SF_BLK_SFCTL;
-   if (h2s->send_wait)
+   if (h2s->send_wait && LIST_ISEMPTY(>list))
LIST_ADDQ(>send_list, >list);
-
}
}
else {


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-11 Thread Maciej Zdeb
> What I find very strange is why you're possibly the only one seeing this
> (and maybe also @serimin on github issue #94). If we could figure what
> makes your case specific it could help narrow the issue down. I'm seeing
> that you have a very simple Lua service to respond to health checks, so
> I've been thinking that maybe we do have some remaining bugs when Lua is
> accessed over H2 (e.g. incorrect length or whatever), but it's not the
> case on your connection since there are something like 17 streams so we
> can rule out the hypothesis of a health check connection, and thus that
> Lua was used.
>

Correct, lua applets are used for health checking and only for specific
list of clients (distinguished by source IP). Health checks are made over
http 1.0 (not even 1.1), so it should not affect h2 in HAProxy. This
particular HAProxy is used for serving images, so the responses are quite
large and so is the traffic. I'm using HAProxy 1.9.7 on other machines,
without that problem so the traffic specification or config matters.

The difference from other instances of HAProxy 1.9.7 I'm using is the
consistent hashing (hash-type consistent, balance hdr(...) and server id
specified inf config for each server) and also inter, fastinter, downinter,
slowstart settings.


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-11 Thread Willy Tarreau
On Sat, May 11, 2019 at 10:52:35AM +0200, Willy Tarreau wrote:
> I certainly made a few reasoning mistakes above but I don't see anything
> in the code preventing this case from happening.
> 
> Thus I'd like you to try the attached patch which is supposed to prevent
> this scenario from happening. At least I've verified that it doesn't
> break the h2spec test suite.

While trying to check if it still applied to the latest 1.9 I figured
that it corresponds to what Olivier had also found and fixed in his
latest patch :-/  The positive point is that my analysis was correct.

So I'm afraid that if it still fails with his fix, we'll need another
core :-(

Thanks,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-11 Thread Willy Tarreau
On Sat, May 11, 2019 at 09:56:18AM +0200, Willy Tarreau wrote:
> I'm back to auditing the code to figure how we can free an h2s without
> first detaching it from the lists. I hope to have yet another patch to
> propose to you.

So I'm seeing something which bothers me in the code. Since I'm not at
ease with these parts I may say stupid things. But overall the idea is
the following :

- everywhere before releasing an h2s we remove it from its lists, so it
  cannot exist in a pool while still belonging to a list ; that's fine.
  => it means the element we're looping on is still live in the connection.

- at every place we're adding an element to the list, we take care of
  removing it first, or checking that it was not attached to a list. Every
  single place except one :

  static void h2c_update_all_ws(struct h2c *h2c, int diff)
  (...)
if (h2s->mws > 0 && (h2s->flags & H2_SF_BLK_SFCTL)) {
h2s->flags &= ~H2_SF_BLK_SFCTL;
if (h2s->send_wait)
LIST_ADDQ(>send_list, >list);

}

  Instead the check is made on send_wait being non-null.

- in h2s_frt_make_resp_data() we test this send_wait pointer before deciding
  to remove the element from the list in case we're blocking on flow control :

if (size <= 0) {
h2s->flags |= H2_SF_BLK_SFCTL;
if (h2s->send_wait) {
LIST_DEL(>list);
LIST_INIT(>list);
}
goto end;
}

  So if for any reason send_wait is not set, nobody removes the element
  from the list.

- this function h2s_frt_make_resp_data() is called from h2_snd_buf(), which
  starts by resetting h2s->send_wait. And it doesn't delete the stream from
  the list only in the case where it managed to send some data.

So what I'm wondering is if something like the following scenario may happen :

1) a stream wants to send data, it calls h2_snd_buf() a first time. The
   output buffer is full, the call fails

2) the caller (si_cs_send()) detects the error and calls h2_subscribe() to
   attach the stream to the send_list

3) some room is made in the buffer, the list is walked over and this stream
   is woken up.

4) h2_snd_buf() is called again to try to send, send_wait is immediately reset

5) h2s_frt_make_resp_data() detects that the stream's flow control credit
   is exhausted and declines. It sets H2_SF_BLK_SFCTL but doesn't delete
   the element from the send list since send_wait is NULL

6) another stream still has credit and fills the connection's buffer with
   its data.

7) the server the first stream is associated to delivers more data while the
   stream is still waiting for a window update, and attempts to send again
   via h2_snd_buf()

8) h2_snd_buf() doesn't detect that the stream is blocked because the check
   on sending_list doesn't match, so it tries again and sets send_wait to
   NULL again, then tries to send. It fails again because the connection's
   buffer is full

9) si_cs_send() above calls h2_subsribe() again to set send_wait and try
   to add the stream to the send_list. But since the stream has the SFCTL
   flag it's not added again, and it remains in the send_list where it
   still is.

10) a window update finally arrives for this stream (or an initial window
size update for all streams). h2c_handle_window_update() is called,
sees the stream has the SFCTL flag, removes it and appends it to the
end of the send_list, where it already was.

11) h2_process_mux() scans the list, finds the first reference to the list
element (where it was first added), and wakes it up. When going to the
next one however, it directly jumps to the one that was added past the
second addition, ignoring the orphans in the list that sit between the
first place of the element and its last one.

12) h2_snd_buf() is called again for this element, then deletes it from the
list, which has the consequence of unlinking the element from its last
predecessor and successors, but not from the first one. Once the element
is deleted, it's initialized and its next points to itself (an empty
list head).

13) h2_process_mux() scans the list again to process more streams, and
reaches the first predecessor of this element, then visits the next
which is this element again, looping onto itself.

I certainly made a few reasoning mistakes above but I don't see anything
in the code preventing this case from happening.

Thus I'd like you to try the attached patch which is supposed to prevent
this scenario from happening. At least I've verified that it doesn't
break the h2spec test suite.

Thanks,
Willy
diff --git a/src/mux_h2.c b/src/mux_h2.c
index 67a297a..e2b1533 100644
--- a/src/mux_h2.c
+++ b/src/mux_h2.c
@@ -1484,9 +1484,10 @@ static void h2c_update_all_ws(struct h2c *h2c, int diff)
 
if (h2s->mws > 0 && (h2s->flags & H2_SF_BLK_SFCTL)) {

Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-11 Thread Willy Tarreau
Hi Maciej,

On Fri, May 10, 2019 at 06:45:21PM +0200, Maciej Zdeb wrote:
> Olivier, it's still looping, but differently:
> 
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb) n
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
(...)
> (gdb) p *h2s
> $1 = {cs = 0x2f84190, sess = 0x819580 , h2c = 0x2f841a0, h1m
> = {state = 48, flags = 0, curr_len = 38317, body_len = 103852, next = 413,
> err_pos = -1, err_state = 0}, by_id = {node = {branches = {b = {0x34c0260,
>   0x321d330}}, node_p = 0x0, leaf_p = 0x0, bit = 1, pfx = 47005},
> key = 3}, id = 3, flags = 28675, mws = 1017461, errcode = H2_ERR_NO_ERROR,
> st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf = {size = 0, area =
> 0x0, data = 0,
> head = 0}, wait_event = {task = 0x2cd0ed0, handle = 0x3, events = 0},
> recv_wait = 0x0, send_wait = 0x321d390, list = {n = 0x321d3b8, p =
> 0x321d3b8}, sending_list = {n = 0x3174cf8, p = 0x3174cf8}}
(...)

In fact it's exactly the same. I've analyzed the core you sent me (thanks
a lot for this by the way), and it explains very well why it's looping,
though I don't yet understand how we entered this situation. It also
explains how it can randomly crash instead of looping.

What happens is that the item you're inspecting above has an empty
list element which points to itself. Thus it doesn't belong to a list
and its next is itself. But it's still attached to its predecessor. In
your core file it was very visible as the h2c list's head points to the
element and the list tail's tail points to it as well. So what we're
seeing is an element that was *not* unlinked from a list before being
removed / recycled. Once it was reused, it was initialized with an
empty list header and the next time the connection's list was visited,
it caused the loop.

What I find very strange is why you're possibly the only one seeing this
(and maybe also @serimin on github issue #94). If we could figure what
makes your case specific it could help narrow the issue down. I'm seeing
that you have a very simple Lua service to respond to health checks, so
I've been thinking that maybe we do have some remaining bugs when Lua is
accessed over H2 (e.g. incorrect length or whatever), but it's not the
case on your connection since there are something like 17 streams so we
can rule out the hypothesis of a health check connection, and thus that
Lua was used.

I'm back to auditing the code to figure how we can free an h2s without
first detaching it from the lists. I hope to have yet another patch to
propose to you.

Thanks again for your traces, they've been amazingly useful!
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-10 Thread Maciej Zdeb
Olivier, it's still looping, but differently:

2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb) n
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb) gcore
warning: target file /proc/12265/cmdline contained unexpected null
characters
warning: Memory read failed for corefile section, 12288 bytes at
0x7fff17ff3000.
Saved corefile core.12265
(gdb) p *h2s
$1 = {cs = 0x2f84190, sess = 0x819580 , h2c = 0x2f841a0, h1m
= {state = 48, flags = 0, curr_len = 38317, body_len = 103852, next = 413,
err_pos = -1, err_state = 0}, by_id = {node = {branches = {b = {0x34c0260,
  0x321d330}}, node_p = 0x0, leaf_p = 0x0, bit = 1, pfx = 47005},
key = 3}, id = 3, flags = 28675, mws = 1017461, errcode = H2_ERR_NO_ERROR,
st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf = {size = 0, area =
0x0, data = 0,
head = 0}, wait_event = {task = 0x2cd0ed0, handle = 0x3, events = 0},
recv_wait = 0x0, send_wait = 0x321d390, list = {n = 0x321d3b8, p =
0x321d3b8}, sending_list = {n = 0x3174cf8, p = 0x3174cf8}}
(gdb) p *h2s_back
$2 = {cs = 0x2f84190, sess = 0x819580 , h2c = 0x2f841a0, h1m
= {state = 48, flags = 0, curr_len = 38317, body_len = 103852, next = 413,
err_pos = -1, err_state = 0}, by_id = {node = {branches = {b = {0x34c0260,
  0x321d330}}, node_p = 0x0, leaf_p = 0x0, bit = 1, pfx = 47005},
key = 3}, id = 3, flags = 28675, mws = 1017461, errcode = H2_ERR_NO_ERROR,
st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf = {size = 0, area =
0x0, data = 0,
head = 0}, wait_event = {task = 0x2cd0ed0, handle = 0x3, events = 0},
recv_wait = 0x0, send_wait = 0x321d390, list = {n = 0x321d3b8, p =
0x321d3b8}, sending_list = {n = 0x3174cf8, p = 0x3174cf8}}
(gdb) p *h2c
$3 = {conn = 0x2d10700, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
flags = 0, streams_limit = 100, max_id = 9, rcvd_c = 0, rcvd_s = 0, ddht =
0x2f311a0, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 3, dfl
= 4,
  dft = 8 '\b', dff = 0 '\000', dpl = 0 '\000', last_sid = -1, mbuf = {size
= 16384, area = 0x34f0a10 "", 

Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-10 Thread Maciej Zdeb
I've just sent some additional data to Willy. :)

Sure, I'll test your patch!

pt., 10 maj 2019 o 15:11 Olivier Houchard 
napisał(a):

> Hi Maciej,
>
> On Thu, May 09, 2019 at 07:25:54PM +0200, Maciej Zdeb wrote:
> > Hi again,
> >
> > I have bad news, HAProxy 1.9.7-35b44da still looping :/
> >
> > gdb session:
> > h2_process_mux (h2c=0x1432420) at src/mux_h2.c:2609
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb) n
> > 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> > H2_CF_MUX_BLOCK_ANY)
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> > H2_CF_MUX_BLOCK_ANY)
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2613if (!LIST_ISEMPTY(>sending_list))
> > (gdb)
> > 2619if (!h2s->send_wait) {
> > (gdb)
> > 2620LIST_DEL_INIT(>list);
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> > H2_CF_MUX_BLOCK_ANY)
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> > H2_CF_MUX_BLOCK_ANY)
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2613if (!LIST_ISEMPTY(>sending_list))
> > (gdb)
> > 2619if (!h2s->send_wait) {
> > (gdb)
> > 2620LIST_DEL_INIT(>list);
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> > H2_CF_MUX_BLOCK_ANY)
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> > H2_CF_MUX_BLOCK_ANY)
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb)
> > 2613if (!LIST_ISEMPTY(>sending_list))
> > (gdb)
> > 2619if (!h2s->send_wait) {
> > (gdb)
> > 2620LIST_DEL_INIT(>list);
> > (gdb)
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb) p *h2s
> > $1 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420,
> h1m
> > = {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976,
> next =
> > 411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
> > {0x13dcf50,
> >   0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1,
> pfx
> > = 0}, key = 11}, id = 11, flags = 28675, mws = 977198, errcode =
> > H2_ERR_NO_ERROR, st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf =
> > {size = 0, area = 0x0,
> > data = 0, head = 0}, wait_event = {task = 0x15077a0, handle = 0x0,
> > events = 0}, recv_wait = 0x0, send_wait = 0x0, list = {n = 0x15b31a8, p =
> > 0x15b31a8}, sending_list = {n = 0x15b31b8, p = 0x15b31b8}}
> > (gdb) p *h2s_back
> > $2 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420,
> h1m
> > = {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976,
> next =
> > 411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
> > {0x13dcf50,
> >   0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1,
> pfx
> > = 0}, key = 11}, id = 11, flags = 28675, mws = 977198, errcode =
> > H2_ERR_NO_ERROR, st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf =
> > {size = 0, area = 0x0,
> > data = 0, head = 0}, wait_event = {task = 0x15077a0, handle = 0x0,
> > events = 0}, recv_wait = 0x0, send_wait = 0x0, list = {n = 0x15b31a8, p =
> > 0x15b31a8}, sending_list = {n = 0x15b31b8, p = 0x15b31b8}}
> > (gdb) p *h2c
> > $3 = {conn = 0x17e3310, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
> > flags = 0, streams_limit = 100, max_id = 13, rcvd_c = 0, rcvd_s = 0,
> ddht =
> > 0x1e99a40, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 13,
> dfl
> > = 4,
> >   dft = 8 '\b', dff = 0 '\000', dpl = 0 '\000', last_sid = -1, mbuf =
> {size
> > = 16384, area = 0x1e573a0 "", data = 13700, head = 0}, msi = -1, mfl = 0,
> > mft = 0 '\000', mff = 0 '\000', miw = 65535, mws = 10159243, mfs = 16384,
> >   timeout = 2, shut_timeout = 2, nb_streams = 2, nb_cs = 3,
> > nb_reserved = 0, stream_cnt = 7, proxy = 0xb85fc0, task = 0x126aa30,
> > streams_by_id = {b = {0x125ab91, 0x0}}, send_list = {n = 0x15b31a8, p =
> > 0x125ac18}, fctl_list = {
> > n = 0x14324f8, p = 0x14324f8}, sending_list = {n = 0x1432508, p =
> > 0x1432508}, buf_wait = {target = 0x0, wakeup_cb = 0x0, list = {n =
> > 0x1432528, p = 0x1432528}}, wait_event = {task = 0x1420fa0, handle = 0x0,
> > events = 1}}
> > (gdb) p list
> > $4 = (int *) 0x0
> > (gdb) n
> > 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> > H2_CF_MUX_BLOCK_ANY)
> > (gdb) n
> > 2609list_for_each_entry_safe(h2s, 

Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-10 Thread Olivier Houchard
Hi Maciej,

On Thu, May 09, 2019 at 07:25:54PM +0200, Maciej Zdeb wrote:
> Hi again,
> 
> I have bad news, HAProxy 1.9.7-35b44da still looping :/
> 
> gdb session:
> h2_process_mux (h2c=0x1432420) at src/mux_h2.c:2609
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb) n
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2613if (!LIST_ISEMPTY(>sending_list))
> (gdb)
> 2619if (!h2s->send_wait) {
> (gdb)
> 2620LIST_DEL_INIT(>list);
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2613if (!LIST_ISEMPTY(>sending_list))
> (gdb)
> 2619if (!h2s->send_wait) {
> (gdb)
> 2620LIST_DEL_INIT(>list);
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb)
> 2613if (!LIST_ISEMPTY(>sending_list))
> (gdb)
> 2619if (!h2s->send_wait) {
> (gdb)
> 2620LIST_DEL_INIT(>list);
> (gdb)
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb) p *h2s
> $1 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420, h1m
> = {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976, next =
> 411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
> {0x13dcf50,
>   0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1, pfx
> = 0}, key = 11}, id = 11, flags = 28675, mws = 977198, errcode =
> H2_ERR_NO_ERROR, st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf =
> {size = 0, area = 0x0,
> data = 0, head = 0}, wait_event = {task = 0x15077a0, handle = 0x0,
> events = 0}, recv_wait = 0x0, send_wait = 0x0, list = {n = 0x15b31a8, p =
> 0x15b31a8}, sending_list = {n = 0x15b31b8, p = 0x15b31b8}}
> (gdb) p *h2s_back
> $2 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420, h1m
> = {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976, next =
> 411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
> {0x13dcf50,
>   0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1, pfx
> = 0}, key = 11}, id = 11, flags = 28675, mws = 977198, errcode =
> H2_ERR_NO_ERROR, st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf =
> {size = 0, area = 0x0,
> data = 0, head = 0}, wait_event = {task = 0x15077a0, handle = 0x0,
> events = 0}, recv_wait = 0x0, send_wait = 0x0, list = {n = 0x15b31a8, p =
> 0x15b31a8}, sending_list = {n = 0x15b31b8, p = 0x15b31b8}}
> (gdb) p *h2c
> $3 = {conn = 0x17e3310, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
> flags = 0, streams_limit = 100, max_id = 13, rcvd_c = 0, rcvd_s = 0, ddht =
> 0x1e99a40, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 13, dfl
> = 4,
>   dft = 8 '\b', dff = 0 '\000', dpl = 0 '\000', last_sid = -1, mbuf = {size
> = 16384, area = 0x1e573a0 "", data = 13700, head = 0}, msi = -1, mfl = 0,
> mft = 0 '\000', mff = 0 '\000', miw = 65535, mws = 10159243, mfs = 16384,
>   timeout = 2, shut_timeout = 2, nb_streams = 2, nb_cs = 3,
> nb_reserved = 0, stream_cnt = 7, proxy = 0xb85fc0, task = 0x126aa30,
> streams_by_id = {b = {0x125ab91, 0x0}}, send_list = {n = 0x15b31a8, p =
> 0x125ac18}, fctl_list = {
> n = 0x14324f8, p = 0x14324f8}, sending_list = {n = 0x1432508, p =
> 0x1432508}, buf_wait = {target = 0x0, wakeup_cb = 0x0, list = {n =
> 0x1432528, p = 0x1432528}}, wait_event = {task = 0x1420fa0, handle = 0x0,
> events = 1}}
> (gdb) p list
> $4 = (int *) 0x0
> (gdb) n
> 2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb) n
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb) p *h2s
> $5 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420, h1m
> = {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976, next =
> 411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
> {0x13dcf50,
>   0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1, pfx
> = 0}, key = 11}, id = 11, flags = 28675, 

Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-10 Thread Maciej Zdeb
I'm gettingh old... I failed to remember to dump core :( And already killed
the process. Sorry, but the issue must reoccur and I can't say how long it
may take.

As soon as I get core dump I'll return.

pt., 10.05.2019, 10:35 użytkownik Willy Tarreau  napisał:

> Hi Maciej,
>
> On Thu, May 09, 2019 at 07:25:54PM +0200, Maciej Zdeb wrote:
> > Hi again,
> >
> > I have bad news, HAProxy 1.9.7-35b44da still looping :/
>
> Well, it's getting really annoying. Something's definitely wrong in
> this list and I can't figure what.
>
> > 2609list_for_each_entry_safe(h2s, h2s_back, >send_list,
> list) {
> > (gdb) p *h2s
> > $1 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420,
> h1m
> ^^^
>
> Seeing things like the above make me doubt about the list's integrity, thus
> it could again hold an element that was already reused somewhere else.
> Could
> it be possible for you to share your unstripped executable, a core dump and
> your config ? (not to the list! just send a private link to Olivier or me).
>
> At this point either we find what's happening or we'll have to issue 1.9.8
> with this bug still alive, which doesn't make me feel comfortable to say
> the least :-/
>
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-10 Thread Willy Tarreau
Hi Maciej,

On Thu, May 09, 2019 at 07:25:54PM +0200, Maciej Zdeb wrote:
> Hi again,
> 
> I have bad news, HAProxy 1.9.7-35b44da still looping :/

Well, it's getting really annoying. Something's definitely wrong in
this list and I can't figure what.

> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb) p *h2s
> $1 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420, h1m
^^^

Seeing things like the above make me doubt about the list's integrity, thus
it could again hold an element that was already reused somewhere else. Could
it be possible for you to share your unstripped executable, a core dump and
your config ? (not to the list! just send a private link to Olivier or me).

At this point either we find what's happening or we'll have to issue 1.9.8
with this bug still alive, which doesn't make me feel comfortable to say 
the least :-/

Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-09 Thread Maciej Zdeb
Hi again,

I have bad news, HAProxy 1.9.7-35b44da still looping :/

gdb session:
h2_process_mux (h2c=0x1432420) at src/mux_h2.c:2609
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb) n
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2619if (!h2s->send_wait) {
(gdb)
2620LIST_DEL_INIT(>list);
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2619if (!h2s->send_wait) {
(gdb)
2620LIST_DEL_INIT(>list);
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb)
2613if (!LIST_ISEMPTY(>sending_list))
(gdb)
2619if (!h2s->send_wait) {
(gdb)
2620LIST_DEL_INIT(>list);
(gdb)
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb) p *h2s
$1 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420, h1m
= {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976, next =
411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
{0x13dcf50,
  0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1, pfx
= 0}, key = 11}, id = 11, flags = 28675, mws = 977198, errcode =
H2_ERR_NO_ERROR, st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf =
{size = 0, area = 0x0,
data = 0, head = 0}, wait_event = {task = 0x15077a0, handle = 0x0,
events = 0}, recv_wait = 0x0, send_wait = 0x0, list = {n = 0x15b31a8, p =
0x15b31a8}, sending_list = {n = 0x15b31b8, p = 0x15b31b8}}
(gdb) p *h2s_back
$2 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420, h1m
= {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976, next =
411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
{0x13dcf50,
  0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1, pfx
= 0}, key = 11}, id = 11, flags = 28675, mws = 977198, errcode =
H2_ERR_NO_ERROR, st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf =
{size = 0, area = 0x0,
data = 0, head = 0}, wait_event = {task = 0x15077a0, handle = 0x0,
events = 0}, recv_wait = 0x0, send_wait = 0x0, list = {n = 0x15b31a8, p =
0x15b31a8}, sending_list = {n = 0x15b31b8, p = 0x15b31b8}}
(gdb) p *h2c
$3 = {conn = 0x17e3310, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
flags = 0, streams_limit = 100, max_id = 13, rcvd_c = 0, rcvd_s = 0, ddht =
0x1e99a40, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 13, dfl
= 4,
  dft = 8 '\b', dff = 0 '\000', dpl = 0 '\000', last_sid = -1, mbuf = {size
= 16384, area = 0x1e573a0 "", data = 13700, head = 0}, msi = -1, mfl = 0,
mft = 0 '\000', mff = 0 '\000', miw = 65535, mws = 10159243, mfs = 16384,
  timeout = 2, shut_timeout = 2, nb_streams = 2, nb_cs = 3,
nb_reserved = 0, stream_cnt = 7, proxy = 0xb85fc0, task = 0x126aa30,
streams_by_id = {b = {0x125ab91, 0x0}}, send_list = {n = 0x15b31a8, p =
0x125ac18}, fctl_list = {
n = 0x14324f8, p = 0x14324f8}, sending_list = {n = 0x1432508, p =
0x1432508}, buf_wait = {target = 0x0, wakeup_cb = 0x0, list = {n =
0x1432528, p = 0x1432528}}, wait_event = {task = 0x1420fa0, handle = 0x0,
events = 1}}
(gdb) p list
$4 = (int *) 0x0
(gdb) n
2610if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb) n
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb) p *h2s
$5 = {cs = 0x1499030, sess = 0x819580 , h2c = 0x1432420, h1m
= {state = H1_MSG_DONE, flags = 29, curr_len = 0, body_len = 111976, next =
411, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
{0x13dcf50,
  0x15b3120}}, node_p = 0x125ab90, leaf_p = 0x15b3121, bit = 1, pfx
= 0}, key = 11}, id = 11, flags = 28675, mws = 977198, errcode =
H2_ERR_NO_ERROR, st = H2_SS_CLOSED, status = 200, body_len = 0, rxbuf =
{size = 0, area = 0x0,
data = 0, head = 0}, wait_event = {task = 0x15077a0, handle = 0x0,
events = 0}, recv_wait = 0x0, send_wait = 0x0, list = {n = 0x15b31a8, p =
0x15b31a8}, sending_list = {n = 

Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-09 Thread Maciej Zdeb
I'm happy to help! :) Checking Olivier patch.

Thanks!

czw., 9 maj 2019 o 14:34 Willy Tarreau  napisał(a):

> On Thu, May 09, 2019 at 02:31:58PM +0200, Maciej Zdeb wrote:
> > What a bad luck :D I must have compiled it just before you pushed that
> > change (segfault above is from haproxy 1.9.7-9b8ac0f).
>
> Great, so there's still some hope. I really appreciate your help and
> feedback here, such issues are extremely difficult to track down and
> even to reproduce and your help is invaluable here.
>
> Cheers,
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-09 Thread Willy Tarreau
On Thu, May 09, 2019 at 02:31:58PM +0200, Maciej Zdeb wrote:
> What a bad luck :D I must have compiled it just before you pushed that
> change (segfault above is from haproxy 1.9.7-9b8ac0f).

Great, so there's still some hope. I really appreciate your help and
feedback here, such issues are extremely difficult to track down and
even to reproduce and your help is invaluable here.

Cheers,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-09 Thread Willy Tarreau
On Thu, May 09, 2019 at 02:19:26PM +0200, Maciej Zdeb wrote:
> Hi Willy,
> 
> I've built 1.9 from head, unfortunately something is wrong, right now I've
> got segfault:
> 
> Core was generated by `/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p
> /var/run/haproxy.pid -D -sf 75'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00484ab4 in h2_process_mux (h2c=0x1cda990) at
> src/mux_h2.c:2609
> 2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
> (gdb) bt f
> #0  0x00484ab4 in h2_process_mux (h2c=0x1cda990) at
> src/mux_h2.c:2609
> h2s = 0x2
  ^

So now we have the proof that the memory is corrupted! Olivier thinks
he found a possible culprit one hour ago, which could cause a detached
stream to stay referenced, and thus to corrupt the lists. I'm unsure
to what extent it can cause the presence of a dead stream in this list,
but from the beginning we're seeing wrong stuff there in your reports
and it affects the same list :-/  For reference the commit ID in 1.9
is 07a9f0265 ("BUG/MEDIUM: h2: Make sure we set send_list to NULL in
h2_detach()."). I hope that's not what you're running so that there's
still some hope. Care to double-check ?

Thanks,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-09 Thread Maciej Zdeb
Hi Willy,

I've built 1.9 from head, unfortunately something is wrong, right now I've
got segfault:

Core was generated by `/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p
/var/run/haproxy.pid -D -sf 75'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00484ab4 in h2_process_mux (h2c=0x1cda990) at
src/mux_h2.c:2609
2609list_for_each_entry_safe(h2s, h2s_back, >send_list, list) {
(gdb) bt f
#0  0x00484ab4 in h2_process_mux (h2c=0x1cda990) at
src/mux_h2.c:2609
h2s = 0x2
h2s_back = 
#1  h2_send (h2c=0x1cda990) at src/mux_h2.c:2747
flags = 
conn = 0x2d12190
done = 0
sent = 0
#2  0x0048d488 in pool_free (ptr=0x1cdaa68, pool=0x0) at
include/common/memory.h:327
No locals.
#3  __b_drop (buf=) at include/common/buffer.h:112
No locals.
#4  b_drop (buf=0x1cda9b8) at include/common/buffer.h:119
No locals.
#5  b_free (buf=0x1cda9b8) at include/common/buffer.h:125
No locals.
#6  h2_release_buf (h2c=0x1cda990, bptr=0x1cda9b8) at src/mux_h2.c:407
bptr = 0x1cda9b8
#7  h2_process (h2c=0x1cda990) at src/mux_h2.c:2889
conn = 0xfffdd7b8
#8  0x004bf425 in run_thread_poll_loop (data=0x0) at
src/haproxy.c:2710
ptif = 
ptdf = 
start_lock = 0
#9  0x0001 in ?? ()
No symbol table info available.
#10 0x0041fb86 in main (argc=, argv=0x7ffe885af938)
at src/haproxy.c:3354
tids = 0x1924900
threads = 0xfffc
i = 
old_sig = {__val = {0, 140199073187936, 140199073187072,
140199071033905, 0, 140199073173696, 1, 0, 18446603340516163585,
140199073187072, 1, 140199073235400, 0, 140199073236256, 0, 24}}
blocked_sig = {__val = {1844674406710583, 18446744073709551615
}}
err = 
retry = 
limit = {rlim_cur = 100, rlim_max = 100}
errmsg = "\000\000\000\000\000\000\000\000n\000\000\000w", '\000'
,
"p1K\243\202\177\000\000\000\000\000\000\000\000\000\000`'\\\242\202\177\000\000\030\000\000\000\000\000\000\000\200f4\001\000\000\000\000>\001\000\024\000\000\000\000
\352J\243\202\177\000\000`\221\065\001\000\000\000\000\340*(\242\202\177\000\000\370\371Z\210"
pidfd = 

czw., 9 maj 2019 o 10:51 Willy Tarreau  napisał(a):

> Hi Maciej,
>
> I've just pushed a number of fixes into 1.9-master, including the one
> I was talking about, if you want to try again.
>
> Cheers,
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-09 Thread Willy Tarreau
Hi Maciej,

I've just pushed a number of fixes into 1.9-master, including the one
I was talking about, if you want to try again.

Cheers,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-08 Thread Maciej Zdeb
Ok, thanks!

śr., 8.05.2019, 16:50 użytkownik Willy Tarreau  napisał:

> On Wed, May 08, 2019 at 04:48:55PM +0200, Maciej Zdeb wrote:
> > I'll gladly test Olivier patch after backporting. :)
>
> Thank you Maciej. I still have other stuff to deal with before going back
> to the pending 1.9 backports, and I'll merge it.
>
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-08 Thread Maciej Zdeb
I'll gladly test Olivier patch after backporting. :)

śr., 8.05.2019, 15:29 użytkownik Willy Tarreau  napisał:

> On Wed, May 08, 2019 at 03:03:23PM +0200, Olivier Houchard wrote:
> > > > I can't seem to remember :)
> > >
> > > Given the number of bugs we've dealt with in the last few weeks, you're
> > > forgiven :-)
> > >
> >
> > I'm afraid I'm getting old :/
>
> Ah! you see how it feels!
>
> > > > I think that patch is safe to backport anyway, but I wouldn't swear
> it solves
> > > > Maciej's issue.
> > >
> > > OK. I'll take it then so that we can study the situation under better
> > > conditions if the problem happens again. It may be possible that we've
> > > also overlooked other situations where this infinite loop could happen
> > > with SUB_CALL_UNSUBSCRIBE and that were addressed by your patch.
> > >
> >
> > Yes it is certainly possible it helps, I just can affirm it with
> confidence.
>
> This is why I prefer to take it. The amount of changes that went into 1.9
> was significant enough to get such hard-to-debug issues. As long as fixes
> don't harm, I prefer to take them than having to fight one full week with
> a bug just to discover that it was already fixed upstream.
>
> Thanks,
> Willy
>


Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-08 Thread Willy Tarreau
On Wed, May 08, 2019 at 03:03:23PM +0200, Olivier Houchard wrote:
> > > I can't seem to remember :)
> > 
> > Given the number of bugs we've dealt with in the last few weeks, you're
> > forgiven :-)
> > 
> 
> I'm afraid I'm getting old :/

Ah! you see how it feels!

> > > I think that patch is safe to backport anyway, but I wouldn't swear it 
> > > solves
> > > Maciej's issue.
> > 
> > OK. I'll take it then so that we can study the situation under better
> > conditions if the problem happens again. It may be possible that we've
> > also overlooked other situations where this infinite loop could happen
> > with SUB_CALL_UNSUBSCRIBE and that were addressed by your patch.
> > 
> 
> Yes it is certainly possible it helps, I just can affirm it with confidence.

This is why I prefer to take it. The amount of changes that went into 1.9
was significant enough to get such hard-to-debug issues. As long as fixes
don't harm, I prefer to take them than having to fight one full week with
a bug just to discover that it was already fixed upstream.

Thanks,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-08 Thread Olivier Houchard
On Wed, May 08, 2019 at 02:30:07PM +0200, Willy Tarreau wrote:
> On Wed, May 08, 2019 at 01:56:05PM +0200, Olivier Houchard wrote:
> > > > One of processes stuck in infinite loop, admin socket is not responsive 
> > > > so
> > > > I've got information only from gdb:
> > > > 
> > > > 0x00484ab8 in h2_process_mux (h2c=0x2e8ff30) at 
> > > > src/mux_h2.c:2589
> > > > 2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
> > > > (gdb) n
> > > (...)
> > > 
> > > CCing Olivier. Olivier, I'm wondering if this is not directly related to
> > > what you addressed with this fix merged in 2.0 but not backported :
> > > 
> > >   998410a ("BUG/MEDIUM: h2: Revamp the way send subscriptions works.")
> > > 
> > > From what I'm seeing there's no error, the stream is in the sending list,
> > > there's no blocking flag, well, everything looks OK, but we're looping
> > > on SUB_CALL_UNSUBSCRIBE which apprently should not if I understand it
> > > right. Do you think we should backport this patch ?
> > > 
> > > Remaining of the trace below for reference.
> > > 
> > > THanks,
> > > Willy
> > > 
> > 
> > I can't seem to remember :)
> 
> Given the number of bugs we've dealt with in the last few weeks, you're
> forgiven :-)
> 

I'm afraid I'm getting old :/

> > I remember a bug involving the h2s being added to the send_list twice, and
> > thus leading to an infinite loop when parsing the send_list, but it was
> > fixed by 9a0f559676c4d309edbe42ba33197e7dd8935f1c, and backported.
> > For me, this one was mostly to make me happier about the code, and stop
> > abusing SUB_CALL_UNSUBSCRIBE, and if I didn't mention it should be 
> > backported
> > it's probably I considered the code in 1.9 was OK (or I just forgot), but
> > it's still flagged as "BUG" too (but that could be me having "BUG/MEDIUM"
> > hardcoded in my finger).
> > I think that patch is safe to backport anyway, but I wouldn't swear it 
> > solves
> > Maciej's issue.
> 
> OK. I'll take it then so that we can study the situation under better
> conditions if the problem happens again. It may be possible that we've
> also overlooked other situations where this infinite loop could happen
> with SUB_CALL_UNSUBSCRIBE and that were addressed by your patch.
> 

Yes it is certainly possible it helps, I just can affirm it with confidence.

Regards,

Olivier



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-08 Thread Willy Tarreau
On Wed, May 08, 2019 at 01:56:05PM +0200, Olivier Houchard wrote:
> > > One of processes stuck in infinite loop, admin socket is not responsive so
> > > I've got information only from gdb:
> > > 
> > > 0x00484ab8 in h2_process_mux (h2c=0x2e8ff30) at src/mux_h2.c:2589
> > > 2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
> > > (gdb) n
> > (...)
> > 
> > CCing Olivier. Olivier, I'm wondering if this is not directly related to
> > what you addressed with this fix merged in 2.0 but not backported :
> > 
> >   998410a ("BUG/MEDIUM: h2: Revamp the way send subscriptions works.")
> > 
> > From what I'm seeing there's no error, the stream is in the sending list,
> > there's no blocking flag, well, everything looks OK, but we're looping
> > on SUB_CALL_UNSUBSCRIBE which apprently should not if I understand it
> > right. Do you think we should backport this patch ?
> > 
> > Remaining of the trace below for reference.
> > 
> > THanks,
> > Willy
> > 
> 
> I can't seem to remember :)

Given the number of bugs we've dealt with in the last few weeks, you're
forgiven :-)

> I remember a bug involving the h2s being added to the send_list twice, and
> thus leading to an infinite loop when parsing the send_list, but it was
> fixed by 9a0f559676c4d309edbe42ba33197e7dd8935f1c, and backported.
> For me, this one was mostly to make me happier about the code, and stop
> abusing SUB_CALL_UNSUBSCRIBE, and if I didn't mention it should be backported
> it's probably I considered the code in 1.9 was OK (or I just forgot), but
> it's still flagged as "BUG" too (but that could be me having "BUG/MEDIUM"
> hardcoded in my finger).
> I think that patch is safe to backport anyway, but I wouldn't swear it solves
> Maciej's issue.

OK. I'll take it then so that we can study the situation under better
conditions if the problem happens again. It may be possible that we've
also overlooked other situations where this infinite loop could happen
with SUB_CALL_UNSUBSCRIBE and that were addressed by your patch.

Thanks,
Willy



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-08 Thread Olivier Houchard
Hi,

On Wed, May 08, 2019 at 07:11:27AM +0200, Willy Tarreau wrote:
> Hi Maciej,
> 
> On Tue, May 07, 2019 at 07:08:47PM +0200, Maciej Zdeb wrote:
> > Hi,
> > 
> > I've got another bug with 100% CPU on HAProxy process, it is built from
> > HEAD of 1.9 branch.
> > 
> > One of processes stuck in infinite loop, admin socket is not responsive so
> > I've got information only from gdb:
> > 
> > 0x00484ab8 in h2_process_mux (h2c=0x2e8ff30) at src/mux_h2.c:2589
> > 2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
> > (gdb) n
> (...)
> 
> CCing Olivier. Olivier, I'm wondering if this is not directly related to
> what you addressed with this fix merged in 2.0 but not backported :
> 
>   998410a ("BUG/MEDIUM: h2: Revamp the way send subscriptions works.")
> 
> From what I'm seeing there's no error, the stream is in the sending list,
> there's no blocking flag, well, everything looks OK, but we're looping
> on SUB_CALL_UNSUBSCRIBE which apprently should not if I understand it
> right. Do you think we should backport this patch ?
> 
> Remaining of the trace below for reference.
> 
> THanks,
> Willy
> 

I can't seem to remember :)
I remember a bug involving the h2s being added to the send_list twice, and
thus leading to an infinite loop when parsing the send_list, but it was
fixed by 9a0f559676c4d309edbe42ba33197e7dd8935f1c, and backported.
For me, this one was mostly to make me happier about the code, and stop
abusing SUB_CALL_UNSUBSCRIBE, and if I didn't mention it should be backported
it's probably I considered the code in 1.9 was OK (or I just forgot), but
it's still flagged as "BUG" too (but that could be me having "BUG/MEDIUM"
hardcoded in my finger).
I think that patch is safe to backport anyway, but I wouldn't swear it solves
Maciej's issue.

Regards,

Olivier



Re: [1.9 HEAD] HAProxy using 100% CPU

2019-05-07 Thread Willy Tarreau
Hi Maciej,

On Tue, May 07, 2019 at 07:08:47PM +0200, Maciej Zdeb wrote:
> Hi,
> 
> I've got another bug with 100% CPU on HAProxy process, it is built from
> HEAD of 1.9 branch.
> 
> One of processes stuck in infinite loop, admin socket is not responsive so
> I've got information only from gdb:
> 
> 0x00484ab8 in h2_process_mux (h2c=0x2e8ff30) at src/mux_h2.c:2589
> 2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
> (gdb) n
(...)

CCing Olivier. Olivier, I'm wondering if this is not directly related to
what you addressed with this fix merged in 2.0 but not backported :

  998410a ("BUG/MEDIUM: h2: Revamp the way send subscriptions works.")

>From what I'm seeing there's no error, the stream is in the sending list,
there's no blocking flag, well, everything looks OK, but we're looping
on SUB_CALL_UNSUBSCRIBE which apprently should not if I understand it
right. Do you think we should backport this patch ?

Remaining of the trace below for reference.

THanks,
Willy

---

> 2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2586list_for_each_entry(h2s, >send_list, list) {
> (gdb)
> 2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
> (gdb)
> 2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb)
> 2586list_for_each_entry(h2s, >send_list, list) {
> (gdb)
> 2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb) p h2c
> $1 = (struct h2c *) 0x2e8ff30
> (gdb) p *h2c
> $2 = {conn = 0x2b4c900, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
> flags = 0, streams_limit = 100, max_id = 149, rcvd_c = 0, rcvd_s = 0, ddht
> = 0x34099c0, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 149,
> dfl = 0,
>   dft = 1 '\001', dff = 37 '%', dpl = 0 '\000', last_sid = -1, mbuf = {size
> = 16384, area = 0x2ec3d50 "", data = 0, head = 0}, msi = -1, mfl = 0, mft =
> 0 '\000', mff = 0 '\000', miw = 6291456, mws = 15443076, mfs = 16384,
>   timeout = 2, shut_timeout = 2, nb_streams = 53, nb_cs = 53,
> nb_reserved = 0, stream_cnt = 75, proxy = 0x219ffe0, task = 0x34081d0,
> streams_by_id = {b = {0x2adc2e1, 0x0}}, send_list = {n = 0x2ac5b38, p =
> 0x3093c18},
>   fctl_list = {n = 0x2e90008, p = 0x2e90008}, sending_list = {n =
> 0x2ac5b48, p = 0x2ec2798}, buf_wait = {target = 0x0, wakeup_cb = 0x0, list
> = {n = 0x2e90038, p = 0x2e90038}}, wait_event = {task = 0x2b2ae90, handle =
> 0x0, events = 1}}
> (gdb) n
> 2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
> (gdb) p *h2s
> $3 = {cs = 0x297bdb0, sess = 0x819580 , h2c = 0x2e8ff30, h1m
> = {state = H1_MSG_RPBEFORE, flags = 12, curr_len = 0, body_len = 0, next =
> 0, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
> {0x2a72250,
>   0x2961c30}}, node_p = 0x2a72251, leaf_p = 0x2961c31, bit = 1, pfx
> = 49017}, key = 103}, id = 103, flags = 16385, mws = 6291456, errcode =
> H2_ERR_NO_ERROR, st = H2_SS_HREM, status = 0, body_len = 0, rxbuf = {size =
> 0,
> area = 0x0, data = 0, head = 0}, wait_event = {task = 0x2fb3ee0, handle
> = 0x0, events = 0}, recv_wait = 0x2b8d700, send_wait = 0x2b8d700, list = {n
> = 0x3130108, p = 0x2b02238}, sending_list = {n = 0x3130118, p = 0x2b02248}}
> (gdb) n
> 2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb) p *h2c
> $4 = {conn = 0x2b4c900, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
> flags = 0, streams_limit = 100, max_id = 149, rcvd_c = 0, rcvd_s = 0, ddht
> = 0x34099c0, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 149,
> dfl = 0,
>   dft = 1 '\001', dff = 37 '%', dpl = 0 '\000', last_sid = -1, mbuf = {size
> = 16384, area = 0x2ec3d50 "", data = 0, head = 0}, msi = -1, mfl = 0, mft =
> 0 '\000', mff = 0 '\000', miw = 6291456, mws = 15443076, mfs = 16384,
>   timeout = 2, shut_timeout = 2, nb_streams = 53, nb_cs = 53,
> nb_reserved = 0, stream_cnt = 75, proxy = 0x219ffe0, task = 0x34081d0,
> streams_by_id = {b = {0x2adc2e1, 0x0}}, send_list = {n = 0x2ac5b38, p =
> 0x3093c18},
>   fctl_list = {n = 0x2e90008, p = 0x2e90008}, sending_list = {n =
> 0x2ac5b48, p = 0x2ec2798}, buf_wait = {target = 0x0, wakeup_cb = 0x0, list
> = {n = 0x2e90038, p = 0x2e90038}}, wait_event = {task = 0x2b2ae90, handle =
> 0x0, events = 1}}
> (gdb) n
> 2586list_for_each_entry(h2s, >send_list, list) {
> (gdb) n
> 2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
> H2_CF_MUX_BLOCK_ANY)
> (gdb) n
> 2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
> 
> 
> HAProxy info:
> HA-Proxy version 1.9.7-207ba5a 2019/05/05 - https://haproxy.org/
> Build options :
>   TARGET  = linux2628
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
> -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter
> 

[1.9 HEAD] HAProxy using 100% CPU

2019-05-07 Thread Maciej Zdeb
Hi,

I've got another bug with 100% CPU on HAProxy process, it is built from
HEAD of 1.9 branch.

One of processes stuck in infinite loop, admin socket is not responsive so
I've got information only from gdb:

0x00484ab8 in h2_process_mux (h2c=0x2e8ff30) at src/mux_h2.c:2589
2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
(gdb) n
2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2586list_for_each_entry(h2s, >send_list, list) {
(gdb)
2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
(gdb)
2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb)
2586list_for_each_entry(h2s, >send_list, list) {
(gdb)
2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb) p h2c
$1 = (struct h2c *) 0x2e8ff30
(gdb) p *h2c
$2 = {conn = 0x2b4c900, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
flags = 0, streams_limit = 100, max_id = 149, rcvd_c = 0, rcvd_s = 0, ddht
= 0x34099c0, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 149,
dfl = 0,
  dft = 1 '\001', dff = 37 '%', dpl = 0 '\000', last_sid = -1, mbuf = {size
= 16384, area = 0x2ec3d50 "", data = 0, head = 0}, msi = -1, mfl = 0, mft =
0 '\000', mff = 0 '\000', miw = 6291456, mws = 15443076, mfs = 16384,
  timeout = 2, shut_timeout = 2, nb_streams = 53, nb_cs = 53,
nb_reserved = 0, stream_cnt = 75, proxy = 0x219ffe0, task = 0x34081d0,
streams_by_id = {b = {0x2adc2e1, 0x0}}, send_list = {n = 0x2ac5b38, p =
0x3093c18},
  fctl_list = {n = 0x2e90008, p = 0x2e90008}, sending_list = {n =
0x2ac5b48, p = 0x2ec2798}, buf_wait = {target = 0x0, wakeup_cb = 0x0, list
= {n = 0x2e90038, p = 0x2e90038}}, wait_event = {task = 0x2b2ae90, handle =
0x0, events = 1}}
(gdb) n
2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)
(gdb) p *h2s
$3 = {cs = 0x297bdb0, sess = 0x819580 , h2c = 0x2e8ff30, h1m
= {state = H1_MSG_RPBEFORE, flags = 12, curr_len = 0, body_len = 0, next =
0, err_pos = -1, err_state = 0}, by_id = {node = {branches = {b =
{0x2a72250,
  0x2961c30}}, node_p = 0x2a72251, leaf_p = 0x2961c31, bit = 1, pfx
= 49017}, key = 103}, id = 103, flags = 16385, mws = 6291456, errcode =
H2_ERR_NO_ERROR, st = H2_SS_HREM, status = 0, body_len = 0, rxbuf = {size =
0,
area = 0x0, data = 0, head = 0}, wait_event = {task = 0x2fb3ee0, handle
= 0x0, events = 0}, recv_wait = 0x2b8d700, send_wait = 0x2b8d700, list = {n
= 0x3130108, p = 0x2b02238}, sending_list = {n = 0x3130118, p = 0x2b02248}}
(gdb) n
2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb) p *h2c
$4 = {conn = 0x2b4c900, st0 = H2_CS_FRAME_H, errcode = H2_ERR_NO_ERROR,
flags = 0, streams_limit = 100, max_id = 149, rcvd_c = 0, rcvd_s = 0, ddht
= 0x34099c0, dbuf = {size = 0, area = 0x0, data = 0, head = 0}, dsi = 149,
dfl = 0,
  dft = 1 '\001', dff = 37 '%', dpl = 0 '\000', last_sid = -1, mbuf = {size
= 16384, area = 0x2ec3d50 "", data = 0, head = 0}, msi = -1, mfl = 0, mft =
0 '\000', mff = 0 '\000', miw = 6291456, mws = 15443076, mfs = 16384,
  timeout = 2, shut_timeout = 2, nb_streams = 53, nb_cs = 53,
nb_reserved = 0, stream_cnt = 75, proxy = 0x219ffe0, task = 0x34081d0,
streams_by_id = {b = {0x2adc2e1, 0x0}}, send_list = {n = 0x2ac5b38, p =
0x3093c18},
  fctl_list = {n = 0x2e90008, p = 0x2e90008}, sending_list = {n =
0x2ac5b48, p = 0x2ec2798}, buf_wait = {target = 0x0, wakeup_cb = 0x0, list
= {n = 0x2e90038, p = 0x2e90038}}, wait_event = {task = 0x2b2ae90, handle =
0x0, events = 1}}
(gdb) n
2586list_for_each_entry(h2s, >send_list, list) {
(gdb) n
2587if (h2c->st0 >= H2_CS_ERROR || h2c->flags &
H2_CF_MUX_BLOCK_ANY)
(gdb) n
2589if (h2s->send_wait->events & SUB_CALL_UNSUBSCRIBE)


HAProxy info:
HA-Proxy version 1.9.7-207ba5a 2019/05/05 - https://haproxy.org/
Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
-fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter
-Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered
-Wno-missing-field-initializers -Wtype-limits -DIP_BIND_ADDRESS_NO_PORT=24
  OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_DL=1
USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 USE_PCRE_JIT=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.1.1b  26 Feb 2019
Running on OpenSSL version : OpenSSL 1.1.1b  26 Feb 2019
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.5
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported :