Hi guys,
On Thu, Sep 17, 2020 at 11:05:31AM +1000, Igor Cicimov wrote:
(...)
> > Coredump fragment from thread1:
> > (gdb) bt
> > #0 0x000055cbbf6ed64b in h2s_notify_recv (h2s=0x7f65b8b55130) at
> > src/mux_h2.c:783
So the code is this one:
777 static void __maybe_unused h2s_notify_recv(struct h2s *h2s)
778 {
779 struct wait_event *sw;
780
781 if (h2s->recv_wait) {
782 sw = h2s->recv_wait;
783 sw->events &= ~SUB_RETRY_RECV;
784 tasklet_wakeup(sw->tasklet);
785 h2s->recv_wait = NULL;
786 }
787 }
In the trace it's said that sw = 0xffffffff. Looking at all places where
h2s->recv_wait() is modified, it's either NULL or a valid pointer to some
structure. We could have imagined that for whatever reason h2s is wrong
here, but this call only happens when its state is still valid, and it
experiences double dereferences before landing here, which tends to
indicate that the h2s pointer is OK. Thus the only hypothesis I can have
for now is memory corruption :-/ That field would get overwritten with
(int)-1 for whatever reason, maybe a wrong cast somewhere, but it's not
as if we had many of these.
> I'm not one of the devs but obviously many of us using v2.0 will be
> interested in the answer. Assuming you do not install from packages can you
> please provide some more background on how you produce the binary, like if
> you compile then what OS and kernel is this compiled on and what OS and
> kernel this crashes on? Again if compiled any other custom compiled
> packages in use, like OpenSSL, lua etc, you might be using or have compiled
> haproxy against etc.?
>
> Also if this is a bug and you have hit some corner case with your config
> (many are using 2.0 but we have not seen crashes) you should provide a
> stripped down version (not too stripped though just the sensitive data) of
> your config too.
I agree with Igor here, any info to try to narrow down a reproducer, both
in terms of config and operations, would be tremendously helpful!
Thanks,
Willy