On Sat, Jan 07, 2023 at 02:22:01PM +0100, Willy Tarreau wrote:
> Hi Luke,
> On Sat, Jan 07, 2023 at 01:44:30PM +0100, Luke Seelenbinder wrote:
> > Hi list,
> > > We've been running 2.7.1 on a subset of our edge servers with QUIC + 
> > > HTTP/3
> > enabled, and we're seeing routine, but infrequent (~daily), crashes (mix of
> > SIGABRT / SIGSEGV). I have coredumps and there doesn't seem to be any common
> > thread across crashes / machines, but it's possible I'm missing something.
> > Two of the coredumps show the following backtrace:
> > > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0  0x000055b0fe319ce7 in qc_release_frm (qc=0x55b101236570,
> > frm=0x7fd8201fbbf0 <main_arena+112>) at src/quic_conn.c:1569
> > 1569                pn = f->pkt->pn_node.key;
> > > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0  qc_release_frm (qc=0x5652aa588fc0, frm=0x5652aa2537d0) at
> > src/quic_conn.c:1564
> > 1564        list_for_each_entry_safe(f, tmp, &origin->reflist, ref) {
> > > which seem similar enough to possibly share a common cause. The other
> > crashes occur in quictls (sigabrt), htx.h (sigsegv), and ebtree.h (sigsegv).
> >
> > Are there known fixes from 2.8-dev or internal trackers that could be
> > related? I can dig deeper, but for now I'll probably disable quic since that
> > seems to be the most likely culprit.
> I'm seeing the following patch for QUIC which was fixed right after
> 2.7.1 was emitted and which suggest potential crashes:
>   15337fd80 ("BUG/MEDIUM: mux-quic: fix double delete from qcc.opening_list")
> So you might possibly be hitting that bug, indeed. If you're interested
> in giving 2.8-dev1 a try, it would confirm whether you're facing this
> exact issue. But at the moment we're not aware of any remaining crash-
> inducing bugs in 2.8-dev, so if it would still fail for you it would
> indicate a new unknown bug.

Luke, the crashes you reported are quite identical to the ones I had
before I introduced the fix. Indeed, you should try 2.8-dev1 if you can
and report us if this has solved the issue.

Thanks for your help,

-- 
Amaury Denoyelle

Reply via email to