Hi Patrick, On Tue, May 07, 2019 at 02:01:33PM -0400, Patrick Hemmer wrote: > Just in case it's useful, we had the issue recur today. However I gleaned a > little more information from this recurrence. Provided below are several > outputs from a gdb `bt full`. The important bit is that in the captures, the > last frame which doesn't change between each capture is the `si_cs_send` > function. The last stack capture provided has the shortest stack depth of > all the captures, and is inside `h2_snd_buf`.
Thank you. At first glance this remains similar. Christopher and I have been studying these issues intensely these days because they have deep roots into some design choices and tradeoffs we've had to make and that we're relying on, and we've come to conclusions about some long term changes to address the causes, and some fixes for 1.9 that now appear valid. We're still carefully reviewing our changes before pushing them. Then I think we'll emit 1.9.8 anyway since it will already fix quite a number of issues addressed since 1.9.7, so for you it will probably be easier to try again. > Otherwise it's still the behavior is the same as last time with `strace` > showing absolutely nothing, so it's still looping. I'm not surprised. We managed to break that loop in a dirty way a first time but it came with impacts (some random errors could be spewed depending on the frame sizes, which is obviously not acceptable). But yes, this loop has no way to give up. That's the second argument convincing me of finishing the watchdog so that at least it dies when this happens! Expect some updates on this this week. Cheers, Willy