On Tue, 21 Jul 2020 19:23:44 +0100 Julian Smith <[email protected]> wrote:
> On Mon, 20 Jul 2020 17:18:19 +0100 > Julian Smith <[email protected]> wrote: > > > On Mon, 20 Jul 2020 15:26:11 +0000 > > Visa Hankala <[email protected]> wrote: > > > > > On Mon, Jul 20, 2020 at 04:35:12AM +0000, Visa Hankala wrote: > > > > On Sun, Jul 19, 2020 at 09:47:54PM +0100, Julian Smith wrote: > > > > > > > > > I've been finding egdb and gdb rather easily get stuck in an > > > > > uninterruptible wait, e.g. when running the 'next' command > > > > > after hitting a breakpoint. > > > > [...] > > > > > > The single-thread check done by wait4() is non-interruptible. > > > > When the debugger gets stuck, is it blocked in "suspend" state? > > > > > > > > ps reports it to be in state 'D'. > > > > > > > > > > However, I think there is a bug in the single-thread switch > > > > code. It looks that ps_singlecount can be decremented too much. > > > > This probably is a regression of making ps_singlecount unsigned > > > > and letting single_thread_check() run without the kernel lock. > > > > > > > > The bug might go away if single_thread_check() made sure that > > > > P_SUSPSINGLE is set before the thread suspends. > > > > > > Below is an updated patch for testing. It extends the scope of > > > SCHED_LOCK() so that there are fewer chances of interleaving of > > > single_thread_set() and single_thread_check(). > > > > Many thanks for these patches. I'll try to test in the next couple > > of days. Though the last time i built an OpenBSD kernel was well > > over a decade ago, so it might take me a little longer. > > I managed to build a patched kernel, and it seems to fix the problem - > i haven't been able to get egdb into an uninterruptable wait state. > > Also, i've been running the patched kernel all day now and it doesn't > seem to be causing any problems elsewhere. Unfortunately the same problem has just occurred again. I've run egdb quite a few times since i updated the kernel, so the patch has definitely reduced the problem, but it doesn't seem to have eliminated it. Let me know if there anything i could do to find out more information. Thanks, - Jules -- http://op59.net

