> Date: Wed, 3 May 2017 21:05:24 +0100 > From: Stuart Henderson <s...@spacehopper.org> > > On 2017/05/03 15:12, Mark Kettenis wrote: > > > Date: Wed, 3 May 2017 13:51:22 +0100 > > > From: Stuart Henderson <s...@spacehopper.org> > > > > > > On 2017/05/01 22:18, Mark Kettenis wrote: > > > > > Date: Mon, 1 May 2017 20:58:29 +0100 > > > > > From: Stuart Henderson <s...@spacehopper.org> > > > > > > > > > > Userland is non-responsive, machine is pingable, tcp connections open > > > > > but no banner from ssh. No failed pool requests. This kernel is from > > > > > today's snapshot but I saw the same with one from a couple of days > > > > > ago. Is there anything else I can get that might be useful? > > > > > > > > .. > > > > > 71034 186155 65198 0 3 0x11 vp perl > > > .. > > > > > > > > The diff below might fix thise. Or it might actually turn this into a > > > > hard hang... > > > > > > > > Nevertheless, could you try running with it? > > > > > > I haven't seen this happen again with your diff, and haven't seen any > > > hangs. Probably still too early to say for sure that it fixes things, > > > but it looks promising so far. > > > > Thanks. Since Dale ok'ed it and I had been running with it for a > > while already, I committed it last night. > > > > Ha. As is traditional, not long after sending that message I've hit > a hard lock - no DDB.
I'm sure it wouldn't have happened if I hand't committed it ;). Could you change the PR_NOWAIT back into P_WAITOK and see if you can reproduce the hang and break into ddb? Meanwhile I'll think about what information to print once you've hit it ;).