On Wed, Dec 11, 2019 at 09:06:33AM +1100, matthew green wrote: > "Andrew Doran" writes: > > Module Name: src > > Committed By: ad > > Date: Mon Dec 9 21:02:10 UTC 2019 > > > > Modified Files: > > src/sys/kern: kern_rwlock.c > > > > Log Message: > > Expunge the panicstr checks, we don't need them. > > can you explain why?
Sure, I have developed a bit of a feel for it after years off watching the thing panic. The checks to not go too hard on the assertions once panicstr are set are pretty good in my experience - we don't want that to snowball. The ones that disable locking were of some kind of use (at least to me) back in 2007 before we had a decent LOCKDEBUG setup for the newlock2 primitives. So it sprang out of development requirements and an uncertainty about all how this new synchronization stuff was going to behave in practice more than a desire to do the right thing. On a uniprocessor or dual core box back then the panicstr checks didn't really seem to have many bad effects, but with more CPUs it often seems to make the crash much worse than it needs to be and I keep bumping into the effects of it. Here's a craptacular example: http://www.netbsd.org/~ad/ That's kind of amusing to look at and it's only my framebuffer memory but I worry that it could just as well be inodes or mbufs or anything else that belongs to the user, and I think that until the CPUs are locked up and activity stopped, the thing needs to keep working as properly as it can. > what sort of crash-time testing did you perform? That the system can be debugged and reset cleanly. If we've got code in DDB that hangs up or crashdups don't work then that's something we should fix. I've not run into any in a long time, they seem to get fixed. Do you have a particular concern or something else in mind? Cheers, Andrew