Re: svn commit: r331298 - head/sys/dev/syscons
On Fri, Mar 23, 2018 at 01:21:43AM +1100, Bruce Evans wrote: > I don't like having a whole task for this. The whole thing is just a > hack to work around some the upper layers of the tty driver and some > lower layers not having any input [escape] sequences to control the > kernel. Only the syscons and vt lower layers have such sequences > (where they are actually key combinations that are converted to control > operations instead of to input [escape] sequences). To work around for > hardware ttys, the filter for kdb sequences is abused to implement a > non-kdb sequence for rebooting. > > The tty input methods could check for kernel-control sequences and safely > signal init. This is a bit too complicated for syscons and vt since they > can more easily check for key combinations, but wouldhave to convert these > to standard sequences to get the tty layer to do the same thing. (They > have many more kernel-control key combinations and the non-kdb one for > rebooting is just a bug for them.) This is a bit complicated for hardware > tty drivers too -- some use the tty bulk-input method and this shouldn't > check for sequences, but should reduce to bcopy(). Hoever, to detect the > kdb sequences, these drivers han to check at a low level anyway. They > can be clever about this and only check for the console device[s] which are > usually only used for for input at a low rate. This is both complicated and mostly pointless. The task mechanism provides the easy solution, and more, the task mechanism was specifically designed to allow to schedule activities in the more allowing context, from a more restrictive context. I have no intent to start the self-inflicting activity to code the attempt to modify a lot of drivers to do what can be done in 10 lines of code. > Calling kern_reboot() from fast interrupt handlers is still invalid. > It is quite likely to deadlock. In particular, it should deadlock in > when kern_reboot() prints messages to the console. Most console drivers > have races instead of deadlocks by dropping their lock[s] in their > fast interrupt handler before calling the buggy alt escape function. I agree that it is mostly invalid. If SCHEDULER_STOPPED() evaluates to true, then the clean shutdown is not possible at all on modern hardware. Most drivers are asynchronous, and even the hardware reboot command management is async. Similarly, if initproc is not yet initialized, it is silly to expect clean shutdown to occur. I just keep the existing behavior for these corner cases, perhaps it was added for some reasons and might sometime work. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Thu, Mar 22, 2018 at 5:42 AM, Konstantin Belousovwrote: > On Thu, Mar 22, 2018 at 05:50:57PM +1100, Bruce Evans wrote: > > On Wed, 21 Mar 2018, Warner Losh wrote: > > > > > On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov < > kostik...@gmail.com> > > > wrote: > > >> ... > > >> Are you saying that fast interrupt handlers call shutdown_nice() ? > This > > >> is the quite serious bug on its own. To fix it, shutdown_nice() > should > > >> use a fast taskqueue to schedule the task which would lock the process > > >> and send the signal. > > > > > > Is there some way we know we're in a fast interrupt handler? If so, it > > > should be simple to fix. If not, then there's an API change ahead of > us... > > > > There is a td_intr_nesting_level flag that might work. (I invented this, > > but don't like its current use.) > But why do we need to know this ? We can always schedule a task in the > fast taskqueue if init is present and can be scheduled. Ah, good point. Seems like a poster child example of doing a fast task queue deferment. That's always safe, except maybe in BDE's fast interrupt world. > > But bde is right: the system has to be in good enough shape to cope. I > > > wonder if we should put that coping into kdb_reboot() instead. It's > only an > > > issue for TILDE ^R, which is a fairly edge case. > > > > shutdown_nice() is also called directly from syscons and vt for the > reboot, > > halt and poweroff keys. This happens in normal interrupt handler > context, > > so there is only a problem when init is not running. > > diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c > index e5ea9644ad3..e7c6d4c92b2 100644 > --- a/sys/kern/kern_shutdown.c > +++ b/sys/kern/kern_shutdown.c > @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > #include > #include > > @@ -276,6 +277,28 @@ sys_reboot(struct thread *td, struct reboot_args *uap) > return (error); > } > > +static void > +shutdown_nice_task_fn(void *arg, int pending __unused) > +{ > + int howto; > + > + howto = (uintptr_t)arg; > + /* Send a signal to init(8) and have it shutdown the world. */ > + PROC_LOCK(initproc); > + if (howto & RB_POWEROFF) > + kern_psignal(initproc, SIGUSR2); > + else if (howto & RB_POWERCYCLE) > + kern_psignal(initproc, SIGWINCH); > + else if (howto & RB_HALT) > + kern_psignal(initproc, SIGUSR1); > + else > + kern_psignal(initproc, SIGINT); > + PROC_UNLOCK(initproc); > +} > + > +static struct task shutdown_nice_task = TASK_INITIALIZER(0, > +_nice_task_fn, NULL); > + > /* > * Called by events that want to shut down.. e.g on a PC > */ > @@ -283,20 +306,14 @@ void > shutdown_nice(int howto) > { > > - if (initproc != NULL) { > - /* Send a signal to init(8) and have it shutdown the > world. */ > - PROC_LOCK(initproc); > - if (howto & RB_POWEROFF) > - kern_psignal(initproc, SIGUSR2); > - else if (howto & RB_POWERCYCLE) > - kern_psignal(initproc, SIGWINCH); > - else if (howto & RB_HALT) > - kern_psignal(initproc, SIGUSR1); > - else > - kern_psignal(initproc, SIGINT); > - PROC_UNLOCK(initproc); > + if (initproc != NULL && !SCHEDULER_STOPPED()) { > + shutdown_nice_task.ta_context = (void *)(uintptr_t)howto; > + taskqueue_enqueue(taskqueue_fast, _nice_task); > } else { > - /* No init(8) running, so simply reboot. */ > + /* > +* No init(8) running, or scheduler would not allow it > +* to run, so simply reboot. > +*/ > kern_reboot(howto | RB_NOSYNC); > } > } > I like this elegance. I think we should commit this. While it would be nice to support the super-fast interrupts that bde has done, we already don't support it in a number of places. Conceptually, I like the notion of the tilde escape machine in the tty layer, but I don't like the trade offs. Here we only defer to reboot, which requires a fair amount of machinery working to work right. it's not the dependencies I'd prefer, but usually it doesn't matter. However, to do that, it would also mean that breaking to the debugger would need to be done in the tty layer, which requires more of the context switching mechanisms to be working, which would be a bigger loss. So this is a good trade-off. If kern_reboot() can't be called from a fast interrupt handler in bdeBSD, it would be a simple matter to put a wrapper around it here so his kernel can context switch. Though, if the scheduler isn't running, I'm not sure what you can do in this situation. The system is already hung or in a really bad way, so it would have to jump to somewhere late in
Re: svn commit: r331298 - head/sys/dev/syscons
On Thu, 22 Mar 2018, Konstantin Belousov wrote: On Thu, Mar 22, 2018 at 05:50:57PM +1100, Bruce Evans wrote: On Wed, 21 Mar 2018, Warner Losh wrote: On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousovwrote: ... Are you saying that fast interrupt handlers call shutdown_nice() ? This is the quite serious bug on its own. To fix it, shutdown_nice() should use a fast taskqueue to schedule the task which would lock the process and send the signal. Is there some way we know we're in a fast interrupt handler? If so, it should be simple to fix. If not, then there's an API change ahead of us... There is a td_intr_nesting_level flag that might work. (I invented this, but don't like its current use.) But why do we need to know this ? We can always schedule a task in the fast taskqueue if init is present and can be scheduled. Not quite always. In my version, fast interrupt handlers are actually fast, so they can interrupt any spin mutex and cannot call any scheduling function. This is enforced by setting the pcpu pointer to NULL. Taskqueues and even SWIs are unavailable for fast interrupt handlers. Scheduling is done by setting a flag that is checked in timeout handlers like it was in FreeBSD-1. ... shutdown_nice() is also called directly from syscons and vt for the reboot, halt and poweroff keys. This happens in normal interrupt handler context, so there is only a problem when init is not running. diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c index e5ea9644ad3..e7c6d4c92b2 100644 --- a/sys/kern/kern_shutdown.c +++ b/sys/kern/kern_shutdown.c @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -276,6 +277,28 @@ sys_reboot(struct thread *td, struct reboot_args *uap) return (error); } +static void +shutdown_nice_task_fn(void *arg, int pending __unused) +{ + int howto; + + howto = (uintptr_t)arg; + /* Send a signal to init(8) and have it shutdown the world. */ + PROC_LOCK(initproc); + if (howto & RB_POWEROFF) + kern_psignal(initproc, SIGUSR2); + else if (howto & RB_POWERCYCLE) + kern_psignal(initproc, SIGWINCH); + else if (howto & RB_HALT) + kern_psignal(initproc, SIGUSR1); + else + kern_psignal(initproc, SIGINT); + PROC_UNLOCK(initproc); +} + +static struct task shutdown_nice_task = TASK_INITIALIZER(0, +_nice_task_fn, NULL); + I don't like having a whole task for this. The whole thing is just a hack to work around some the upper layers of the tty driver and some lower layers not having any input [escape] sequences to control the kernel. Only the syscons and vt lower layers have such sequences (where they are actually key combinations that are converted to control operations instead of to input [escape] sequences). To work around for hardware ttys, the filter for kdb sequences is abused to implement a non-kdb sequence for rebooting. The tty input methods could check for kernel-control sequences and safely signal init. This is a bit too complicated for syscons and vt since they can more easily check for key combinations, but wouldhave to convert these to standard sequences to get the tty layer to do the same thing. (They have many more kernel-control key combinations and the non-kdb one for rebooting is just a bug for them.) This is a bit complicated for hardware tty drivers too -- some use the tty bulk-input method and this shouldn't check for sequences, but should reduce to bcopy(). Hoever, to detect the kdb sequences, these drivers han to check at a low level anyway. They can be clever about this and only check for the console device[s] which are usually only used for for input at a low rate. /* * Called by events that want to shut down.. e.g on a PC */ @@ -283,20 +306,14 @@ void shutdown_nice(int howto) { - if (initproc != NULL) { - /* Send a signal to init(8) and have it shutdown the world. */ - PROC_LOCK(initproc); - if (howto & RB_POWEROFF) - kern_psignal(initproc, SIGUSR2); - else if (howto & RB_POWERCYCLE) - kern_psignal(initproc, SIGWINCH); - else if (howto & RB_HALT) - kern_psignal(initproc, SIGUSR1); - else - kern_psignal(initproc, SIGINT); - PROC_UNLOCK(initproc); + if (initproc != NULL && !SCHEDULER_STOPPED()) { + shutdown_nice_task.ta_context = (void *)(uintptr_t)howto; + taskqueue_enqueue(taskqueue_fast, _nice_task); } else { - /* No init(8) running, so simply reboot. */ + /* +* No init(8) running, or scheduler would not allow it +* to run, so simply reboot. +*/ kern_reboot(howto | RB_NOSYNC); } } Calling
Re: svn commit: r331298 - head/sys/dev/syscons
On Thu, Mar 22, 2018 at 05:50:57PM +1100, Bruce Evans wrote: > On Wed, 21 Mar 2018, Warner Losh wrote: > > > On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov> > wrote: > >> ... > >> Are you saying that fast interrupt handlers call shutdown_nice() ? This > >> is the quite serious bug on its own. To fix it, shutdown_nice() should > >> use a fast taskqueue to schedule the task which would lock the process > >> and send the signal. > > > > Is there some way we know we're in a fast interrupt handler? If so, it > > should be simple to fix. If not, then there's an API change ahead of us... > > There is a td_intr_nesting_level flag that might work. (I invented this, > but don't like its current use.) But why do we need to know this ? We can always schedule a task in the fast taskqueue if init is present and can be scheduled. > > > But bde is right: the system has to be in good enough shape to cope. I > > wonder if we should put that coping into kdb_reboot() instead. It's only an > > issue for TILDE ^R, which is a fairly edge case. > > shutdown_nice() is also called directly from syscons and vt for the reboot, > halt and poweroff keys. This happens in normal interrupt handler context, > so there is only a problem when init is not running. diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c index e5ea9644ad3..e7c6d4c92b2 100644 --- a/sys/kern/kern_shutdown.c +++ b/sys/kern/kern_shutdown.c @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -276,6 +277,28 @@ sys_reboot(struct thread *td, struct reboot_args *uap) return (error); } +static void +shutdown_nice_task_fn(void *arg, int pending __unused) +{ + int howto; + + howto = (uintptr_t)arg; + /* Send a signal to init(8) and have it shutdown the world. */ + PROC_LOCK(initproc); + if (howto & RB_POWEROFF) + kern_psignal(initproc, SIGUSR2); + else if (howto & RB_POWERCYCLE) + kern_psignal(initproc, SIGWINCH); + else if (howto & RB_HALT) + kern_psignal(initproc, SIGUSR1); + else + kern_psignal(initproc, SIGINT); + PROC_UNLOCK(initproc); +} + +static struct task shutdown_nice_task = TASK_INITIALIZER(0, +_nice_task_fn, NULL); + /* * Called by events that want to shut down.. e.g on a PC */ @@ -283,20 +306,14 @@ void shutdown_nice(int howto) { - if (initproc != NULL) { - /* Send a signal to init(8) and have it shutdown the world. */ - PROC_LOCK(initproc); - if (howto & RB_POWEROFF) - kern_psignal(initproc, SIGUSR2); - else if (howto & RB_POWERCYCLE) - kern_psignal(initproc, SIGWINCH); - else if (howto & RB_HALT) - kern_psignal(initproc, SIGUSR1); - else - kern_psignal(initproc, SIGINT); - PROC_UNLOCK(initproc); + if (initproc != NULL && !SCHEDULER_STOPPED()) { + shutdown_nice_task.ta_context = (void *)(uintptr_t)howto; + taskqueue_enqueue(taskqueue_fast, _nice_task); } else { - /* No init(8) running, so simply reboot. */ + /* +* No init(8) running, or scheduler would not allow it +* to run, so simply reboot. +*/ kern_reboot(howto | RB_NOSYNC); } } ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Wed, 21 Mar 2018, Warner Losh wrote: On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousovwrote: ... Are you saying that fast interrupt handlers call shutdown_nice() ? This is the quite serious bug on its own. To fix it, shutdown_nice() should use a fast taskqueue to schedule the task which would lock the process and send the signal. Is there some way we know we're in a fast interrupt handler? If so, it should be simple to fix. If not, then there's an API change ahead of us... There is a td_intr_nesting_level flag that might work. (I invented this, but don't like its current use.) But bde is right: the system has to be in good enough shape to cope. I wonder if we should put that coping into kdb_reboot() instead. It's only an issue for TILDE ^R, which is a fairly edge case. shutdown_nice() is also called directly from syscons and vt for the reboot, halt and poweroff keys. This happens in normal interrupt handler context, so there is only a problem when init is not running. Bruce ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Wed, 21 Mar 2018, Warner Losh wrote: On Wed, Mar 21, 2018 at 11:53 AM, Bruce Evanswrote: On Wed, 21 Mar 2018, Warner Losh wrote: Log: Unlock giant when calling shutdown_nice() This breaks the driver. Giant is syscons' driver lock, and also the interrupt handler lock for at least the atkbd keyboard driver, so vt sometimes holds the lock for. OK. I got carried away. You're right. The proper fix is to unlock Giant at the top of kern_reboot() instead. This handles the case where we call shutdown_nice() with no init running and have to call kern_reboot directly. Otherwise it's perfectly fine to just call shutdown_nice() with Giant held since we just signal init from there. So I'll revert this change and make that other change instead. ... Good point. I think the following change is good for everything except calling shutdown_nice() from a fast interrupt handler with noinit running: diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c index e5ea9644ad3f..564aecd811be 100644 --- a/sys/kern/kern_shutdown.c +++ b/sys/kern/kern_shutdown.c @@ -366,6 +366,12 @@ kern_reboot(int howto) { static int once = 0; + /* +* Drop Giant once and for all. +*/ + while (mtx_owned()) + mtx_unlock(); + #if defined(SMP) /* * Bind us to the first CPU so that all shutdown code runs there. Some Comments? Try putting this in vfs_mountroot_parse() and/or the second clause of shutdown_nice() only. The only other calls are from is from sys_reboot() and vpanic(). sys_reboot() is either MPSAFE or not, and it should know when it is safe to drop its Giant lock if at all. vpanic() sets SCHEDULER_STOPPED() to avoid seeing problems with Giant or any other mutex. Bruce ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Wed, 21 Mar 2018, Konstantin Belousov wrote: On Thu, Mar 22, 2018 at 04:53:22AM +1100, Bruce Evans wrote: Serial console drivers with fast interrupt handlers have much more broken locking for ddb special keys. It is invalid to either drop locks or call the "any" function from a fast interrupt handler, but buggy serial console drivers calls kbd_alt_break(), and that now calls shutdown_nice() and other functions that cannot be called from a fast interrupt handler. ddb keys supply most of the shutdown_nice() functionality for serial consoles, and there are no escape sequence to get this without ddb or maybe another debugger, so these bugs don't affect most users. Handling this correctly requires much the same fix as an unsafe signal handler, and fixes have much the same problems -- not much more than setting a flag is safe, and the flag might never be looked at if the system is in a bad state. However, if a nice shutdown is possible then the sytem must be in a good enough state to poll for flags. Are you saying that fast interrupt handlers call shutdown_nice() ? This is the quite serious bug on its own. To fix it, shutdown_nice() should use a fast taskqueue to schedule the task which would lock the process and send the signal. Yes. See kdb_reboot(). This is called for an escape sequence from kdb_alt_break_internal(). The other calls in kdb_alt_break_internal() don't wander into a function that calls shutdown_nice() and are relatively safe. They have to be not completely safe to work in "any" context. kdb_reboot() is the opposite -- we want it to fail if the context is not nice enough to reboot nicely. BTW, I've often wanted to be able to send more general signals to init using keystrokes, independent of being logged in. Mainly SIGHUP to shut down to single user mode. This has some security problems, especially for the "any" signal to the "any" process. At least syscons defaults to allowing Ctrl-Alt-Del to reboot. I like that and miss it on serial consoles and on some non-FreeBSD OS's (the alt-break sequence to reach kdb_reboot() is not equivalent, since it is only available if much more insecure sequences are also allowed). Shutdown isn't as fundamentally insecure as sending arbitrary signals and most systems allow it without much more than a password for users that have physical access to the system. OTOH, I don't like the syscons key that suspends. This is also allowed by default. I sometimes press it by mistake on a system that can suspend but not resume. Unlike the reboot key, there are no knobs for controlling it. There are 3 static configuration knobs, 2 sysctls and the keymap which must be understood for securing the main reboot key in syscons, and a slightly different set of controls in vt :-(. For suspend, there is only the keymap. Syscons also has a standby key and more shutdown keys, but these are in more obscure parts of keymaps and I've never noticed typing them by mistake. Bruce ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousovwrote: > On Thu, Mar 22, 2018 at 04:53:22AM +1100, Bruce Evans wrote: > > Serial console drivers with fast interrupt handlers have much more > > broken locking for ddb special keys. It is invalid to either drop locks > > or call the "any" function from a fast interrupt handler, but buggy > > serial console drivers calls kbd_alt_break(), and that now calls > > shutdown_nice() and other functions that cannot be called from a fast > > interrupt handler. ddb keys supply most of the shutdown_nice() > > functionality for serial consoles, and there are no escape sequence to > > get this without ddb or maybe another debugger, so these bugs don't > > affect most users. > > > > Handling this correctly requires much the same fix as an unsafe signal > > handler, and fixes have much the same problems -- not much more than > > setting a flag is safe, and the flag might never be looked at if the > > system is in a bad state. However, if a nice shutdown is possible then > > the sytem must be in a good enough state to poll for flags. > > Are you saying that fast interrupt handlers call shutdown_nice() ? This > is the quite serious bug on its own. To fix it, shutdown_nice() should > use a fast taskqueue to schedule the task which would lock the process > and send the signal. > Is there some way we know we're in a fast interrupt handler? If so, it should be simple to fix. If not, then there's an API change ahead of us... But bde is right: the system has to be in good enough shape to cope. I wonder if we should put that coping into kdb_reboot() instead. It's only an issue for TILDE ^R, which is a fairly edge case. Warner ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Wed, Mar 21, 2018 at 11:53 AM, Bruce Evanswrote: > On Wed, 21 Mar 2018, Warner Losh wrote: > > Log: >> Unlock giant when calling shutdown_nice() >> > > This breaks the driver. Giant is syscons' driver lock, and also the > interrupt handler lock for at least the atkbd keyboard driver, so vt > sometimes holds the lock for. > OK. I got carried away. You're right. The proper fix is to unlock Giant at the top of kern_reboot() instead. This handles the case where we call shutdown_nice() with no init running and have to call kern_reboot directly. Otherwise it's perfectly fine to just call shutdown_nice() with Giant held since we just signal init from there. So I'll revert this change and make that other change instead. > vt has to sprinkle lots of Giant locking and unlocking where the > corresponding locking is automatic for syscons, but this seems to be > nonsense in its event handler. vt has to acquire Giant when calling > the keyboard driver, but event handling is a call from the keyboard > driver, usually from the interrupt handler with a suitable lock held. > For atkbd, the suitable lock is Giant -- see atkbd_intr(). (All keyboard > drivers that use the kbd(4) for the top level use Giant looking since > toe top level specifies D_NEEDGIANT in its cdevsw. I think that is all > keyboard drivers.) So where vt_kbdevent() sprinkles Giant, that has no > effect since Giant is held by the caller. So vt_kbdevent() calls > vt_processkey() with Giant held despite it not acquiring Giant explicitly; > vt_processkey() calls vt_machine_kbdevent() and that calls various > shutdown functions, all with Giant held. True, and covered above. > Modified: head/sys/dev/syscons/syscons.c >> >> == >> --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018 >> (r331297) >> +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018 >> (r331298) >> @@ -3858,22 +3858,28 @@ next_code: >> >> case RBT: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(); >> shutdown_nice(0); >> + } >> #endif >> break; >> >> case HALT: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(); >> shutdown_nice(RB_HALT); >> + } >> #endif >> break; >> >> case PDWN: >> #ifndef SC_DISABLE_REBOOT >> - if (enable_reboot && !(flags & SCGETC_CN)) >> + if (enable_reboot && !(flags & SCGETC_CN)) { >> + mtx_unlock(); >> shutdown_nice(RB_HALT|RB_POWEROFF); >> + } >> #endif >> break; >> > > The new bugs should cause assertion failures. shutdown_nice() just > signals init and returns. Giant is not re-acquired, so assertions > should fail when the caller drops the lock. So the shutdown should > be a nasty panic. > I must have been using vt when I tested it, since I didn't see that. > Transiently dropping the lock is probably not fatal depending on what > the caller does. On x86 with atkbd, nested interrupts are prevented > by masking the in the ATPIC or APIC -- Giant is not really the driver > lock, but just a lock for interfacing between related screen and keyboard > drivers. > > Serial console drivers with fast interrupt handlers have much more > broken locking for ddb special keys. It is invalid to either drop locks > or call the "any" function from a fast interrupt handler, but buggy > serial console drivers calls kbd_alt_break(), and that now calls > shutdown_nice() and other functions that cannot be called from a fast > interrupt handler. ddb keys supply most of the shutdown_nice() > functionality for serial consoles, and there are no escape sequence to > get this without ddb or maybe another debugger, so these bugs don't > affect most users. > It's called it before my changes... I'll make a note to look into this. > Handling this correctly requires much the same fix as an unsafe signal > handler, and fixes have much the same problems -- not much more than > setting a flag is safe, and the flag might never be looked at if the > system is in a bad state. However, if a nice shutdown is possible then > the sytem must be in a good enough state to poll for flags. > > For normal signal handlers, there is no problem sending a signal to init > or at least with setting a flag and waking up some thread to check the > flag. > > I don't quite understand this commit. It should be valid to send a > signal to init() in proc or non-fast ithread context including with > Giant held. shutdown_nice() starts with
Re: svn commit: r331298 - head/sys/dev/syscons
On Thu, Mar 22, 2018 at 04:53:22AM +1100, Bruce Evans wrote: > Serial console drivers with fast interrupt handlers have much more > broken locking for ddb special keys. It is invalid to either drop locks > or call the "any" function from a fast interrupt handler, but buggy > serial console drivers calls kbd_alt_break(), and that now calls > shutdown_nice() and other functions that cannot be called from a fast > interrupt handler. ddb keys supply most of the shutdown_nice() > functionality for serial consoles, and there are no escape sequence to > get this without ddb or maybe another debugger, so these bugs don't > affect most users. > > Handling this correctly requires much the same fix as an unsafe signal > handler, and fixes have much the same problems -- not much more than > setting a flag is safe, and the flag might never be looked at if the > system is in a bad state. However, if a nice shutdown is possible then > the sytem must be in a good enough state to poll for flags. Are you saying that fast interrupt handlers call shutdown_nice() ? This is the quite serious bug on its own. To fix it, shutdown_nice() should use a fast taskqueue to schedule the task which would lock the process and send the signal. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Thu, 22 Mar 2018, Bruce Evans wrote: On Wed, 21 Mar 2018, Warner Losh wrote: Log: Unlock giant when calling shutdown_nice() ... This breaks the driver. Giant is syscons' driver lock, and also the interrupt handler lock for at least the atkbd keyboard driver, so vt sometimes holds the lock for. [That should have been "... lock too".] ... Actual testing shows that doesn't cause a panic, but it also doesn't actually unlock for shutdown_nice(), since the lock is acquired twice and only released once. syscons has much the same extra lock sprinkling for event handling as vt: - intr_event_execute_handlers() acquires Giant and calls atkbdintr() - atkbdintr() calls sckbdevent() - sckbdevent() unnecessarily acquires Giant again - the buggy unlocking drops Giant just once - shutdown_nice() is called with Giant held - the buggy unlocking fails to re-acquire Giant - sckbdevent() releases Giant, leaving it not held - sckbdevent() returns - atkbdintr() returns - intr_event_execute_handlers() releases Giant. This should panic, but it apparently blocks for long enough for init to shut down first. When I trace the last step, I get a panic which might be either from the different timing or just a bug in kdb. Testing with a kernel with other bugs fixed shows that the invariants violation really is detected, causing a panic instead of a nice shutdown for Ctrl-Alt-Del, but recursive panics pile up and printf() in -current is too broken to print anything. A better kernel prints: XX ppkernel trap 12 with interrupts disabled XX panic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic XX panic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic XX panic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic XX panic: kernel trap doesn't have ucrepanic: kernel trap doesn't have ucred XX cpuid = 32 XX time = 1521655530 XX KDB: enter: panic where some messages are lost and some are duplicated (there are 8 CPUs). I got control by putting a breakpoint after stop_cpus_hard() in vpanic(). (A breakpoint at panic() crashes, probably for multiple CPUs hitting it, though this is supposed to be fixed in the test version.) The backtrace is then: XX Breakpoint at vpanic+0x4a:popl%ecx XX db> t XX Tracing pid 11 tid 100042 td 0xd6a61360 XX vpanic(c0916c91,d6685ab4,d6685ab4,c098f050,c098f040,...) at vpanic+0x4a/frame 0xd6685a8c XX kassert_panic(c0916c91,c093b956,c0930a24,c09066cf,557) at kassert_panic+0x49/frame 0xd6685aa8 XX witness_unlock(c098f040,8,c09066c6,557) at witness_unlock+0xe7/frame 0xd6685af0 XX __mtx_unlock_flags(c098f050,0,c09066c6,557) at __mtx_unlock_flags+0x65/frame 0xd6685b14 XX intr_event_execute_handlers(c8f466b0) at intr_event_execute_handlers+0xed/frame 0xd6685b40 XX ithread_execute_handlers(c8f466b0,0,80202,d6a61360,c8f466b0,...) at ithread_execute_handlers+0x21/frame 0xd6685b54 XX ithread_loop(d6cda250,d6685ba8,0,d6cda250,c06a2c56,...) at ithread_loop+0x5f/frame 0xd6685b74 XX fork_exit(c06a2c56,d6cda250,d6685ba8) at fork_exit+0x83/frame 0xd6685b94 XX fork_trampoline() at fork_trampoline+0x8 The stack trace is messed up are mis-decoded despite attempts to avoid this (use i386, don't use clang, and turn off auto-inlining...). panic() is not shown, and args passed in registers are not shown. There are just enough args to find the panic message. It is as expected: XX db> x/s 0xd6685ab4 XX 0xd6685ab4: V\271\223\300$\012\223\300\317f\220\300W\005 XX db> x/s 0xc0916c91 XX __func__.16466+0x1c41: lock (%s) %s not locked @ %s:%d Bruce ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331298 - head/sys/dev/syscons
On Wed, 21 Mar 2018, Warner Losh wrote: Log: Unlock giant when calling shutdown_nice() This breaks the driver. Giant is syscons' driver lock, and also the interrupt handler lock for at least the atkbd keyboard driver, so vt sometimes holds the lock for. vt has to sprinkle lots of Giant locking and unlocking where the corresponding locking is automatic for syscons, but this seems to be nonsense in its event handler. vt has to acquire Giant when calling the keyboard driver, but event handling is a call from the keyboard driver, usually from the interrupt handler with a suitable lock held. For atkbd, the suitable lock is Giant -- see atkbd_intr(). (All keyboard drivers that use the kbd(4) for the top level use Giant looking since toe top level specifies D_NEEDGIANT in its cdevsw. I think that is all keyboard drivers.) So where vt_kbdevent() sprinkles Giant, that has no effect since Giant is held by the caller. So vt_kbdevent() calls vt_processkey() with Giant held despite it not acquiring Giant explicitly; vt_processkey() calls vt_machine_kbdevent() and that calls various shutdown functions, all with Giant held. Modified: head/sys/dev/syscons/syscons.c == --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018 (r331297) +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018 (r331298) @@ -3858,22 +3858,28 @@ next_code: case RBT: #ifndef SC_DISABLE_REBOOT - if (enable_reboot && !(flags & SCGETC_CN)) + if (enable_reboot && !(flags & SCGETC_CN)) { + mtx_unlock(); shutdown_nice(0); + } #endif break; case HALT: #ifndef SC_DISABLE_REBOOT - if (enable_reboot && !(flags & SCGETC_CN)) + if (enable_reboot && !(flags & SCGETC_CN)) { + mtx_unlock(); shutdown_nice(RB_HALT); + } #endif break; case PDWN: #ifndef SC_DISABLE_REBOOT - if (enable_reboot && !(flags & SCGETC_CN)) + if (enable_reboot && !(flags & SCGETC_CN)) { + mtx_unlock(); shutdown_nice(RB_HALT|RB_POWEROFF); + } #endif break; The new bugs should cause assertion failures. shutdown_nice() just signals init and returns. Giant is not re-acquired, so assertions should fail when the caller drops the lock. So the shutdown should be a nasty panic. Transiently dropping the lock is probably not fatal depending on what the caller does. On x86 with atkbd, nested interrupts are prevented by masking the in the ATPIC or APIC -- Giant is not really the driver lock, but just a lock for interfacing between related screen and keyboard drivers. Serial console drivers with fast interrupt handlers have much more broken locking for ddb special keys. It is invalid to either drop locks or call the "any" function from a fast interrupt handler, but buggy serial console drivers calls kbd_alt_break(), and that now calls shutdown_nice() and other functions that cannot be called from a fast interrupt handler. ddb keys supply most of the shutdown_nice() functionality for serial consoles, and there are no escape sequence to get this without ddb or maybe another debugger, so these bugs don't affect most users. Handling this correctly requires much the same fix as an unsafe signal handler, and fixes have much the same problems -- not much more than setting a flag is safe, and the flag might never be looked at if the system is in a bad state. However, if a nice shutdown is possible then the sytem must be in a good enough state to poll for flags. For normal signal handlers, there is no problem sending a signal to init or at least with setting a flag and waking up some thread to check the flag. I don't quite understand this commit. It should be valid to send a signal to init() in proc or non-fast ithread context including with Giant held. shutdown_nice() starts with PROC_LOCK(initproc), so it would be a LOR to call it with a spinlock held, and is even more obviously wrong to call it from kbd_alt_break() with interrupts masked and possibly a spinlock held, but Giant is a special sleep lock that causes fewer LORs than most sleep locks. Actual testing shows that doesn't cause a panic, but it also doesn't actually unlock for shutdown_nice(), since the lock is acquired twice and only released once. syscons has much the same extra lock sprinkling for event handling as vt: - intr_event_execute_handlers() acquires Giant and calls atkbdintr() - atkbdintr() calls sckbdevent() - sckbdevent() unnecessarily acquires Giant again - the buggy unlocking drops Giant just once - shutdown_nice() is called with Giant held - the buggy unlocking fails to re-acquire Giant - sckbdevent() releases Giant, leaving it not held -
svn commit: r331298 - head/sys/dev/syscons
Author: imp Date: Wed Mar 21 14:47:12 2018 New Revision: 331298 URL: https://svnweb.freebsd.org/changeset/base/331298 Log: Unlock giant when calling shutdown_nice() Modified: head/sys/dev/syscons/syscons.c Modified: head/sys/dev/syscons/syscons.c == --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018 (r331297) +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018 (r331298) @@ -3858,22 +3858,28 @@ next_code: case RBT: #ifndef SC_DISABLE_REBOOT - if (enable_reboot && !(flags & SCGETC_CN)) + if (enable_reboot && !(flags & SCGETC_CN)) { + mtx_unlock(); shutdown_nice(0); + } #endif break; case HALT: #ifndef SC_DISABLE_REBOOT - if (enable_reboot && !(flags & SCGETC_CN)) + if (enable_reboot && !(flags & SCGETC_CN)) { + mtx_unlock(); shutdown_nice(RB_HALT); + } #endif break; case PDWN: #ifndef SC_DISABLE_REBOOT - if (enable_reboot && !(flags & SCGETC_CN)) + if (enable_reboot && !(flags & SCGETC_CN)) { + mtx_unlock(); shutdown_nice(RB_HALT|RB_POWEROFF); + } #endif break; ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"