Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-22 Thread Konstantin Belousov
On Fri, Mar 23, 2018 at 01:21:43AM +1100, Bruce Evans wrote:
> I don't like having a whole task for this.  The whole thing is just a
> hack to work around some the upper layers of the tty driver and some
> lower layers not having any input [escape] sequences to control the
> kernel.  Only the syscons and vt lower layers have such sequences
> (where they are actually key combinations that are converted to control
> operations instead of to input [escape] sequences).  To work around for
> hardware ttys, the filter for kdb sequences is abused to implement a
> non-kdb sequence for rebooting.
> 
> The tty input methods could check for kernel-control sequences and safely
> signal init.  This is a bit too complicated for syscons and vt since they
> can more easily check for key combinations, but wouldhave to convert these
> to standard sequences to get the tty layer to do the same thing.  (They
> have many more kernel-control key combinations and the non-kdb one for
> rebooting is just a bug for them.)  This is a bit complicated for hardware
> tty drivers too -- some use the tty bulk-input method and this shouldn't
> check for sequences, but should reduce to bcopy().  Hoever, to detect the
> kdb sequences, these drivers han to check at a low level anyway.  They
> can be clever about this and only check for the console device[s] which are
> usually only used for for input at a low rate.
This is both complicated and mostly pointless. The task mechanism
provides the easy solution, and more, the task mechanism was
specifically designed to allow to schedule activities in the more
allowing context, from a more restrictive context. I have no intent to
start the self-inflicting activity to code the attempt to modify a lot
of drivers to do what can be done in 10 lines of code.

> Calling kern_reboot() from fast interrupt handlers is still invalid.
> It is quite likely to deadlock.  In particular, it should deadlock in
> when kern_reboot() prints messages to the console.  Most console drivers
> have races instead of deadlocks by dropping their lock[s] in their
> fast interrupt handler before calling the buggy alt escape function.
I agree that it is mostly invalid. If SCHEDULER_STOPPED() evaluates to
true, then the clean shutdown is not possible at all on modern hardware.
Most drivers are asynchronous, and even the hardware reboot command
management is async.

Similarly, if initproc is not yet initialized, it is silly to expect
clean shutdown to occur. I just keep the existing behavior for these
corner cases, perhaps it was added for some reasons and might sometime
work.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-22 Thread Warner Losh
On Thu, Mar 22, 2018 at 5:42 AM, Konstantin Belousov 
wrote:

> On Thu, Mar 22, 2018 at 05:50:57PM +1100, Bruce Evans wrote:
> > On Wed, 21 Mar 2018, Warner Losh wrote:
> >
> > > On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov <
> kostik...@gmail.com>
> > > wrote:
> > >> ...
> > >> Are you saying that fast interrupt handlers call shutdown_nice() ?
> This
> > >> is the quite serious bug on its own.  To fix it, shutdown_nice()
> should
> > >> use a fast taskqueue to schedule the task which would lock the process
> > >> and send the signal.
> > >
> > > Is there some way we know we're in a fast interrupt handler? If so, it
> > > should be simple to fix. If not, then there's an API change ahead of
> us...
> >
> > There is a td_intr_nesting_level flag that might work.  (I invented this,
> > but don't like its current use.)
> But why do we need to know this ?  We can always schedule a task in the
> fast taskqueue if init is present and can be scheduled.


Ah, good point. Seems like a poster child example of doing a fast task
queue deferment. That's always safe, except maybe in BDE's fast interrupt
world.

> > But bde is right: the system has to be in good enough shape to cope. I
> > > wonder if we should put that coping into kdb_reboot() instead. It's
> only an
> > > issue for  TILDE ^R, which is a fairly edge case.
> >
> > shutdown_nice() is also called directly from syscons and vt for the
> reboot,
> > halt and poweroff keys.  This happens in normal interrupt handler
> context,
> > so there is only a problem when init is not running.
>
> diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c
> index e5ea9644ad3..e7c6d4c92b2 100644
> --- a/sys/kern/kern_shutdown.c
> +++ b/sys/kern/kern_shutdown.c
> @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>
> @@ -276,6 +277,28 @@ sys_reboot(struct thread *td, struct reboot_args *uap)
> return (error);
>  }
>
> +static void
> +shutdown_nice_task_fn(void *arg, int pending __unused)
> +{
> +   int howto;
> +
> +   howto = (uintptr_t)arg;
> +   /* Send a signal to init(8) and have it shutdown the world. */
> +   PROC_LOCK(initproc);
> +   if (howto & RB_POWEROFF)
> +   kern_psignal(initproc, SIGUSR2);
> +   else if (howto & RB_POWERCYCLE)
> +   kern_psignal(initproc, SIGWINCH);
> +   else if (howto & RB_HALT)
> +   kern_psignal(initproc, SIGUSR1);
> +   else
> +   kern_psignal(initproc, SIGINT);
> +   PROC_UNLOCK(initproc);
> +}
> +
> +static struct task shutdown_nice_task = TASK_INITIALIZER(0,
> +_nice_task_fn, NULL);
> +
>  /*
>   * Called by events that want to shut down.. e.g   on a PC
>   */
> @@ -283,20 +306,14 @@ void
>  shutdown_nice(int howto)
>  {
>
> -   if (initproc != NULL) {
> -   /* Send a signal to init(8) and have it shutdown the
> world. */
> -   PROC_LOCK(initproc);
> -   if (howto & RB_POWEROFF)
> -   kern_psignal(initproc, SIGUSR2);
> -   else if (howto & RB_POWERCYCLE)
> -   kern_psignal(initproc, SIGWINCH);
> -   else if (howto & RB_HALT)
> -   kern_psignal(initproc, SIGUSR1);
> -   else
> -   kern_psignal(initproc, SIGINT);
> -   PROC_UNLOCK(initproc);
> +   if (initproc != NULL && !SCHEDULER_STOPPED()) {
> +   shutdown_nice_task.ta_context = (void *)(uintptr_t)howto;
> +   taskqueue_enqueue(taskqueue_fast, _nice_task);
> } else {
> -   /* No init(8) running, so simply reboot. */
> +   /*
> +* No init(8) running, or scheduler would not allow it
> +* to run, so simply reboot.
> +*/
> kern_reboot(howto | RB_NOSYNC);
> }
>  }
>

I like this elegance. I think we should commit this. While it would be nice
to support the super-fast interrupts that bde has done, we already don't
support it in a number of places. Conceptually, I like the notion of the
tilde escape machine in the tty layer, but I don't like the trade offs.
Here we only defer to reboot, which requires a fair amount of machinery
working to work right. it's not the dependencies I'd prefer, but usually it
doesn't matter. However, to do that, it would also mean that breaking to
the debugger would need to be done in the tty layer, which requires more of
the context switching mechanisms to be working, which would be a bigger
loss. So this is a good trade-off.

If kern_reboot() can't be called from a fast interrupt handler in bdeBSD,
it would be a simple matter to put a wrapper around it here so his kernel
can context switch. Though, if the scheduler isn't running, I'm not sure
what you can do in this situation. The system is already hung or in a
really bad way, so it would have to jump to somewhere late in 

Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-22 Thread Bruce Evans

On Thu, 22 Mar 2018, Konstantin Belousov wrote:


On Thu, Mar 22, 2018 at 05:50:57PM +1100, Bruce Evans wrote:

On Wed, 21 Mar 2018, Warner Losh wrote:


On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov 
wrote:

...
Are you saying that fast interrupt handlers call shutdown_nice() ?  This
is the quite serious bug on its own.  To fix it, shutdown_nice() should
use a fast taskqueue to schedule the task which would lock the process
and send the signal.


Is there some way we know we're in a fast interrupt handler? If so, it
should be simple to fix. If not, then there's an API change ahead of us...


There is a td_intr_nesting_level flag that might work.  (I invented this,
but don't like its current use.)

But why do we need to know this ?  We can always schedule a task in the
fast taskqueue if init is present and can be scheduled.


Not quite always.  In my version, fast interrupt handlers are actually
fast, so they can interrupt any spin mutex and cannot call any
scheduling function.  This is enforced by setting the pcpu pointer to
NULL.  Taskqueues and even SWIs are unavailable for fast interrupt
handlers.  Scheduling is done by setting a flag that is checked in
timeout handlers like it was in FreeBSD-1.


...

shutdown_nice() is also called directly from syscons and vt for the reboot,
halt and poweroff keys.  This happens in normal interrupt handler context,
so there is only a problem when init is not running.


diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c
index e5ea9644ad3..e7c6d4c92b2 100644
--- a/sys/kern/kern_shutdown.c
+++ b/sys/kern/kern_shutdown.c
@@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
#include 
#include 
#include 
+#include 
#include 
#include 

@@ -276,6 +277,28 @@ sys_reboot(struct thread *td, struct reboot_args *uap)
return (error);
}

+static void
+shutdown_nice_task_fn(void *arg, int pending __unused)
+{
+   int howto;
+
+   howto = (uintptr_t)arg;
+   /* Send a signal to init(8) and have it shutdown the world. */
+   PROC_LOCK(initproc);
+   if (howto & RB_POWEROFF)
+   kern_psignal(initproc, SIGUSR2);
+   else if (howto & RB_POWERCYCLE)
+   kern_psignal(initproc, SIGWINCH);
+   else if (howto & RB_HALT)
+   kern_psignal(initproc, SIGUSR1);
+   else
+   kern_psignal(initproc, SIGINT);
+   PROC_UNLOCK(initproc);
+}
+
+static struct task shutdown_nice_task = TASK_INITIALIZER(0,
+_nice_task_fn, NULL);
+


I don't like having a whole task for this.  The whole thing is just a
hack to work around some the upper layers of the tty driver and some
lower layers not having any input [escape] sequences to control the
kernel.  Only the syscons and vt lower layers have such sequences
(where they are actually key combinations that are converted to control
operations instead of to input [escape] sequences).  To work around for
hardware ttys, the filter for kdb sequences is abused to implement a
non-kdb sequence for rebooting.

The tty input methods could check for kernel-control sequences and safely
signal init.  This is a bit too complicated for syscons and vt since they
can more easily check for key combinations, but wouldhave to convert these
to standard sequences to get the tty layer to do the same thing.  (They
have many more kernel-control key combinations and the non-kdb one for
rebooting is just a bug for them.)  This is a bit complicated for hardware
tty drivers too -- some use the tty bulk-input method and this shouldn't
check for sequences, but should reduce to bcopy().  Hoever, to detect the
kdb sequences, these drivers han to check at a low level anyway.  They
can be clever about this and only check for the console device[s] which are
usually only used for for input at a low rate.


/*
 * Called by events that want to shut down.. e.g   on a PC
 */
@@ -283,20 +306,14 @@ void
shutdown_nice(int howto)
{

-   if (initproc != NULL) {
-   /* Send a signal to init(8) and have it shutdown the world. */
-   PROC_LOCK(initproc);
-   if (howto & RB_POWEROFF)
-   kern_psignal(initproc, SIGUSR2);
-   else if (howto & RB_POWERCYCLE)
-   kern_psignal(initproc, SIGWINCH);
-   else if (howto & RB_HALT)
-   kern_psignal(initproc, SIGUSR1);
-   else
-   kern_psignal(initproc, SIGINT);
-   PROC_UNLOCK(initproc);
+   if (initproc != NULL && !SCHEDULER_STOPPED()) {
+   shutdown_nice_task.ta_context = (void *)(uintptr_t)howto;
+   taskqueue_enqueue(taskqueue_fast, _nice_task);
} else {
-   /* No init(8) running, so simply reboot. */
+   /*
+* No init(8) running, or scheduler would not allow it
+* to run, so simply reboot.
+*/
kern_reboot(howto | RB_NOSYNC);
}
}


Calling 

Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-22 Thread Konstantin Belousov
On Thu, Mar 22, 2018 at 05:50:57PM +1100, Bruce Evans wrote:
> On Wed, 21 Mar 2018, Warner Losh wrote:
> 
> > On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov 
> > wrote:
> >> ...
> >> Are you saying that fast interrupt handlers call shutdown_nice() ?  This
> >> is the quite serious bug on its own.  To fix it, shutdown_nice() should
> >> use a fast taskqueue to schedule the task which would lock the process
> >> and send the signal.
> >
> > Is there some way we know we're in a fast interrupt handler? If so, it
> > should be simple to fix. If not, then there's an API change ahead of us...
> 
> There is a td_intr_nesting_level flag that might work.  (I invented this,
> but don't like its current use.)
But why do we need to know this ?  We can always schedule a task in the
fast taskqueue if init is present and can be scheduled.

> 
> > But bde is right: the system has to be in good enough shape to cope. I
> > wonder if we should put that coping into kdb_reboot() instead. It's only an
> > issue for  TILDE ^R, which is a fairly edge case.
> 
> shutdown_nice() is also called directly from syscons and vt for the reboot,
> halt and poweroff keys.  This happens in normal interrupt handler context,
> so there is only a problem when init is not running.

diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c
index e5ea9644ad3..e7c6d4c92b2 100644
--- a/sys/kern/kern_shutdown.c
+++ b/sys/kern/kern_shutdown.c
@@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -276,6 +277,28 @@ sys_reboot(struct thread *td, struct reboot_args *uap)
return (error);
 }
 
+static void
+shutdown_nice_task_fn(void *arg, int pending __unused)
+{
+   int howto;
+
+   howto = (uintptr_t)arg;
+   /* Send a signal to init(8) and have it shutdown the world. */
+   PROC_LOCK(initproc);
+   if (howto & RB_POWEROFF)
+   kern_psignal(initproc, SIGUSR2);
+   else if (howto & RB_POWERCYCLE)
+   kern_psignal(initproc, SIGWINCH);
+   else if (howto & RB_HALT)
+   kern_psignal(initproc, SIGUSR1);
+   else
+   kern_psignal(initproc, SIGINT);
+   PROC_UNLOCK(initproc);
+}
+
+static struct task shutdown_nice_task = TASK_INITIALIZER(0,
+_nice_task_fn, NULL);
+
 /*
  * Called by events that want to shut down.. e.g   on a PC
  */
@@ -283,20 +306,14 @@ void
 shutdown_nice(int howto)
 {
 
-   if (initproc != NULL) {
-   /* Send a signal to init(8) and have it shutdown the world. */
-   PROC_LOCK(initproc);
-   if (howto & RB_POWEROFF)
-   kern_psignal(initproc, SIGUSR2);
-   else if (howto & RB_POWERCYCLE)
-   kern_psignal(initproc, SIGWINCH);
-   else if (howto & RB_HALT)
-   kern_psignal(initproc, SIGUSR1);
-   else
-   kern_psignal(initproc, SIGINT);
-   PROC_UNLOCK(initproc);
+   if (initproc != NULL && !SCHEDULER_STOPPED()) {
+   shutdown_nice_task.ta_context = (void *)(uintptr_t)howto;
+   taskqueue_enqueue(taskqueue_fast, _nice_task);
} else {
-   /* No init(8) running, so simply reboot. */
+   /*
+* No init(8) running, or scheduler would not allow it
+* to run, so simply reboot.
+*/
kern_reboot(howto | RB_NOSYNC);
}
 }
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-22 Thread Bruce Evans

On Wed, 21 Mar 2018, Warner Losh wrote:


On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov 
wrote:

...
Are you saying that fast interrupt handlers call shutdown_nice() ?  This
is the quite serious bug on its own.  To fix it, shutdown_nice() should
use a fast taskqueue to schedule the task which would lock the process
and send the signal.


Is there some way we know we're in a fast interrupt handler? If so, it
should be simple to fix. If not, then there's an API change ahead of us...


There is a td_intr_nesting_level flag that might work.  (I invented this,
but don't like its current use.)


But bde is right: the system has to be in good enough shape to cope. I
wonder if we should put that coping into kdb_reboot() instead. It's only an
issue for  TILDE ^R, which is a fairly edge case.


shutdown_nice() is also called directly from syscons and vt for the reboot,
halt and poweroff keys.  This happens in normal interrupt handler context,
so there is only a problem when init is not running.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-22 Thread Bruce Evans

On Wed, 21 Mar 2018, Warner Losh wrote:


On Wed, Mar 21, 2018 at 11:53 AM, Bruce Evans  wrote:


On Wed, 21 Mar 2018, Warner Losh wrote:

Log:

 Unlock giant when calling shutdown_nice()


This breaks the driver.  Giant is syscons' driver lock, and also the
interrupt handler lock for at least the atkbd keyboard driver, so vt
sometimes holds the lock for.


OK. I got carried away. You're right. The proper fix is to unlock Giant at
the top of kern_reboot() instead. This handles the case where we call
shutdown_nice() with no init running and have to call kern_reboot directly.
Otherwise it's perfectly fine to just call shutdown_nice() with Giant held
since we just signal init from there. So I'll revert this change and make
that other change instead.
...
Good point. I think the following change is good for everything except
calling shutdown_nice() from a fast interrupt handler with noinit running:

diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c
index e5ea9644ad3f..564aecd811be 100644
--- a/sys/kern/kern_shutdown.c
+++ b/sys/kern/kern_shutdown.c
@@ -366,6 +366,12 @@ kern_reboot(int howto)
{
   static int once = 0;

+   /*
+* Drop Giant once and for all.
+*/
+   while (mtx_owned())
+   mtx_unlock();
+
#if defined(SMP)
   /*
* Bind us to the first CPU so that all shutdown code runs there.
Some

Comments?


Try putting this in vfs_mountroot_parse() and/or the second clause of
shutdown_nice() only.  The only other calls are from is from sys_reboot()
and vpanic().  sys_reboot() is either MPSAFE or not, and it should know
when it is safe to drop its Giant lock if at all.  vpanic() sets
SCHEDULER_STOPPED() to avoid seeing problems with Giant or any other
mutex.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-22 Thread Bruce Evans

On Wed, 21 Mar 2018, Konstantin Belousov wrote:


On Thu, Mar 22, 2018 at 04:53:22AM +1100, Bruce Evans wrote:

Serial console drivers with fast interrupt handlers have much more
broken locking for ddb special keys.  It is invalid to either drop locks
or call the "any" function from a fast interrupt handler, but buggy
serial console drivers calls kbd_alt_break(), and that now calls
shutdown_nice() and other functions that cannot be called from a fast
interrupt handler.  ddb keys supply most of the shutdown_nice()
functionality for serial consoles, and there are no escape sequence to
get this without ddb or maybe another debugger, so these bugs don't
affect most users.

Handling this correctly requires much the same fix as an unsafe signal
handler, and fixes have much the same problems -- not much more than
setting a flag is safe, and the flag might never be looked at if the
system is in a bad state.  However, if a nice shutdown is possible then
the sytem must be in a good enough state to poll for flags.


Are you saying that fast interrupt handlers call shutdown_nice() ?  This
is the quite serious bug on its own.  To fix it, shutdown_nice() should
use a fast taskqueue to schedule the task which would lock the process
and send the signal.


Yes.  See kdb_reboot().  This is called for an escape sequence from
kdb_alt_break_internal().  The other calls in kdb_alt_break_internal()
don't wander into a function that calls shutdown_nice() and are
relatively safe.  They have to be not completely safe to work in "any"
context.  kdb_reboot() is the opposite -- we want it to fail if the
context is not nice enough to reboot nicely.

BTW, I've often wanted to be able to send more general signals to init
using keystrokes, independent of being logged in.  Mainly SIGHUP to
shut down to single user mode.  This has some security problems,
especially for the "any" signal to the "any" process.  At least syscons
defaults to allowing Ctrl-Alt-Del to reboot.  I like that and miss it
on serial consoles and on some non-FreeBSD OS's (the alt-break sequence
to reach kdb_reboot() is not equivalent, since it is only available
if much more insecure sequences are also allowed).  Shutdown isn't as
fundamentally insecure as sending arbitrary signals and most systems
allow it without much more than a password for users that have physical
access to the system.

OTOH, I don't like the syscons key that suspends.  This is also allowed
by default.  I sometimes press it by mistake on a system that can suspend
but not resume.  Unlike the reboot key, there are no knobs for controlling
it.  There are 3 static configuration knobs, 2 sysctls and the keymap
which must be understood for securing the main reboot key in syscons, and a
slightly different set of controls in vt :-(.  For suspend, there is only
the keymap.  Syscons also has a standby key and more shutdown keys, but
these are in more obscure parts of keymaps and I've never noticed typing
them by mistake.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-21 Thread Warner Losh
On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov 
wrote:

> On Thu, Mar 22, 2018 at 04:53:22AM +1100, Bruce Evans wrote:
> > Serial console drivers with fast interrupt handlers have much more
> > broken locking for ddb special keys.  It is invalid to either drop locks
> > or call the "any" function from a fast interrupt handler, but buggy
> > serial console drivers calls kbd_alt_break(), and that now calls
> > shutdown_nice() and other functions that cannot be called from a fast
> > interrupt handler.  ddb keys supply most of the shutdown_nice()
> > functionality for serial consoles, and there are no escape sequence to
> > get this without ddb or maybe another debugger, so these bugs don't
> > affect most users.
> >
> > Handling this correctly requires much the same fix as an unsafe signal
> > handler, and fixes have much the same problems -- not much more than
> > setting a flag is safe, and the flag might never be looked at if the
> > system is in a bad state.  However, if a nice shutdown is possible then
> > the sytem must be in a good enough state to poll for flags.
>
> Are you saying that fast interrupt handlers call shutdown_nice() ?  This
> is the quite serious bug on its own.  To fix it, shutdown_nice() should
> use a fast taskqueue to schedule the task which would lock the process
> and send the signal.
>

Is there some way we know we're in a fast interrupt handler? If so, it
should be simple to fix. If not, then there's an API change ahead of us...

But bde is right: the system has to be in good enough shape to cope. I
wonder if we should put that coping into kdb_reboot() instead. It's only an
issue for  TILDE ^R, which is a fairly edge case.

Warner
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-21 Thread Warner Losh
On Wed, Mar 21, 2018 at 11:53 AM, Bruce Evans  wrote:

> On Wed, 21 Mar 2018, Warner Losh wrote:
>
> Log:
>>  Unlock giant when calling shutdown_nice()
>>
>
> This breaks the driver.  Giant is syscons' driver lock, and also the
> interrupt handler lock for at least the atkbd keyboard driver, so vt
> sometimes holds the lock for.
>

OK. I got carried away. You're right. The proper fix is to unlock Giant at
the top of kern_reboot() instead. This handles the case where we call
shutdown_nice() with no init running and have to call kern_reboot directly.
Otherwise it's perfectly fine to just call shutdown_nice() with Giant held
since we just signal init from there. So I'll revert this change and make
that other change instead.


> vt has to sprinkle lots of Giant locking and unlocking where the
> corresponding locking is automatic for syscons, but this seems to be
> nonsense in its event handler.  vt has to acquire Giant when calling
> the keyboard driver, but event handling is a call from the keyboard
> driver, usually from the interrupt handler with a suitable lock held.
> For atkbd, the suitable lock is Giant -- see atkbd_intr().  (All keyboard
> drivers that use the kbd(4) for the top level use Giant looking since
> toe top level specifies D_NEEDGIANT in its cdevsw.  I think that is all
> keyboard drivers.)  So where vt_kbdevent() sprinkles Giant, that has no
> effect since Giant is held by the caller.  So vt_kbdevent() calls
> vt_processkey() with Giant held despite it not acquiring Giant explicitly;
> vt_processkey() calls vt_machine_kbdevent() and that calls various
> shutdown functions, all with Giant held.


True, and covered above.


> Modified: head/sys/dev/syscons/syscons.c
>> 
>> ==
>> --- head/sys/dev/syscons/syscons.c  Wed Mar 21 14:47:08 2018
>> (r331297)
>> +++ head/sys/dev/syscons/syscons.c  Wed Mar 21 14:47:12 2018
>> (r331298)
>> @@ -3858,22 +3858,28 @@ next_code:
>>
>> case RBT:
>> #ifndef SC_DISABLE_REBOOT
>> -   if (enable_reboot && !(flags & SCGETC_CN))
>> +   if (enable_reboot && !(flags & SCGETC_CN)) {
>> +   mtx_unlock();
>> shutdown_nice(0);
>> +   }
>> #endif
>> break;
>>
>> case HALT:
>> #ifndef SC_DISABLE_REBOOT
>> -   if (enable_reboot && !(flags & SCGETC_CN))
>> +   if (enable_reboot && !(flags & SCGETC_CN)) {
>> +   mtx_unlock();
>> shutdown_nice(RB_HALT);
>> +   }
>> #endif
>> break;
>>
>> case PDWN:
>> #ifndef SC_DISABLE_REBOOT
>> -   if (enable_reboot && !(flags & SCGETC_CN))
>> +   if (enable_reboot && !(flags & SCGETC_CN)) {
>> +   mtx_unlock();
>> shutdown_nice(RB_HALT|RB_POWEROFF);
>> +   }
>> #endif
>> break;
>>
>
> The new bugs should cause assertion failures.  shutdown_nice() just
> signals init and returns.  Giant is not re-acquired, so assertions
> should fail when the caller drops the lock.  So the shutdown should
> be a nasty panic.
>

I must have been using vt when I tested it, since I didn't see that.


> Transiently dropping the lock is probably not fatal depending on what
> the caller does.  On x86 with atkbd, nested interrupts are prevented
> by masking the in the ATPIC or APIC -- Giant is not really the driver
> lock, but just a lock for interfacing between related screen and keyboard
> drivers.
>
> Serial console drivers with fast interrupt handlers have much more
> broken locking for ddb special keys.  It is invalid to either drop locks
> or call the "any" function from a fast interrupt handler, but buggy
> serial console drivers calls kbd_alt_break(), and that now calls
> shutdown_nice() and other functions that cannot be called from a fast
> interrupt handler.  ddb keys supply most of the shutdown_nice()
> functionality for serial consoles, and there are no escape sequence to
> get this without ddb or maybe another debugger, so these bugs don't
> affect most users.
>

It's called it before my changes... I'll make a note to look into this.


> Handling this correctly requires much the same fix as an unsafe signal
> handler, and fixes have much the same problems -- not much more than
> setting a flag is safe, and the flag might never be looked at if the
> system is in a bad state.  However, if a nice shutdown is possible then
> the sytem must be in a good enough state to poll for flags.
>
> For normal signal handlers, there is no problem sending a signal to init
> or at least with setting a flag and waking up some thread to check the
> flag.
>
> I don't quite understand this commit.  It should be valid to send a
> signal to init() in proc or non-fast ithread context including with
> Giant held.  shutdown_nice() starts with 

Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-21 Thread Konstantin Belousov
On Thu, Mar 22, 2018 at 04:53:22AM +1100, Bruce Evans wrote:
> Serial console drivers with fast interrupt handlers have much more
> broken locking for ddb special keys.  It is invalid to either drop locks
> or call the "any" function from a fast interrupt handler, but buggy
> serial console drivers calls kbd_alt_break(), and that now calls
> shutdown_nice() and other functions that cannot be called from a fast
> interrupt handler.  ddb keys supply most of the shutdown_nice()
> functionality for serial consoles, and there are no escape sequence to
> get this without ddb or maybe another debugger, so these bugs don't
> affect most users.
> 
> Handling this correctly requires much the same fix as an unsafe signal
> handler, and fixes have much the same problems -- not much more than
> setting a flag is safe, and the flag might never be looked at if the
> system is in a bad state.  However, if a nice shutdown is possible then
> the sytem must be in a good enough state to poll for flags.

Are you saying that fast interrupt handlers call shutdown_nice() ?  This
is the quite serious bug on its own.  To fix it, shutdown_nice() should
use a fast taskqueue to schedule the task which would lock the process
and send the signal.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-21 Thread Bruce Evans

On Thu, 22 Mar 2018, Bruce Evans wrote:


On Wed, 21 Mar 2018, Warner Losh wrote:


Log:
 Unlock giant when calling shutdown_nice()

...
This breaks the driver.  Giant is syscons' driver lock, and also the
interrupt handler lock for at least the atkbd keyboard driver, so vt
sometimes holds the lock for.


[That should have been "... lock too".]


...
Actual testing shows that doesn't cause a panic, but it also doesn't
actually unlock for shutdown_nice(), since the lock is acquired twice
and only released once.  syscons has much the same extra lock sprinkling
for event handling as vt:

- intr_event_execute_handlers() acquires Giant and calls atkbdintr()
- atkbdintr() calls sckbdevent()
- sckbdevent() unnecessarily acquires Giant again
- the buggy unlocking drops Giant just once
- shutdown_nice() is called with Giant held
- the buggy unlocking fails to re-acquire Giant
- sckbdevent() releases Giant, leaving it not held
- sckbdevent() returns
- atkbdintr() returns
- intr_event_execute_handlers() releases Giant.  This should panic, but
 it apparently blocks for long enough for init to shut down first.

When I trace the last step, I get a panic which might be either from the
different timing or just a bug in kdb.


Testing with a kernel with other bugs fixed shows that the invariants
violation really is detected, causing a panic instead of a nice shutdown
for Ctrl-Alt-Del, but recursive panics pile up and printf() in -current
is too broken to print anything.  A better kernel prints:

XX ppkernel trap 12 with interrupts disabled
XX panic: kernel trap doesn't have ucred
XX cpuid = 32
XX time = 1521655530
XX KDB: enter: panic
XX panic: kernel trap doesn't have ucred
XX cpuid = 32
XX time = 1521655530
XX KDB: enter: panic
XX panic: kernel trap doesn't have ucred
XX cpuid = 32
XX time = 1521655530
XX KDB: enter: panic
XX panic: kernel trap doesn't have ucrepanic: kernel trap doesn't have ucred
XX cpuid = 32
XX time = 1521655530
XX KDB: enter: panic

where some messages are lost and some are duplicated (there are 8 CPUs).

I got control by putting a breakpoint after stop_cpus_hard() in vpanic().
(A breakpoint at panic() crashes, probably for multiple CPUs hitting it,
though this is supposed to be fixed in the test version.)  The backtrace
is then:

XX Breakpoint at   vpanic+0x4a:popl%ecx
XX db> t
XX Tracing pid 11 tid 100042 td 0xd6a61360
XX vpanic(c0916c91,d6685ab4,d6685ab4,c098f050,c098f040,...) at 
vpanic+0x4a/frame 0xd6685a8c
XX kassert_panic(c0916c91,c093b956,c0930a24,c09066cf,557) at 
kassert_panic+0x49/frame 0xd6685aa8
XX witness_unlock(c098f040,8,c09066c6,557) at witness_unlock+0xe7/frame 
0xd6685af0
XX __mtx_unlock_flags(c098f050,0,c09066c6,557) at __mtx_unlock_flags+0x65/frame 
0xd6685b14
XX intr_event_execute_handlers(c8f466b0) at 
intr_event_execute_handlers+0xed/frame 0xd6685b40
XX ithread_execute_handlers(c8f466b0,0,80202,d6a61360,c8f466b0,...) at 
ithread_execute_handlers+0x21/frame 0xd6685b54
XX ithread_loop(d6cda250,d6685ba8,0,d6cda250,c06a2c56,...) at 
ithread_loop+0x5f/frame 0xd6685b74
XX fork_exit(c06a2c56,d6cda250,d6685ba8) at fork_exit+0x83/frame 0xd6685b94
XX fork_trampoline() at fork_trampoline+0x8

The stack trace is messed up are mis-decoded despite attempts to avoid this
(use i386, don't use clang, and turn off auto-inlining...).   panic() is not
shown, and args passed in registers are not shown.  There are just enough
args to find the panic message.  It is as expected:

XX db> x/s 0xd6685ab4
XX 0xd6685ab4: V\271\223\300$\012\223\300\317f\220\300W\005
XX db> x/s 0xc0916c91
XX __func__.16466+0x1c41:  lock (%s) %s not locked @ %s:%d

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r331298 - head/sys/dev/syscons

2018-03-21 Thread Bruce Evans

On Wed, 21 Mar 2018, Warner Losh wrote:


Log:
 Unlock giant when calling shutdown_nice()


This breaks the driver.  Giant is syscons' driver lock, and also the
interrupt handler lock for at least the atkbd keyboard driver, so vt
sometimes holds the lock for.

vt has to sprinkle lots of Giant locking and unlocking where the
corresponding locking is automatic for syscons, but this seems to be
nonsense in its event handler.  vt has to acquire Giant when calling
the keyboard driver, but event handling is a call from the keyboard
driver, usually from the interrupt handler with a suitable lock held.
For atkbd, the suitable lock is Giant -- see atkbd_intr().  (All keyboard
drivers that use the kbd(4) for the top level use Giant looking since
toe top level specifies D_NEEDGIANT in its cdevsw.  I think that is all
keyboard drivers.)  So where vt_kbdevent() sprinkles Giant, that has no
effect since Giant is held by the caller.  So vt_kbdevent() calls
vt_processkey() with Giant held despite it not acquiring Giant explicitly;
vt_processkey() calls vt_machine_kbdevent() and that calls various
shutdown functions, all with Giant held.


Modified: head/sys/dev/syscons/syscons.c
==
--- head/sys/dev/syscons/syscons.c  Wed Mar 21 14:47:08 2018
(r331297)
+++ head/sys/dev/syscons/syscons.c  Wed Mar 21 14:47:12 2018
(r331298)
@@ -3858,22 +3858,28 @@ next_code:

case RBT:
#ifndef SC_DISABLE_REBOOT
-   if (enable_reboot && !(flags & SCGETC_CN))
+   if (enable_reboot && !(flags & SCGETC_CN)) {
+   mtx_unlock();
shutdown_nice(0);
+   }
#endif
break;

case HALT:
#ifndef SC_DISABLE_REBOOT
-   if (enable_reboot && !(flags & SCGETC_CN))
+   if (enable_reboot && !(flags & SCGETC_CN)) {
+   mtx_unlock();
shutdown_nice(RB_HALT);
+   }
#endif
break;

case PDWN:
#ifndef SC_DISABLE_REBOOT
-   if (enable_reboot && !(flags & SCGETC_CN))
+   if (enable_reboot && !(flags & SCGETC_CN)) {
+   mtx_unlock();
shutdown_nice(RB_HALT|RB_POWEROFF);
+   }
#endif
break;


The new bugs should cause assertion failures.  shutdown_nice() just
signals init and returns.  Giant is not re-acquired, so assertions
should fail when the caller drops the lock.  So the shutdown should
be a nasty panic.

Transiently dropping the lock is probably not fatal depending on what
the caller does.  On x86 with atkbd, nested interrupts are prevented
by masking the in the ATPIC or APIC -- Giant is not really the driver
lock, but just a lock for interfacing between related screen and keyboard
drivers.

Serial console drivers with fast interrupt handlers have much more
broken locking for ddb special keys.  It is invalid to either drop locks
or call the "any" function from a fast interrupt handler, but buggy
serial console drivers calls kbd_alt_break(), and that now calls
shutdown_nice() and other functions that cannot be called from a fast
interrupt handler.  ddb keys supply most of the shutdown_nice()
functionality for serial consoles, and there are no escape sequence to
get this without ddb or maybe another debugger, so these bugs don't
affect most users.

Handling this correctly requires much the same fix as an unsafe signal
handler, and fixes have much the same problems -- not much more than
setting a flag is safe, and the flag might never be looked at if the
system is in a bad state.  However, if a nice shutdown is possible then
the sytem must be in a good enough state to poll for flags.

For normal signal handlers, there is no problem sending a signal to init
or at least with setting a flag and waking up some thread to check the
flag.

I don't quite understand this commit.  It should be valid to send a
signal to init() in proc or non-fast ithread context including with
Giant held.  shutdown_nice() starts with PROC_LOCK(initproc), so it
would be a LOR to call it with a spinlock held, and is even more
obviously wrong to call it from kbd_alt_break() with interrupts masked
and possibly a spinlock held, but Giant is a special sleep lock that
causes fewer LORs than most sleep locks.

Actual testing shows that doesn't cause a panic, but it also doesn't
actually unlock for shutdown_nice(), since the lock is acquired twice
and only released once.  syscons has much the same extra lock sprinkling
for event handling as vt:

- intr_event_execute_handlers() acquires Giant and calls atkbdintr()
- atkbdintr() calls sckbdevent()
- sckbdevent() unnecessarily acquires Giant again
- the buggy unlocking drops Giant just once
- shutdown_nice() is called with Giant held
- the buggy unlocking fails to re-acquire Giant
- sckbdevent() releases Giant, leaving it not held
- 

svn commit: r331298 - head/sys/dev/syscons

2018-03-21 Thread Warner Losh
Author: imp
Date: Wed Mar 21 14:47:12 2018
New Revision: 331298
URL: https://svnweb.freebsd.org/changeset/base/331298

Log:
  Unlock giant when calling shutdown_nice()

Modified:
  head/sys/dev/syscons/syscons.c

Modified: head/sys/dev/syscons/syscons.c
==
--- head/sys/dev/syscons/syscons.c  Wed Mar 21 14:47:08 2018
(r331297)
+++ head/sys/dev/syscons/syscons.c  Wed Mar 21 14:47:12 2018
(r331298)
@@ -3858,22 +3858,28 @@ next_code:
 
case RBT:
 #ifndef SC_DISABLE_REBOOT
-   if (enable_reboot && !(flags & SCGETC_CN))
+   if (enable_reboot && !(flags & SCGETC_CN)) {
+   mtx_unlock();
shutdown_nice(0);
+   }
 #endif
break;
 
case HALT:
 #ifndef SC_DISABLE_REBOOT
-   if (enable_reboot && !(flags & SCGETC_CN))
+   if (enable_reboot && !(flags & SCGETC_CN)) {
+   mtx_unlock();
shutdown_nice(RB_HALT);
+   }
 #endif
break;
 
case PDWN:
 #ifndef SC_DISABLE_REBOOT
-   if (enable_reboot && !(flags & SCGETC_CN))
+   if (enable_reboot && !(flags & SCGETC_CN)) {
+   mtx_unlock();
shutdown_nice(RB_HALT|RB_POWEROFF);
+   }
 #endif
break;
 
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"