Re: PROPOSAL: config_* with device_t references

2023-05-12 Thread Taylor R Campbell
> Date: Wed, 10 May 2023 01:08:27 +
> From: Taylor R Campbell 
> 
> I propose to add new config_*_acquire/release functions: [...]

Updated patch so that the legacy config_* operations require the
caller to hold the kernel lock, while the new config_*_acquire/release
ones do not (but take it internally until we make autoconf internals
MP-safe).

This way you can eliminate KERNEL_LOCKs around finding and detaching
children, without introducing races with concurrent drvctl -d:

// no KERNEL_LOCK(1, NULL);
dev = config_found_acquire(...);
mutex_enter(&sc->sc_childlock);
sc->sc_dev = dev;
mutex_exit(&sc->sc_childlock);
device_release(dev);
// no KERNEL_UNLOCK_ONE(NULL);

// no KERNEL_LOCK(1, NULL);
mutex_enter(&sc->sc_childlock);
if ((dev = sc->sc_dev) != NULL)
device_acquire(dev);
mutex_exit(&sc->sc_childlock);
config_detach_release(dev, ...);
// no KERNEL_UNLOCK_ONE(NULL);

Also included:

- Conversion of dk(4) to config_*_acquire/release -- lightly tested.
- Conversion of usb(4) to config_*_acquire/release -- compile-tested.

(Coming up is a series of other changes to fix related races in dk(4)
and simplify some of the logic.  Also need to think about how devices
like vnd(4), cgd(4), &c., can safely work -- these logical drivers
sometimes do config_attach_pseudo in open, and sometimes do
config_detach in close, so their needs are a bit different from
drivers for physical hardware devices.)
>From baba80e17fbb254e50b242d71dc7e1de9bdc5520 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Fri, 26 Aug 2022 10:35:08 +
Subject: [PATCH] autoconf(9): New functions for referenced attach/detach.

New functions:

- config_found_acquire(dev, aux, print, cfargs)
- config_attach_acquire(parent, cf, aux, print, cfargs)
- config_attach_pseudo_acquire(cf, aux)
- config_detach_release(dev, flags)
- device_acquire(dev)

The config_*_acquire functions are like the non-acquire versions, but
they return a referenced device_t, which is guaranteed to be safe to
use until released.  The device's detach function may run while it is
referenced, but the device_t will not be freed and the parent's
.ca_childdetached routine will not be called.

=> config_attach_pseudo_acquire additionally lets you pass an aux
   argument to the device's .ca_attach routine, unlike
   config_attach_pseudo which always passes NULL.

=> Eventually, config_found, config_attach, and config_attach_pseudo
   should be made to return void, because use of the device_t they
   return is unsafe without the kernel lock and difficult to use
   safely even with the kernel lock or in a UP system.  For now,
   they require the caller to hold the kernel lock, while
   config_*_acquire do not.

config_detach_release is like device_release and then config_detach,
but avoids the race inherent with that sequence.

=> Eventually, config_detach should be eliminated, because getting at
   the device_t it needs is unsafe without the kernel lock and
   difficult to use safely even with the kernel lock or in a UP
   system.  For now, it requires the caller to hold the kernel lock,
   while config_detach_release does not.

device_acquire acquires a reference to a device.  It never fails and
can be used in thread context (but not interrupt context, hard or
soft).  Caller is responsible for ensuring that the device_t cannot
be freed; in other words, the device_t must be made unavailable to
any device_acquire callers before the .ca_detach function returns.
Typically device_acquire will be used in a read section (mutex,
rwlock, pserialize, &c.) in a data structure lookup, with
corresponding logic in the .ca_detach function to remove the device
from the data structure and wait for all read sections to complete.
---
 sys/kern/subr_autoconf.c | 185 ---
 sys/sys/device.h |   7 ++
 2 files changed, 181 insertions(+), 11 deletions(-)

diff --git a/sys/kern/subr_autoconf.c b/sys/kern/subr_autoconf.c
index e60be2d7aad8..55c04ff8ca9e 100644
--- a/sys/kern/subr_autoconf.c
+++ b/sys/kern/subr_autoconf.c
@@ -1259,17 +1259,21 @@ static const char * const msgs[] = {
  * not configured, call the given `print' function and return NULL.
  */
 device_t
-config_found(device_t parent, void *aux, cfprint_t print,
+config_found_acquire(device_t parent, void *aux, cfprint_t print,
 const struct cfargs * const cfargs)
 {
cfdata_t cf;
struct cfargs_internal store;
const struct cfargs_internal * const args =
cfargs_canonicalize(cfargs, &store);
+   device_t dev;
+
+   KERNEL_LOCK(1, NULL);
 
cf = config_search_internal(parent, aux, args);
if (cf != NULL) {
-   return config_attach_internal(parent, cf, aux, print, args);
+   dev = config_attach_internal(parent, cf, aux, print, args);
+   goto out;
}
 
if (print) {
@@ -1283,7

Re: PROPOSAL: config_* with device_t references

2023-05-10 Thread David Holland
On Wed, May 10, 2023 at 01:08:27AM +, Taylor R Campbell wrote:
 > I propose to add new config_*_acquire/release functions:
 >  dev = config_found_acquire(...);
 > ...
 >  config_detach_release(dev);

Seems reasonable...

(note, haven't read the diff carefully yet)

-- 
David A. Holland
dholl...@netbsd.org


PROPOSAL: config_* with device_t references

2023-05-09 Thread Taylor R Campbell
I propose to add new config_*_acquire/release functions:

dev = config_found_acquire(...);
is essentially
dev = config_found(...);
device_acquire(dev);

and

config_detach_release(dev);
is essentially
device_release(dev);
config_detach(dev);

but they work atomically without races with, e.g., concurrent drvctl
or physical device removal in order to eliminate a class of driver
bugs.  Similarly config_attach_acquire, config_attach_pseudo_acquire.

Longer-term, we can eliminate the non-acquire/release versions (or
make config_attach/found return void since that's safe) once every
driver is converted, but adding new names enables the compiler to
assist with auditing the conversion until we're done.

Thoughts?


DETAILS

The logic

sc->sc_child = config_found(...);

and

config_detach(sc->sc_child, flags);

found in many drivers, is fundamentally buggy on any system with MP or
kpreemption.

It's buggy because, for example, a parallel call to `drvctl -d childN'
can detach the device and free it, before sc->sc_child has been
initialized on return from config_found or after it has been read but
before control has entered config_detach.  This can lead to
use-after-free.

The kernel lock provides weak protection against this class of bugs
via

KERNEL_LOCK(1, NULL);
sc->sc_child = config_found(...);
KERNEL_UNLOCK_ONE(NULL);

because it has the opportunity to exclude concurrent .ca_childdetached
callbacks.  But we need to get more things off the kernel lock, not
add new dependencies on it.  And it only helps as long as nothing in
config_found blocks (e.g., on an adaptive mutex) before it returns the
device_t.


We can rewrite this instead as:

child = config_found_acquire(...);

mutex_enter(&sc->sc_childlock);
sc->sc_child = child;   
mutex_exit(&sc->sc_childlock);

device_release(child);

and

mutex_enter(&sc->sc_childlock);
child = sc->sc_child;
device_acquire(child);
mutex_exit(&sc->sc_childlock);

config_detach_release(child);

Here sc->sc_childlock is also taken by the device's .ca_childdetached
routine:

mutex_enter(&sc->sc_childlock);
if (dev == sc->sc_child)
sc->sc_child = NULL;
mutex_exit(&sc->sc_childlock);

Other use of the child, e.g. in a keyboard interrupt routine calling
wskbd_input, can safely use sc_childlock to coordinate access to
sc_child without worrying about spinning for the kernel lock or
sleeping while another thread runs a detach routine -- it's only
needed to serialize access to sc->sc_child, so it can use a driver's
own fine-grained locks.

(It's the driver's responsibility to ensure that .ca_detach routine
can't return while device_acquire is still possible, e.g. by quiescing
interrupt handlers that could call device_acquire.)

A minor additional change in this is to pass a cookie through
config_attach_pseudo_acquire as the aux argument to the .ca_attach
function.
>From 066f934f6c0dd3c3884b827d1b78f02c4b50fc3e Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Fri, 26 Aug 2022 10:35:08 +
Subject: [PATCH] autoconf(9): New functions for referenced attach/detach.

New functions:

- config_found_acquire(dev, aux, print, cfargs)
- config_attach_acquire(parent, cf, aux, print, cfargs)
- config_attach_pseudo_acquire(cf, aux)
- config_detach_release(dev, flags)
- device_acquire(dev)

The config_*_acquire functions are like the non-acquire versions, but
they return a referenced device_t, which is guaranteed to be safe to
use until released.  The device's detach function may run while it is
referenced, but the device_t will not be freed and the parent's
.ca_childdetached routine will not be called.

=> config_attach_pseudo_acquire additionally lets you pass an aux
   argument to the device's .ca_attach routine, unlike
   config_attach_pseudo which always passes NULL.

=> Eventually, config_found, config_attach, and config_attach_pseudo
   should be made to return void, because use of the device_t they
   return is fundamentally unsafe.

config_detach_release is like device_release and then config_detach,
but avoids the race inherent with that sequence.

=> Eventually, config_detach should be eliminated, because it's
   fundamentally MP-unsafe (and difficult to use safely even in a UP
   system or with the kernel lock).

device_acquire acquires a reference to a device.  It never fails.
The caller is responsible for ensuring that the device_t cannot be
freed.  Typically this will be used in a read section (mutex, rwlock,
pserialize, &c.) in a data structure lookup, with corresponding logic
in the .ca_detach function to remove the device from the data
structure and wait for all read sections to complete.
---
 sys/kern/subr_autoconf.c | 137 +--
 sys/sys/device.h |   7 ++
 2 files changed, 137 insertions(+), 7 deletions(-)

diff --git a/sys/kern/su