On Mon, Jan 26, 2026 at 2:50 PM Johan Hovold <[email protected]> wrote: > > On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote: > > On Sat, Jan 24, 2026 at 08:08:28PM +0100, Danilo Krummrich wrote: > > > On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote: > > > > this does not look like the right interface for the chardev unplug > > > > issue. > > > > > > I think it depends, we should do everything to prevent having the issue > > > in the > > > first place, e.g. ensure that we synchronize the unplug properly on device > > > driver unbind. > > > > > > Sometimes, however, this isn't possible; this is where a revocable > > > mechanism can > > > come in handy to prevent UAF of device resources -- DRM is a good example > > > for > > > this. > > > > This is not "possible" for almost all real devices so we need something > > like this for almost all classes of devices, DRM just shows the extremes > > involved, v4l2 is also another good example. > > It's certainly possible to handle the chardev unplug issue without > revocable as several subsystems already do. All you need is a refcount, > a lock and a flag. > > It may be possible to provide a generic solutions at the chardev level > or some kind of helper implementation (similar to revocable) for > subsystems to use directly. >
This echoes the heated exchange I recently had with Johan elsewhere so I would like to chime in and use the wider forum of driver core maintainers to settle an important question. It seems there are two camps in this discussion: one whose perception of the problem is limited to character devices being referenced from user-space at the time of the driver unbind (favoring fixing the issues at the vfs level) and another extending the problem to any driver unbinding where we cannot ensure a proper ordering of the teardown (for whatever reason: fw_devlink=off, helper auxiliary devices acting as intermediates, or even user-space unbinding a driver manually with bus-level sysfs attributes) leaving consumers of resources exposed by providers that are gone with dangling references (focusing the solutions on the subsystem level). The question is: should we work towards making the kernel gracefully handle any such situation or is it acceptable that if we do "non-standard" things, we can trigger invalid memory accesses from user-space. I'm asking this because I've been sending patches to several subsystems addressing life-time issues at the subsystem level with SRCU and I've faced resistance from Johan at least twice - not based on the implementation details but on the philosophy itself of synchronizing all accesses from consumers to providers (SRCU, revocable or otherwise). I myself am in the latter camp. My thinking is: if we expose an interface, it should work correctly. In particular: it should not allow the user (even root!) to crash the kernel. In addition: there seems to be an agreement that rust in linux is good because of its memory safety features. The issues we're discussing would have never happened, had the code been written in rust so we should not just accept them as normal in C or tell the user to "just not do it". Bartosz
