On Fri, Apr 02, 2021 at 06:30:16PM +0000, Luis Chamberlain wrote:
> On Fri, Apr 02, 2021 at 09:54:12AM +0200, Greg KH wrote:
> > On Thu, Apr 01, 2021 at 11:59:25PM +0000, Luis Chamberlain wrote:
> > > As for the syfs deadlock possible with drivers, this fixes it in a 
> > > generic way:
> > > 
> > > commit fac43d8025727a74f80a183cc5eb74ed902a5d14
> > > Author: Luis Chamberlain <mcg...@kernel.org>
> > > Date:   Sat Mar 27 14:58:15 2021 +0000
> > > 
> > >     sysfs: add optional module_owner to attribute
> > >     
> > >     This is needed as otherwise the owner of the attribute
> > >     or group read/store might have a shared lock used on driver removal,
> > >     and deadlock if we race with driver removal.
> > >     
> > >     Signed-off-by: Luis Chamberlain <mcg...@kernel.org>
> > 
> > No, please no.  Module removal is a "best effort",
> 
> Not for live patching. I am not sure if I am missing any other valid
> use case?

live patching removes modules?  We have so many code paths that are
"best effort" when it comes to module unloading, trying to resolve this
one is a valiant try, but not realistic.

> > if the system dies when it happens, that's on you. 
> 
> I think the better approach for now is simply to call testers / etc to
> deal with this open coded. I cannot be sure that other than live
> patching there may be other valid use cases for module removal, and for
> races we really may care for where userspace *will* typically be mucking
> with sysfs attributes. Monitoring my systems's sysfs attributes I am
> actually quite surprised at the random pokes at them.
> 
> > I am not willing to expend extra energy
> > and maintance of core things like sysfs for stuff like this that does
> > not matter in any system other than a developer's box.
> 
> Should we document this as well? Without this it is unclear that tons of
> random tests are sanely nullified. At least this dead lock I spotted can
> be pretty common form on many drivers.

What other drivers have this problem?

> > Lock data, not code please.  Trying to tie data structure's lifespans
> > to the lifespan of code is a tangled mess, and one that I do not want to
> > add to in any form.
> 
> Driver developers will simply have to open code these protections. In
> light of what I see on LTP / fuzzing, I suspect the use case will grow
> and we'll have to revisit this in the future. But for now, sure, we can
> just open code the required protections everywhere to not crash on module
> removal.

LTP and fuzzing too do not remove modules.  So I do not understand the
root problem here, that's just something that does not happen on a real
system.

thanks,

greg k-h

Reply via email to