On Thu, 2026-05-28 at 17:24 +0200, Krzysztof Niemiec wrote:
> On 2026-05-25 at 10:16:32 GMT, Janusz Krzysztofik wrote:
> > Hi Krzysztof,
> >
> > On Fri, 2026-05-22 at 17:40 +0200, Krzysztof Niemiec wrote:
> > > The dev->registered variable was initially added in such a way that it
> > > is only set after all functions that could have failed have been called
> > > and haven't returned an error. However, since then, the function
> > > drm_modeset_register_all() is being checked for its return value, which
> > > opens a possibility of failing the register after setting the registered
> > > variable anyway. drm_dev_register() cleans up after itself in case of
> > > failure, mimicking the functions called in drm_dev_unregister(), with
> > > the exception of drm_client_sysrq_unregister() and
> > > drm_panic_unregister() (_register() counterparts of which are not
> > > checked for their return values), and the dev->registered flag.
> > >
> > > This creates a situation in which after calling drm_dev_register()
> > > nothing is registered (as the function entered an error path and cleaned
> > > up), but the device is reported as being registered according to the
> > > dev->registered variable.
> > >
> > > This, for example, confuses the WARN_ON() in drm_mode_object_register(),
> > > which
> > > is raised if a non-dynamic mode object is attempted to be unregistered
> > > in between the drm_dev_register() and drm_dev_unregister() calls. If
> > > drm_dev_register() fails - meaning the device *isn't* registered -
> > > currently it still reports dev->registered = true, which triggers the
> > > WARN_ON(), even though the usage is correct in that instance.
> > >
> > > Set dev->registered back to false in the error path to prevent this.
> >
> > While you've also submitted a patch ("drm/i915: Remove drm_dev_unregister()
> > from the error path during i915_driver_register()") that adjusts i915
> > handling of register / unregister steps after this fix, have you checked if
> > other drivers are prepared for that and won't be affected?
>
> I just did; there'd be a problem if in the error path triggered by
> failing drm_dev_register, some driver relies on the dev->registered
> variable to be true to do its work.
OK, maybe my doubts about other drivers were groundless. The flag is
described as DRM internal after all, and cleaning it in the error unwind
path seems fully justified. However, please review your commit
description. Trying to understand why drm_mode_object_register() might
call WARN_ON() when dev->registered is true, I found it doesn't, and that's
probably its _unregister() counterpart that may do that. With no stack
trace attached, that's confusing for a reviewer. Such WARN_ON()s may be
hit not only due to inconsistencies internal to DRM, but also as a result
of drivers incorrectly / unexpectedly calling some DRM functions.
Thanks,
Janusz
>
> There are two main patterns of doing the error path in the DRM drivers
> I've seen - one is just directly returning from the overarching function
> with drm_dev_register()'s error return value. The other is invoking the
> .unbind handler of the driver via component_unbind_all(), and most of
> the time those call drm_dev_unregister() right away, which sets the
> dev->registered value to false anyway. In these two cases (if
> drm_dev_unregister() is called in the second one), the problem doesn't
> affect the driver at all.
>
> In the case of i915 not setting the variable to false on failure
> triggered a WARN_ON() in drm_mode_object_unregister() because that is
> invoked indirectly in the error path due to the complexity of our
> init code (even xe just returns directly instead). If another driver
> calls that function in its error path, it will just silence (correctly)
> the WARN_ONs there.
>
> The only driver that explicitly checks for dev->registerd in its error
> path is msm, which does that to see if it needs to invoke
> drm_dev_unregister() during unload. This patch actually fixes a bug there
> because otherwise _unregister() would be called even if _register() fails,
> which is not correct.
>
> All other uses of the variable are outside the error path of their
> respective drivers, so they should function as intended even after the
> change.
>
> Thanks
> Krzysztof
>
> >
> > Thanks,
> > Janusz
> >
> >
> > >
> > > Signed-off-by: Krzysztof Niemiec <[email protected]>
> > > ---
> > > drivers/gpu/drm/drm_drv.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > index 985c283cf59f..92403d79bb64 100644
> > > --- a/drivers/gpu/drm/drm_drv.c
> > > +++ b/drivers/gpu/drm/drm_drv.c
> > > @@ -1116,6 +1116,7 @@ int drm_dev_register(struct drm_device *dev,
> > > unsigned long flags)
> > > if (dev->driver->unload)
> > > dev->driver->unload(dev);
> > > err_minors:
> > > + dev->registered = false;
> > > remove_compat_control_link(dev);
> > > drm_minor_unregister(dev, DRM_MINOR_ACCEL);
> > > drm_minor_unregister(dev, DRM_MINOR_PRIMARY);