Hi Lyude,

On Fri, Jan 30, 2026 at 07:13:30PM -0500, Lyude Paul wrote:
> One of the tricky things about DRM bindings in Rust is the fact that
> initialization of a DRM device is a multi-step process. It's quite normal
> for a device driver to start making use of its DRM device for tasks like
> creating GEM objects before userspace registration happens. This is an
> issue in rust though, since prior to userspace registration the device is
> only partly initialized. This means there's a plethora of DRM device
> operations we can't yet expose without opening up the door to UB if the DRM
> device in question isn't yet registered.
> 
> Additionally, this isn't something we can reliably check at runtime. And
> even if we could, performing an operation which requires the device be
> registered when the device isn't actually registered is a programmer bug,
> meaning there's no real way to gracefully handle such a mistake at runtime.
> And even if that wasn't the case, it would be horrendously annoying and
> noisy to have to check if a device is registered constantly throughout a
> driver.
> 
> In order to solve this, we first take inspiration from
> `kernel::device::DeviceContext` and introduce `kernel::drm::DeviceContext`.
> This provides us with a ZST type that we can generalize over to represent
> contexts where a device is known to have been registered with userspace at
> some point in time (`Registered`), along with contexts where we can't make
> such a guarantee (`Uninit`).
> 
> It's important to note we intentionally do not provide a `DeviceContext`
> which represents an unregistered device. This is because there's no
> reasonable way to guarantee that a device with long-living references to
> itself will not be registered eventually with userspace. Instead, we
> provide a new-type for this: `UnregisteredDevice` which can
> provide a guarantee that the `Device` has never been registered with
> userspace. To ensure this, we modify `Registration` so that creating a new
> `Registration` requires passing ownership of an `UnregisteredDevice`.
> 
> Signed-off-by: Lyude Paul <[email protected]>
> 
> ---
> V2:
> * Make sure that `UnregisteredDevice` is not thread-safe (since DRM device
>   initialization is also not thread-safe)
> * Rename from AnyCtx to Uninit, I think this name actually makes a bit more
>   sense.
> * Change assume_registered() to assume_ctx()
>   Since it looks like in some situations, we'll want to update the
>   DeviceContext of a object to the latest DeviceContext we know the Device
>   to be in.
> * Rename Init to Uninit
>   When we eventually add KMS support, we're going to have 3 different
>   DeviceContexts - Uninit, Init, Registered. Additionally, aside from not
>   being registered there are a number of portions of the rest of the Device
>   which also aren't usable before at least the Init context - so the naming
>   of Uninit makes this a little clearer.
> * s/DeviceContext/DeviceContext/
>   For consistency with the rest of the kernel
> * Drop as_ref::<Device<T, Uninit>>() for now since I don't actually think
>   we need this quite yet
> V3:
> * Get rid of drm_dev_ctx!, as we don't actually need to implement Send or
>   Sync ourselves
> * Remove mention of C function in drm::device::Registration rustdoc
> * Add more documentation to the DeviceContext trait, go into detail about
>   the various setup phases and such.
> * Add missing period to comment in `UnregisteredDevice::new()`.
> V4:
> * Address some comments from Danilo I missed last round:
>   * Remove leftover rebase detritus from new_foreign_owned()
>     (the seemingly useless cast)
>   * Remove no-op mention in Registered device context
> 
> Signed-off-by: Lyude Paul <[email protected]>
> ---
>  drivers/gpu/drm/nova/driver.rs |   8 +-
>  drivers/gpu/drm/tyr/driver.rs  |  10 +-
>  rust/kernel/drm/device.rs      | 178 +++++++++++++++++++++++++++------
>  rust/kernel/drm/driver.rs      |  35 +++++--
>  rust/kernel/drm/mod.rs         |   4 +
>  5 files changed, 190 insertions(+), 45 deletions(-)
> 
...
> --- a/rust/kernel/drm/device.rs
> +++ b/rust/kernel/drm/device.rs
...

> +pub struct UnregisteredDevice<T: drm::Driver>(ARef<Device<T, Uninit>>, 
> NotThreadSafe);
> +
> +impl<T: drm::Driver> Deref for UnregisteredDevice<T> {
> +    type Target = Device<T, Uninit>;
> +
> +    fn deref(&self) -> &Self::Target {
> +        &self.0
> +    }
>  }
>  
> -impl<T: drm::Driver> Device<T> {
> +impl<T: drm::Driver> UnregisteredDevice<T> {
>      const VTABLE: bindings::drm_driver = drm_legacy_fields! {
>          load: None,
>          open: Some(drm::File::<T::File>::open_callback),
>          postclose: Some(drm::File::<T::File>::postclose_callback),
>          unload: None,
> -        release: Some(Self::release),
> +        release: Some(Device::<T>::release),
>          master_set: None,
>          master_drop: None,
>          debugfs_init: None,
> @@ -108,8 +185,10 @@ impl<T: drm::Driver> Device<T> {
>  
>      const GEM_FOPS: bindings::file_operations = drm::gem::create_fops();
>  
> -    /// Create a new `drm::Device` for a `drm::Driver`.
> -    pub fn new(dev: &device::Device, data: impl PinInit<T::Data, Error>) -> 
> Result<ARef<Self>> {
> +    /// Create a new `UnregisteredDevice` for a `drm::Driver`.
> +    ///
> +    /// This can be used to create a 
> [`Registration`](kernel::drm::Registration).
> +    pub fn new(dev: &device::Device, data: impl PinInit<T::Data, Error>) -> 
> Result<Self> {
>          // `__drm_dev_alloc` uses `kmalloc()` to allocate memory, hence 
> ensure a `kmalloc()`
>          // compatible `Layout`.
>          let layout = Kmalloc::aligned_layout(Layout::new::<Self>());
> @@ -117,12 +196,12 @@ pub fn new(dev: &device::Device, data: impl 
> PinInit<T::Data, Error>) -> Result<A
>          // SAFETY:
>          // - `VTABLE`, as a `const` is pinned to the read-only section of 
> the compilation,
>          // - `dev` is valid by its type invarants,
> -        let raw_drm: *mut Self = unsafe {
> +        let raw_drm: *mut Device<T, Uninit> = unsafe {
>              bindings::__drm_dev_alloc(
>                  dev.as_raw(),
>                  &Self::VTABLE,
>                  layout.size(),
> -                mem::offset_of!(Self, dev),
> +                mem::offset_of!(Device<T, Uninit>, dev),
>              )
>          }
>          .cast();

When I was testing this series with Tyr, I got this kernel Oops:

[   31.351423] tyr fb000000.gpu: mali-unknown id 0xa867 major 0x67 minor 0x0 
status 0x5
[   31.352324] tyr fb000000.gpu: Features: L2:0x7120306 Tiler:0x809 Mem:0x301 
MMU:0x2830 AS:0xff
[   31.353299] tyr fb000000.gpu: shader_present=0x0000000000050005 
l2_present=0x0000000000000001 tiler_present=0x0000000000000001
[   31.357585] Unable to handle kernel paging request at virtual address 
ffff003064726553
[   31.358324] Mem abort info:
[   31.358579]   ESR = 0x0000000096000005
[   31.358956]   EC = 0x25: DABT (current EL), IL = 32 bits
[   31.359592]   SET = 0, FnV = 0
[   31.359898]   EA = 0, S1PTW = 0
[   31.360185]   FSC = 0x05: level 1 translation fault
[   31.360620] Data abort info:
[   31.360882]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[   31.361370]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[   31.361832]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[   31.361839] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000002926000
[   31.361847] [ffff003064726553] pgd=0000000000000000, p4d=18000002fffff403, 
pud=0000000000000000

[   31.364156] Internal error: Oops: 0000000096000005 [#1]  SMP
[   31.365050] Modules linked in: tyr(+) soundcore sha256 cfg80211 rfkill 
pci_endpoint_test fuse dm_mod ipv6
[   31.366212] CPU: 3 UID: 0 PID: 254 Comm: (udev-worker) Not tainted 
6.19.0-rc5 #281 PREEMPT
[   31.366942] Hardware name: Radxa ROCK 5B (DT)
[   31.367325] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   31.367935] pc : drmm_kmalloc+0x34/0x128
[   31.368286] lr : drm_gem_init+0x70/0xb8
[   31.368627] sp : ffff800081ec3250
[   31.368918] x29: ffff800081ec3250 x28: ffff800082c01000 x27: ffff800081ec3420
[   31.369549] x26: ffff000100dc8a00 x25: ffff0001009ab960 x24: ffff000107a14840
[   31.370179] x23: ffff0001009ab7e0 x22: 0000000000000138 x21: 0000000000000dc0
[   31.370808] x20: ffff000100a801d0 x19: ffff000100a801d0 x18: 0000000000000003
[   31.371438] x17: 0000000000000000 x16: ffffc82e1355a228 x15: ffff000107c08978
[   31.372068] x14: ffffffffffffffff x13: 0000000000000044 x12: 0000000000000000
[   31.372697] x11: 0000000000000030 x10: ffffc82e12bd2db0 x9 : ffff003064726163
[   31.373327] x8 : 00000000000001b8 x7 : ffffc82e107c8dc8 x6 : 0000000000000000
[   31.373956] x5 : 0000000000000000 x4 : 0000000000000003 x3 : 0000000000000000
[   31.374591] x2 : 0000000000000dc0 x1 : 0000000000000dc0 x0 : 00000000000001b8
[   31.375221] Call trace:
[  OK  ] Listening on systemd-hostnamed.sock[   31.375438]  
drmm_kmalloc+0x34/0x128 (P)
et - Hostname Service Socket.
[   31.376264]  drm_gem_init+0x70/0xb8
[   31.376838]  drm_dev_init+0x27c/0x30c
[   31.377164]  __drm_dev_alloc+0x44/0x88
[   31.377496]  
_RNvMs3_NtNtCs8pcx3nCgX4X_6kernel3drm6deviceINtB5_18UnregisteredDeviceNtNtCs5CCNNSUyUK_3tyr6driver12TyrDrmDriverE3newB19_+0x50/0xe4
 [tyr]
[   31.378690]  
_RNvXs0_NtCs5CCNNSUyUK_3tyr6driverNtB5_21TyrPlatformDeviceDataNtNtCs8pcx3nCgX4X_6kernel8platform6Driver5probe+0x30c/0x4ec
 [tyr]
[   31.379794]  
_RNvMs0_NtCs8pcx3nCgX4X_6kernel8platformINtB5_7AdapterNtNtCs5CCNNSUyUK_3tyr6driver21TyrPlatformDeviceDataE14probe_callbackBT_+0x60/0x220
 [tyr]
[   31.381010]  platform_probe+0x5c/0x9c
[   31.381337]  really_probe+0x154/0x43c
[   31.381665]  __driver_probe_device+0xa4/0x118
[   31.382052]  driver_probe_device+0x40/0x230
[   31.382423]  __driver_attach+0xf8/0x288
[   31.382764]  bus_for_each_dev+0xec/0x144
[   31.383112]  driver_attach+0x24/0x30
[   31.383430]  bus_add_driver+0x14c/0x2a0
[   31.383771]  driver_register+0x68/0x100
[   31.384109]  __platform_driver_register+0x20/0x2c
[   31.384524]  init_module+0x84/0xfa0 [tyr]
[   31.384885]  do_one_initcall+0x104/0x35c
[   31.385235]  do_init_module+0x58/0x228
[   31.385569]  load_module+0x16f4/0x18f8
[   31.385901]  __arm64_sys_finit_module+0x24c/0x368
[   31.386317]  invoke_syscall+0x40/0xcc
[   31.386646]  el0_svc_common+0x80/0xd8
[   31.386652]  do_el0_svc+0x1c/0x28
[  OK  ] Reached target sockets.target -[   31.387270]  el0_svc+0x54/0x1d4
 Socket Units.
[   31.388011]  el0t_64_sync_handler+0x84/0x12c
[   31.388509]  el0t_64_sync+0x198/0x19c
[   31.388838] Code: 54000682 f9400689 aa0803e0 2a1503e1 (b943f122)
[   31.389372] ---[ end trace 0000000000000000 ]---
[   31.396698] Unable to handle kernel paging request at virtual address 
0030303030306266

I just fixed it by changing the layout size from Self -> Device<T, Uninit>

After this fix, it's working great to load the firmware with the
Unregistered Device.

thanks,
Deborah

Reply via email to