Currently the GSP is left running and the WPR2 memory region untouched as the driver is unbound. This is obviously not idea for at least two reasons:
- Probing requires setting up the WPR2 region, which cannot be done if there is already one in place. Thus the current requirement to reset the GPU (using e.g. `echo 1 >/sys/bus/pci/devices/.../reset`) before the driver can be probed again after removal. - The running GSP may still attempt to access shared memory regions, which the kernel might recycle. This patchset does the necessary to leave the GPU in a clean state after unbind. First are a few preparatory patches: - Running the unload sequence requires mutable access to the driver data, but the current device unbind method only passes a non-mutable reference to it. Since the driver data is destroyed after the call to `unbind`, we can just give ownership back to the driver at this stage to solve this issue. The need for mutable access is likely to go away in Nova after we support concurrency on the command queue, but for now we need it and it looks like a sensible design direction anyway. - A `warn_on_err` macro is introduced to call `warn_on` if the passed `Result` is an error. This simplifies the unbind sequence's code as we need to proceed to the next step even if the previous one failed. - A fix (?) to the automatically-generated pin-projected structures, suppressing the warnings when using them partially. With these in place, the rest of the patchset is relatively trivial. We change the signatures of methods related to unbinding to work with mutable pinned driver data, then implement the two steps of the GPU unbind sequence: asking the GSP to shut down, and removing the WPR2 protected memory area. This series sits on top of the following: - Nova fixes for this cycle [1]. - Nova misc improvements [2]. - Transmute on ZSTs [3]. A tree with all the required patches is available in [4]. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ [3] https://lore.kernel.org/all/[email protected]/ [4] https://github.com/Gnurou/linux/tree/b4/nova-unload Signed-off-by: Alexandre Courbot <[email protected]> --- Alexandre Courbot (7): rust: pci: pass driver data by value to `unbind` rust: add warn_on_err macro gpu: nova-core: use warn_on_err macro [RFC] rust: pin-init: allow `dead_code` on projection structure gpu: nova-nova: use pin-init projections gpu: nova-core: send UNLOADING_GUEST_DRIVER GSP command GSP upon unloading gpu: nova-core: run Booter Unloader and FWSEC-SB upon unbinding drivers/gpu/nova-core/driver.rs | 4 +- drivers/gpu/nova-core/firmware/booter.rs | 1 - drivers/gpu/nova-core/firmware/fwsec.rs | 1 - drivers/gpu/nova-core/gpu.rs | 25 ++++++-- drivers/gpu/nova-core/gsp/boot.rs | 77 +++++++++++++++++++++++ drivers/gpu/nova-core/gsp/commands.rs | 42 +++++++++++++ drivers/gpu/nova-core/gsp/fw.rs | 4 ++ drivers/gpu/nova-core/gsp/fw/commands.rs | 27 ++++++++ drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 8 +++ drivers/gpu/nova-core/regs.rs | 5 ++ rust/kernel/bug.rs | 10 +++ rust/kernel/pci.rs | 4 +- rust/pin-init/src/macros.rs | 1 + samples/rust/rust_driver_pci.rs | 2 +- 14 files changed, 198 insertions(+), 13 deletions(-) --- base-commit: 8d4031f6a53fe47449b91f30cd7aa5b439558874 change-id: 20251216-nova-unload-4029b3b76950 Best regards, -- Alexandre Courbot <[email protected]>
