On 4/22/26 6:40 AM, Alexandre Courbot wrote: > Currently the GSP is left running and the WPR2 memory region untouched > when the driver is unbound. This is obviously not ideal for at least two > reasons:
Hi, Is this ready to merge, or are you looking for more reviews? thanks, -- John Hubbard > > - Probing requires setting up the WPR2 region, which cannot be done if > there is already one in place. Hence the current requirement to reset > the GPU (using e.g. `echo 1 >/sys/bus/pci/devices/.../reset`) before > the driver can be probed again after removal. > - The running GSP may still attempt to access shared memory regions > which the kernel might recycle. > > On top of that, there is a nasty bug in the Blackwell VBIOS that > sometimes borks the GPU upon PCI reset, requiring a reboot. So relying > on the PCI reset to unload/reload Nova is really not practical here. > > This series does what is needed to leave the GPU in a clean state after > unbind, for all currently supported GPUs. Blackwell support is trivial > and will be added alongside the Blackwell series [1] if this can be > merged first. > > The first patch adds a `warn_on_err` utility macro to the kernel crate > as it is useful to warn on failures in the driver unbind path, but I can > remove it if it is not deemed useful. > > This series applies cleanly on `master` as of today. > > [1] https://lore.kernel.org/all/[email protected]/ > > Signed-off-by: Alexandre Courbot <[email protected]> > --- > Changes in v3: > - Disambiguate doccomment for `warn_on_err`. > - Test the correct bit instead of the whole register value to determine > that the GSP has stopped. > - Use an enum instead of a boolean to encode the power level when > shutting down the GSP. > - Add missing newline to `dev_err`. > - Add missing doccomments for new types. > - Use values from bindings instead of magic numbers. > - Remove the redundant `get_gsp_info` function. > - Better document Booter Unloader mailbox sentinel value, and check the > value of mbox0 upon return. > - Link to v2: > https://patch.msgid.link/[email protected] > > Changes in v2: > - Rebase on top of `master` and remove unneeded/obsolete preparatory patches. > - Tidy up the imports of commands from the `fw` module in the `gsp` module. > - Link to v1: > https://patch.msgid.link/[email protected] > > --- > Alexandre Courbot (6): > rust: add warn_on_err macro > gpu: nova-core: use warn_on_err macro > gpu: nova-core: remove unneeded get_gsp_info proxy function > gpu: nova-core: do not import firmware commands into GSP command module > gpu: nova-core: send UNLOADING_GUEST_DRIVER GSP command upon unloading > gpu: nova-core: run Booter Unloader and FWSEC-SB upon unbinding > > drivers/gpu/nova-core/firmware/booter.rs | 1 - > drivers/gpu/nova-core/firmware/fwsec.rs | 1 - > drivers/gpu/nova-core/gpu.rs | 21 +++-- > drivers/gpu/nova-core/gsp/boot.rs | 100 > +++++++++++++++++++++- > drivers/gpu/nova-core/gsp/commands.rs | 69 +++++++++++---- > drivers/gpu/nova-core/gsp/fw.rs | 4 + > drivers/gpu/nova-core/gsp/fw/commands.rs | 44 ++++++++++ > drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 11 +++ > drivers/gpu/nova-core/regs.rs | 5 ++ > rust/kernel/bug.rs | 10 +++ > 10 files changed, 241 insertions(+), 25 deletions(-) > --- > base-commit: b4e07588e743c989499ca24d49e752c074924a9a > change-id: 20251216-nova-unload-4029b3b76950 > > Best regards, > -- > Alexandre Courbot <[email protected]> >
