Re: Device / Driver and PCI Rust abstractions
https://lore.kernel.org/rust-for-linux/20240520172554.182094-1-d...@redhat.com/
[RFC PATCH 8/8] nova: add initial driver stub
Add the initial driver stub of Nova, a Rust-based GSP-only driver for Nvidia GPUs. Nova, in the long term, is intended to serve as the successor of Nouveau for GSP-firmware-based GPUs. [1] As a stub driver Nova's focus is to make use of the most basic device / driver infrastructure required to build a DRM driver on the PCI bus and serve as demonstration example and justification for this infrastructure. In further consequence, the idea is to develop Nova continuously upstream, using those increments to lift further Rust abstractions and infrastructure upstream. Link: https://lore.kernel.org/dri-devel/Zfsj0_tb-0-tNrJy@cassiopeiae/T/#u [1] Signed-off-by: Danilo Krummrich --- MAINTAINERS| 10 ++ drivers/gpu/drm/Kconfig| 2 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/nova/Kconfig | 11 +++ drivers/gpu/drm/nova/Makefile | 3 + drivers/gpu/drm/nova/driver.rs | 110 + drivers/gpu/drm/nova/file.rs | 71 ++ drivers/gpu/drm/nova/gem.rs| 50 ++ drivers/gpu/drm/nova/gpu.rs| 172 + drivers/gpu/drm/nova/nova.rs | 20 include/uapi/drm/nova_drm.h| 101 +++ rust/uapi/uapi_helper.h| 1 + 12 files changed, 552 insertions(+) create mode 100644 drivers/gpu/drm/nova/Kconfig create mode 100644 drivers/gpu/drm/nova/Makefile create mode 100644 drivers/gpu/drm/nova/driver.rs create mode 100644 drivers/gpu/drm/nova/file.rs create mode 100644 drivers/gpu/drm/nova/gem.rs create mode 100644 drivers/gpu/drm/nova/gpu.rs create mode 100644 drivers/gpu/drm/nova/nova.rs create mode 100644 include/uapi/drm/nova_drm.h diff --git a/MAINTAINERS b/MAINTAINERS index aa3b947fb080..1ca0ea445e3f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6863,6 +6863,16 @@ T: git https://gitlab.freedesktop.org/drm/nouveau.git F: drivers/gpu/drm/nouveau/ F: include/uapi/drm/nouveau_drm.h +DRM DRIVER (STUB) FOR NVIDIA GSP GPUS [RUST] +M: Danilo Krummrich +L: dri-devel@lists.freedesktop.org +L: nouv...@lists.freedesktop.org +S: Supported +C: irc://irc.oftc.net/nouveau +T: git https://gitlab.freedesktop.org/drm/nova.git +F: drivers/gpu/drm/nova/ +F: include/uapi/drm/nova_drm.h + DRM DRIVER FOR OLIMEX LCD-OLINUXINO PANELS M: Stefan Mavrodiev S: Maintained diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 5a0c476361c3..0bb0442a252e 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -274,6 +274,8 @@ source "drivers/gpu/drm/amd/amdgpu/Kconfig" source "drivers/gpu/drm/nouveau/Kconfig" +source "drivers/gpu/drm/nova/Kconfig" + source "drivers/gpu/drm/i915/Kconfig" source "drivers/gpu/drm/xe/Kconfig" diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 104b42df2e95..de6dc006cb7f 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -143,6 +143,7 @@ obj-$(CONFIG_DRM_VMWGFX)+= vmwgfx/ obj-$(CONFIG_DRM_VGEM) += vgem/ obj-$(CONFIG_DRM_VKMS) += vkms/ obj-$(CONFIG_DRM_NOUVEAU) +=nouveau/ +obj-$(CONFIG_DRM_NOVA_STUB) += nova/ obj-$(CONFIG_DRM_EXYNOS) +=exynos/ obj-$(CONFIG_DRM_ROCKCHIP) +=rockchip/ obj-$(CONFIG_DRM_GMA500) += gma500/ diff --git a/drivers/gpu/drm/nova/Kconfig b/drivers/gpu/drm/nova/Kconfig new file mode 100644 index ..1ae2d1322a92 --- /dev/null +++ b/drivers/gpu/drm/nova/Kconfig @@ -0,0 +1,11 @@ +config DRM_NOVA_STUB + tristate "Nova GPU driver stub" + depends on DRM + depends on PCI + depends on RUST + default n + help + Choose this if you want to build the Nova stub driver for Nvidia + GSP-based GPUs. + + If M is selected, the module will be called nova. diff --git a/drivers/gpu/drm/nova/Makefile b/drivers/gpu/drm/nova/Makefile new file mode 100644 index ..733ac5fb9f4f --- /dev/null +++ b/drivers/gpu/drm/nova/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_DRM_NOVA_STUB) += nova.o diff --git a/drivers/gpu/drm/nova/driver.rs b/drivers/gpu/drm/nova/driver.rs new file mode 100644 index ..47eaa2eddf16 --- /dev/null +++ b/drivers/gpu/drm/nova/driver.rs @@ -0,0 +1,110 @@ +// SPDX-License-Identifier: GPL-2.0 + +use alloc::boxed::Box; +use core::pin::Pin; +use kernel::{ +bindings, c_str, device, driver, drm, +drm::{drv, ioctl}, +pci, +pci::define_pci_id_table, +prelude::*, +sync::Arc, +}; + +use crate::{file::File, gpu::Gpu}; + +pub(crate) struct NovaDriver; + +/// Convienence type alias for the DRM device type for this driver +pub(crate) type NovaDevice = drm::device::Device; + +#[allow(dead_code)] +pub(crate) struct NovaData { +pub(crate) gpu: Arc, +pub(crate) pdev: pci::Device, +} + +type DeviceData = device::Data, NovaData>; + +const INFO: drm::drv::DriverInfo = drm::drv::DriverInfo { +major: 0, +minor: 0,
[RFC PATCH 7/8] rust: add firmware abstractions
Add an abstraction around the kernels firmware API to request firmware images. The abstraction provides functions to access the firmware buffer and / or copy it to a new buffer allocated with a given allocator backend. The firmware is released once the abstraction is dropped. Signed-off-by: Danilo Krummrich --- rust/bindings/bindings_helper.h | 1 + rust/kernel/firmware.rs | 74 + rust/kernel/lib.rs | 1 + 3 files changed, 76 insertions(+) create mode 100644 rust/kernel/firmware.rs diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index b245db8d5a87..e4ffc47da5ec 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include diff --git a/rust/kernel/firmware.rs b/rust/kernel/firmware.rs new file mode 100644 index ..700504fb3c9c --- /dev/null +++ b/rust/kernel/firmware.rs @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Firmware abstraction +//! +//! C header: [`include/linux/firmware.h`](../../../../include/linux/firmware.h") + +use crate::{bindings, device::Device, error::Error, error::Result, str::CStr, types::Opaque}; + +/// Abstraction around a C firmware struct. +/// +/// This is a simple abstraction around the C firmware API. Just like with the C API, firmware can +/// be requested. Once requested the abstraction provides direct access to the firmware buffer as +/// `&[u8]`. Alternatively, the firmware can be copied to a new buffer using `Firmware::copy`. The +/// firmware is released once [`Firmware`] is dropped. +/// +/// # Examples +/// +/// ``` +/// let fw = Firmware::request("path/to/firmware.bin", dev.as_ref())?; +/// driver_load_firmware(fw.data()); +/// ``` +pub struct Firmware(Opaque<*const bindings::firmware>); + +impl Firmware { +/// Send a firmware request and wait for it. See also `bindings::request_firmware`. +pub fn request(name: , dev: ) -> Result { +let fw = Opaque::uninit(); + +let ret = unsafe { bindings::request_firmware(fw.get(), name.as_char_ptr(), dev.as_raw()) }; +if ret != 0 { +return Err(Error::from_errno(ret)); +} + +Ok(Firmware(fw)) +} + +/// Send a request for an optional fw module. See also `bindings::request_firmware_nowarn`. +pub fn request_nowarn(name: , dev: ) -> Result { +let fw = Opaque::uninit(); + +let ret = unsafe { +bindings::firmware_request_nowarn(fw.get(), name.as_char_ptr(), dev.as_raw()) +}; +if ret != 0 { +return Err(Error::from_errno(ret)); +} + +Ok(Firmware(fw)) +} + +/// Returns the size of the requested firmware in bytes. +pub fn size() -> usize { +unsafe { (*(*self.0.get())).size } +} + +/// Returns the requested firmware as `&[u8]`. +pub fn data() -> &[u8] { +unsafe { core::slice::from_raw_parts((*(*self.0.get())).data, self.size()) } +} +} + +impl Drop for Firmware { +fn drop( self) { +unsafe { bindings::release_firmware(*self.0.get()) }; +} +} + +// SAFETY: `Firmware` only holds a pointer to a C firmware struct, which is safe to be used from any +// thread. +unsafe impl Send for Firmware {} + +// SAFETY: `Firmware` only holds a pointer to a C firmware struct, references to which are safe to +// be used from any thread. +unsafe impl Sync for Firmware {} diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index 6415968ee3b8..ed97d131661a 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -35,6 +35,7 @@ #[cfg(CONFIG_DRM = "y")] pub mod drm; pub mod error; +pub mod firmware; pub mod init; pub mod ioctl; #[cfg(CONFIG_KUNIT)] -- 2.45.1
[RFC PATCH 6/8] rust: drm: gem: Add GEM object abstraction
From: Asahi Lina The DRM GEM subsystem is the DRM memory management subsystem used by most modern drivers. Add a Rust abstraction to allow Rust DRM driver implementations to use it. Signed-off-by: Asahi Lina Co-developed-by: Danilo Krummrich Signed-off-by: Danilo Krummrich --- rust/bindings/bindings_helper.h | 1 + rust/helpers.c | 23 ++ rust/kernel/drm/drv.rs | 4 +- rust/kernel/drm/gem/mod.rs | 406 rust/kernel/drm/mod.rs | 1 + 5 files changed, 433 insertions(+), 2 deletions(-) create mode 100644 rust/kernel/drm/gem/mod.rs diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index c591811ccb67..b245db8d5a87 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include diff --git a/rust/helpers.c b/rust/helpers.c index dc2405772b1a..30e86bf00337 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -20,6 +20,7 @@ * Sorted alphabetically. */ +#include #include #include #include @@ -302,6 +303,28 @@ u64 rust_helper_pci_resource_len(struct pci_dev *pdev, int barnr) return pci_resource_len(pdev, barnr); } +#ifdef CONFIG_DRM + +void rust_helper_drm_gem_object_get(struct drm_gem_object *obj) +{ + drm_gem_object_get(obj); +} +EXPORT_SYMBOL_GPL(rust_helper_drm_gem_object_get); + +void rust_helper_drm_gem_object_put(struct drm_gem_object *obj) +{ + drm_gem_object_put(obj); +} +EXPORT_SYMBOL_GPL(rust_helper_drm_gem_object_put); + +__u64 rust_helper_drm_vma_node_offset_addr(struct drm_vma_offset_node *node) +{ + return drm_vma_node_offset_addr(node); +} +EXPORT_SYMBOL_GPL(rust_helper_drm_vma_node_offset_addr); + +#endif + /* * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can * use it in contexts where Rust expects a `usize` like slice (array) indices. diff --git a/rust/kernel/drm/drv.rs b/rust/kernel/drm/drv.rs index c5a63663ea21..063b420f57e5 100644 --- a/rust/kernel/drm/drv.rs +++ b/rust/kernel/drm/drv.rs @@ -113,7 +113,7 @@ pub struct AllocOps { } /// Trait for memory manager implementations. Implemented internally. -pub trait AllocImpl: Sealed { +pub trait AllocImpl: Sealed + drm::gem::IntoGEMObject { /// The C callback operations for this memory manager. const ALLOC_OPS: AllocOps; } @@ -243,7 +243,7 @@ pub fn new(parent: ::Device) -> Result { drm: drm::device::Device::new(parent, )?, registered: false, vtable, -fops: Default::default(), // TODO: GEM abstraction +fops: drm::gem::create_fops(), _pin: PhantomPinned, _p: PhantomData, }) diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs new file mode 100644 index ..4cd85d5f1df8 --- /dev/null +++ b/rust/kernel/drm/gem/mod.rs @@ -0,0 +1,406 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT + +//! DRM GEM API +//! +//! C header: [`include/linux/drm/drm_gem.h`](../../../../include/linux/drm/drm_gem.h) + +use alloc::boxed::Box; + +use crate::{ +alloc::flags::*, +bindings, +drm::{device, drv, file}, +error::{to_result, Result}, +prelude::*, +}; +use core::{marker::PhantomPinned, mem, ops::Deref, ops::DerefMut}; + +/// GEM object functions, which must be implemented by drivers. +pub trait BaseDriverObject: Sync + Send + Sized { +/// Create a new driver data object for a GEM object of a given size. +fn new(dev: ::Device, size: usize) -> impl PinInit; + +/// Open a new handle to an existing object, associated with a File. +fn open( +_obj: &<::Driver as drv::Driver>::Object, +_file: ::File<<::Driver as drv::Driver>::File>, +) -> Result { +Ok(()) +} + +/// Close a handle to an existing object, associated with a File. +fn close( +_obj: &<::Driver as drv::Driver>::Object, +_file: ::File<<::Driver as drv::Driver>::File>, +) { +} +} + +/// Trait that represents a GEM object subtype +pub trait IntoGEMObject: Sized + crate::private::Sealed { +/// Owning driver for this type +type Driver: drv::Driver; + +/// Returns a reference to the raw `drm_gem_object` structure, which must be valid as long as +/// this owning object is valid. +fn gem_obj() -> ::drm_gem_object; + +/// Converts a pointer to a `drm_gem_object` into a pointer to this type. +fn from_gem_obj(obj: *mut bindings::drm_gem_object) -> *mut Self; +} + +/// Trait which must be implemented by drivers using base GEM objects. +pub trait DriverObject: BaseDriverObject> { +/// Parent `Driver` for this object. +type Driver: drv::Driver; +} + +unsafe extern "C" fn free_callback(obj: *mut bindings::drm_gem_object) { +// SAFETY: All of our objects are Object. +let this = unsafe { cr
[RFC PATCH 5/8] rust: drm: file: Add File abstraction
From: Asahi Lina A DRM File is the DRM counterpart to a kernel file structure, representing an open DRM file descriptor. Add a Rust abstraction to allow drivers to implement their own File types that implement the DriverFile trait. Signed-off-by: Asahi Lina Signed-off-by: Danilo Krummrich --- rust/bindings/bindings_helper.h | 1 + rust/kernel/drm/drv.rs | 7 +- rust/kernel/drm/file.rs | 113 rust/kernel/drm/mod.rs | 1 + 4 files changed, 120 insertions(+), 2 deletions(-) create mode 100644 rust/kernel/drm/file.rs diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 831fbfe03a47..c591811ccb67 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -8,6 +8,7 @@ #include #include +#include #include #include #include diff --git a/rust/kernel/drm/drv.rs b/rust/kernel/drm/drv.rs index 5dd8f3f8df7c..c5a63663ea21 100644 --- a/rust/kernel/drm/drv.rs +++ b/rust/kernel/drm/drv.rs @@ -131,6 +131,9 @@ pub trait Driver { /// Should be either `drm::gem::Object` or `drm::gem::shmem::Object`. type Object: AllocImpl; +/// The type used to represent a DRM File (client) +type File: drm::file::DriverFile; + /// Driver metadata const INFO: DriverInfo; @@ -200,8 +203,8 @@ macro_rules! drm_device_register { impl Registration { const VTABLE: bindings::drm_driver = drm_legacy_fields! { load: None, -open: None, // TODO: File abstraction -postclose: None, // TODO: File abstraction +open: Some(drm::file::open_callback::), +postclose: Some(drm::file::postclose_callback::), lastclose: None, unload: None, release: None, diff --git a/rust/kernel/drm/file.rs b/rust/kernel/drm/file.rs new file mode 100644 index ..a25b8c61f4ee --- /dev/null +++ b/rust/kernel/drm/file.rs @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT + +//! DRM File objects. +//! +//! C header: [`include/linux/drm/drm_file.h`](../../../../include/linux/drm/drm_file.h) + +use crate::{bindings, drm, error::Result}; +use alloc::boxed::Box; +use core::marker::PhantomData; +use core::pin::Pin; + +/// Trait that must be implemented by DRM drivers to represent a DRM File (a client instance). +pub trait DriverFile { +/// The parent `Driver` implementation for this `DriverFile`. +type Driver: drm::drv::Driver; + +/// Open a new file (called when a client opens the DRM device). +fn open(device: ::device::Device) -> Result>>; +} + +/// An open DRM File. +/// +/// # Invariants +/// `raw` is a valid pointer to a `drm_file` struct. +#[repr(transparent)] +pub struct File { +raw: *mut bindings::drm_file, +_p: PhantomData, +} + +pub(super) unsafe extern "C" fn open_callback( +raw_dev: *mut bindings::drm_device, +raw_file: *mut bindings::drm_file, +) -> core::ffi::c_int { +let drm = unsafe { drm::device::Device::borrow(raw_dev) }; +// SAFETY: This reference won't escape this function +let file = unsafe { *raw_file }; + +let inner = match T::open(drm) { +Err(e) => { +return e.to_errno(); +} +Ok(i) => i, +}; + +// SAFETY: This pointer is treated as pinned, and the Drop guarantee is upheld below. +file.driver_priv = Box::into_raw(unsafe { Pin::into_inner_unchecked(inner) }) as *mut _; + +0 +} + +pub(super) unsafe extern "C" fn postclose_callback( +_dev: *mut bindings::drm_device, +raw_file: *mut bindings::drm_file, +) { +// SAFETY: This reference won't escape this function +let file = unsafe { &*raw_file }; + +// Drop the DriverFile +unsafe { +let _ = Box::from_raw(file.driver_priv as *mut T); +}; +} + +impl File { +// Not intended to be called externally, except via declare_drm_ioctls!() +#[doc(hidden)] +pub unsafe fn from_raw(raw_file: *mut bindings::drm_file) -> File { +File { +raw: raw_file, +_p: PhantomData, +} +} + +#[allow(dead_code)] +/// Return the raw pointer to the underlying `drm_file`. +pub(super) fn raw() -> *const bindings::drm_file { +self.raw +} + +/// Return an immutable reference to the raw `drm_file` structure. +pub(super) fn file() -> ::drm_file { +unsafe { &*self.raw } +} + +/// Return a pinned reference to the driver file structure. +pub fn inner() -> Pin<> { +unsafe { Pin::new_unchecked(&*(self.file().driver_priv as *const T)) } +} +} + +impl crate::private::Sealed for File {} + +/// Generic trait to allow users that don't care about driver specifics to accept any File. +/// +/// # Safety +/// Must only be implemented for File and return the pointer, following the normal invariants +/// of that type. +pub unsafe trait GenericFile: crate::private::Sealed { +//
[RFC PATCH 4/8] rust: drm: implement `AsRef` for DRM device
Implement `AsRef` for `drm::device::Device` such that `dev_*` print macros can be used conveniently. Signed-off-by: Danilo Krummrich --- rust/kernel/drm/device.rs | 8 1 file changed, 8 insertions(+) diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs index f72bab8dd42d..aef947893dab 100644 --- a/rust/kernel/drm/device.rs +++ b/rust/kernel/drm/device.rs @@ -79,6 +79,14 @@ unsafe fn dec_ref(obj: NonNull) { } } +impl AsRef for Device { +fn as_ref() -> ::Device { +// SAFETY: `bindings::drm_device::dev` is valid as long as the DRM device itself is valid, +// which is guaranteed by the type invariant. +unsafe { device::Device::as_ref((*self.as_raw()).dev) } +} +} + // SAFETY: `Device` only holds a pointer to a C device, which is safe to be used from any thread. unsafe impl Send for Device {} -- 2.45.1
[RFC PATCH 4/8] rust: drm: implement `AsRef` for DRM device
Implement `AsRef` for `drm::device::Device` such that `dev_*` print macros can be used conveniently. Signed-off-by: Danilo Krummrich --- rust/kernel/drm/device.rs | 8 1 file changed, 8 insertions(+) diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs index f72bab8dd42d..aef947893dab 100644 --- a/rust/kernel/drm/device.rs +++ b/rust/kernel/drm/device.rs @@ -79,6 +79,14 @@ unsafe fn dec_ref(obj: NonNull) { } } +impl AsRef for Device { +fn as_ref() -> ::Device { +// SAFETY: `bindings::drm_device::dev` is valid as long as the DRM device itself is valid, +// which is guaranteed by the type invariant. +unsafe { device::Device::as_ref((*self.as_raw()).dev) } +} +} + // SAFETY: `Device` only holds a pointer to a C device, which is safe to be used from any thread. unsafe impl Send for Device {} -- 2.45.1
[RFC PATCH 3/8] rust: drm: Add Device and Driver abstractions
From: Asahi Lina Add abstractions for DRM drivers and devices. These go together in one commit since both are fairly tightly coupled types. A few things have been stubbed out, to be implemented as further bits of the DRM subsystem are introduced. Signed-off-by: Asahi Lina Co-developed-by: Danilo Krummrich Signed-off-by: Danilo Krummrich --- rust/bindings/bindings_helper.h | 2 + rust/kernel/drm/device.rs | 87 + rust/kernel/drm/drv.rs | 318 rust/kernel/drm/mod.rs | 2 + 4 files changed, 409 insertions(+) create mode 100644 rust/kernel/drm/device.rs create mode 100644 rust/kernel/drm/drv.rs diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 14188034285a..831fbfe03a47 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -6,6 +6,8 @@ * Sorted alphabetically. */ +#include +#include #include #include #include diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs new file mode 100644 index ..f72bab8dd42d --- /dev/null +++ b/rust/kernel/drm/device.rs @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT + +//! DRM device. +//! +//! C header: [`include/linux/drm/drm_device.h`](../../../../include/linux/drm/drm_device.h) + +use crate::{ +bindings, device, drm, +error::code::*, +error::from_err_ptr, +error::Result, +types::{ARef, AlwaysRefCounted, ForeignOwnable, Opaque}, +}; +use alloc::boxed::Box; +use core::{ffi::c_void, marker::PhantomData, pin::Pin, ptr::NonNull}; + +/// A typed DRM device with a specific driver. The device is always reference-counted. +#[repr(transparent)] +pub struct Device(Opaque, PhantomData); + +impl Device { +pub(crate) fn new( +dev: ::Device, +vtable: >, +) -> Result> { +let raw_drm = unsafe { bindings::drm_dev_alloc(&**vtable, dev.as_raw()) }; +let raw_drm = NonNull::new(from_err_ptr(raw_drm)? as *mut _).ok_or(ENOMEM)?; + +// SAFETY: The reference count is one, and now we take ownership of that reference as a +// drm::device::Device. +Ok(unsafe { ARef::from_raw(raw_drm) }) +} + +pub(crate) fn as_raw() -> *mut bindings::drm_device { +self.0.get() +} + +// Not intended to be called externally, except via declare_drm_ioctls!() +#[doc(hidden)] +pub unsafe fn borrow<'a>(raw: *const bindings::drm_device) -> &'a Self { +unsafe { &*(raw as *const Self) } +} + +pub(crate) fn raw_data() -> *const c_void { +// SAFETY: `self` is guaranteed to hold a valid `bindings::drm_device` pointer. +unsafe { *self.as_raw() }.dev_private +} + +// SAFETY: Must be called only once after device creation. +pub(crate) unsafe fn set_raw_data(, ptr: *const c_void) { +// SAFETY: Safe as by the safety precondition. +unsafe { *self.as_raw() }.dev_private = ptr as _; +} + +/// Returns a borrowed reference to the user data associated with this Device. +pub fn data() -> Option<::Borrowed<'_>> { +let dev_private = self.raw_data(); + +if dev_private.is_null() { +None +} else { +// SAFETY: `dev_private` is NULL before the DRM device is registered; after the DRM +// device has been registered dev_private is guaranteed to be valid. +Some(unsafe { T::Data::borrow(dev_private) }) +} +} +} + +// SAFETY: DRM device objects are always reference counted and the get/put functions +// satisfy the requirements. +unsafe impl AlwaysRefCounted for Device { +fn inc_ref() { +unsafe { bindings::drm_dev_get(_raw() as *const _ as *mut _) }; +} + +unsafe fn dec_ref(obj: NonNull) { +// SAFETY: The Device type has the same layout as drm_device, so we can just cast. +unsafe { bindings::drm_dev_put(obj.as_ptr() as *mut _) }; +} +} + +// SAFETY: `Device` only holds a pointer to a C device, which is safe to be used from any thread. +unsafe impl Send for Device {} + +// SAFETY: `Device` only holds a pointer to a C device, references to which are safe to be used +// from any thread. +unsafe impl Sync for Device {} diff --git a/rust/kernel/drm/drv.rs b/rust/kernel/drm/drv.rs new file mode 100644 index ..5dd8f3f8df7c --- /dev/null +++ b/rust/kernel/drm/drv.rs @@ -0,0 +1,318 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT + +//! DRM driver core. +//! +//! C header: [`include/linux/drm/drm_drv.h`](../../../../include/linux/drm/drm_drv.h) + +use crate::{ +alloc::flags::*, +bindings, device, drm, +error::code::*, +error::{Error, Result}, +prelude::*, +private::Sealed, +str::CStr, +types::{ARef, ForeignOwnable}, +ThisModule, +}; +use core::{ +marker::{PhantomData, PhantomPinned}, +pin::Pin, +}; +use macros::vtable; + +/// Driver use the GE
[RFC PATCH 2/8] rust: Add a Sealed trait
From: Asahi Lina Some traits exposed by the kernel crate may not be intended to be implemented by downstream modules. Add a Sealed trait to allow avoiding this using the sealed trait pattern. Signed-off-by: Asahi Lina Signed-off-by: Danilo Krummrich --- rust/kernel/lib.rs | 5 + 1 file changed, 5 insertions(+) diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index cb0dd3d76729..6415968ee3b8 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -65,6 +65,11 @@ #[doc(hidden)] pub use build_error::build_error; +pub(crate) mod private { +#[allow(unreachable_pub)] +pub trait Sealed {} +} + /// Prefix to appear before log messages printed from within the `kernel` crate. const __LOG_PREFIX: &[u8] = b"rust_kernel\0"; -- 2.45.1
[RFC PATCH 1/8] rust: drm: ioctl: Add DRM ioctl abstraction
From: Asahi Lina DRM drivers need to be able to declare which driver-specific ioctls they support. Add an abstraction implementing the required types and a helper macro to generate the ioctl definition inside the DRM driver. Note that this macro is not usable until further bits of the abstraction are in place (but it will not fail to compile on its own, if not called). Signed-off-by: Asahi Lina Signed-off-by: Danilo Krummrich --- rust/bindings/bindings_helper.h | 1 + rust/kernel/drm/ioctl.rs| 153 rust/kernel/drm/mod.rs | 5 ++ rust/kernel/lib.rs | 2 + rust/uapi/uapi_helper.h | 1 + 5 files changed, 162 insertions(+) create mode 100644 rust/kernel/drm/ioctl.rs create mode 100644 rust/kernel/drm/mod.rs diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 32221de16e57..14188034285a 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -6,6 +6,7 @@ * Sorted alphabetically. */ +#include #include #include #include diff --git a/rust/kernel/drm/ioctl.rs b/rust/kernel/drm/ioctl.rs new file mode 100644 index ..79d720e9d18e --- /dev/null +++ b/rust/kernel/drm/ioctl.rs @@ -0,0 +1,153 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +#![allow(non_snake_case)] + +//! DRM IOCTL definitions. +//! +//! C header: [`include/linux/drm/drm_ioctl.h`](../../../../include/linux/drm/drm_ioctl.h) + +use crate::ioctl; + +const BASE: u32 = uapi::DRM_IOCTL_BASE as u32; + +/// Construct a DRM ioctl number with no argument. +#[inline(always)] +pub const fn IO(nr: u32) -> u32 { +ioctl::_IO(BASE, nr) +} + +/// Construct a DRM ioctl number with a read-only argument. +#[inline(always)] +pub const fn IOR(nr: u32) -> u32 { +ioctl::_IOR::(BASE, nr) +} + +/// Construct a DRM ioctl number with a write-only argument. +#[inline(always)] +pub const fn IOW(nr: u32) -> u32 { +ioctl::_IOW::(BASE, nr) +} + +/// Construct a DRM ioctl number with a read-write argument. +#[inline(always)] +pub const fn IOWR(nr: u32) -> u32 { +ioctl::_IOWR::(BASE, nr) +} + +/// Descriptor type for DRM ioctls. Use the `declare_drm_ioctls!{}` macro to construct them. +pub type DrmIoctlDescriptor = bindings::drm_ioctl_desc; + +/// This is for ioctl which are used for rendering, and require that the file descriptor is either +/// for a render node, or if it’s a legacy/primary node, then it must be authenticated. +pub const AUTH: u32 = bindings::drm_ioctl_flags_DRM_AUTH; + +/// This must be set for any ioctl which can change the modeset or display state. Userspace must +/// call the ioctl through a primary node, while it is the active master. +/// +/// Note that read-only modeset ioctl can also be called by unauthenticated clients, or when a +/// master is not the currently active one. +pub const MASTER: u32 = bindings::drm_ioctl_flags_DRM_MASTER; + +/// Anything that could potentially wreak a master file descriptor needs to have this flag set. +/// +/// Current that’s only for the SETMASTER and DROPMASTER ioctl, which e.g. logind can call to +/// force a non-behaving master (display compositor) into compliance. +/// +/// This is equivalent to callers with the SYSADMIN capability. +pub const ROOT_ONLY: u32 = bindings::drm_ioctl_flags_DRM_ROOT_ONLY; + +/// This is used for all ioctl needed for rendering only, for drivers which support render nodes. +/// This should be all new render drivers, and hence it should be always set for any ioctl with +/// `AUTH` set. Note though that read-only query ioctl might have this set, but have not set +/// DRM_AUTH because they do not require authentication. +pub const RENDER_ALLOW: u32 = bindings::drm_ioctl_flags_DRM_RENDER_ALLOW; + +/// Internal structures used by the [`declare_drm_ioctls!{}`] macro. Do not use directly. +#[doc(hidden)] +pub mod internal { +pub use bindings::drm_device; +pub use bindings::drm_file; +pub use bindings::drm_ioctl_desc; +} + +/// Declare the DRM ioctls for a driver. +/// +/// Each entry in the list should have the form: +/// +/// `(ioctl_number, argument_type, flags, user_callback),` +/// +/// `argument_type` is the type name within the `bindings` crate. +/// `user_callback` should have the following prototype: +/// +/// ``` +/// fn foo(device: ::drm::device::Device, +///data: bindings::argument_type, +///file: ::drm::file::File, +/// ) +/// ``` +/// where `Self` is the drm::drv::Driver implementation these ioctls are being declared within. +/// +/// # Examples +/// +/// ``` +/// kernel::declare_drm_ioctls! { +/// (FOO_GET_PARAM, drm_foo_get_param, ioctl::RENDER_ALLOW, my_get_param_handler), +/// } +/// ``` +/// +#[macro_export] +macro_rules! declare_drm_ioctls { +( $(($cmd:ident, $struct:ident, $flags:expr, $func:expr)),* $(,)? ) => { +const IOCTLS: &'static [$crate::drm::ioctl::DrmIoctlDescriptor] = { +use $crate::uapi::*;
[RFC PATCH 0/8] [RFC] DRM Rust abstractions and Nova
This patch series implements some basic DRM Rust abstractions and a stub implementation of the Nova GPU driver. Nova is intended to be developed upstream, starting out with just a stub driver to lift some initial required infrastructure upstream. A more detailed explanation can be found in [1]. This patch series is based on the "Device / Driver and PCI Rust abstractions" series [2]. The DRM bits can also be found in [3] and the Nova bits in [4]. [1] https://lore.kernel.org/dri-devel/Zfsj0_tb-0-tNrJy@cassiopeiae/T/#u [2] Reply to this mail titled "Device / Driver and PCI Rust abstractions". [3] https://gitlab.freedesktop.org/drm/misc/kernel/-/tree/topic/rust-drm [4] https://gitlab.freedesktop.org/drm/nova/-/tree/nova-next Asahi Lina (5): rust: drm: ioctl: Add DRM ioctl abstraction rust: Add a Sealed trait rust: drm: Add Device and Driver abstractions rust: drm: file: Add File abstraction rust: drm: gem: Add GEM object abstraction Danilo Krummrich (3): rust: drm: implement `AsRef` for DRM device rust: add firmware abstractions nova: add initial driver stub MAINTAINERS | 10 + drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile| 1 + drivers/gpu/drm/nova/Kconfig| 11 + drivers/gpu/drm/nova/Makefile | 3 + drivers/gpu/drm/nova/driver.rs | 110 + drivers/gpu/drm/nova/file.rs| 71 ++ drivers/gpu/drm/nova/gem.rs | 50 drivers/gpu/drm/nova/gpu.rs | 172 ++ drivers/gpu/drm/nova/nova.rs| 20 ++ include/uapi/drm/nova_drm.h | 101 rust/bindings/bindings_helper.h | 6 + rust/helpers.c | 23 ++ rust/kernel/drm/device.rs | 95 rust/kernel/drm/drv.rs | 321 + rust/kernel/drm/file.rs | 113 + rust/kernel/drm/gem/mod.rs | 406 rust/kernel/drm/ioctl.rs| 153 rust/kernel/drm/mod.rs | 9 + rust/kernel/firmware.rs | 74 ++ rust/kernel/lib.rs | 8 + rust/uapi/uapi_helper.h | 2 + 22 files changed, 1761 insertions(+) create mode 100644 drivers/gpu/drm/nova/Kconfig create mode 100644 drivers/gpu/drm/nova/Makefile create mode 100644 drivers/gpu/drm/nova/driver.rs create mode 100644 drivers/gpu/drm/nova/file.rs create mode 100644 drivers/gpu/drm/nova/gem.rs create mode 100644 drivers/gpu/drm/nova/gpu.rs create mode 100644 drivers/gpu/drm/nova/nova.rs create mode 100644 include/uapi/drm/nova_drm.h create mode 100644 rust/kernel/drm/device.rs create mode 100644 rust/kernel/drm/drv.rs create mode 100644 rust/kernel/drm/file.rs create mode 100644 rust/kernel/drm/gem/mod.rs create mode 100644 rust/kernel/drm/ioctl.rs create mode 100644 rust/kernel/drm/mod.rs create mode 100644 rust/kernel/firmware.rs base-commit: 4a67fc8e2330cbd57b49d8ea703ffbf7292ef828 -- 2.45.1
Re: [PATCH v4] drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations
Hi Mohamed, Thank you for fixing this up! On 5/9/24 22:43, Mohamed Ahmed wrote: Allows PTE kind and tile mode on BO create with VM_BIND, and adds a GETPARAM to indicate this change. This is needed to support It's usually better to use imperative verb form for commit messages. No need to send a new version though. modifiers in NVK and ensure correctness when dealing with the nouveau GL driver. The userspace modifiers implementation this is for can be found here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Signed-off-by: Mohamed Ahmed Applied to drm-misc-next-fixes. Generally, please make sure to use scripts/get_maintainer.pl before sending patches. - Danilo --- drivers/gpu/drm/nouveau/nouveau_abi16.c | 3 ++ drivers/gpu/drm/nouveau/nouveau_bo.c| 44 +++-- include/uapi/drm/nouveau_drm.h | 7 3 files changed, 29 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c b/drivers/gpu/drm/nouveau/nouveau_abi16.c index 80f74ee0f..47e53e17b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_abi16.c +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c @@ -272,6 +272,9 @@ nouveau_abi16_ioctl_getparam(ABI16_IOCTL_ARGS) getparam->value = (u64)ttm_resource_manager_usage(vram_mgr); break; } + case NOUVEAU_GETPARAM_HAS_VMA_TILEMODE: + getparam->value = 1; + break; default: NV_PRINTK(dbg, cli, "unknown parameter %lld\n", getparam->param); return -EINVAL; diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index db8cbf615..186add400 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -241,28 +241,28 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int *align, u32 domain, } nvbo->contig = !(tile_flags & NOUVEAU_GEM_TILE_NONCONTIG); - if (!nouveau_cli_uvmm(cli) || internal) { - /* for BO noVM allocs, don't assign kinds */ - if (cli->device.info.family >= NV_DEVICE_INFO_V0_FERMI) { - nvbo->kind = (tile_flags & 0xff00) >> 8; - if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) { - kfree(nvbo); - return ERR_PTR(-EINVAL); - } - nvbo->comp = mmu->kind[nvbo->kind] != nvbo->kind; - } else if (cli->device.info.family >= NV_DEVICE_INFO_V0_TESLA) { - nvbo->kind = (tile_flags & 0x7f00) >> 8; - nvbo->comp = (tile_flags & 0x0003) >> 16; - if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) { - kfree(nvbo); - return ERR_PTR(-EINVAL); - } - } else { - nvbo->zeta = (tile_flags & 0x0007); + if (cli->device.info.family >= NV_DEVICE_INFO_V0_FERMI) { + nvbo->kind = (tile_flags & 0xff00) >> 8; + if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) { + kfree(nvbo); + return ERR_PTR(-EINVAL); + } + + nvbo->comp = mmu->kind[nvbo->kind] != nvbo->kind; + } else if (cli->device.info.family >= NV_DEVICE_INFO_V0_TESLA) { + nvbo->kind = (tile_flags & 0x7f00) >> 8; + nvbo->comp = (tile_flags & 0x0003) >> 16; + if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) { + kfree(nvbo); + return ERR_PTR(-EINVAL); } - nvbo->mode = tile_mode; + } else { + nvbo->zeta = (tile_flags & 0x0007); + } + nvbo->mode = tile_mode; + if (!nouveau_cli_uvmm(cli) || internal) { /* Determine the desirable target GPU page size for the buffer. */ for (i = 0; i < vmm->page_nr; i++) { /* Because we cannot currently allow VMM maps to fail @@ -304,12 +304,6 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int *align, u32 domain, } nvbo->page = vmm->page[pi].shift; } else { - /* reject other tile flags when in VM mode. */ - if (tile_mode) - return ERR_PTR(-EINVAL); - if (tile_flags & ~NOUVEAU_GEM_TILE_NONCONTIG) - return ERR_PTR(-EINVAL); - /* Determine the desirable target GPU page size for the buffer. */ for (i = 0; i < vmm->page_nr; i++) { /* Because we cannot currently allow VMM maps to fail diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h index cd84227f1..5402f77ee 100644 --- a/include/uapi/drm/nouveau_drm.h +++
Re: [PATCH] drm/i915: fix missing linux/debugfs.h includes
On 4/30/24 16:53, Jani Nikula wrote: On Tue, 30 Apr 2024, Danilo Krummrich wrote: After dropping linux/debugfs.h include from drm/drm_print.h the following files in i915 miss the linux/debugfs.h include: i915_debugfs.c, i915_debugfs_params.c and i915_gpu_error.c. Add the include to fix the corresponding build errors. Reported-by: kernel test robot Fixes: 33d5ae6cacf4 ("drm/print: drop include debugfs.h and include where needed") Closes: https://lore.kernel.org/oe-kbuild-all/202404260726.nyguvl59-...@intel.com/ Signed-off-by: Danilo Krummrich Thanks, but it's already fixed by commit 7fa043eafdb7 ("drm/i915: fix build with missing debugfs includes") in drm-next. Even better, note that drm-misc-next is still broken though. - Danilo BR, Jani. --- drivers/gpu/drm/i915/i915_debugfs.c| 1 + drivers/gpu/drm/i915/i915_debugfs_params.c | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 1 + 3 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 24c78873b3cf..b552e27cdcd5 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -26,6 +26,7 @@ * */ +#include #include #include #include diff --git a/drivers/gpu/drm/i915/i915_debugfs_params.c b/drivers/gpu/drm/i915/i915_debugfs_params.c index 8bca02025e09..295486b8ceb1 100644 --- a/drivers/gpu/drm/i915/i915_debugfs_params.c +++ b/drivers/gpu/drm/i915/i915_debugfs_params.c @@ -3,6 +3,7 @@ * Copyright © 2019 Intel Corporation */ +#include #include #include "i915_debugfs_params.h" diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 2594eb10c559..625b3c024540 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -28,6 +28,7 @@ */ #include +#include #include #include #include base-commit: 4a9a567ab101e659a4fafb7a691ff6b84531a10a
Re: [PATCH] nouveau: Add missing break statement
On 4/30/24 15:18, Chaitanya Kumar Borah wrote: Add the missing break statement that causes the following build error CC [M] drivers/gpu/drm/i915/display/intel_display_device.o ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function ‘build_registry’: ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label at end of compound statement 1266 | default: | ^~~ CC [M] drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h CC [M] drivers/gpu/drm/amd/amdgpu/imu_v11_0.o make[7]: *** [../scripts/Makefile.build:244: drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1 make[7]: *** Waiting for unfinished jobs Fixes: b58a0bc904ff ("nouveau: add command-line GSP-RM registry support") Closes: https://lore.kernel.org/all/913052ca6c0988db1bab293cfae38529251b4594.ca...@nvidia.com/T/#m3c9acebac754f2e74a85b76c858c093bb1aacaf0 Signed-off-by: Chaitanya Kumar Borah Applied to drm-misc-next, thanks! --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 0b46db5c77b8..63619512e7f6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -1264,6 +1264,7 @@ static void build_registry(struct nvkm_gsp *gsp, PACKED_REGISTRY_TABLE *registry str_offset += reg->vlen; break; default: + break; } i++;
[PATCH] drm/i915: fix missing linux/debugfs.h includes
After dropping linux/debugfs.h include from drm/drm_print.h the following files in i915 miss the linux/debugfs.h include: i915_debugfs.c, i915_debugfs_params.c and i915_gpu_error.c. Add the include to fix the corresponding build errors. Reported-by: kernel test robot Fixes: 33d5ae6cacf4 ("drm/print: drop include debugfs.h and include where needed") Closes: https://lore.kernel.org/oe-kbuild-all/202404260726.nyguvl59-...@intel.com/ Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/i915/i915_debugfs.c| 1 + drivers/gpu/drm/i915/i915_debugfs_params.c | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 1 + 3 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 24c78873b3cf..b552e27cdcd5 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -26,6 +26,7 @@ * */ +#include #include #include #include diff --git a/drivers/gpu/drm/i915/i915_debugfs_params.c b/drivers/gpu/drm/i915/i915_debugfs_params.c index 8bca02025e09..295486b8ceb1 100644 --- a/drivers/gpu/drm/i915/i915_debugfs_params.c +++ b/drivers/gpu/drm/i915/i915_debugfs_params.c @@ -3,6 +3,7 @@ * Copyright © 2019 Intel Corporation */ +#include #include #include "i915_debugfs_params.h" diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 2594eb10c559..625b3c024540 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -28,6 +28,7 @@ */ #include +#include #include #include #include base-commit: 4a9a567ab101e659a4fafb7a691ff6b84531a10a -- 2.44.0
Re: [PATCH] [v7] nouveau: add command-line GSP-RM registry support
On 4/30/24 15:06, Lucas De Marchi wrote: On Fri, Apr 26, 2024 at 06:08:19PM GMT, Danilo Krummrich wrote: On 4/25/24 18:38, Timur Tabi wrote: On Thu, 2024-04-25 at 15:22 +0200, Danilo Krummrich wrote: + size_t length; + + /* Remove any whitespace from the parameter string */ + length = strip(p, " \t\n"); With that, I see the following warning compiling this patch. warning: variable ‘length’ set but not used [-Wunused-but-set-variable] Did you intend to use the length for anything? No, and I could have sworn I fixed that before sending out v7. I think I originally intended 'length' to determine when I finished parsing the string. Also, looking at the warning made me aware of 'p' potentially being NULL. If you agree, I can fix the warning and add the corresponding NULL check when applying the patch. Yes, that would be great. You can just delete 'length'. The NULL check for 'p' should call clean_registry() before returning -ENOMEM. Applied to drm-misc-next, thanks! since this commit our CI is failing to build with the following error: CC [M] drivers/gpu/drm/i915/display/intel_display_device.o ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function ‘build_registry’: ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label at end of compound statement 1266 | default: | ^~~ CC [M] drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h CC [M] drivers/gpu/drm/amd/amdgpu/imu_v11_0.o make[7]: *** [../scripts/Makefile.build:244: drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1 make[7]: *** Waiting for unfinished jobs you are missing a `break;` after that default. @Timur, do you want to send a quick fix for this yourself? Otherwise, I can send one as well. We should fix this asap. - Danilo Thanks for catching this.
[PATCH] drm/nouveau: use vmemdup_array_user() in u_memcpya()
Now that we have vmemdup_array_user(), make use of it. Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_drv.h | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h index e239c6bf4afa..2038d60958e3 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drv.h +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h @@ -190,11 +190,8 @@ static inline void * u_memcpya(uint64_t user, unsigned int nmemb, unsigned int size) { void __user *userptr = u64_to_user_ptr(user); - size_t bytes; - if (unlikely(check_mul_overflow(nmemb, size, ))) - return ERR_PTR(-EOVERFLOW); - return vmemdup_user(userptr, bytes); + return vmemdup_array_user(userptr, nmemb, size); } #include base-commit: a57e191ebbaa0363dbf352cc37447c2230573e29 prerequisite-patch-id: d18b2d6615c58084a7027f4f1d9b51bd3f9dd83f -- 2.44.0
[PATCH] drm/nouveau: fix duplicate pointer to struct drm_device
nouveau_uvmm_ioctl_vm_init() already has a pointer to struct drm_device, no need to derive another one from struct drm_file. Fixes: 266f7618e761 ("drm/nouveau: separately allocate struct nouveau_uvmm") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c index 0a0a11dc9ec0..929afd5bc773 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -1812,13 +1812,12 @@ static const struct drm_gpuvm_ops gpuvm_ops = { }; int -nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, +nouveau_uvmm_ioctl_vm_init(struct drm_device *drm, void *data, struct drm_file *file_priv) { struct nouveau_uvmm *uvmm; struct nouveau_cli *cli = nouveau_cli(file_priv); - struct drm_device *drm = cli->drm->dev; struct drm_gem_object *r_obj; struct drm_nouveau_vm_init *init = data; u64 kernel_managed_end; base-commit: a57e191ebbaa0363dbf352cc37447c2230573e29 prerequisite-patch-id: 2d453c82161ab6a188caae12e558e8c8d6c1a450 -- 2.44.0
Re: [PATCH] nouveau: fix instmem race condition around ptr stores
On 4/11/24 03:15, Dave Airlie wrote: From: Dave Airlie Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0008 PGD 800114e6e067 P4D 800114e6e067 PUD 109046067 PMD 0 Oops: [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: :ac20c5857838 EFLAGS: 00010202 RAX: RBX: 004d8001 RCX: 0001 RDX: 004d8001 RSI: 06d8 RDI: a07afe332180 RBP: 06d8 R08: ac20c5857ad0 R09: 0010 R10: 0001 R11: a07af27e2de0 R12: 001c R13: ac20c5857ad0 R14: a07a96fe9040 R15: 001c FS: 7fe395eed7c0() GS:a07e2c98() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: 00011febe001 CR4: 003706f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: linux-stable Signed-off-by: Dave Airlie Added a "Fixes:" tag and applied to drm-misc-fixes. --- drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c index a7f3fc342d87..dd5b5a17ece0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c @@ -222,8 +222,11 @@ nv50_instobj_acquire(struct nvkm_memory *memory) void __iomem *map = NULL; /* Already mapped? */ - if (refcount_inc_not_zero(>maps)) + if (refcount_inc_not_zero(>maps)) { + /* read barrier match the wmb on refcount set */ + smp_rmb(); return iobj->map; + } /* Take the lock, and re-check that another thread hasn't * already mapped the object in the meantime. @@ -250,6 +253,8 @@ nv50_instobj_acquire(struct nvkm_memory *memory) iobj->base.memory.ptrs = _instobj_fast; else iobj->base.memory.ptrs = _instobj_slow; + /* barrier to ensure the ptrs are written before refcount is set */ + smp_wmb(); refcount_set(>maps, 1); }
Re: [PATCH v2] drm: nv04: Fix out of bounds access
On 4/11/24 13:08, Mikhail Kobuk wrote: When Output Resource (dcb->or) value is assigned in fabricate_dcb_output(), there may be out of bounds access to dac_users array in case dcb->or is zero because ffs(dcb->or) is used as index there. The 'or' argument of fabricate_dcb_output() must be interpreted as a number of bit to set, not value. Utilize macros from 'enum nouveau_or' in calls instead of hardcoding. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for iMac G4") Fixes: 670820c0e6a9 ("drm/nouveau: Workaround incorrect DCB entry on a GeForce3 Ti 200.") Signed-off-by: Mikhail Kobuk Applied to drm-misc-fixes, thanks! --- Changes in v2: - Instead of checking ffs(dcb->or), adjust function calls to match argument semantics - Link to v1: https://lore.kernel.org/all/20240331064552.6112-1-m.ko...@ispras.ru/ drivers/gpu/drm/nouveau/nouveau_bios.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c b/drivers/gpu/drm/nouveau/nouveau_bios.c index 479effcf607e..79cfab53f80e 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bios.c +++ b/drivers/gpu/drm/nouveau/nouveau_bios.c @@ -23,6 +23,7 @@ */ #include "nouveau_drv.h" +#include "nouveau_bios.h" #include "nouveau_reg.h" #include "dispnv04/hw.h" #include "nouveau_encoder.h" @@ -1677,7 +1678,7 @@ apply_dcb_encoder_quirks(struct drm_device *dev, int idx, u32 *conn, u32 *conf) */ if (nv_match_device(dev, 0x0201, 0x1462, 0x8851)) { if (*conn == 0xf2005014 && *conf == 0x) { - fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 1, 1, 1); + fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 1, 1, DCB_OUTPUT_B); return false; } } @@ -1763,26 +1764,26 @@ fabricate_dcb_encoder_table(struct drm_device *dev, struct nvbios *bios) #ifdef __powerpc__ /* Apple iMac G4 NV17 */ if (of_machine_is_compatible("PowerMac4,5")) { - fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, 1); - fabricate_dcb_output(dcb, DCB_OUTPUT_ANALOG, 1, all_heads, 2); + fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, DCB_OUTPUT_B); + fabricate_dcb_output(dcb, DCB_OUTPUT_ANALOG, 1, all_heads, DCB_OUTPUT_C); return; } #endif /* Make up some sane defaults */ fabricate_dcb_output(dcb, DCB_OUTPUT_ANALOG, -bios->legacy.i2c_indices.crt, 1, 1); +bios->legacy.i2c_indices.crt, 1, DCB_OUTPUT_B); if (nv04_tv_identify(dev, bios->legacy.i2c_indices.tv) >= 0) fabricate_dcb_output(dcb, DCB_OUTPUT_TV, bios->legacy.i2c_indices.tv, -all_heads, 0); +all_heads, DCB_OUTPUT_A); else if (bios->tmds.output0_script_ptr || bios->tmds.output1_script_ptr) fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, bios->legacy.i2c_indices.panel, -all_heads, 1); +all_heads, DCB_OUTPUT_B); } static int
Re: [PATCH] drm: nv04: Add check to avoid out of bounds access
On 4/10/24 17:39, Mikhail Kobuk wrote: On 08/04/2024 16:23, Danilo Krummrich wrote: On 4/5/24 22:05, Lyude Paul wrote: On Fri, 2024-04-05 at 17:53 +0200, Danilo Krummrich wrote: On 3/31/24 08:45, Mikhail Kobuk wrote: Output Resource (dcb->or) value is not guaranteed to be non-zero (i.e. in drivers/gpu/drm/nouveau/nouveau_bios.c, in 'fabricate_dcb_encoder_table()' 'dcb->or' is assigned value '0' in call to 'fabricate_dcb_output()'). I don't really know much about the semantics of this code. Looking at fabricate_dcb_output() though I wonder if the intention was to assign BIT(or) to entry->or. @Lyude, can you help here? This code is definitely a bit before my time as well - but I think you're completely correct. Especially considering this bit I found in nouveau_bios.h: Thanks for confirming. @Mikhail, I think we should rather fix this assignment then. Thank you all for a thorough look! - Danilo enum nouveau_or { DCB_OUTPUT_A = (1 << 0), DCB_OUTPUT_B = (1 << 1), DCB_OUTPUT_C = (1 << 2) }; Considering this code bit, and the fact that fabricate_dcb_output() is called in drivers/gpu/drm/nouveau/nouveau_bios.c only, there's option to adjust function calls instead of adding BIT(or), i.e.: fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, DCB_OUTPUT_B); instead of current: fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, 1); and etc. Should I make a new patch with adjusted calls or stick with BIT(or)? Please send a new patch adjusting the calls using enum nouveau_or, that seems to be cleaner. - Danilo Otherwise, for parsing the DCB entries, it seems that the bound checks are happening in olddcb_outp_foreach() [1]. [1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nouveau_bios.c#L1331 Add check to validate 'dcb->or' before it's used. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for iMac G4") Signed-off-by: Mikhail Kobuk --- drivers/gpu/drm/nouveau/dispnv04/dac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/dac.c b/drivers/gpu/drm/nouveau/dispnv04/dac.c index d6b8e0cce2ac..0c8d4fc95ff3 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/dac.c +++ b/drivers/gpu/drm/nouveau/dispnv04/dac.c @@ -428,7 +428,7 @@ void nv04_dac_update_dacclk(struct drm_encoder *encoder, bool enable) struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; - if (nv_gf4_disp_arch(dev)) { + if (nv_gf4_disp_arch(dev) && ffs(dcb->or)) { uint32_t *dac_users = _display(dev)- dac_users[ffs(dcb->or) - 1]; int dacclk_off = NV_PRAMDAC_DACCLK + nv04_dac_output_offset(encoder); uint32_t dacclk = NVReadRAMDAC(dev, 0, dacclk_off); @@ -453,7 +453,7 @@ bool nv04_dac_in_use(struct drm_encoder *encoder) struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; - return nv_gf4_disp_arch(encoder->dev) && + return nv_gf4_disp_arch(encoder->dev) && ffs(dcb->or) && (nv04_display(dev)->dac_users[ffs(dcb->or) - 1] & ~(1 << dcb->index)); }
Re: [PATCH] nouveau: fix instmem race condition around ptr stores
On 4/9/24 10:27, Lucas Stach wrote: Am Dienstag, dem 09.04.2024 um 10:34 +1000 schrieb Dave Airlie: From: Dave Airlie Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0008 PGD 800114e6e067 P4D 800114e6e067 PUD 109046067 PMD 0 Oops: [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: :ac20c5857838 EFLAGS: 00010202 RAX: RBX: 004d8001 RCX: 0001 RDX: 004d8001 RSI: 06d8 RDI: a07afe332180 RBP: 06d8 R08: ac20c5857ad0 R09: 0010 R10: 0001 R11: a07af27e2de0 R12: 001c R13: ac20c5857ad0 R14: a07a96fe9040 R15: 001c FS: 7fe395eed7c0() GS:a07e2c98() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: 00011febe001 CR4: 003706f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. Good catch! Cc: linux-stable Signed-off-by: Dave Airlie --- drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c index a7f3fc342d87..cbacc7b11f8c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c @@ -250,6 +250,9 @@ nv50_instobj_acquire(struct nvkm_memory *memory) iobj->base.memory.ptrs = _instobj_fast; else iobj->base.memory.ptrs = _instobj_slow; + /* barrier to ensure ptrs is written before another thread + does refcount_inc_not_zero successfully. */ + smp_mb(); Doesn't this miss the corresponding smp_rmb after refcount_inc_not_zero? Without it a sufficiently speculating CPU might still hoist the NULL ptr load across the refcount increase. Agree, also think this one could be smp_wmb() only. I also think it's reasonable to keep "the fast path refcount_inc_not_zero that doesn't take the lock", since the scope for this being potentially racy is limited to this function only. Regards, Lucas
Re: [PATCH] drm: nv04: Add check to avoid out of bounds access
On 4/5/24 22:05, Lyude Paul wrote: On Fri, 2024-04-05 at 17:53 +0200, Danilo Krummrich wrote: On 3/31/24 08:45, Mikhail Kobuk wrote: Output Resource (dcb->or) value is not guaranteed to be non-zero (i.e. in drivers/gpu/drm/nouveau/nouveau_bios.c, in 'fabricate_dcb_encoder_table()' 'dcb->or' is assigned value '0' in call to 'fabricate_dcb_output()'). I don't really know much about the semantics of this code. Looking at fabricate_dcb_output() though I wonder if the intention was to assign BIT(or) to entry->or. @Lyude, can you help here? This code is definitely a bit before my time as well - but I think you're completely correct. Especially considering this bit I found in nouveau_bios.h: Thanks for confirming. @Mikhail, I think we should rather fix this assignment then. - Danilo enum nouveau_or { DCB_OUTPUT_A = (1 << 0), DCB_OUTPUT_B = (1 << 1), DCB_OUTPUT_C = (1 << 2) }; Otherwise, for parsing the DCB entries, it seems that the bound checks are happening in olddcb_outp_foreach() [1]. [1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nouveau_bios.c#L1331 Add check to validate 'dcb->or' before it's used. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for iMac G4") Signed-off-by: Mikhail Kobuk --- drivers/gpu/drm/nouveau/dispnv04/dac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/dac.c b/drivers/gpu/drm/nouveau/dispnv04/dac.c index d6b8e0cce2ac..0c8d4fc95ff3 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/dac.c +++ b/drivers/gpu/drm/nouveau/dispnv04/dac.c @@ -428,7 +428,7 @@ void nv04_dac_update_dacclk(struct drm_encoder *encoder, bool enable) struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; - if (nv_gf4_disp_arch(dev)) { + if (nv_gf4_disp_arch(dev) && ffs(dcb->or)) { uint32_t *dac_users = _display(dev)- dac_users[ffs(dcb->or) - 1]; int dacclk_off = NV_PRAMDAC_DACCLK + nv04_dac_output_offset(encoder); uint32_t dacclk = NVReadRAMDAC(dev, 0, dacclk_off); @@ -453,7 +453,7 @@ bool nv04_dac_in_use(struct drm_encoder *encoder) struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; - return nv_gf4_disp_arch(encoder->dev) && + return nv_gf4_disp_arch(encoder->dev) && ffs(dcb->or) && (nv04_display(dev)->dac_users[ffs(dcb->or) - 1] & ~(1 << dcb->index)); }
Re: [PATCH] nouveau: fix devinit paths to only handle display on GSP.
On 4/8/24 08:42, Dave Airlie wrote: This reverts: nouveau/gsp: don't check devinit disable on GSP. and applies a further fix. It turns out the open gpu driver, checks this register, but only for display. Match that behaviour and only disable displays on turings. Fixes: 5d4e8ae6e57b ("nouveau/gsp: don't check devinit disable on GSP.") Signed-off-by: Dave Airlie Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c | 12 drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c | 1 + 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c index 7bcbc4895ec2..271bfa038f5b 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c @@ -25,6 +25,7 @@ #include #include +#include void gm107_devinit_disable(struct nvkm_devinit *init) @@ -33,10 +34,13 @@ gm107_devinit_disable(struct nvkm_devinit *init) u32 r021c00 = nvkm_rd32(device, 0x021c00); u32 r021c04 = nvkm_rd32(device, 0x021c04); - if (r021c00 & 0x0001) - nvkm_subdev_disable(device, NVKM_ENGINE_CE, 0); - if (r021c00 & 0x0004) - nvkm_subdev_disable(device, NVKM_ENGINE_CE, 2); + /* gsp only wants to enable/disable display */ + if (!nvkm_gsp_rm(device->gsp)) { + if (r021c00 & 0x0001) + nvkm_subdev_disable(device, NVKM_ENGINE_CE, 0); + if (r021c00 & 0x0004) + nvkm_subdev_disable(device, NVKM_ENGINE_CE, 2); + } if (r021c04 & 0x0001) nvkm_subdev_disable(device, NVKM_ENGINE_DISP, 0); } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c index 11b4c9c274a1..666eb93b1742 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c @@ -41,6 +41,7 @@ r535_devinit_new(const struct nvkm_devinit_func *hw, rm->dtor = r535_devinit_dtor; rm->post = hw->post; + rm->disable = hw->disable; ret = nv50_devinit_new_(rm, device, type, inst, pdevinit); if (ret)
Re: [PATCH] [v2] nouveau: fix function cast warning
On 4/4/24 18:02, Arnd Bergmann wrote: From: Arnd Bergmann Calling a function through an incompatible pointer type causes breaks kcfi, so clang warns about the assignment: drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 73 | .fini = (void(*)(void *))kfree, Avoid this with a trivial wrapper. Fixes: c39f472e9f14 ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)") Signed-off-by: Arnd Bergmann Applied to drm-misc-fixes, thanks! --- v2: avoid 'return kfree()' expression returning a void --- drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c index 4bf486b57101..cb05f7f48a98 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c @@ -66,11 +66,16 @@ of_init(struct nvkm_bios *bios, const char *name) return ERR_PTR(-EINVAL); } +static void of_fini(void *p) +{ + kfree(p); +} + const struct nvbios_source nvbios_of = { .name = "OpenFirmware", .init = of_init, - .fini = (void(*)(void *))kfree, + .fini = of_fini, .read = of_read, .size = of_size, .rw = false,
Re: [PATCH] nouveau/gsp: Avoid addressing beyond end of rpc->entries
On 3/30/24 15:12, Kees Cook wrote: Using the end of rpc->entries[] for addressing runs into both compile-time and run-time detection of accessing beyond the end of the array. Use the base pointer instead, since was allocated with the additional bytes for storing the strings. Avoids the following warning in future GCC releases with support for __counted_by: In function 'fortify_memcpy_chk', inlined from 'r535_gsp_rpc_set_registry' at ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1123:3: ../include/linux/fortify-string.h:553:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 553 | __write_overflow_field(p_size_field, size); | ^~ for this code: strings = (char *)>entries[NV_GSP_REG_NUM_ENTRIES]; ... memcpy(strings, r535_registry_entries[i].name, name_len); Signed-off-by: Kees Cook Applied to drm-misc-fixes, thanks! --- Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: David Airlie Cc: Daniel Vetter Cc: Dave Airlie Cc: Ben Skeggs Cc: Timur Tabi Cc: dri-devel@lists.freedesktop.org Cc: nouv...@lists.freedesktop.org --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 9994cbd6f1c4..9858c1438aa7 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -1112,7 +1112,7 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp) rpc->numEntries = NV_GSP_REG_NUM_ENTRIES; str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]); - strings = (char *)>entries[NV_GSP_REG_NUM_ENTRIES]; + strings = (char *)rpc + str_offset; for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) { int name_len = strlen(r535_registry_entries[i].name) + 1;
Re: [PATCH v0 13/14] drm/nouveau: Make I2C terminology more inclusive
Hi Easwar, On 3/29/24 18:00, Easwar Hariharan wrote: I2C v7, SMBus 3.2, and I3C specifications have replaced "master/slave" with more appropriate terms. Inspired by and following on to Wolfram's series to fix drivers/i2c/[1], fix the terminology for users of I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists in the specification. Compile tested, no functionality changes intended [1]: https://lore.kernel.org/all/20240322132619.6389-1-wsa+rene...@sang-engineering.com/ Signed-off-by: Easwar Hariharan --- drivers/gpu/drm/nouveau/dispnv04/dfp.c | 14 +++--- .../gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h | 2 +- drivers/gpu/drm/nouveau/nouveau_bios.c | 4 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/dfp.c b/drivers/gpu/drm/nouveau/dispnv04/dfp.c index d5b129dc623b..65b791006b19 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/dfp.c +++ b/drivers/gpu/drm/nouveau/dispnv04/dfp.c @@ -149,7 +149,7 @@ void nv04_dfp_update_fp_control(struct drm_encoder *encoder, int mode) } } -static struct drm_encoder *get_tmds_slave(struct drm_encoder *encoder) +static struct drm_encoder *get_tmds_client(struct drm_encoder *encoder) { struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; @@ -172,7 +172,7 @@ static struct drm_encoder *get_tmds_slave(struct drm_encoder *encoder) struct dcb_output *slave_dcb = nouveau_encoder(slave)->dcb; if (slave_dcb->type == DCB_OUTPUT_TMDS && get_slave_funcs(slave) && - slave_dcb->tmdsconf.slave_addr == dcb->tmdsconf.slave_addr) + slave_dcb->tmdsconf.client_addr == dcb->tmdsconf.client_addr) return slave; While, personally, I think master/slave was well suiting for the device relationship on those busses, I think that if we change it up to conform with the change in specification, we should make sure to update drivers consistently. We should make sure to also change the names of the sourrounding structures and variable names, otherwise we just make this code harder to read. - Danilo } @@ -471,7 +471,7 @@ static void nv04_dfp_commit(struct drm_encoder *encoder) NVWriteRAMDAC(dev, 0, NV_PRAMDAC_TEST_CONTROL + nv04_dac_output_offset(encoder), 0x0010); /* Init external transmitters */ - slave_encoder = get_tmds_slave(encoder); + slave_encoder = get_tmds_client(encoder); if (slave_encoder) get_slave_funcs(slave_encoder)->mode_set( slave_encoder, _encoder->mode, _encoder->mode); @@ -621,7 +621,7 @@ static void nv04_dfp_destroy(struct drm_encoder *encoder) kfree(nv_encoder); } -static void nv04_tmds_slave_init(struct drm_encoder *encoder) +static void nv04_tmds_client_init(struct drm_encoder *encoder) { struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; @@ -632,7 +632,7 @@ static void nv04_tmds_slave_init(struct drm_encoder *encoder) { { .type = "sil164", - .addr = (dcb->tmdsconf.slave_addr == 0x7 ? 0x3a : 0x38), + .addr = (dcb->tmdsconf.client_addr == 0x7 ? 0x3a : 0x38), .platform_data = &(struct sil164_encoder_params) { SIL164_INPUT_EDGE_RISING } @@ -642,7 +642,7 @@ static void nv04_tmds_slave_init(struct drm_encoder *encoder) }; int type; - if (!nv_gf4_disp_arch(dev) || !bus || get_tmds_slave(encoder)) + if (!nv_gf4_disp_arch(dev) || !bus || get_tmds_client(encoder)) return; type = nvkm_i2c_bus_probe(bus, "TMDS transmitter", info, NULL, NULL); @@ -716,7 +716,7 @@ nv04_dfp_create(struct drm_connector *connector, struct dcb_output *entry) if (entry->type == DCB_OUTPUT_TMDS && entry->location != DCB_LOC_ON_CHIP) - nv04_tmds_slave_init(encoder); + nv04_tmds_client_init(encoder); drm_connector_attach_encoder(connector, encoder); return 0; diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h index 73f9d9947e7e..5da40cf74b1a 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h @@ -50,7 +50,7 @@ struct dcb_output { } dpconf; struct { struct sor_conf sor; - int slave_addr; + int client_addr; } tmdsconf; }; bool i2c_upper_default; diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c b/drivers/gpu/drm/nouveau/nouveau_bios.c index 479effcf607e..a91a5d3df3ca 100644 ---
Re: [PATCH] drm: nv04: Add check to avoid out of bounds access
On 3/31/24 08:45, Mikhail Kobuk wrote: Output Resource (dcb->or) value is not guaranteed to be non-zero (i.e. in drivers/gpu/drm/nouveau/nouveau_bios.c, in 'fabricate_dcb_encoder_table()' 'dcb->or' is assigned value '0' in call to 'fabricate_dcb_output()'). I don't really know much about the semantics of this code. Looking at fabricate_dcb_output() though I wonder if the intention was to assign BIT(or) to entry->or. @Lyude, can you help here? Otherwise, for parsing the DCB entries, it seems that the bound checks are happening in olddcb_outp_foreach() [1]. [1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nouveau_bios.c#L1331 Add check to validate 'dcb->or' before it's used. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for iMac G4") Signed-off-by: Mikhail Kobuk --- drivers/gpu/drm/nouveau/dispnv04/dac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/dac.c b/drivers/gpu/drm/nouveau/dispnv04/dac.c index d6b8e0cce2ac..0c8d4fc95ff3 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/dac.c +++ b/drivers/gpu/drm/nouveau/dispnv04/dac.c @@ -428,7 +428,7 @@ void nv04_dac_update_dacclk(struct drm_encoder *encoder, bool enable) struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; - if (nv_gf4_disp_arch(dev)) { + if (nv_gf4_disp_arch(dev) && ffs(dcb->or)) { uint32_t *dac_users = _display(dev)->dac_users[ffs(dcb->or) - 1]; int dacclk_off = NV_PRAMDAC_DACCLK + nv04_dac_output_offset(encoder); uint32_t dacclk = NVReadRAMDAC(dev, 0, dacclk_off); @@ -453,7 +453,7 @@ bool nv04_dac_in_use(struct drm_encoder *encoder) struct drm_device *dev = encoder->dev; struct dcb_output *dcb = nouveau_encoder(encoder)->dcb; - return nv_gf4_disp_arch(encoder->dev) && + return nv_gf4_disp_arch(encoder->dev) && ffs(dcb->or) && (nv04_display(dev)->dac_users[ffs(dcb->or) - 1] & ~(1 << dcb->index)); }
Re: [PATCH] nouveau/uvmm: fix addr/range calcs for remap operations
On 3/28/24 03:43, Dave Airlie wrote: From: Dave Airlie dEQP-VK.sparse_resources.image_rebind.2d_array.r64i.128_128_8 was causing a remap operation like the below. op_remap: prev: 003fffed 000f a5abd18a op_remap: next: op_remap: unmap: 003fffed 0010 0 op_map: map: 003c 0001 5b1ba33c 000e This was resulting in an unmap operation from 0x3fffed+0xf, 0x10 which was corrupting the pagetables and oopsing the kernel. Good catch, thanks for looking into that. Fixes the prev + unmap range calcs to use start/end and map back to addr/range. I like how using start/end instead fixes the issue and keeps it simple. Signed-off-by: Dave Airlie Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Cc: Danilo Krummrich Applied the patch to drm-misc-fixes. --- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c index 9675ef25b16d..87bce1a9d073 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -813,15 +813,15 @@ op_remap(struct drm_gpuva_op_remap *r, struct drm_gpuva_op_unmap *u = r->unmap; struct nouveau_uvma *uvma = uvma_from_va(u->va); u64 addr = uvma->va.va.addr; - u64 range = uvma->va.va.range; + u64 end = uvma->va.va.addr + uvma->va.va.range; if (r->prev) addr = r->prev->va.addr + r->prev->va.range; if (r->next) - range = r->next->va.addr - addr; + end = r->next->va.addr; - op_unmap_range(u, addr, range); + op_unmap_range(u, addr, end - addr); } static int
Re: [PATCH][next] drm/nouveau/gr/gf100: Remove second semicolon
On 3/15/24 10:09, Colin Ian King wrote: There is a statement with two semicolons. Remove the second one, it is redundant. Signed-off-by: Colin Ian King Applied, thanks! --- drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c index 986e8d547c94..060c74a80eb1 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c @@ -420,7 +420,7 @@ gf100_gr_chan_new(struct nvkm_gr *base, struct nvkm_chan *fifoch, return ret; } else { ret = nvkm_memory_map(gr->attrib_cb, 0, chan->vmm, chan->attrib_cb, - , sizeof(args));; + , sizeof(args)); if (ret) return ret; }
Nova and staging Rust abstractions
Hi all, In this mail I briefly want to announce the Nova project and subsequently talk about the first efforts taken in order to upstream required Rust abstractions: We just started to work on Nova, a Rust-based GSP-only driver for Nvidia GPUs. Nova, in the long term, is intended to serve as the successor of Nouveau for GSP-firmware-based GPUs. With Nova we see the chance to significantly decrease the complexity of the driver compared to Nouveau for mainly two reasons. First, Nouveau's historic architecture, especially around nvif/nvkm, is rather complicated and inflexible and requires major rework to solve certain problems (such as locking hierarchy in VMM / MMU code for VM_BIND currently being solved with a workaround) and second, with a GSP-only driver there is no need to maintain compatibility with pre-GSP code. Besides that, we also want to take the chance to contribute to the Rust efforts in the kernel and benefit from from more memory safety offered by the Rust programming language. Ideally, all that leads to better maintainability and a driver that makes it easy for people to get involved into this project. With the choice of Rust the first problem to deal with are missing C binding abstractions for integral kernel infrastructure (e.g. device / driver abstractions). Since this is a bit of a chicken and egg problem - we need a user to upstream abstractions, but we also need the abstractions to create a driver - we want to develop Nova upstream and start with just a driver stub that only makes use of some basic Rust abstractions. In particular, we want to start with basic device / driver, DRM and PCI abstractions and a Nova stub driver making use of them. Fortunately, a lot of those abstractions did already exist in various downstream trees (big thanks to all the authors). We started picking up existing work, figured out the dependencies, fixed a few issues and warnings and applied some structure by organizing it in the following branches. Remotes: drm-misc [1], rust (Rust-for-Linux) [2], nova [3] - drm-misc/topic/rust-drm: [4] contains basic DRM abstractions, e.g. DRM device, GEM; depends on rust-device - rust/staging/rust-device: [5] contains all common device / driver abstractions - rust/staging/rust-pci:[6] contains basic PCI abstractions, e.g. PCI device, but also other dependencies such as IoMem and Ressources; depends on rust-device - rust/staging/dev: [7] integration branch for all staging Rust abstractions - nova/stub/nova: [8] the nova stub driver; depends on rust-drm and rust-pci All branches and commits are functional, but the code and commit messages are not yet in a state for sending actual patch series. While we are working on improving those patches we would like to ask for feedback and / or improvements already. @Greg, can you please have a first quick look at rust-device [5]? - Danilo [1] https://cgit.freedesktop.org/drm/drm-misc [2] https://github.com/Rust-for-Linux/linux [3] https://gitlab.freedesktop.org/drm/nova [4] https://cgit.freedesktop.org/drm/drm-misc/log/?h=topic/rust-drm [5] https://github.com/Rust-for-Linux/linux/tree/staging/rust-device [6] https://github.com/Rust-for-Linux/linux/tree/staging/rust-pci [7] https://github.com/Rust-for-Linux/linux/tree/staging/dev [8] https://gitlab.freedesktop.org/drm/nova/-/tree/stub/nova
Re: [PATCH] nouveau/gsp: don't check devinit disable on GSP.
On 3/19/24 18:38, Timothy Maden wrote: Hello Can this be merged please ? Applied this to drm-misc-next-fixes a couple of hours ago. - Danilo On 3/14/24 03:45, Dave Airlie wrote: From: Dave Airlie GSP should be handling this and I can see no evidence in opengpu driver that this register should be touched. Fixed acceleration on 2080 Ti GPUs. Fixes: 15740541e8f0 ("drm/nouveau/devinit/tu102-: prepare for GSP-RM") Signed-off-by: Dave Airlie --- drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c index 666eb93b1742..11b4c9c274a1 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c @@ -41,7 +41,6 @@ r535_devinit_new(const struct nvkm_devinit_func *hw, rm->dtor = r535_devinit_dtor; rm->post = hw->post; - rm->disable = hw->disable; ret = nv50_devinit_new_(rm, device, type, inst, pdevinit); if (ret)
Re: [RESEND v3 1/6] drm/mst: read sideband messaging cap
On 3/19/24 10:20, Jani Nikula wrote: On Tue, 19 Mar 2024, Jani Nikula wrote: Amend drm_dp_read_mst_cap() to return an enum, indicating "SST", "SST with sideband messaging", or "MST". Modify all call sites to take the new return value into account. drm-misc and nouveau maintainers, ack for merging this via drm-intel, please? Sure, please go ahead. Thanks, Danilo BR, Jani. v2: - Rename enumerators (Ville) Cc: Arun R Murthy Cc: Ville Syrjälä Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Reviewed-by: Ville Syrjälä Signed-off-by: Jani Nikula --- drivers/gpu/drm/display/drm_dp_mst_topology.c | 20 ++-- drivers/gpu/drm/i915/display/intel_dp.c | 4 ++-- drivers/gpu/drm/nouveau/nouveau_dp.c | 2 +- include/drm/display/drm_dp_mst_helper.h | 23 ++- 4 files changed, 38 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c index 03d528209426..c193be3577f7 100644 --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c @@ -3608,24 +3608,30 @@ fixed20_12 drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr, EXPORT_SYMBOL(drm_dp_get_vc_payload_bw); /** - * drm_dp_read_mst_cap() - check whether or not a sink supports MST + * drm_dp_read_mst_cap() - Read the sink's MST mode capability * @aux: The DP AUX channel to use * @dpcd: A cached copy of the DPCD capabilities for this sink * - * Returns: %True if the sink supports MST, %false otherwise + * Returns: enum drm_dp_mst_mode to indicate MST mode capability */ -bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, -const u8 dpcd[DP_RECEIVER_CAP_SIZE]) +enum drm_dp_mst_mode drm_dp_read_mst_cap(struct drm_dp_aux *aux, +const u8 dpcd[DP_RECEIVER_CAP_SIZE]) { u8 mstm_cap; if (dpcd[DP_DPCD_REV] < DP_DPCD_REV_12) - return false; + return DRM_DP_SST; if (drm_dp_dpcd_readb(aux, DP_MSTM_CAP, _cap) != 1) - return false; + return DRM_DP_SST; + + if (mstm_cap & DP_MST_CAP) + return DRM_DP_MST; + + if (mstm_cap & DP_SINGLE_STREAM_SIDEBAND_MSG) + return DRM_DP_SST_SIDEBAND_MSG; - return mstm_cap & DP_MST_CAP; + return DRM_DP_SST; } EXPORT_SYMBOL(drm_dp_read_mst_cap); diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index af7ca00e9bc0..91c42949ac7e 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -4046,7 +4046,7 @@ intel_dp_can_mst(struct intel_dp *intel_dp) return i915->display.params.enable_dp_mst && intel_dp_mst_source_support(intel_dp) && - drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd); + drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) == DRM_DP_MST; } static void @@ -4055,7 +4055,7 @@ intel_dp_configure_mst(struct intel_dp *intel_dp) struct drm_i915_private *i915 = dp_to_i915(intel_dp); struct intel_encoder *encoder = _to_dig_port(intel_dp)->base; - bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd); + bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) == DRM_DP_MST; drm_dbg_kms(>drm, "[ENCODER:%d:%s] MST support: port: %s, sink: %s, modparam: %s\n", diff --git a/drivers/gpu/drm/nouveau/nouveau_dp.c b/drivers/gpu/drm/nouveau/nouveau_dp.c index 7de7707ec6a8..fb06ee17d9e5 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dp.c +++ b/drivers/gpu/drm/nouveau/nouveau_dp.c @@ -181,7 +181,7 @@ nouveau_dp_probe_dpcd(struct nouveau_connector *nv_connector, if (nouveau_mst) { mstm = outp->dp.mstm; if (mstm) - mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd); + mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd) == DRM_DP_MST; } if (nouveau_dp_has_sink_count(connector, outp)) { diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h index 3ae88a383a41..cbcb49cb6a46 100644 --- a/include/drm/display/drm_dp_mst_helper.h +++ b/include/drm/display/drm_dp_mst_helper.h @@ -817,7 +817,28 @@ int drm_dp_mst_topology_mgr_init(struct drm_dp_mst_topology_mgr *mgr, void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr); -bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, const u8 dpcd[DP_RECEIVER_CAP_SIZE]); +/** + * enum drm_dp_mst_mode - sink's MST mode capability + */ +enum drm_dp_mst_mode { + /** +* @DRM_DP_SST: The sink does not support MST nor single stream sideband +* messaging. +*/ + DRM_DP_SST,
Re: [RESEND v3 1/6] drm/mst: read sideband messaging cap
On 3/19/24 10:12, Jani Nikula wrote: Amend drm_dp_read_mst_cap() to return an enum, indicating "SST", "SST with sideband messaging", or "MST". Modify all call sites to take the new return value into account. v2: - Rename enumerators (Ville) Cc: Arun R Murthy Cc: Ville Syrjälä Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Reviewed-by: Ville Syrjälä Signed-off-by: Jani Nikula Acked-by: Danilo Krummrich --- drivers/gpu/drm/display/drm_dp_mst_topology.c | 20 ++-- drivers/gpu/drm/i915/display/intel_dp.c | 4 ++-- drivers/gpu/drm/nouveau/nouveau_dp.c | 2 +- include/drm/display/drm_dp_mst_helper.h | 23 ++- 4 files changed, 38 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c index 03d528209426..c193be3577f7 100644 --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c @@ -3608,24 +3608,30 @@ fixed20_12 drm_dp_get_vc_payload_bw(const struct drm_dp_mst_topology_mgr *mgr, EXPORT_SYMBOL(drm_dp_get_vc_payload_bw); /** - * drm_dp_read_mst_cap() - check whether or not a sink supports MST + * drm_dp_read_mst_cap() - Read the sink's MST mode capability * @aux: The DP AUX channel to use * @dpcd: A cached copy of the DPCD capabilities for this sink * - * Returns: %True if the sink supports MST, %false otherwise + * Returns: enum drm_dp_mst_mode to indicate MST mode capability */ -bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, -const u8 dpcd[DP_RECEIVER_CAP_SIZE]) +enum drm_dp_mst_mode drm_dp_read_mst_cap(struct drm_dp_aux *aux, +const u8 dpcd[DP_RECEIVER_CAP_SIZE]) { u8 mstm_cap; if (dpcd[DP_DPCD_REV] < DP_DPCD_REV_12) - return false; + return DRM_DP_SST; if (drm_dp_dpcd_readb(aux, DP_MSTM_CAP, _cap) != 1) - return false; + return DRM_DP_SST; + + if (mstm_cap & DP_MST_CAP) + return DRM_DP_MST; + + if (mstm_cap & DP_SINGLE_STREAM_SIDEBAND_MSG) + return DRM_DP_SST_SIDEBAND_MSG; - return mstm_cap & DP_MST_CAP; + return DRM_DP_SST; } EXPORT_SYMBOL(drm_dp_read_mst_cap); diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index af7ca00e9bc0..91c42949ac7e 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -4046,7 +4046,7 @@ intel_dp_can_mst(struct intel_dp *intel_dp) return i915->display.params.enable_dp_mst && intel_dp_mst_source_support(intel_dp) && - drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd); + drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) == DRM_DP_MST; } static void @@ -4055,7 +4055,7 @@ intel_dp_configure_mst(struct intel_dp *intel_dp) struct drm_i915_private *i915 = dp_to_i915(intel_dp); struct intel_encoder *encoder = _to_dig_port(intel_dp)->base; - bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd); + bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) == DRM_DP_MST; drm_dbg_kms(>drm, "[ENCODER:%d:%s] MST support: port: %s, sink: %s, modparam: %s\n", diff --git a/drivers/gpu/drm/nouveau/nouveau_dp.c b/drivers/gpu/drm/nouveau/nouveau_dp.c index 7de7707ec6a8..fb06ee17d9e5 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dp.c +++ b/drivers/gpu/drm/nouveau/nouveau_dp.c @@ -181,7 +181,7 @@ nouveau_dp_probe_dpcd(struct nouveau_connector *nv_connector, if (nouveau_mst) { mstm = outp->dp.mstm; if (mstm) - mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd); + mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd) == DRM_DP_MST; } if (nouveau_dp_has_sink_count(connector, outp)) { diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h index 3ae88a383a41..cbcb49cb6a46 100644 --- a/include/drm/display/drm_dp_mst_helper.h +++ b/include/drm/display/drm_dp_mst_helper.h @@ -817,7 +817,28 @@ int drm_dp_mst_topology_mgr_init(struct drm_dp_mst_topology_mgr *mgr, void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr); -bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, const u8 dpcd[DP_RECEIVER_CAP_SIZE]); +/** + * enum drm_dp_mst_mode - sink's MST mode capability + */ +enum drm_dp_mst_mode { + /** +* @DRM_DP_SST: The sink does not support MST nor single stream sideband +* messaging. +*/ + DRM_DP_SST, + /** +* @DRM_DP_MST: Sink supports MST, more than one stream and single +* stream sideband messaging. +*/ + DRM_DP
Re: [PATCH v3] nouveau/dmem: handle kcalloc() allocation failure
On 3/6/24 06:01, Duoming Zhou wrote: The kcalloc() in nouveau_dmem_evict_chunk() will return null if the physical memory has run out. As a result, if we dereference src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs will happen. Moreover, the GPU is going away. If the kcalloc() fails, we could not evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL flag in kcalloc(). Finally, as there is no need to have physically contiguous memory, this patch switches kcalloc() to kvcalloc() in order to avoid failing allocations. Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release") Suggested-by: Danilo Krummrich Signed-off-by: Duoming Zhou Applied to drm-misc-fixes, thanks! --- Changes in v3: - Switch kcalloc() to kvcalloc(). drivers/gpu/drm/nouveau/nouveau_dmem.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 12feecf71e7..6fb65b01d77 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -378,9 +378,9 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk) dma_addr_t *dma_addrs; struct nouveau_fence *fence; - src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL); - dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL); - dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL); + src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL); + dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL); + dma_addrs = kvcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | __GFP_NOFAIL); migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT, npages); @@ -406,11 +406,11 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk) migrate_device_pages(src_pfns, dst_pfns, npages); nouveau_dmem_fence_done(); migrate_device_finalize(src_pfns, dst_pfns, npages); - kfree(src_pfns); - kfree(dst_pfns); + kvfree(src_pfns); + kvfree(dst_pfns); for (i = 0; i < npages; i++) dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL); - kfree(dma_addrs); + kvfree(dma_addrs); } void
Re: [PATCH v2] nouveau/dmem: handle kcalloc() allocation failure
Hi Duoming, thanks for sending a V2. On 3/5/24 15:39, Duoming Zhou wrote: The kcalloc() in nouveau_dmem_evict_chunk() will return null if the physical memory has run out. As a result, if we dereference src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs will happen. Moreover, the GPU is going away. If the kcalloc() fails, we could not evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL flag in kcalloc(). Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release") Signed-off-by: Duoming Zhou --- Changes in v2: - Allocate with __GFP_NOFAIL. drivers/gpu/drm/nouveau/nouveau_dmem.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 12feecf71e7..f5ae9724ee2 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -378,9 +378,9 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk) dma_addr_t *dma_addrs; struct nouveau_fence *fence; - src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL); - dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL); - dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL); + src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL); + dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL); + dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | __GFP_NOFAIL); I think we should also switch to kvcalloc(), AFAICS we don't need physically contiguous memory. Sorry I did not mention that in V1 already. - Danilo migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT, npages);
Re: [PATCH] nouveau/dmem: handle kcalloc() allocation failure
Hi Duoming, On 3/3/24 08:53, Duoming Zhou wrote: The kcalloc() in nouveau_dmem_evict_chunk() will return null if the physical memory has run out. As a result, if we dereference src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs will happen. This patch uses stack variables to replace the kcalloc(). Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release") Signed-off-by: Duoming Zhou --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 12feecf71e7..9a578262c6d 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -374,13 +374,13 @@ static void nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk) { unsigned long i, npages = range_len(>pagemap.range) >> PAGE_SHIFT; - unsigned long *src_pfns, *dst_pfns; - dma_addr_t *dma_addrs; + unsigned long src_pfns[npages], dst_pfns[npages]; + dma_addr_t dma_addrs[npages]; Please don't use variable length arrays, this can potentially blow up the stack. As a fix I think we should allocate with __GFP_NOFAIL instead. - Danilo struct nouveau_fence *fence; - src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL); - dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL); - dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL); + memset(src_pfns, 0, npages); + memset(dst_pfns, 0, npages); + memset(dma_addrs, 0, npages); migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT, npages); @@ -406,11 +406,8 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk) migrate_device_pages(src_pfns, dst_pfns, npages); nouveau_dmem_fence_done(); migrate_device_finalize(src_pfns, dst_pfns, npages); - kfree(src_pfns); - kfree(dst_pfns); for (i = 0; i < npages; i++) dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL); - kfree(dma_addrs); } void
Re: [PATCH 2/2] drm/nouveau: move more missing UAPI bits
On 3/4/24 19:31, Karol Herbst wrote: Those are already de-facto UAPI, so let's just move it into the uapi header. Signed-off-by: Karol Herbst Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_abi16.c | 20 +++- drivers/gpu/drm/nouveau/nouveau_abi16.h | 12 include/uapi/drm/nouveau_drm.h | 22 ++ 3 files changed, 37 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c b/drivers/gpu/drm/nouveau/nouveau_abi16.c index cd14f993bdd1b..92f9127b284ac 100644 --- a/drivers/gpu/drm/nouveau/nouveau_abi16.c +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c @@ -312,11 +312,21 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS) if (device->info.family >= NV_DEVICE_INFO_V0_KEPLER) { if (init->fb_ctxdma_handle == ~0) { switch (init->tt_ctxdma_handle) { - case 0x01: engine = NV_DEVICE_HOST_RUNLIST_ENGINES_GR ; break; - case 0x02: engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSPDEC; break; - case 0x04: engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSPPP ; break; - case 0x08: engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSVLD ; break; - case 0x30: engine = NV_DEVICE_HOST_RUNLIST_ENGINES_CE ; break; + case NOUVEAU_FIFO_ENGINE_GR: + engine = NV_DEVICE_HOST_RUNLIST_ENGINES_GR; + break; + case NOUVEAU_FIFO_ENGINE_VP: + engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSPDEC; + break; + case NOUVEAU_FIFO_ENGINE_PPP: + engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSPPP; + break; + case NOUVEAU_FIFO_ENGINE_BSP: + engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSVLD; + break; + case NOUVEAU_FIFO_ENGINE_CE: + engine = NV_DEVICE_HOST_RUNLIST_ENGINES_CE; + break; default: return nouveau_abi16_put(abi16, -ENOSYS); } diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.h b/drivers/gpu/drm/nouveau/nouveau_abi16.h index 11c8c4a80079b..661b901d8ecc9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_abi16.h +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.h @@ -50,18 +50,6 @@ struct drm_nouveau_grobj_alloc { int class; }; -struct drm_nouveau_notifierobj_alloc { - uint32_t channel; - uint32_t handle; - uint32_t size; - uint32_t offset; -}; - -struct drm_nouveau_gpuobj_free { - int channel; - uint32_t handle; -}; - struct drm_nouveau_setparam { uint64_t param; uint64_t value; diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h index 77d7ff0d5b110..5404d4cfff4c2 100644 --- a/include/uapi/drm/nouveau_drm.h +++ b/include/uapi/drm/nouveau_drm.h @@ -73,6 +73,16 @@ struct drm_nouveau_getparam { __u64 value; }; +/* + * Those are used to support selecting the main engine used on Kepler. + * This goes into drm_nouveau_channel_alloc::tt_ctxdma_handle + */ +#define NOUVEAU_FIFO_ENGINE_GR 0x01 +#define NOUVEAU_FIFO_ENGINE_VP 0x02 +#define NOUVEAU_FIFO_ENGINE_PPP 0x04 +#define NOUVEAU_FIFO_ENGINE_BSP 0x08 +#define NOUVEAU_FIFO_ENGINE_CE 0x30 + struct drm_nouveau_channel_alloc { __u32 fb_ctxdma_handle; __u32 tt_ctxdma_handle; @@ -95,6 +105,18 @@ struct drm_nouveau_channel_free { __s32 channel; }; +struct drm_nouveau_notifierobj_alloc { + __u32 channel; + __u32 handle; + __u32 size; + __u32 offset; +}; + +struct drm_nouveau_gpuobj_free { + __s32 channel; + __u32 handle; +}; + #define NOUVEAU_GEM_DOMAIN_CPU (1 << 0) #define NOUVEAU_GEM_DOMAIN_VRAM (1 << 1) #define NOUVEAU_GEM_DOMAIN_GART (1 << 2)
Re: [PATCH 1/2] drm/nouveau: fix stale locked mutex in nouveau_gem_ioctl_pushbuf
On 3/4/24 19:31, Karol Herbst wrote: If VM_BIND is enabled on the client the legacy submission ioctl can't be used, however if a client tries to do so regardless it will return an error. In this case the clients mutex remained unlocked leading to a deadlock inside nouveau_drm_postclose or any other nouveau ioctl call. Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Cc: Danilo Krummrich Signed-off-by: Karol Herbst Should add a stable tag for that one, otherwise: Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 49c2bcbef1299..5a887d67dc0e8 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -764,7 +764,7 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data, return -ENOMEM; if (unlikely(nouveau_cli_uvmm(cli))) - return -ENOSYS; + return nouveau_abi16_put(abi16, -ENOSYS); list_for_each_entry(temp, >channels, head) { if (temp->chan->chid == req->channel) {
Re: [PATCH stable v6.7] drm/nouveau: don't fini scheduler before entity flush
On 3/4/24 18:55, Greg KH wrote: On Mon, Mar 04, 2024 at 06:01:46PM +0100, Danilo Krummrich wrote: Cc: # v6.7 only You say 6.7 only, but this commit is in 6.6, so why not 6.6 also? Good catch, I was sure I originally merged this for 6.7. This fix should indeed be applied to 6.6 as well. Should have double checked that, my bad. - Danilo thanks, greg k-h
[PATCH stable v6.7] drm/nouveau: don't fini scheduler before entity flush
_sched] [ 305.226418] nouveau_drm_device_fini+0x2a0/0x490 [nouveau] [ 305.226624] nouveau_drm_remove+0x18e/0x220 [nouveau] [ 305.226832] pci_device_remove+0xa3/0x1d0 [ 305.226836] device_release_driver_internal+0x379/0x540 [ 305.226840] driver_detach+0xc5/0x180 [ 305.226843] bus_remove_driver+0x11e/0x2a0 [ 305.226847] pci_unregister_driver+0x2a/0x250 [ 305.226850] nouveau_drm_exit+0x1f/0x970 [nouveau] [ 305.227056] __do_sys_delete_module+0x350/0x580 [ 305.227060] do_syscall_64+0x61/0xe0 [ 305.227064] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 305.227070] The buggy address belongs to the object at 8881440a8f00 which belongs to the cache kmalloc-128 of size 128 [ 305.227073] The buggy address is located 72 bytes inside of freed 128-byte region [8881440a8f00, 8881440a8f80) [ 305.227078] The buggy address belongs to the physical page: [ 305.227081] page:627efa0a refcount:1 mapcount:0 mapping: index:0x0 pfn:0x1440a8 [ 305.227085] head:627efa0a order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 305.227088] flags: 0x17c840(slab|head|node=0|zone=2|lastcpupid=0x1f) [ 305.227093] page_type: 0x() [ 305.227097] raw: 0017c840 8881000428c0 ea0005b33500 dead0002 [ 305.227100] raw: 00200020 0001 [ 305.227102] page dumped because: kasan: bad access detected [ 305.227106] Memory state around the buggy address: [ 305.227109] 8881440a8e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 305.227112] 8881440a8e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 305.227114] >8881440a8f00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 305.227117] ^ [ 305.227120] 8881440a8f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 305.227122] 8881440a9000: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc [ 305.227125] == Cc: # v6.7 only Reported-by: Karol Herbst Closes: https://gist.githubusercontent.com/karolherbst/a20eb0f937a06ed6aabe2ac2ca3d11b5/raw/9cd8b1dc5894872d0eeebbee3dd0fdd28bb576bc/gistfile1.txt Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_drm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 50589f982d1a..75545da9d1e9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -708,10 +708,11 @@ nouveau_drm_device_fini(struct drm_device *dev) } mutex_unlock(>clients_lock); - nouveau_sched_fini(drm); - nouveau_cli_fini(>client); nouveau_cli_fini(>master); + + nouveau_sched_fini(drm); + nvif_parent_dtor(>parent); mutex_destroy(>clients_lock); kfree(drm); -- 2.44.0
Re: [PATCH 1/2] nouveau: lock the client object tree.
On 3/1/24 04:42, Dave Airlie wrote: From: Dave Airlie It appears the client object tree has no locking unless I've missed something else. Fix races around adding/removing client objects, mostly vram bar mappings. 4562.099306] general protection fault, probably for non-canonical address 0x6677ed422bceb80c: [#1] PREEMPT SMP PTI [ 4562.099314] CPU: 2 PID: 23171 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 [ 4562.099324] Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 [ 4562.099330] RIP: 0010:nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099503] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 48 89 f8 48 85 f6 74 39 48 8b 87 a0 00 00 00 48 85 c0 74 12 <48> 8b 48 f8 48 39 ce 73 15 48 8b 40 10 48 85 c0 75 ee 48 c7 c0 fe [ 4562.099506] RSP: :a94cc420bbf8 EFLAGS: 00010206 [ 4562.099512] RAX: 6677ed422bceb814 RBX: 98108791f400 RCX: 9810f26b8f58 [ 4562.099517] RDX: RSI: 9810f26b9158 RDI: 98108791f400 [ 4562.099519] RBP: 9810f26b9158 R08: R09: [ 4562.099521] R10: a94cc420bc48 R11: 0001 R12: 9810f02a7cc0 [ 4562.099526] R13: R14: 00ff R15: 0007 [ 4562.099528] FS: 7f629c5017c0() GS:98142c70() knlGS: [ 4562.099534] CS: 0010 DS: ES: CR0: 80050033 [ 4562.099536] CR2: 7f629a882000 CR3: 00017019e004 CR4: 003706f0 [ 4562.099541] DR0: DR1: DR2: [ 4562.099542] DR3: DR6: fffe0ff0 DR7: 0400 [ 4562.099544] Call Trace: [ 4562.099555] [ 4562.099573] ? die_addr+0x36/0x90 [ 4562.099583] ? exc_general_protection+0x246/0x4a0 [ 4562.099593] ? asm_exc_general_protection+0x26/0x30 [ 4562.099600] ? nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099730] nvkm_ioctl+0xa1/0x250 [nouveau] [ 4562.099861] nvif_object_map_handle+0xc8/0x180 [nouveau] [ 4562.099986] nouveau_ttm_io_mem_reserve+0x122/0x270 [nouveau] [ 4562.100156] ? dma_resv_test_signaled+0x26/0xb0 [ 4562.100163] ttm_bo_vm_fault_reserved+0x97/0x3c0 [ttm] [ 4562.100182] ? __mutex_unlock_slowpath+0x2a/0x270 [ 4562.100189] nouveau_ttm_fault+0x69/0xb0 [nouveau] [ 4562.100356] __do_fault+0x32/0x150 [ 4562.100362] do_fault+0x7c/0x560 [ 4562.100369] __handle_mm_fault+0x800/0xc10 [ 4562.100382] handle_mm_fault+0x17c/0x3e0 [ 4562.100388] do_user_addr_fault+0x208/0x860 [ 4562.100395] exc_page_fault+0x7f/0x200 [ 4562.100402] asm_exc_page_fault+0x26/0x30 [ 4562.100412] RIP: 0033:0x9b9870 [ 4562.100419] Code: 85 a8 f7 ff ff 8b 8d 80 f7 ff ff 89 08 e9 18 f2 ff ff 0f 1f 84 00 00 00 00 00 44 89 32 e9 90 fa ff ff 0f 1f 84 00 00 00 00 00 <44> 89 32 e9 f8 f1 ff ff 0f 1f 84 00 00 00 00 00 66 44 89 32 e9 e7 [ 4562.100422] RSP: 002b:7fff9ba2dc70 EFLAGS: 00010246 [ 4562.100426] RAX: 0004 RBX: 0dd65e10 RCX: 00fff000 [ 4562.100428] RDX: 7f629a882000 RSI: 7f629a882000 RDI: 0066 [ 4562.100432] RBP: 7fff9ba2e570 R08: R09: 000123ddf000 [ 4562.100434] R10: 0001 R11: 0246 R12: 7fff [ 4562.100436] R13: R14: R15: [ 4562.100446] [ 4562.100448] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink cmac bnep sunrpc iwlmvm intel_rapl_msr intel_rapl_common snd_sof_pci_intel_cnl x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda_common mac80211 coretemp snd_soc_acpi_intel_match kvm_intel snd_soc_acpi snd_soc_hdac_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof kvm snd_sof_utils snd_soc_core snd_hda_codec_realtek libarc4 snd_hda_codec_generic snd_compress snd_hda_ext_core vfat fat snd_hda_intel snd_intel_dspcfg irqbypass iwlwifi snd_hda_codec snd_hwdep snd_hda_core btusb btrtl mei_hdcp iTCO_wdt rapl mei_pxp btintel snd_seq iTCO_vendor_support btbcm snd_seq_device intel_cstate bluetooth snd_pcm cfg80211 intel_wmi_thunderbolt wmi_bmof intel_uncore snd_timer mei_me snd ecdh_generic i2c_i801 [ 4562.100541] ecc mei i2c_smbus soundcore rfkill intel_pch_thermal acpi_pad zram nouveau drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_gpuvm drm_exec mxm_wmi drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul nvme e1000e crc32c_intel nvme_core ghash_clmulni_intel video wmi pinctrl_cannonlake ip6_tables ip_tables fuse [ 4562.100616] ---[ end trace ]--- Signed-off-by: Dave Airlie Cc: sta...@vger.kernel.org --- .../drm/nouveau/include/nvkm/core/client.h| 1 + drivers/gpu/drm/nouveau/nvkm/core/client.c| 1 +
Re: [PATCH] drm/nouveau: use dedicated wq for fence uevents work
On Fri, Feb 23, 2024 at 10:14:53AM +1000, Dave Airlie wrote: > On Fri, 23 Feb 2024 at 00:45, Danilo Krummrich wrote: > > > > Using the kernel global workqueue to signal fences can lead to > > unexpected deadlocks. Some other work (e.g. from a different driver) > > could directly or indirectly depend on this fence to be signaled. > > However, if the WQ_MAX_ACTIVE limit is reached by waiters, this can > > prevent the work signaling the fence from running. > > > > While this seems fairly unlikely, it's potentially exploitable. > > LGTM > > Reviewed-by: Dave Airlie > > probably should go into drm-misc-fixes? Yes, however, 39126abc5e20 went in through drm-fixes directly it seems, since it's not in drm-misc-fixes. Guess you want me to cherry-pick 39126abc5e20 to drm-misc-fixes rather than take this one through drm-fixes as well? > > > > > Fixes: 39126abc5e20 ("nouveau: offload fence uevents work to workqueue") > > Signed-off-by: Danilo Krummrich > > --- > > drivers/gpu/drm/nouveau/nouveau_drm.c | 13 +++-- > > drivers/gpu/drm/nouveau/nouveau_drv.h | 3 +++ > > drivers/gpu/drm/nouveau/nouveau_fence.c | 3 ++- > > drivers/gpu/drm/nouveau/nouveau_fence.h | 2 ++ > > 4 files changed, 18 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c > > b/drivers/gpu/drm/nouveau/nouveau_drm.c > > index 6f6c31a9937b..6be202081077 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c > > @@ -598,9 +598,15 @@ nouveau_drm_device_init(struct drm_device *dev) > > goto fail_alloc; > > } > > > > + drm->fence_wq = alloc_workqueue("nouveau_fence_wq", 0, > > WQ_MAX_ACTIVE); > > + if (!drm->fence_wq) { > > + ret = -ENOMEM; > > + goto fail_sched_wq; > > + } > > + > > ret = nouveau_cli_init(drm, "DRM-master", >master); > > if (ret) > > - goto fail_wq; > > + goto fail_fence_wq; > > > > ret = nouveau_cli_init(drm, "DRM", >client); > > if (ret) > > @@ -670,7 +676,9 @@ nouveau_drm_device_init(struct drm_device *dev) > > nouveau_cli_fini(>client); > > fail_master: > > nouveau_cli_fini(>master); > > -fail_wq: > > +fail_fence_wq: > > + destroy_workqueue(drm->fence_wq); > > +fail_sched_wq: > > destroy_workqueue(drm->sched_wq); > > fail_alloc: > > nvif_parent_dtor(>parent); > > @@ -725,6 +733,7 @@ nouveau_drm_device_fini(struct drm_device *dev) > > > > nouveau_cli_fini(>client); > > nouveau_cli_fini(>master); > > + destroy_workqueue(drm->fence_wq); > > destroy_workqueue(drm->sched_wq); > > nvif_parent_dtor(>parent); > > mutex_destroy(>clients_lock); > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h > > b/drivers/gpu/drm/nouveau/nouveau_drv.h > > index 8a6d94c8b163..b43619a213a4 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_drv.h > > +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h > > @@ -261,6 +261,9 @@ struct nouveau_drm { > > /* Workqueue used for channel schedulers. */ > > struct workqueue_struct *sched_wq; > > > > + /* Workqueue used to signal fences. */ > > + struct workqueue_struct *fence_wq; > > + > > /* context for accelerated drm-internal operations */ > > struct nouveau_channel *cechan; > > struct nouveau_channel *channel; > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c > > b/drivers/gpu/drm/nouveau/nouveau_fence.c > > index 93f08f9479d8..c3ea3cd933cd 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > > @@ -174,7 +174,7 @@ static int > > nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, > > u32 repc) > > { > > struct nouveau_fence_chan *fctx = container_of(event, > > typeof(*fctx), event); > > - schedule_work(>uevent_work); > > + queue_work(fctx->wq, >uevent_work); > > return NVIF_EVENT_KEEP; > > } > > > > @@ -194,6 +194,7 @@ nouveau_fence_context_new(struct nouveau_channel *chan, > > struct nouveau_fence_cha > > INIT_LIST_HEAD(>pending); > > spin_lock_init(>lock); > > fctx->context = chan->drm->runl[chan->runli
[PATCH] drm/nouveau: use dedicated wq for fence uevents work
Using the kernel global workqueue to signal fences can lead to unexpected deadlocks. Some other work (e.g. from a different driver) could directly or indirectly depend on this fence to be signaled. However, if the WQ_MAX_ACTIVE limit is reached by waiters, this can prevent the work signaling the fence from running. While this seems fairly unlikely, it's potentially exploitable. Fixes: 39126abc5e20 ("nouveau: offload fence uevents work to workqueue") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_drm.c | 13 +++-- drivers/gpu/drm/nouveau/nouveau_drv.h | 3 +++ drivers/gpu/drm/nouveau/nouveau_fence.c | 3 ++- drivers/gpu/drm/nouveau/nouveau_fence.h | 2 ++ 4 files changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 6f6c31a9937b..6be202081077 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -598,9 +598,15 @@ nouveau_drm_device_init(struct drm_device *dev) goto fail_alloc; } + drm->fence_wq = alloc_workqueue("nouveau_fence_wq", 0, WQ_MAX_ACTIVE); + if (!drm->fence_wq) { + ret = -ENOMEM; + goto fail_sched_wq; + } + ret = nouveau_cli_init(drm, "DRM-master", >master); if (ret) - goto fail_wq; + goto fail_fence_wq; ret = nouveau_cli_init(drm, "DRM", >client); if (ret) @@ -670,7 +676,9 @@ nouveau_drm_device_init(struct drm_device *dev) nouveau_cli_fini(>client); fail_master: nouveau_cli_fini(>master); -fail_wq: +fail_fence_wq: + destroy_workqueue(drm->fence_wq); +fail_sched_wq: destroy_workqueue(drm->sched_wq); fail_alloc: nvif_parent_dtor(>parent); @@ -725,6 +733,7 @@ nouveau_drm_device_fini(struct drm_device *dev) nouveau_cli_fini(>client); nouveau_cli_fini(>master); + destroy_workqueue(drm->fence_wq); destroy_workqueue(drm->sched_wq); nvif_parent_dtor(>parent); mutex_destroy(>clients_lock); diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h index 8a6d94c8b163..b43619a213a4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drv.h +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h @@ -261,6 +261,9 @@ struct nouveau_drm { /* Workqueue used for channel schedulers. */ struct workqueue_struct *sched_wq; + /* Workqueue used to signal fences. */ + struct workqueue_struct *fence_wq; + /* context for accelerated drm-internal operations */ struct nouveau_channel *cechan; struct nouveau_channel *channel; diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 93f08f9479d8..c3ea3cd933cd 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -174,7 +174,7 @@ static int nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 repc) { struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), event); - schedule_work(>uevent_work); + queue_work(fctx->wq, >uevent_work); return NVIF_EVENT_KEEP; } @@ -194,6 +194,7 @@ nouveau_fence_context_new(struct nouveau_channel *chan, struct nouveau_fence_cha INIT_LIST_HEAD(>pending); spin_lock_init(>lock); fctx->context = chan->drm->runl[chan->runlist].context_base + chan->chid; + fctx->wq = chan->drm->fence_wq; if (chan == chan->drm->cechan) strcpy(fctx->name, "copy engine channel"); diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h index 8bc065acfe35..bc13110bdfa4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.h +++ b/drivers/gpu/drm/nouveau/nouveau_fence.h @@ -44,7 +44,9 @@ struct nouveau_fence_chan { u32 context; char name[32]; + struct workqueue_struct *wq; struct work_struct uevent_work; + struct nvif_event event; int notify_ref, dead, killed; }; base-commit: 1f4c6f11a557642505e5f403e0dfabbaff9c529a -- 2.43.0
Re: [PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized
On 2/20/24 20:45, Timur Tabi wrote: On Mon, 2024-02-19 at 10:32 +0100, Danilo Krummrich wrote: Looks like I spoke too soon, I just hit the problem with the drm-next tree. Did you apply the patch to drm-next? Ugh, you're right. I don't how I got confused, but I could have sworn that I saw your two patches in drm-next, but they are not there. Sorry for the noise. No worries, thanks for testing! :-)
Re: [PATCH] nouveau: offload fence uevents work to workqueue
On 2/16/24 17:41, Daniel Vetter wrote: On Tue, Feb 13, 2024 at 06:39:20PM +0100, Danilo Krummrich wrote: On 2/9/24 19:52, Daniel Vetter wrote: On Fri, Feb 09, 2024 at 06:41:32PM +0100, Danilo Krummrich wrote: On 2/6/24 15:03, Daniel Vetter wrote: On Mon, Feb 05, 2024 at 11:00:04PM +0100, Danilo Krummrich wrote: On 2/5/24 22:08, Dave Airlie wrote: On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich wrote: On 1/29/24 02:50, Dave Airlie wrote: From: Dave Airlie This should break the deadlock between the fctx lock and the irq lock. This offloads the processing off the work from the irq into a workqueue. Signed-off-by: Dave Airlie Nouveau's scheduler uses a dedicated wq, hence from this perspective it's safe deferring fence signalling to the kernel global wq. However, I wonder if we could create deadlocks by building dependency chains into other drivers / kernel code that, by chance, makes use of the kernel global wq as well. Admittedly, even if, it's gonna be extremely unlikely given that WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq. Also, do we need to CC stable? I pushed this to Linus at the end of last week, since the hangs in 6.7 take out the complete system and I wanted it in stable. It might be safer to use a dedicated wq, is the concern someone is waiting on a fence in a workqueue somewhere else so we will never signal it? Yes, if some other work is waiting for this fence (or something else in the same dependency chain) to signal it can prevent executing the work signaling this fence, in case both are scheduled on the same wq. As mentioned, with the kernel global wq this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 512, but formally the race condition exists. I guess a malicious attacker could try to intentionally push jobs directly or indirectly depending on this fence to a driver which queues them up on a scheduler using the kernel global wq. I think if you add dma_fence_signalling annotations (aside, there's some patch from iirc Thomas Hellstrom to improve them and cut down on some false positives, but I lost track) then I think you won't get any splats because the wq subsystem assumes that WC_MAX_ACTIVE is close enough to infinity to not matter. As mentioned, for the kernel global wq it's 512. (Intentionally) feeding the kernel with enough jobs to to provoke a deadlock doesn't seem impossible to me. I think it'd be safer to just establish not to use the kernel global wq for executing work in the fence signalling critical path. We could also run into similar problems with a dedicated wq, e.g. when drivers share a wq between drm_gpu_scheduler instances (see [1]), however, I'm not sure we can catch that with lockdep. I think if you want to fix it perfectly you'd need to set the max number of wq to the number of engines (or for dynamic/fw scheduled engines to the number of context) you have. Or whatever limit to the number of parallel timelines there is.> I guess this would need a new wq function to update? drm/sched code could be able to set that for drivers, so drivers cannot get this wrong. Not sure I can follow. The scheduler instance might be per context and bind queue. In this case it gets the shared wq passed, but doesn't know how many other scheduler instances share the same one. Yeah that's why maybe more of that logic should be in the drm/sched code instead of drivers just cleverly using what's there ... Agree, that's gonna be a huge design change though. Additionally, there might be drivers not using the DRM scheduler for for bind queues at all (I think Xe does not). Uh ... maybe we should do this the same across all drivers? But I also thought that Xe was flat-out synchronous and only had an out-fence since you need a userspace thread in mesa for vk semantics anyway. If xe hand-rolled a scheduler I'm not going to be very amused. I really don't know the details, but there are similarities at least. There is the the rebind work, which seems to be called in some VM_BIND cases and in the context of an EXEC ioctl and seems to signal a fence. It seems valid to not stuff this into the scheduler. There are also cases like this one, where we have fence signalling critical code in wqs outside the context of a scheduler instance. If we don't do something like that then I'm not sure there's really much benefit - instead of carefully timing 512 dma_fence on the global wq you just need to create a pile of context (at least on intel's guc that's absolutely no issue) and then careful time them. Well, that's true. I'd still argue that there is a slight difference. From a drivers isolated perspective using the global kernel wq might be entirely fine, as in this patch. However, in combination with another driver doing the same thing, things can blow up. For the case you illustrated it's at least possible to spot it from a driver's perspective. I still feel like we have bigger fish to
Re: [PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized
On 2/15/24 00:48, Timur Tabi wrote: On Fri, 2024-02-02 at 17:14 +, Timur Tabi wrote: On Fri, 2024-02-02 at 01:05 +0100, Danilo Krummrich wrote: nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call their corresponding *_fini() counterpart. This can lead to nouveau_sched_fini() being called without struct nouveau_sched ever being initialized in the first place. Thanks, I've confirmed that these patches do fix the problem. Looks like I spoke too soon, I just hit the problem with the drm-next tree. Did you apply the patch to drm-next? I'm able to repro the problem by having r535_gsp_init() return an error. r535_gsp_rpc_poll return -EINVAL (I'm testing my own GSP-RM build) and nouveau_sched_fini() is called even though nouveau_sched_init() was never called.
Re: [PATCH] drm/nouveau/mmu/r535: uninitialized variable in r535_bar_new_()
On 2/13/24 19:09, Dan Carpenter wrote: If gf100_bar_new_() fails then "bar" is not initialized. Fixes: 5bf0257136a2 ("drm/nouveau/mmu/r535: initial support") Signed-off-by: Dan Carpenter Applied to drm-misc-fixes, thanks! --- It looks like this was intended to handle a failure from the "rm" func but "rm" can't actually fail so it's easier to write the error handling for the code as-is. drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c index 4135690326f4..3a30bea30e36 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c @@ -168,12 +168,11 @@ r535_bar_new_(const struct nvkm_bar_func *hw, struct nvkm_device *device, rm->flush = r535_bar_flush; ret = gf100_bar_new_(rm, device, type, inst, ); - *pbar = bar; if (ret) { - if (!bar) - kfree(rm); + kfree(rm); return ret; } + *pbar = bar; bar->flushBAR2PhysMode = ioremap(device->func->resource_addr(device, 3), PAGE_SIZE); if (!bar->flushBAR2PhysMode)
Re: [PATCH] nouveau: fix function cast warnings
On 2/13/24 10:57, Arnd Bergmann wrote: From: Arnd Bergmann clang-16 warns about casting between incompatible function types: drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c:161:10: error: cast from 'void (*)(const struct firmware *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 161 | .fini = (void(*)(void *))release_firmware, This one was done to use the generic shadow_fw_release() function as a callback for struct nvbios_source. Change it to use the same prototype as the other five instances, with a trivial helper function that actually calls release_firmware. Fixes: 70c0f263cc2e ("drm/nouveau/bios: pull in basic vbios subdev, more to come later") Signed-off-by: Arnd Bergmann Applied to drm-misc-fixes, thanks! --- drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c index 19188683c8fc..8c2bf1c16f2a 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c @@ -154,11 +154,17 @@ shadow_fw_init(struct nvkm_bios *bios, const char *name) return (void *)fw; } +static void +shadow_fw_release(void *fw) +{ + release_firmware(fw); +} + static const struct nvbios_source shadow_fw = { .name = "firmware", .init = shadow_fw_init, - .fini = (void(*)(void *))release_firmware, + .fini = shadow_fw_release, .read = shadow_fw_read, .rw = false, };
Re: [PATCH 2/2] [v3] drm/nouveau: expose GSP-RM logging buffers via debugfs
On 2/13/24 18:10, Timur Tabi wrote: On Tue, 2024-02-13 at 17:57 +0100, Danilo Krummrich wrote: + struct debugfs_blob_wrapper blob_init; + struct debugfs_blob_wrapper blob_intr; + struct debugfs_blob_wrapper blob_rm; + struct debugfs_blob_wrapper blob_pmu; + struct dentry *debugfs_logging_dir; I think we should not create those from within the nvkm layer, but rather pass them down through nvkm_device_pci_new(). Should they be created in nvkm_device_pci_new() also, even though we have no idea whether GSP is involved at that point? We can pass some structure to nvkm_device_pci_new() where GSP sets the pointers to the buffers and maybe a callback to clean them up. Once nvkm_device_pci_new() returns we'd see if something has been set or not. If so, we can go ahead and create the debugfs nodes. Lifecycle wise I think we should ensure that removing the Nouveau kernel module also cleans up those buffers. Even though keep-gsp-logging is considered unsafe, we shouldn't leak memory. I agree, but then there needs to be some way to keep these debugfs entries until the driver unloads. I don't know how to do that without creating some ugly global variables. I fear that might be the only option. However, I still prefer a global list over a memory leak. For instance, can we allocate corresponding buffers in the driver layer, copy things over and keep those buffers until nouveau_drm_exit()? This would also avoid exposing those DMA buffers via debugfs. The whole point behind this patch is to expose the buffers via debugfs. How else should they be exposed? I think I only thought about the fail case where we could just copy over the data from the DMA buffer to another buffer and expose that one instead. However, that doesn't work if GSP loads successfully. Hence, as mentioned above, we might just want to have some structure that we can add to the global list that contains the pointers to the DMA buffers and maybe a callback function to clean them up defined by the GSP code that we can simply call from nouveau_drm_exit().
Re: [PATCH] nouveau: offload fence uevents work to workqueue
On 2/9/24 19:52, Daniel Vetter wrote: On Fri, Feb 09, 2024 at 06:41:32PM +0100, Danilo Krummrich wrote: On 2/6/24 15:03, Daniel Vetter wrote: On Mon, Feb 05, 2024 at 11:00:04PM +0100, Danilo Krummrich wrote: On 2/5/24 22:08, Dave Airlie wrote: On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich wrote: On 1/29/24 02:50, Dave Airlie wrote: From: Dave Airlie This should break the deadlock between the fctx lock and the irq lock. This offloads the processing off the work from the irq into a workqueue. Signed-off-by: Dave Airlie Nouveau's scheduler uses a dedicated wq, hence from this perspective it's safe deferring fence signalling to the kernel global wq. However, I wonder if we could create deadlocks by building dependency chains into other drivers / kernel code that, by chance, makes use of the kernel global wq as well. Admittedly, even if, it's gonna be extremely unlikely given that WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq. Also, do we need to CC stable? I pushed this to Linus at the end of last week, since the hangs in 6.7 take out the complete system and I wanted it in stable. It might be safer to use a dedicated wq, is the concern someone is waiting on a fence in a workqueue somewhere else so we will never signal it? Yes, if some other work is waiting for this fence (or something else in the same dependency chain) to signal it can prevent executing the work signaling this fence, in case both are scheduled on the same wq. As mentioned, with the kernel global wq this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 512, but formally the race condition exists. I guess a malicious attacker could try to intentionally push jobs directly or indirectly depending on this fence to a driver which queues them up on a scheduler using the kernel global wq. I think if you add dma_fence_signalling annotations (aside, there's some patch from iirc Thomas Hellstrom to improve them and cut down on some false positives, but I lost track) then I think you won't get any splats because the wq subsystem assumes that WC_MAX_ACTIVE is close enough to infinity to not matter. As mentioned, for the kernel global wq it's 512. (Intentionally) feeding the kernel with enough jobs to to provoke a deadlock doesn't seem impossible to me. I think it'd be safer to just establish not to use the kernel global wq for executing work in the fence signalling critical path. We could also run into similar problems with a dedicated wq, e.g. when drivers share a wq between drm_gpu_scheduler instances (see [1]), however, I'm not sure we can catch that with lockdep. I think if you want to fix it perfectly you'd need to set the max number of wq to the number of engines (or for dynamic/fw scheduled engines to the number of context) you have. Or whatever limit to the number of parallel timelines there is.> I guess this would need a new wq function to update? drm/sched code could be able to set that for drivers, so drivers cannot get this wrong. Not sure I can follow. The scheduler instance might be per context and bind queue. In this case it gets the shared wq passed, but doesn't know how many other scheduler instances share the same one. Additionally, there might be drivers not using the DRM scheduler for for bind queues at all (I think Xe does not). If we don't do something like that then I'm not sure there's really much benefit - instead of carefully timing 512 dma_fence on the global wq you just need to create a pile of context (at least on intel's guc that's absolutely no issue) and then careful time them. Well, that's true. I'd still argue that there is a slight difference. From a drivers isolated perspective using the global kernel wq might be entirely fine, as in this patch. However, in combination with another driver doing the same thing, things can blow up. For the case you illustrated it's at least possible to spot it from a driver's perspective. I still feel like we have bigger fish to fry ... But might be worth the effort to at least make the parallel timeline limit a lot more explicit? I agree, and it'd be great if we can find a solution such bugs can be detected systematically (e.g. through lockdep), but maybe we can start to at least document that we should never use the kernel global wq and where we need to be careful in sharing driver wqs. - Danilo Cheers, Sima [1] https://elixir.bootlin.com/linux/v6.8-rc3/source/drivers/gpu/drm/nouveau/nouveau_drm.c#L313 I'm not sure we should care differently, but I guess it'd be good to annotate it all in case the wq subsystem's idea of how much such deadlocks are real changes. Also Teo is on a mission to get rid of all the global wq flushes, so there should also be no source of deadlocks from that kind of cross-driver dependency. Or at least shouldn't be in the future, I'm not sure it all landed. -Sima
Re: [PATCH 1/2] drm/scheduler: improve GPU scheduler documentation v2
Hi Christian, What's the status of this effort? Was there ever a follow-up? - Danilo On 11/16/23 23:23, Danilo Krummrich wrote: Hi Christian, Thanks for sending an update of this patch! On Thu, Nov 16, 2023 at 03:15:46PM +0100, Christian König wrote: Start to improve the scheduler document. Especially document the lifetime of each of the objects as well as the restrictions around DMA-fence handling and userspace compatibility. v2: Some improvements suggested by Danilo, add section about error handling. Signed-off-by: Christian König --- Documentation/gpu/drm-mm.rst | 36 + drivers/gpu/drm/scheduler/sched_main.c | 174 + 2 files changed, 188 insertions(+), 22 deletions(-) diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index acc5901ac840..112463fa9f3a 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -552,12 +552,48 @@ Overview .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c :doc: Overview +Job Object +-- + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Job Object + +Entity Object +- + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Entity Object + +Hardware Fence Object +- + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Hardware Fence Object + +Scheduler Fence Object +-- + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Scheduler Fence Object + +Scheduler and Run Queue Objects +--- + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Scheduler and Run Queue Objects + Flow Control .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c :doc: Flow Control +Error and Timeout handling +-- + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Error and Timeout handling + Scheduler Function References - diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 044a8c4875ba..026123497b0e 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -24,28 +24,122 @@ /** * DOC: Overview * - * The GPU scheduler provides entities which allow userspace to push jobs - * into software queues which are then scheduled on a hardware run queue. - * The software queues have a priority among them. The scheduler selects the entities - * from the run queue using a FIFO. The scheduler provides dependency handling - * features among jobs. The driver is supposed to provide callback functions for - * backend operations to the scheduler like submitting a job to hardware run queue, - * returning the dependencies of a job etc. - * - * The organisation of the scheduler is the following: - * - * 1. Each hw run queue has one scheduler - * 2. Each scheduler has multiple run queues with different priorities - *(e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL) - * 3. Each scheduler run queue has a queue of entities to schedule - * 4. Entities themselves maintain a queue of jobs that will be scheduled on - *the hardware. - * - * The jobs in a entity are always scheduled in the order that they were pushed. - * - * Note that once a job was taken from the entities queue and pushed to the - * hardware, i.e. the pending queue, the entity must not be referenced anymore - * through the jobs entity pointer. + * The GPU scheduler implements some logic to decide which command submission + * to push next to the hardware. Another major use case of the GPU scheduler + * is to enforce correct driver behavior around those command submissions. + * Because of this it's also used by drivers which don't need the actual + * scheduling functionality. This reads a bit like we're already right in the middle of the documentation. I'd propose to start with something like "The DRM GPU scheduler serves as a common component intended to help drivers to manage command submissions." And then mention the different solutions the scheduler provides, e.g. ordering of command submissions, dma-fence handling in the context of command submissions, etc. Also, I think it would be good to give a rough overview of the topology of the scheduler's components. And since you already mention that the component is "also used by drivers which don't need the actual scheduling functionality", I'd also mention the special case of a single entity and a single run-queue per scheduler. + * + * All callbacks the driver needs to implement are restricted by DMA-fence + * signaling rules to guarantee deadlock free forward progress. This especially + * means that for normal operation no memory can be allocated in a callback. + * All memory which is needed for pushing the job to the hardware must be + * allocated before arming a job. It also means that no locks can be taken +
Re: [PATCH 2/2] [v3] drm/nouveau: expose GSP-RM logging buffers via debugfs
On 2/12/24 22:15, Timur Tabi wrote: The LOGINIT, LOGINTR, LOGRM, and LOGPMU buffers are circular buffers that have printf-like logs from GSP-RM and PMU encoded in them. LOGINIT, LOGINTR, and LOGRM are allocated by Nouveau and their DMA addresses are passed to GSP-RM during initialization. The buffers are required for GSP-RM to initialize properly. LOGPMU is also allocated by Nouveau, but its contents are updated when Nouveau receives an NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPC from GSP-RM. Nouveau then copies the RPC to the buffer. The messages are encoded as an array of variable-length structures that contain the parameters to an NV_PRINTF call. The format string and parameter count are stored in a special ELF image that contains only logging strings. This image is not currently shipped with the Nvidia driver. There are two methods to extract the logs. OpenRM tries to load the logging ELF, and if present, parses the log buffers in real time and outputs the strings to the kernel console. Alternatively, and this is the method used by this patch, the buffers can be exposed to user space, and a user-space tool (along with the logging ELF image) can parse the buffer and dump the logs. This method has the advantage that it allows the buffers to be parsed even when the logging ELF file is not available to the user. However, it has the disadvantage the debubfs entries need to remain until the driver is unloaded. The buffers are exposed via debugfs. The debugfs entries must be created before GSP-RM is started, to ensure that they are available during GSP-RM initialization. If GSP-RM fails to initialize, then Nouveau immediately shuts down the GSP interface. This would normally also deallocate the logging buffers, thereby preventing the user from capturing the debug logs. To avoid this, the keep-gsp-logging command line parameter can be specified. This parmater is marked as *unsafe* (thereby taining the kernel) because the DMA buffer and debugfs entries are never deallocated, even if the driver unloads. This gives the user the time to capture the logs, but it also means that resources can only be recovered by a reboot. An end-user can capture the logs using the following commands: cp /sys/kernel/debug/dri//loginit loginit cp /sys/kernel/debug/dri//logrm logrm cp /sys/kernel/debug/dri//logintr logintr cp /sys/kernel/debug/dri//logpmu logpmu where is the PCI ID of the GPU (e.g. :65:00.0). If keep-gsp-logging is specified, then the is the same but with -debug appended (e.g. :65:00.0-debug). Since LOGPMU is not needed for normal GSP-RM operation, it is only created if debugfs is available. Otherwise, the NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPCs are ignored. Signed-off-by: Timur Tabi --- v3: reworked r535_gsp_libos_debugfs_init, rebased for drm-next .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 12 + .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 215 +- 2 files changed, 223 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h index 3fbc57b16a05..2ee44bdf8be7 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h @@ -5,6 +5,8 @@ #include #include +#include + #define GSP_PAGE_SHIFT 12 #define GSP_PAGE_SIZE BIT(GSP_PAGE_SHIFT) @@ -217,6 +219,16 @@ struct nvkm_gsp { /* The size of the registry RPC */ size_t registry_rpc_size; + + /* +* Logging buffers in debugfs. The wrapper objects need to remain +* in memory until the dentry is deleted. +*/ + struct debugfs_blob_wrapper blob_init; + struct debugfs_blob_wrapper blob_intr; + struct debugfs_blob_wrapper blob_rm; + struct debugfs_blob_wrapper blob_pmu; + struct dentry *debugfs_logging_dir; I think we should not create those from within the nvkm layer, but rather pass them down through nvkm_device_pci_new(). Lifecycle wise I think we should ensure that removing the Nouveau kernel module also cleans up those buffers. Even though keep-gsp-logging is considered unsafe, we shouldn't leak memory. For instance, can we allocate corresponding buffers in the driver layer, copy things over and keep those buffers until nouveau_drm_exit()? This would also avoid exposing those DMA buffers via debugfs. }; static inline bool diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 86b62c7e1229..56209bf81360 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -26,6 +26,7 @@ #include #include #include +#include #include @@ -1979,6 +1980,196 @@ r535_gsp_rmargs_init(struct nvkm_gsp *gsp, bool resume) return 0; } +#define NV_GSP_MSG_EVENT_UCODE_LIBOS_CLASS_PMU 0xf3d722 + +/** + *
Re: [PATCH 1/2] [v3] nouveau: add command-line GSP-RM registry support
On 2/12/24 22:15, Timur Tabi wrote: Add the NVreg_RegistryDwords command line parameter, which allows specifying additional registry keys to be sent to GSP-RM. This allows additional configuration, debugging, and experimentation with GSP-RM, which uses these keys to alter its behavior. Note that these keys are passed as-is to GSP-RM, and Nouveau does not parse them. This is in contrast to the Nvidia driver, which may parse some of the keys to configure some functionality in concert with GSP-RM. Therefore, any keys which also require action by the driver may not function correctly when passed by Nouveau. Caveat emptor. The name and format of NVreg_RegistryDwords is the same as used by the Nvidia driver, to maintain compatibility. Signed-off-by: Timur Tabi --- v3: rebased to drm-next .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 6 + .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 279 -- 2 files changed, 261 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h index 6f5d376d8fcc..3fbc57b16a05 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h @@ -211,6 +211,12 @@ struct nvkm_gsp { struct mutex mutex;; struct idr idr; } client_id; + + /* A linked list of registry items. The registry RPC will be built from it. */ + struct list_head registry_list; + + /* The size of the registry RPC */ + size_t registry_rpc_size; }; static inline bool diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 1c46e3f919be..86b62c7e1229 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -54,6 +54,8 @@ #include #include +#include +#include #define GSP_MSG_MIN_SIZE GSP_PAGE_SIZE #define GSP_MSG_MAX_SIZE GSP_PAGE_MIN_SIZE * 16 @@ -1082,51 +1084,280 @@ r535_gsp_rpc_unloading_guest_driver(struct nvkm_gsp *gsp, bool suspend) return nvkm_gsp_rpc_wr(gsp, rpc, true); } +struct registry_list_entry { + struct list_head list; Nit: 'head' or 'entry' might be more suitable. + size_t name_len; + u32 type; I prefer to represent type as enum or use a define for '1 = integer' instead. This also gets you rid of the need to explicitly explain it's meaning in the documentation of add_registry(). + u32 data; + u32 length; + char name[]; +}; + +/** + * add_registry -- adds a registry entry + * @gsp: gsp pointer + * @name: name of the registry key + * @type: type of data (1 = integer) + * @data: value + * @length: size of data, in bytes + * + * Adds a registry key/value pair to the registry database. + * + * Currently, only 32-bit integers (type == 1, length == 4) are supported. What if we pass something else? This function doesn't seem to ensure nothing else is accepted. Although I recognize that it's only used by add_registry_num() enforcing this limitation by it's function signature. + * + * This function collects the registry information in a linked list. After + * all registry keys have been added, build_registry() is used to create the + * RPC data structure. + * + * registry_rpc_size is a running total of the size of all registry keys. + * It's used to avoid an O(n) calculation of the size when the RPC is built. + * + * Returns 0 on success, or negative error code on error. + */ +static int add_registry(struct nvkm_gsp *gsp, const char *name, u32 type, u32 data, u32 length) +{ + struct registry_list_entry *reg; + size_t nlen = strlen(name) + 1; + + /* Set an arbitrary limit to avoid problems with broken command lines */ + if (strlen(name) > 64) Could re-use nlen for this check. + return -EFBIG; This error code doesn't seem to apply here, prefer EINVAL. + + reg = kmalloc(sizeof(struct registry_list_entry) + nlen, GFP_KERNEL); + if (!reg) + return -ENOMEM; + + memcpy(reg->name, name, nlen); + reg->name_len = nlen; + reg->type = type; + reg->data = data; + reg->length = length; + + nvkm_debug(>subdev, "adding GSP-RM registry '%s=%u'\n", name, data); + list_add_tail(>list, >registry_list); + gsp->registry_rpc_size += sizeof(PACKED_REGISTRY_ENTRY) + nlen; + + return 0; +} + +static int add_registry_num(struct nvkm_gsp *gsp, const char *name, u32 value) +{ + return add_registry(gsp, name, 1, value, sizeof(u32)); +} + +/** + * build_registry -- create the registry RPC data + * @gsp: gsp pointer + * @registry: pointer to the RPC payload to fill + * + * After all registry key/value pairs have been added, call this function to + * build the RPC. + * + * The registry RPC looks like this: + * + * +-+ + * |NvU32 size; | + * |NvU32
Re: [PATCH] nouveau/svm: fix kvcalloc() argument order
On 2/12/24 12:22, Arnd Bergmann wrote: From: Arnd Bergmann The conversion to kvcalloc() mixed up the object size and count arguments, causing a warning: drivers/gpu/drm/nouveau/nouveau_svm.c: In function 'nouveau_svm_fault_buffer_ctor': drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: error: 'kvcalloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args] 1010 | buffer->fault = kvcalloc(sizeof(*buffer->fault), buffer->entries, GFP_KERNEL); |^ drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: note: earlier argument should specify number of elements, later size of each element The behavior is still correct aside from the warning, but fixing it avoids the warnings and can help the compiler track the individual objects better. Fixes: 71e4bbca070e ("nouveau/svm: Use kvcalloc() instead of kvzalloc()") Signed-off-by: Arnd Bergmann Applied to drm-misc-fixes, thanks! --- drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 4d1008915499..b4da82ddbb6b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -1007,7 +1007,7 @@ nouveau_svm_fault_buffer_ctor(struct nouveau_svm *svm, s32 oclass, int id) if (ret) return ret; - buffer->fault = kvcalloc(sizeof(*buffer->fault), buffer->entries, GFP_KERNEL); + buffer->fault = kvcalloc(buffer->entries, sizeof(*buffer->fault), GFP_KERNEL); if (!buffer->fault) return -ENOMEM;
Re: [PATCH] nouveau: offload fence uevents work to workqueue
On 2/6/24 15:03, Daniel Vetter wrote: On Mon, Feb 05, 2024 at 11:00:04PM +0100, Danilo Krummrich wrote: On 2/5/24 22:08, Dave Airlie wrote: On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich wrote: On 1/29/24 02:50, Dave Airlie wrote: From: Dave Airlie This should break the deadlock between the fctx lock and the irq lock. This offloads the processing off the work from the irq into a workqueue. Signed-off-by: Dave Airlie Nouveau's scheduler uses a dedicated wq, hence from this perspective it's safe deferring fence signalling to the kernel global wq. However, I wonder if we could create deadlocks by building dependency chains into other drivers / kernel code that, by chance, makes use of the kernel global wq as well. Admittedly, even if, it's gonna be extremely unlikely given that WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq. Also, do we need to CC stable? I pushed this to Linus at the end of last week, since the hangs in 6.7 take out the complete system and I wanted it in stable. It might be safer to use a dedicated wq, is the concern someone is waiting on a fence in a workqueue somewhere else so we will never signal it? Yes, if some other work is waiting for this fence (or something else in the same dependency chain) to signal it can prevent executing the work signaling this fence, in case both are scheduled on the same wq. As mentioned, with the kernel global wq this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 512, but formally the race condition exists. I guess a malicious attacker could try to intentionally push jobs directly or indirectly depending on this fence to a driver which queues them up on a scheduler using the kernel global wq. I think if you add dma_fence_signalling annotations (aside, there's some patch from iirc Thomas Hellstrom to improve them and cut down on some false positives, but I lost track) then I think you won't get any splats because the wq subsystem assumes that WC_MAX_ACTIVE is close enough to infinity to not matter. As mentioned, for the kernel global wq it's 512. (Intentionally) feeding the kernel with enough jobs to to provoke a deadlock doesn't seem impossible to me. I think it'd be safer to just establish not to use the kernel global wq for executing work in the fence signalling critical path. We could also run into similar problems with a dedicated wq, e.g. when drivers share a wq between drm_gpu_scheduler instances (see [1]), however, I'm not sure we can catch that with lockdep. [1] https://elixir.bootlin.com/linux/v6.8-rc3/source/drivers/gpu/drm/nouveau/nouveau_drm.c#L313 I'm not sure we should care differently, but I guess it'd be good to annotate it all in case the wq subsystem's idea of how much such deadlocks are real changes. Also Teo is on a mission to get rid of all the global wq flushes, so there should also be no source of deadlocks from that kind of cross-driver dependency. Or at least shouldn't be in the future, I'm not sure it all landed. -Sima
Re: [PATCH] nouveau: offload fence uevents work to workqueue
On 2/5/24 22:08, Dave Airlie wrote: On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich wrote: On 1/29/24 02:50, Dave Airlie wrote: From: Dave Airlie This should break the deadlock between the fctx lock and the irq lock. This offloads the processing off the work from the irq into a workqueue. Signed-off-by: Dave Airlie Nouveau's scheduler uses a dedicated wq, hence from this perspective it's safe deferring fence signalling to the kernel global wq. However, I wonder if we could create deadlocks by building dependency chains into other drivers / kernel code that, by chance, makes use of the kernel global wq as well. Admittedly, even if, it's gonna be extremely unlikely given that WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq. Also, do we need to CC stable? I pushed this to Linus at the end of last week, since the hangs in 6.7 take out the complete system and I wanted it in stable. It might be safer to use a dedicated wq, is the concern someone is waiting on a fence in a workqueue somewhere else so we will never signal it? Yes, if some other work is waiting for this fence (or something else in the same dependency chain) to signal it can prevent executing the work signaling this fence, in case both are scheduled on the same wq. As mentioned, with the kernel global wq this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 512, but formally the race condition exists. I guess a malicious attacker could try to intentionally push jobs directly or indirectly depending on this fence to a driver which queues them up on a scheduler using the kernel global wq. Dave.
Re: [PATCH] nouveau/gsp: use correct size for registry rpc.
On 1/30/24 04:26, Dave Airlie wrote: From: Dave Airlie Timur pointed this out before, and it just slipped my mind, but this might help some things work better, around pcie power management. Fixes: 8d55b0a940bb ("nouveau/gsp: add some basic registry entries.") Signed-off-by: Dave Airlie Added stable CC for v6.7 and applied it to drm-misc-fixes. --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 9ee58e2a0eb2..5e1fa176aac4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -1078,7 +1078,6 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp) if (IS_ERR(rpc)) return PTR_ERR(rpc); - rpc->size = sizeof(*rpc); rpc->numEntries = NV_GSP_REG_NUM_ENTRIES; str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]); @@ -1094,6 +1093,7 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp) strings += name_len; str_offset += name_len; } + rpc->size = str_offset; return nvkm_gsp_rpc_wr(gsp, rpc, false); }
Re: [PATCH] nouveau: offload fence uevents work to workqueue
On 1/29/24 02:50, Dave Airlie wrote: From: Dave Airlie This should break the deadlock between the fctx lock and the irq lock. This offloads the processing off the work from the irq into a workqueue. Signed-off-by: Dave Airlie Nouveau's scheduler uses a dedicated wq, hence from this perspective it's safe deferring fence signalling to the kernel global wq. However, I wonder if we could create deadlocks by building dependency chains into other drivers / kernel code that, by chance, makes use of the kernel global wq as well. Admittedly, even if, it's gonna be extremely unlikely given that WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq. Also, do we need to CC stable? --- drivers/gpu/drm/nouveau/nouveau_fence.c | 24 ++-- drivers/gpu/drm/nouveau/nouveau_fence.h | 1 + 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index ca762ea55413..93f08f9479d8 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -103,6 +103,7 @@ nouveau_fence_context_kill(struct nouveau_fence_chan *fctx, int error) void nouveau_fence_context_del(struct nouveau_fence_chan *fctx) { + cancel_work_sync(>uevent_work); nouveau_fence_context_kill(fctx, 0); nvif_event_dtor(>event); fctx->dead = 1; @@ -145,12 +146,13 @@ nouveau_fence_update(struct nouveau_channel *chan, struct nouveau_fence_chan *fc return drop; } -static int -nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 repc) +static void +nouveau_fence_uevent_work(struct work_struct *work) { - struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), event); + struct nouveau_fence_chan *fctx = container_of(work, struct nouveau_fence_chan, + uevent_work); unsigned long flags; - int ret = NVIF_EVENT_KEEP; + int drop = 0; spin_lock_irqsave(>lock, flags); if (!list_empty(>pending)) { @@ -160,11 +162,20 @@ nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 repc fence = list_entry(fctx->pending.next, typeof(*fence), head); chan = rcu_dereference_protected(fence->channel, lockdep_is_held(>lock)); if (nouveau_fence_update(chan, fctx)) - ret = NVIF_EVENT_DROP; + drop = 1; } + if (drop) + nvif_event_block(>event); + spin_unlock_irqrestore(>lock, flags); +} - return ret; +static int +nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 repc) +{ + struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), event); + schedule_work(>uevent_work); + return NVIF_EVENT_KEEP; } void @@ -178,6 +189,7 @@ nouveau_fence_context_new(struct nouveau_channel *chan, struct nouveau_fence_cha } args; int ret; + INIT_WORK(>uevent_work, nouveau_fence_uevent_work); INIT_LIST_HEAD(>flip); INIT_LIST_HEAD(>pending); spin_lock_init(>lock); diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h index 64d33ae7f356..8bc065acfe35 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.h +++ b/drivers/gpu/drm/nouveau/nouveau_fence.h @@ -44,6 +44,7 @@ struct nouveau_fence_chan { u32 context; char name[32]; + struct work_struct uevent_work; struct nvif_event event; int notify_ref, dead, killed; };
Re: [PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs
On 2/5/24 10:03, Boris Brezillon wrote: +Danilo for the panthor gpuvm-needs update. On Sun, 4 Feb 2024 09:14:44 +0800 (CST) "Andy Yan" wrote: Hi Boris: I saw this warning sometimes(Run on a armbain based bookworm),not sure is a know issue or something else。 [15368.293031] systemd-journald[715]: Received client request to relinquish /var/log/journal/1bc4a340506142af9bd31a6a3d2170ba access. [37743.040737] [ cut here ] [37743.040764] panthor fb00.gpu: drm_WARN_ON(shmem->pages_use_count) [37743.040890] WARNING: CPU: 2 PID: 5702 at drivers/gpu/drm/drm_gem_shmem_helper.c:158 drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper] [37743.040929] Modules linked in: joydev rfkill sunrpc lz4hc lz4 zram binfmt_misc hantro_vpu crct10dif_ce v4l2_vp9 v4l2_h264 snd_soc_simple_amplifier v4l2_mem2mem videobuf2_dma_contig snd_soc_es8328_i2c videobuf2_memops rk_crypto2 snd_soc_es8328 videobuf2_v4l2 sm3_generic videodev crypto_engine sm3 rockchip_rng videobuf2_common nvmem_rockchip_otp snd_soc_rockchip_i2s_tdm snd_soc_hdmi_codec snd_soc_simple_card mc snd_soc_simple_card_utils snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_pcm snd_timer snd soundcore dm_mod ip_tables x_tables autofs4 dw_hdmi_qp_i2s_audio dw_hdmi_qp_cec rk808_regulator rockchipdrm dw_mipi_dsi dw_hdmi_qp dw_hdmi analogix_dp drm_dma_helper fusb302 display_connector rk8xx_spi drm_display_helper phy_rockchip_snps_pcie3 phy_rockchip_samsung_hdptx_hdmi panthor tcpm rk8xx_core cec drm_gpuvm gpu_sched drm_kms_helper drm_shmem_helper drm_exec r8169 drm pwm_bl adc_keys [37743.041108] CPU: 2 PID: 5702 Comm: kworker/u16:8 Not tainted 6.8.0-rc1-edge-rockchip-rk3588 #2 [37743.041115] Hardware name: Rockchip RK3588 EVB1 V10 Board (DT) [37743.041120] Workqueue: panthor-cleanup panthor_vm_bind_job_cleanup_op_ctx_work [panthor] [37743.041151] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [37743.041157] pc : drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper] [37743.041169] lr : drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper] [37743.041181] sp : 80008d37bcc0 [37743.041184] x29: 80008d37bcc0 x28: 800081d379c0 x27: 800081d37000 [37743.041196] x26: 00019909a280 x25: 00019909a2c0 x24: 0001017a4c05 [37743.041206] x23: dead0100 x22: dead0122 x21: 0001627ac1a0 [37743.041217] x20: x19: 0001627ac000 x18: [37743.041227] x17: 00040044 x16: 005000f2b5503510 x15: fff91b77 [37743.041238] x14: 0001 x13: 03c5 x12: ffea [37743.041248] x11: dfff x10: dfff x9 : 800081e0e818 [37743.041259] x8 : 0002ffe8 x7 : c000dfff x6 : 000affa8 [37743.041269] x5 : 1fff x4 : x3 : 8000819a6008 [37743.041279] x2 : x1 : x0 : 00018465e900 [37743.041290] Call trace: [37743.041293] drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper] [37743.041306] panthor_gem_free_object+0x24/0xa0 [panthor] [37743.041321] drm_gem_object_free+0x1c/0x30 [drm] [37743.041452] panthor_vm_bo_put+0xc4/0x12c [panthor] [37743.041475] panthor_vm_cleanup_op_ctx.constprop.0+0xb0/0x104 [panthor] [37743.041491] panthor_vm_bind_job_cleanup_op_ctx_work+0x28/0xd0 [panthor] Ok, I think I found the culprit: there's a race between the drm_gpuvm_bo_put() call in panthor_vm_bo_put() and the list iteration done by drm_gpuvm_prepare_objects(). Because we're not setting DRM_GPUVM_RESV_PROTECTED, the code goes through the 'lockless' iteration loop, and takes/release a vm_bo ref at each iteration. This means our 'were we the last vm_bo user?' test in panthor_vm_bo_put() might return false even if we were actually the last user, and when for_each_vm_bo_in_list() releases the ref it acquired, it not only leaks the pin reference, thus leaving GEM pages pinned (which explains this WARN_ON() splat), but it also calls drm_gpuvm_bo_destroy() in a path where we don't hold the GPUVA list lock, which is bad. IIUC, GPUVM generally behaves correctly. It's just that the return value of drm_gpuvm_bo_put() needs to be treated with care. Maybe we should extend c50a291d621a ("drm/gpuvm: Let drm_gpuvm_bo_put() report when the vm_bo object is destroyed") by a note explaining this unexpected case, or, if not required anymore, simply revert the patch. Long story short, I'll have to use DRM_GPUVM_RESV_PROTECTED, which is fine because I'm deferring vm_bo removal to a work where taking the VM resv lock is allowed. Since I was the one asking for this lockless iterator in the first place, I wonder if we should kill that and make DRM_GPUVM_RESV_PROTECTED the default (this would greatly simplify the code). AFAICT, The PowerVR driver shouldn't be impacted because it's using drm_gpuvm in synchronous mode only, and Xe already uses the resv-protected mode. That leaves Nouveau, but IIRC, it's also doing VM updates in the ioctl path. Danilo, any
Re: [PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized
On 2/2/24 18:14, Timur Tabi wrote: On Fri, 2024-02-02 at 01:05 +0100, Danilo Krummrich wrote: nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call their corresponding *_fini() counterpart. This can lead to nouveau_sched_fini() being called without struct nouveau_sched ever being initialized in the first place. Thanks, I've confirmed that these patches do fix the problem Cool, gonna add your 'Tested-by' then. - Danilo
[PATCH 2/2] drm/nouveau: omit to create schedulers using the legacy uAPI
Omit to create scheduler instances when using the legacy uAPI. When using the legacy NOUVEAU_GEM_PUSHBUF ioctl no scheduler instance is required, hence omit creating scheduler instances in nouveau_abi16_ioctl_channel_alloc(). Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_abi16.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c b/drivers/gpu/drm/nouveau/nouveau_abi16.c index ca4b5ab3e59e..d1bb8151a1df 100644 --- a/drivers/gpu/drm/nouveau/nouveau_abi16.c +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c @@ -339,10 +339,16 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS) if (ret) goto done; - ret = nouveau_sched_create(>sched, drm, drm->sched_wq, - chan->chan->dma.ib_max); - if (ret) - goto done; + /* If we're not using the VM_BIND uAPI, we don't need a scheduler. +* +* The client lock is already acquired by nouveau_abi16_get(). +*/ + if (nouveau_cli_uvmm(cli)) { + ret = nouveau_sched_create(>sched, drm, drm->sched_wq, + chan->chan->dma.ib_max); + if (ret) + goto done; + } init->channel = chan->chan->chid; -- 2.43.0
[PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized
nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call their corresponding *_fini() counterpart. This can lead to nouveau_sched_fini() being called without struct nouveau_sched ever being initialized in the first place. Instead of embedding struct nouveau_sched into struct nouveau_cli and struct nouveau_chan_abi16, allocate struct nouveau_sched separately, such that we can check for the corresponding pointer to be NULL in the particular *_fini() functions. It makes sense to allocate struct nouveau_sched separately anyway, since in a subsequent commit we can also avoid to allocate a struct nouveau_sched in nouveau_abi16_ioctl_channel_alloc() at all, if the VM_BIND uAPI has been disabled due to the legacy uAPI being used. Fixes: 5f03a507b29e ("drm/nouveau: implement 1:1 scheduler - entity relationship") Reported-by: Timur Tabi Closes: https://lore.kernel.org/nouveau/20240131213917.1545604-1-tt...@nvidia.com/ Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_abi16.c | 10 --- drivers/gpu/drm/nouveau/nouveau_abi16.h | 2 +- drivers/gpu/drm/nouveau/nouveau_drm.c | 7 +++-- drivers/gpu/drm/nouveau/nouveau_drv.h | 2 +- drivers/gpu/drm/nouveau/nouveau_exec.c | 2 +- drivers/gpu/drm/nouveau/nouveau_sched.c | 38 +++-- drivers/gpu/drm/nouveau/nouveau_sched.h | 6 ++-- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 2 +- 8 files changed, 53 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c b/drivers/gpu/drm/nouveau/nouveau_abi16.c index a04156ca8390..ca4b5ab3e59e 100644 --- a/drivers/gpu/drm/nouveau/nouveau_abi16.c +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c @@ -128,12 +128,14 @@ nouveau_abi16_chan_fini(struct nouveau_abi16 *abi16, struct nouveau_abi16_ntfy *ntfy, *temp; /* Cancel all jobs from the entity's queue. */ - drm_sched_entity_fini(>sched.entity); + if (chan->sched) + drm_sched_entity_fini(>sched->entity); if (chan->chan) nouveau_channel_idle(chan->chan); - nouveau_sched_fini(>sched); + if (chan->sched) + nouveau_sched_destroy(>sched); /* cleanup notifier state */ list_for_each_entry_safe(ntfy, temp, >notifiers, head) { @@ -337,8 +339,8 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS) if (ret) goto done; - ret = nouveau_sched_init(>sched, drm, drm->sched_wq, -chan->chan->dma.ib_max); + ret = nouveau_sched_create(>sched, drm, drm->sched_wq, + chan->chan->dma.ib_max); if (ret) goto done; diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.h b/drivers/gpu/drm/nouveau/nouveau_abi16.h index 1f5e243c0c75..11c8c4a80079 100644 --- a/drivers/gpu/drm/nouveau/nouveau_abi16.h +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.h @@ -26,7 +26,7 @@ struct nouveau_abi16_chan { struct nouveau_bo *ntfy; struct nouveau_vma *ntfy_vma; struct nvkm_mm heap; - struct nouveau_sched sched; + struct nouveau_sched *sched; }; struct nouveau_abi16 { diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 6f6c31a9937b..a947e1d5f309 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -201,7 +201,8 @@ nouveau_cli_fini(struct nouveau_cli *cli) WARN_ON(!list_empty(>worker)); usif_client_fini(cli); - nouveau_sched_fini(>sched); + if (cli->sched) + nouveau_sched_destroy(>sched); if (uvmm) nouveau_uvmm_fini(uvmm); nouveau_vmm_fini(>svm); @@ -311,7 +312,7 @@ nouveau_cli_init(struct nouveau_drm *drm, const char *sname, cli->mem = [ret]; /* Don't pass in the (shared) sched_wq in order to let -* nouveau_sched_init() create a dedicated one for VM_BIND jobs. +* nouveau_sched_create() create a dedicated one for VM_BIND jobs. * * This is required to ensure that for VM_BIND jobs free_job() work and * run_job() work can always run concurrently and hence, free_job() work @@ -320,7 +321,7 @@ nouveau_cli_init(struct nouveau_drm *drm, const char *sname, * locks which indirectly or directly are held for allocations * elsewhere. */ - ret = nouveau_sched_init(>sched, drm, NULL, 1); + ret = nouveau_sched_create(>sched, drm, NULL, 1); if (ret) goto done; diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h index 8a6d94c8b163..e239c6bf4afa 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drv.h +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h @@ -98,7 +98,7 @@ struct nouveau_cli { bool disabled; } uvmm; - struct nouveau_sched sched; +
Re: [PATCH v3 1/2] drm: enable (most) W=1 warnings by default across the subsystem
On 2/1/24 15:33, Jani Nikula wrote: At least the i915 and amd drivers enable a bunch more compiler warnings than the kernel defaults. Extend most of the W=1 warnings to the entire drm subsystem by default. Use the copy-pasted warnings from scripts/Makefile.extrawarn with s/KBUILD_CFLAGS/subdir-ccflags-y/ to make it easier to compare and keep up with them in the future. This is similar to the approach currently used in i915. Some of the -Wextra warnings do need to be disabled, just like in Makefile.extrawarn, but take care to not disable them for W=2 or W=3 builds, depending on the warning. There are too many -Wformat-truncation warnings to cleanly fix up front; leave that warning disabled for now. v3: - Drop -Wmissing-declarations (already enabled by default) - Drop -Wmissing-prototypes (already enabled by default) v2: - Drop -Wformat-truncation (too many warnings) - Drop -Wstringop-overflow (already enabled by default) Cc: David Airlie Cc: Daniel Vetter Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: Alex Deucher Cc: Christian König Cc: Pan, Xinhui Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Sean Paul Cc: Marijn Suijten Cc: Hamza Mahfooz Acked-by: Javier Martinez Canillas Acked-by: Thomas Zimmermann Acked-by: Sui Jingfeng Signed-off-by: Jani Nikula Acked-by: Danilo Krummrich --- drivers/gpu/drm/Makefile | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 104b42df2e95..04722a5dfb78 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -5,6 +5,31 @@ CFLAGS-$(CONFIG_DRM_USE_DYNAMIC_DEBUG) += -DDYNAMIC_DEBUG_MODULE +# Unconditionally enable W=1 warnings locally +# --- begin copy-paste W=1 warnings from scripts/Makefile.extrawarn +subdir-ccflags-y += -Wextra -Wunused -Wno-unused-parameter +subdir-ccflags-y += $(call cc-option, -Wrestrict) +subdir-ccflags-y += -Wmissing-format-attribute +subdir-ccflags-y += -Wold-style-definition +subdir-ccflags-y += -Wmissing-include-dirs +subdir-ccflags-y += $(call cc-option, -Wunused-but-set-variable) +subdir-ccflags-y += $(call cc-option, -Wunused-const-variable) +subdir-ccflags-y += $(call cc-option, -Wpacked-not-aligned) +subdir-ccflags-y += $(call cc-option, -Wformat-overflow) +# FIXME: fix -Wformat-truncation warnings and uncomment +#subdir-ccflags-y += $(call cc-option, -Wformat-truncation) +subdir-ccflags-y += $(call cc-option, -Wstringop-truncation) +# The following turn off the warnings enabled by -Wextra +ifeq ($(findstring 2, $(KBUILD_EXTRA_WARN)),) +subdir-ccflags-y += -Wno-missing-field-initializers +subdir-ccflags-y += -Wno-type-limits +subdir-ccflags-y += -Wno-shift-negative-value +endif +ifeq ($(findstring 3, $(KBUILD_EXTRA_WARN)),) +subdir-ccflags-y += -Wno-sign-compare +endif +# --- end copy-paste + drm-y := \ drm_aperture.o \ drm_atomic.o \
Re: [PATCH 2/6] drm/nouveau/svm: remove unused but set variables
On 1/10/24 18:39, Jani Nikula wrote: Fix the W=1 warning -Wunused-but-set-variable. Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: nouv...@lists.freedesktop.org Signed-off-by: Jani Nikula Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_svm.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index cc03e0c22ff3..4d1008915499 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -112,7 +112,7 @@ nouveau_svmm_bind(struct drm_device *dev, void *data, { struct nouveau_cli *cli = nouveau_cli(file_priv); struct drm_nouveau_svm_bind *args = data; - unsigned target, cmd, priority; + unsigned target, cmd; unsigned long addr, end; struct mm_struct *mm; @@ -136,9 +136,6 @@ nouveau_svmm_bind(struct drm_device *dev, void *data, return -EINVAL; } - priority = args->header >> NOUVEAU_SVM_BIND_PRIORITY_SHIFT; - priority &= NOUVEAU_SVM_BIND_PRIORITY_MASK; - /* FIXME support CPU target ie all target value < GPU_VRAM */ target = args->header >> NOUVEAU_SVM_BIND_TARGET_SHIFT; target &= NOUVEAU_SVM_BIND_TARGET_MASK; @@ -926,15 +923,14 @@ nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm, unsigned long addr, u64 *pfns, unsigned long npages) { struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns); - int ret; args->p.addr = addr; args->p.size = npages << PAGE_SHIFT; mutex_lock(>mutex); - ret = nvif_object_ioctl(>vmm->vmm.object, args, - struct_size(args, p.phys, npages), NULL); + nvif_object_ioctl(>vmm->vmm.object, args, + struct_size(args, p.phys, npages), NULL); mutex_unlock(>mutex); }
Re: [PATCH 1/6] drm/nouveau/acr/ga102: remove unused but set variable
On 1/10/24 18:39, Jani Nikula wrote: Fix the W=1 warning -Wunused-but-set-variable. Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: nouv...@lists.freedesktop.org Signed-off-by: Jani Nikula Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c index f36a359d4531..bd104a030243 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c @@ -218,7 +218,7 @@ nvkm_acr_lsfw_load_sig_image_desc_v2(struct nvkm_subdev *subdev, const struct firmware *hsbl; const struct nvfw_ls_hsbl_bin_hdr *hdr; const struct nvfw_ls_hsbl_hdr *hshdr; - u32 loc, sig, cnt, *meta; + u32 sig, cnt, *meta; ret = nvkm_firmware_load_name(subdev, path, "hs_bl_sig", ver, ); if (ret) @@ -227,7 +227,6 @@ nvkm_acr_lsfw_load_sig_image_desc_v2(struct nvkm_subdev *subdev, hdr = nvfw_ls_hsbl_bin_hdr(subdev, hsbl->data); hshdr = nvfw_ls_hsbl_hdr(subdev, hsbl->data + hdr->header_offset); meta = (u32 *)(hsbl->data + hshdr->meta_data_offset); - loc = *(u32 *)(hsbl->data + hshdr->patch_loc); sig = *(u32 *)(hsbl->data + hshdr->patch_sig); cnt = *(u32 *)(hsbl->data + hshdr->num_sig);
Re: Making drm_gpuvm work across gpu devices
On 1/24/24 04:57, Zeng, Oak wrote: Thanks a lot Danilo. Maybe I wasn't clear enough. In the solution I proposed, each device still have separate vm/page tables. Each device still need to manage the mapping, page table flags etc. It is just in svm use case, all devices share one drm_gpuvm instance. As I understand it, drm_gpuvm's main function is the va range split and merging. I don't see why it doesn't work across gpu devices. I'm pretty sure it does work. You can indeed use GPUVM for tracking mappings using the split and merge feature only, ignoring all other features it provides. However, I don't think it's a good idea to have a single GPUVM instance to track the memory mappings of different devices with different page tables, different object life times, etc. But I read more about drm_gpuvm. Its split merge function takes a drm_gem_object parameter, see drm_gpuvm_sm_map_ops_create and drm_gpuvm_sm_map. Actually the whole drm_gpuvm is designed for BO-centric driver, for example, it has a drm_gpuvm_bo concept to keep track of the 1BO:Ngpuva mapping. The whole purpose of leveraging drm_gpuvm is to re-use the va split/merge functions for SVM. But in our SVM implementation, there is no buffer object at all. So I don't think our SVM codes can leverage drm_gpuvm. That's all optional features. As mentioned above, you can use GPUVM for tracking mappings using the split and merge feature only. The drm_gem_object parameter in drm_gpuvm_sm_map_ops_create() can simply be NULL. Afaik, Xe already does that for userptr stuff already. But again, I don't think it's a good idea to track memory mappings of multiple independent physical devices and driver instances in a single different place whether you use GPUVM or a custom implementation. - Danilo I will give up this approach, unless Matt or Brian can see a way. A few replies inline @Welty, Brian I had more thoughts inline to one of your original question -Original Message- From: Danilo Krummrich Sent: Tuesday, January 23, 2024 6:57 PM To: Zeng, Oak ; Christian König ; Dave Airlie ; Daniel Vetter ; Felix Kuehling Cc: Welty, Brian ; dri-devel@lists.freedesktop.org; intel- x...@lists.freedesktop.org; Bommu, Krishnaiah ; Ghimiray, Himal Prasad ; thomas.hellst...@linux.intel.com; Vishwanathapura, Niranjana ; Brost, Matthew ; Gupta, saurabhg Subject: Re: Making drm_gpuvm work across gpu devices Hi Oak, On 1/23/24 20:37, Zeng, Oak wrote: Thanks Christian. I have some comment inline below. Danilo, can you also take a look and give your feedback? Thanks. I agree with everything Christian already wrote. Except for the KFD parts, which I'm simply not familiar with, I had exactly the same thoughts after reading your initial mail. Please find some more comments below. -Original Message- From: Christian König Sent: Tuesday, January 23, 2024 6:13 AM To: Zeng, Oak ; Danilo Krummrich ; Dave Airlie ; Daniel Vetter Cc: Welty, Brian ; dri-devel@lists.freedesktop.org; intel- x...@lists.freedesktop.org; Bommu, Krishnaiah ; Ghimiray, Himal Prasad ; thomas.hellst...@linux.intel.com; Vishwanathapura, Niranjana ; Brost, Matthew Subject: Re: Making drm_gpuvm work across gpu devices Hi Oak, Am 23.01.24 um 04:21 schrieb Zeng, Oak: Hi Danilo and all, During the work of Intel's SVM code, we came up the idea of making drm_gpuvm to work across multiple gpu devices. See some discussion here: https://lore.kernel.org/dri- devel/PH7PR11MB70049E7E6A2F40BF6282ECC292742@PH7PR11MB7004.namprd 11.prod.outlook.com/ The reason we try to do this is, for a SVM (shared virtual memory across cpu program and all gpu program on all gpu devices) process, the address space has to be across all gpu devices. So if we make drm_gpuvm to work across devices, then our SVM code can leverage drm_gpuvm as well. At a first look, it seems feasible because drm_gpuvm doesn't really use the drm_device *drm pointer a lot. This param is used only for printing/warning. So I think maybe we can delete this drm field from drm_gpuvm. This way, on a multiple gpu device system, for one process, we can have only one drm_gpuvm instance, instead of multiple drm_gpuvm instances (one for each gpu device). What do you think? Well from the GPUVM side I don't think it would make much difference if we have the drm device or not. But the experience we had with the KFD I think I should mention that we should absolutely *not* deal with multiple devices at the same time in the UAPI or VM objects inside the driver. The background is that all the APIs inside the Linux kernel are build around the idea that they work with only one device at a time. This accounts for both low level APIs like the DMA API as well as pretty high level things like for example file system address space etc... Yes most API are per device based. One exception I know is actually the kfd SVM API. If you look at the svm_ioctl function, it is per-process based. Each
Re: Making drm_gpuvm work across gpu devices
Hi Oak, On 1/23/24 20:37, Zeng, Oak wrote: Thanks Christian. I have some comment inline below. Danilo, can you also take a look and give your feedback? Thanks. I agree with everything Christian already wrote. Except for the KFD parts, which I'm simply not familiar with, I had exactly the same thoughts after reading your initial mail. Please find some more comments below. -Original Message- From: Christian König Sent: Tuesday, January 23, 2024 6:13 AM To: Zeng, Oak ; Danilo Krummrich ; Dave Airlie ; Daniel Vetter Cc: Welty, Brian ; dri-devel@lists.freedesktop.org; intel- x...@lists.freedesktop.org; Bommu, Krishnaiah ; Ghimiray, Himal Prasad ; thomas.hellst...@linux.intel.com; Vishwanathapura, Niranjana ; Brost, Matthew Subject: Re: Making drm_gpuvm work across gpu devices Hi Oak, Am 23.01.24 um 04:21 schrieb Zeng, Oak: Hi Danilo and all, During the work of Intel's SVM code, we came up the idea of making drm_gpuvm to work across multiple gpu devices. See some discussion here: https://lore.kernel.org/dri- devel/PH7PR11MB70049E7E6A2F40BF6282ECC292742@PH7PR11MB7004.namprd 11.prod.outlook.com/ The reason we try to do this is, for a SVM (shared virtual memory across cpu program and all gpu program on all gpu devices) process, the address space has to be across all gpu devices. So if we make drm_gpuvm to work across devices, then our SVM code can leverage drm_gpuvm as well. At a first look, it seems feasible because drm_gpuvm doesn't really use the drm_device *drm pointer a lot. This param is used only for printing/warning. So I think maybe we can delete this drm field from drm_gpuvm. This way, on a multiple gpu device system, for one process, we can have only one drm_gpuvm instance, instead of multiple drm_gpuvm instances (one for each gpu device). What do you think? Well from the GPUVM side I don't think it would make much difference if we have the drm device or not. But the experience we had with the KFD I think I should mention that we should absolutely *not* deal with multiple devices at the same time in the UAPI or VM objects inside the driver. The background is that all the APIs inside the Linux kernel are build around the idea that they work with only one device at a time. This accounts for both low level APIs like the DMA API as well as pretty high level things like for example file system address space etc... Yes most API are per device based. One exception I know is actually the kfd SVM API. If you look at the svm_ioctl function, it is per-process based. Each kfd_process represent a process across N gpu devices. Cc Felix. Need to say, kfd SVM represent a shared virtual address space across CPU and all GPU devices on the system. This is by the definition of SVM (shared virtual memory). This is very different from our legacy gpu *device* driver which works for only one device (i.e., if you want one device to access another device's memory, you will have to use dma-buf export/import etc). We have the same design requirement of SVM. For anyone who want to implement the SVM concept, this is a hard requirement. Since now drm has the drm_gpuvm concept which strictly speaking is designed for one device, I want to see whether we can extend drm_gpuvm to make it work for both single device (as used in xe) and multipe devices (will be used in the SVM code). That is why I brought up this topic. So when you have multiple GPUs you either have an inseparable cluster of them which case you would also only have one drm_device. Or you have separated drm_device which also results in separate drm render nodes and separate virtual address spaces and also eventually separate IOMMU domains which gives you separate dma_addresses for the same page and so separate GPUVM page tables I am thinking we can still make each device has its separate drm_device/render node/iommu domains/gpu page table. Just as what we have today. I am not plan to change this picture. But the virtual address space will support two modes of operation: 1. one drm_gpuvm per device. This is when svm is not in the picture 2. all devices in the process share one single drm_gpuvm, when svm is in the picture. In xe driver design, we have to support a mixture use of legacy mode (such as gem_create and vm_bind) and svm (such as malloc'ed memory for gpu submission). So whenever SVM is in the picture, we want one single process address space across all devices. Drm_gpuvm doesn't need to be aware of those two operation modes. It is driver's responsibility to use different mode. For example, in mode #1, a driver's vm structure (such as xe_vm) can inherit from drm_gpuvm. In mode #2, a driver's svm structure (xe_svm in this series: https://lore.kernel.org/dri-devel/20240117221223.18540-1-oak.z...@intel.com/) can inherit from drm_gpuvm while each xe_vm (still a per-device based struct) will just have a pointer to the drm_gpuvm structure. This way when svm is in play, we build a 1 process:1
Re: [PATCH][next] drm/nouveau/fifo/gk104: remove redundant variable ret
On 1/16/24 13:31, Dan Carpenter wrote: On Tue, Jan 16, 2024 at 11:16:09AM +, Colin Ian King wrote: The variable ret is being assigned a value but it isn't being read afterwards. The assignment is redundant and so ret can be removed. Cleans up clang scan build warning: warning: Although the value stored to 'ret' is used in the enclosing expression, the value is never actually read from 'ret' [deadcode.DeadStores] Signed-off-by: Colin Ian King --- drivers/gpu/drm/nouveau/nvif/fifo.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvif/fifo.c b/drivers/gpu/drm/nouveau/nvif/fifo.c index a463289962b2..e96de14ce87e 100644 --- a/drivers/gpu/drm/nouveau/nvif/fifo.c +++ b/drivers/gpu/drm/nouveau/nvif/fifo.c @@ -73,9 +73,9 @@ u64 nvif_fifo_runlist(struct nvif_device *device, u64 engine) { u64 runm = 0; - int ret, i; + int i; - if ((ret = nvif_fifo_runlists(device))) + if (nvif_fifo_runlists(device)) return runm; Could we return a literal zero here? Otherwise, I'm surprised this doesn't trigger a static checker warning. Why do you think so? Conditionally, runm is used later on as well. I don't think the checker should complain about keeping the value single source. If you agree, want to offer your RB? - Danilo regards, dan carpenter
Re: [PATCH] drm/exec, drm/gpuvm: Prefer u32 over uint32_t
On 1/19/24 10:05, Thomas Hellström wrote: The relatively recently introduced drm/exec utility was using uint32_t in its interface, which was then also carried over to drm/gpuvm. Prefer u32 in new code and update drm/exec and drm/gpuvm accordingly. Cc: Christian König Cc: Danilo Krummrich Signed-off-by: Thomas Hellström Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/drm_exec.c | 2 +- include/drm/drm_exec.h | 4 ++-- include/drm/drm_gpuvm.h| 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c index 5d2809de4517..20e59d88218d 100644 --- a/drivers/gpu/drm/drm_exec.c +++ b/drivers/gpu/drm/drm_exec.c @@ -72,7 +72,7 @@ static void drm_exec_unlock_all(struct drm_exec *exec) * * Initialize the object and make sure that we can track locked objects. */ -void drm_exec_init(struct drm_exec *exec, uint32_t flags) +void drm_exec_init(struct drm_exec *exec, u32 flags) { exec->flags = flags; exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL); diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h index b5bf0b6da791..187c3ec44606 100644 --- a/include/drm/drm_exec.h +++ b/include/drm/drm_exec.h @@ -18,7 +18,7 @@ struct drm_exec { /** * @flags: Flags to control locking behavior */ - uint32_tflags; + u32 flags; /** * @ticket: WW ticket used for acquiring locks @@ -135,7 +135,7 @@ static inline bool drm_exec_is_contended(struct drm_exec *exec) return !!exec->contended; } -void drm_exec_init(struct drm_exec *exec, uint32_t flags); +void drm_exec_init(struct drm_exec *exec, u32 flags); void drm_exec_fini(struct drm_exec *exec); bool drm_exec_cleanup(struct drm_exec *exec); int drm_exec_lock_obj(struct drm_exec *exec, struct drm_gem_object *obj); diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index 48311e6d664c..554046321d24 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -514,7 +514,7 @@ struct drm_gpuvm_exec { /** * @flags: the flags for the struct drm_exec */ - uint32_t flags; + u32 flags; /** * @vm: the _gpuvm to lock its DMA reservations
Re: [PATCH linux-next v2] drm/nouveau/disp: switch to use kmemdup() helper
On 1/9/24 07:24, yang.gua...@zte.com.cn wrote: From: Chen Haonan Use kmemdup() helper instead of open-coding to simplify the code. Signed-off-by: Chen Haonan Reviewed-by: Yang Guang Applied to drm-misc-next, thanks! --- drivers/gpu/drm/nouveau/nvif/outp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvif/outp.c b/drivers/gpu/drm/nouveau/nvif/outp.c index 5d3190c05250..6daeb7f0b09b 100644 --- a/drivers/gpu/drm/nouveau/nvif/outp.c +++ b/drivers/gpu/drm/nouveau/nvif/outp.c @@ -452,13 +452,12 @@ nvif_outp_edid_get(struct nvif_outp *outp, u8 **pedid) if (ret) goto done; - *pedid = kmalloc(args->size, GFP_KERNEL); + *pedid = kmemdup(args->data, args->size, GFP_KERNEL); if (!*pedid) { ret = -ENOMEM; goto done; } - memcpy(*pedid, args->data, args->size); ret = args->size; done: kfree(args);
Re: [PATCH 1/3] drm/nouveau: include drm/drm_edid.h only where needed
On 1/9/24 10:59, Jani Nikula wrote: On Mon, 08 Jan 2024, Danilo Krummrich wrote: On 1/4/24 21:16, Jani Nikula wrote: Including drm_edid.h from nouveau_connector.h causes the rebuild of 15 files when drm_edid.h is modified, while there are only a few files that actually need to include drm_edid.h. Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: nouv...@lists.freedesktop.org Signed-off-by: Jani Nikula Reviewed-by: Danilo Krummrich Are you going to pick this up via the nouveau tree, or shall I apply it to drm-misc-next? We don't maintain a separate tree, hence feel free to apply it to drm-misc-next. - Danilo BR, Jani. --- drivers/gpu/drm/nouveau/dispnv50/head.c | 1 + drivers/gpu/drm/nouveau/nouveau_connector.h | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c index 5f490fbf1877..83355dbc15ee 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.c +++ b/drivers/gpu/drm/nouveau/dispnv50/head.c @@ -32,6 +32,7 @@ #include #include +#include #include #include "nouveau_connector.h" diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h index a2df4918340c..0608cabed058 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.h +++ b/drivers/gpu/drm/nouveau/nouveau_connector.h @@ -35,7 +35,6 @@ #include #include -#include #include #include @@ -44,6 +43,7 @@ struct nvkm_i2c_port; struct dcb_output; +struct edid; #ifdef CONFIG_DRM_NOUVEAU_BACKLIGHT struct nouveau_backlight {
Re: [PATCH -next] drm/nouveau: uapi: fix kerneldoc warnings
On 12/25/23 07:51, Vegard Nossum wrote: As of commit b77fdd6a48e6 ("scripts/kernel-doc: restore warning for Excess struct/union"), we see the following warnings when running 'make htmldocs': ./include/uapi/drm/nouveau_drm.h:292: warning: Excess struct member 'DRM_NOUVEAU_VM_BIND_OP_MAP' description in 'drm_nouveau_vm_bind_op' ./include/uapi/drm/nouveau_drm.h:292: warning: Excess struct member 'DRM_NOUVEAU_VM_BIND_OP_UNMAP' description in 'drm_nouveau_vm_bind_op' ./include/uapi/drm/nouveau_drm.h:292: warning: Excess struct member 'DRM_NOUVEAU_VM_BIND_SPARSE' description in 'drm_nouveau_vm_bind_op' ./include/uapi/drm/nouveau_drm.h:336: warning: Excess struct member 'DRM_NOUVEAU_VM_BIND_RUN_ASYNC' description in 'drm_nouveau_vm_bind' The problem is that these values are #define constants, but had kerneldoc comments attached to them as if they were actual struct members. There are a number of ways we could fix this, but I chose to draw inspiration from include/uapi/drm/i915_drm.h, which pulls them into the corresponding kerneldoc comment for the struct member that they are intended to be used with. To keep the diff readable, there are a number of things I _didn't_ do in this patch, but which we should also consider: - This is pretty good documentation, but it ends up in gpu/driver-uapi, which is part of subsystem-apis/ when it really ought to display under userspace-api/ (the "Linux kernel user-space API guide" book of the documentation). I agree, it indeed looks like this would make sense, same goes for gpu/drm-uapi.rst. @Jani, Sima: Was this intentional? Or can we change it? - More generally, we might want a warning if include/uapi/ files are kerneldoc'd outside userspace-api/. - I'd consider it cleaner if the #defines appeared between the kerneldoc for the member and the member itself (which is something other DRM- related UAPI docs do). - The %IDENTIFIER kerneldoc syntax is intended for "constants", and is more appropriate in this context than ``IDENTIFIER`` or The DRM docs aren't very consistent on this. Cc: Randy Dunlap Cc: Jonathan Corbet Signed-off-by: Vegard Nossum Applied to drm-misc-next, thanks! --- include/uapi/drm/nouveau_drm.h | 56 -- 1 file changed, 27 insertions(+), 29 deletions(-) diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h index 0bade1592f34..c95ef8a4d94a 100644 --- a/include/uapi/drm/nouveau_drm.h +++ b/include/uapi/drm/nouveau_drm.h @@ -238,34 +238,32 @@ struct drm_nouveau_vm_init { struct drm_nouveau_vm_bind_op { /** * @op: the operation type +* +* Supported values: +* +* %DRM_NOUVEAU_VM_BIND_OP_MAP - Map a GEM object to the GPU's VA +* space. Optionally, the _NOUVEAU_VM_BIND_SPARSE flag can be +* passed to instruct the kernel to create sparse mappings for the +* given range. +* +* %DRM_NOUVEAU_VM_BIND_OP_UNMAP - Unmap an existing mapping in the +* GPU's VA space. If the region the mapping is located in is a +* sparse region, new sparse mappings are created where the unmapped +* (memory backed) mapping was mapped previously. To remove a sparse +* region the _NOUVEAU_VM_BIND_SPARSE must be set. */ __u32 op; -/** - * @DRM_NOUVEAU_VM_BIND_OP_MAP: - * - * Map a GEM object to the GPU's VA space. Optionally, the - * _NOUVEAU_VM_BIND_SPARSE flag can be passed to instruct the kernel to - * create sparse mappings for the given range. - */ #define DRM_NOUVEAU_VM_BIND_OP_MAP 0x0 -/** - * @DRM_NOUVEAU_VM_BIND_OP_UNMAP: - * - * Unmap an existing mapping in the GPU's VA space. If the region the mapping - * is located in is a sparse region, new sparse mappings are created where the - * unmapped (memory backed) mapping was mapped previously. To remove a sparse - * region the _NOUVEAU_VM_BIND_SPARSE must be set. - */ #define DRM_NOUVEAU_VM_BIND_OP_UNMAP 0x1 /** * @flags: the flags for a _nouveau_vm_bind_op +* +* Supported values: +* +* %DRM_NOUVEAU_VM_BIND_SPARSE - Indicates that an allocated VA +* space region should be sparse. */ __u32 flags; -/** - * @DRM_NOUVEAU_VM_BIND_SPARSE: - * - * Indicates that an allocated VA space region should be sparse. - */ #define DRM_NOUVEAU_VM_BIND_SPARSE (1 << 8) /** * @handle: the handle of the DRM GEM object to map @@ -301,17 +299,17 @@ struct drm_nouveau_vm_bind { __u32 op_count; /** * @flags: the flags for a _nouveau_vm_bind ioctl +* +* Supported values: +* +* %DRM_NOUVEAU_VM_BIND_RUN_ASYNC - Indicates that the given VM_BIND +* operation should be executed asynchronously by the kernel. +* +* If this flag is not supplied the kernel executes the associated +* operations synchronously and doesn't accept any
Re: [PATCH 1/3] drm/nouveau: include drm/drm_edid.h only where needed
On 1/4/24 21:16, Jani Nikula wrote: Including drm_edid.h from nouveau_connector.h causes the rebuild of 15 files when drm_edid.h is modified, while there are only a few files that actually need to include drm_edid.h. Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: nouv...@lists.freedesktop.org Signed-off-by: Jani Nikula Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/dispnv50/head.c | 1 + drivers/gpu/drm/nouveau/nouveau_connector.h | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c index 5f490fbf1877..83355dbc15ee 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.c +++ b/drivers/gpu/drm/nouveau/dispnv50/head.c @@ -32,6 +32,7 @@ #include #include +#include #include #include "nouveau_connector.h" diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h index a2df4918340c..0608cabed058 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.h +++ b/drivers/gpu/drm/nouveau/nouveau_connector.h @@ -35,7 +35,6 @@ #include #include -#include #include #include @@ -44,6 +43,7 @@ struct nvkm_i2c_port; struct dcb_output; +struct edid; #ifdef CONFIG_DRM_NOUVEAU_BACKLIGHT struct nouveau_backlight {
Re: [PATCH 1/4] drm/nouveau/disp: don't misuse kernel-doc comments
On 1/1/24 00:36, Randy Dunlap wrote: Change kernel-doc "/**" comments to common "/*" comments to prevent kernel-doc warnings: crtc.c:453: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Sets up registers for the given mode/adjusted_mode pair. crtc.c:453: warning: missing initial short description on line: * Sets up registers for the given mode/adjusted_mode pair. crtc.c:629: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Sets up registers for the given mode/adjusted_mode pair. crtc.c:629: warning: missing initial short description on line: * Sets up registers for the given mode/adjusted_mode pair. Signed-off-by: Randy Dunlap Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: nouv...@lists.freedesktop.org Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Series applied to drm-misc-next, thanks! --- drivers/gpu/drm/nouveau/dispnv04/crtc.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c @@ -449,7 +449,7 @@ nv_crtc_mode_set_vga(struct drm_crtc *cr regp->Attribute[NV_CIO_AR_CSEL_INDEX] = 0x00; } -/** +/* * Sets up registers for the given mode/adjusted_mode pair. * * The clocks, CRTCs and outputs attached to this CRTC must be off. @@ -625,7 +625,7 @@ nv_crtc_swap_fbs(struct drm_crtc *crtc, return ret; } -/** +/* * Sets up registers for the given mode/adjusted_mode pair. * * The clocks, CRTCs and outputs attached to this CRTC must be off.
Re: [PATCH] drm/nouveau/bios/init: drop kernel-doc notation
On 12/16/23 21:11, Randy Dunlap wrote: The "/**" comments in this file are not kernel-doc comments. They are used on static functions which can have kernel-doc comments, but that is not the primary focus of kernel-doc comments. Since these comments are incomplete for kernel-doc notation, remove the kernel-doc "/**" markers and make them common comments. This prevents scripts/kernel-doc from issuing 68 warnings: init.c:584: warning: Function parameter or member 'init' not described in 'init_reserved' and 67 warnings like this one: init.c:611: warning: expecting prototype for INIT_DONE(). Prototype was for init_done() instead Signed-off-by: Randy Dunlap Cc: Karol Herbst Cc: Lyude Paul Cc: Danilo Krummrich Cc: dri-devel@lists.freedesktop.org Cc: nouv...@lists.freedesktop.org Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Applied to drm-misc-next, thanks! --- drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c | 136 +++--- 1 file changed, 68 insertions(+), 68 deletions(-) diff -- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c @@ -575,7 +575,7 @@ init_tmds_reg(struct nvbios_init *init, * init opcode handlers */ -/** +/* * init_reserved - stub for various unknown/unused single-byte opcodes * */ @@ -602,7 +602,7 @@ init_reserved(struct nvbios_init *init) init->offset += length; } -/** +/* * INIT_DONE - opcode 0x71 * */ @@ -613,7 +613,7 @@ init_done(struct nvbios_init *init) init->offset = 0x; } -/** +/* * INIT_IO_RESTRICT_PROG - opcode 0x32 * */ @@ -650,7 +650,7 @@ init_io_restrict_prog(struct nvbios_init trace("}]\n"); } -/** +/* * INIT_REPEAT - opcode 0x33 * */ @@ -676,7 +676,7 @@ init_repeat(struct nvbios_init *init) init->repeat = repeat; } -/** +/* * INIT_IO_RESTRICT_PLL - opcode 0x34 * */ @@ -716,7 +716,7 @@ init_io_restrict_pll(struct nvbios_init trace("}]\n"); } -/** +/* * INIT_END_REPEAT - opcode 0x36 * */ @@ -732,7 +732,7 @@ init_end_repeat(struct nvbios_init *init } } -/** +/* * INIT_COPY - opcode 0x37 * */ @@ -759,7 +759,7 @@ init_copy(struct nvbios_init *init) init_wrvgai(init, port, index, data); } -/** +/* * INIT_NOT - opcode 0x38 * */ @@ -771,7 +771,7 @@ init_not(struct nvbios_init *init) init_exec_inv(init); } -/** +/* * INIT_IO_FLAG_CONDITION - opcode 0x39 * */ @@ -788,7 +788,7 @@ init_io_flag_condition(struct nvbios_ini init_exec_set(init, false); } -/** +/* * INIT_GENERIC_CONDITION - opcode 0x3a * */ @@ -840,7 +840,7 @@ init_generic_condition(struct nvbios_ini } } -/** +/* * INIT_IO_MASK_OR - opcode 0x3b * */ @@ -859,7 +859,7 @@ init_io_mask_or(struct nvbios_init *init init_wrvgai(init, 0x03d4, index, data &= ~(1 << or)); } -/** +/* * INIT_IO_OR - opcode 0x3c * */ @@ -878,7 +878,7 @@ init_io_or(struct nvbios_init *init) init_wrvgai(init, 0x03d4, index, data | (1 << or)); } -/** +/* * INIT_ANDN_REG - opcode 0x47 * */ @@ -895,7 +895,7 @@ init_andn_reg(struct nvbios_init *init) init_mask(init, reg, mask, 0); } -/** +/* * INIT_OR_REG - opcode 0x48 * */ @@ -912,7 +912,7 @@ init_or_reg(struct nvbios_init *init) init_mask(init, reg, 0, mask); } -/** +/* * INIT_INDEX_ADDRESS_LATCHED - opcode 0x49 * */ @@ -942,7 +942,7 @@ init_idx_addr_latched(struct nvbios_init } } -/** +/* * INIT_IO_RESTRICT_PLL2 - opcode 0x4a * */ @@ -977,7 +977,7 @@ init_io_restrict_pll2(struct nvbios_init trace("}]\n"); } -/** +/* * INIT_PLL2 - opcode 0x4b * */ @@ -994,7 +994,7 @@ init_pll2(struct nvbios_init *init) init_prog_pll(init, reg, freq); } -/** +/* * INIT_I2C_BYTE - opcode 0x4c * */ @@ -1025,7 +1025,7 @@ init_i2c_byte(struct nvbios_init *init) } } -/** +/* * INIT_ZM_I2C_BYTE - opcode 0x4d * */ @@ -1051,7 +1051,7 @@ init_zm_i2c_byte(struct nvbios_init *ini } } -/** +/* * INIT_ZM_I2C - opcode 0x4e * */ @@ -1085,7 +1085,7 @@ init_zm_i2c(struct nvbios_init *init) } } -/** +/* * INIT_TMDS - opcode 0x4f * */ @@ -,7 +,7 @@ init_tmds(struct nvbios_init *init) init_wr32(init, reg + 0, addr); } -/** +/* * INIT_ZM_TMDS_GROUP - opcode 0x50 * */ @@ -1138,7 +1138,7 @@ init_zm_tmds_group(struct nvbios_init *i } } -/** +/* * INIT_CR_INDEX_ADDRESS_LATCHED - opcode 0x51 * */ @@ -1168,7 +1168,7 @@ init_cr_idx_
Re: [PATCH] drm/nouveau/fifo: remove duplicated including
Hi Wang, there is another patch [1] to fix this, which already made it upstream. - Danilo [1] https://patchwork.freedesktop.org/patch/msgid/20231122004926.84933-1-yang@linux.alibaba.com On 12/15/23 11:02, Wang Jinchao wrote: rm second including of chid.h Signed-off-by: Wang Jinchao --- drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c index 87a62d4ff4bd..7d4716dcd512 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c @@ -24,7 +24,6 @@ #include "chan.h" #include "chid.h" #include "cgrp.h" -#include "chid.h" #include "runl.h" #include "priv.h"
Re: [PATCH linux-next] drm/nouveau/disp: switch to use kmemdup() helper
Hi Yang, On 12/14/23 13:03, yang.gua...@zte.com.cn wrote: From: Yang Guang Use kmemdup() helper instead of open-coding to simplify the code. Signed-off-by: Chen Haonan Please add your "Signed-off-by" tag to this patch. If the above is intended to indicate that Chan was involved in creating this patch (e.g. as co-author, reporter, etc.) please use the corresponding tag instead. See also [1]. Once this is fixed, I'll apply the patch. - Danilo [1] https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin --- drivers/gpu/drm/nouveau/nvif/outp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvif/outp.c b/drivers/gpu/drm/nouveau/nvif/outp.c index 5d3190c05250..6daeb7f0b09b 100644 --- a/drivers/gpu/drm/nouveau/nvif/outp.c +++ b/drivers/gpu/drm/nouveau/nvif/outp.c @@ -452,13 +452,12 @@ nvif_outp_edid_get(struct nvif_outp *outp, u8 **pedid) if (ret) goto done; - *pedid = kmalloc(args->size, GFP_KERNEL); + *pedid = kmemdup(args->data, args->size, GFP_KERNEL); if (!*pedid) { ret = -ENOMEM; goto done; } - memcpy(*pedid, args->data, args->size); ret = args->size; done: kfree(args);
Re: [PATCH drm-misc-next v2 0/2] PowerVR VM fixes
On Tue, Dec 05, 2023 at 04:35:00PM +0100, Maxime Ripard wrote: > Hi, > > On Wed, Nov 29, 2023 at 11:07:59PM +0100, Danilo Krummrich wrote: > > Some major GPUVM changes landed just before v8 of the PowerVR series. Since > > v8 > > went in rather quickly (review process was finished otherwise) I haven't > > had the > > chance to review the subsequent code changes. > > > > Hence, this series with a few fixes in this context. Plus a minor GPUVM > > patch to > > make the drm_gpuvm_prepare_* helpers useful for PowerVR. > > This doesn't apply on drm-misc-next anymore, could you rebase and > resend? I already applied the two patches to drm-misc-next. - Danilo > > Thanks! > Maxime
Re: [PATCH] drm/gpuvm: Let drm_gpuvm_bo_put() report when the vm_bo object is destroyed
On 12/4/23 16:14, Boris Brezillon wrote: Some users need to release resources attached to the vm_bo object when it's destroyed. In Panthor's case, we need to release the pin ref so BO pages can be returned to the system when all GPU mappings are gone. This could be done through a custom drm_gpuvm::vm_bo_free() hook, but this has all sort of locking implications that would force us to expose a drm_gem_shmem_unpin_locked() helper, not to mention the fact that having a ::vm_bo_free() implementation without a ::vm_bo_alloc() one seems odd. So let's keep things simple, and extend drm_gpuvm_bo_put() to report when the object is destroyed. Signed-off-by: Boris Brezillon Reviewed-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c | 8 ++-- include/drm/drm_gpuvm.h | 2 +- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 54f5e8851de5..ae13e2d63637 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -1502,14 +1502,18 @@ drm_gpuvm_bo_destroy(struct kref *kref) * hold the dma-resv or driver specific GEM gpuva lock. * * This function may only be called from non-atomic context. + * + * Returns: true if vm_bo was destroyed, false otherwise. */ -void +bool drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo) { might_sleep(); if (vm_bo) - kref_put(_bo->kref, drm_gpuvm_bo_destroy); + return !!kref_put(_bo->kref, drm_gpuvm_bo_destroy); + + return false; } EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put); diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index f94fec9a8517..7cc41a7d86d5 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -738,7 +738,7 @@ drm_gpuvm_bo_get(struct drm_gpuvm_bo *vm_bo) return vm_bo; } -void drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo); +bool drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo); struct drm_gpuvm_bo * drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
Re: [PATCH] drm/nouveau: Removes unnecessary args check in nouveau_uvmm_sm_prepare
On 11/16/23 21:52, Yuran Pereira wrote: Checking `args` after calling `op_map_prepare` is unnecessary since if `op_map_prepare` was to be called with NULL args, it would lead to a NULL pointer dereference, thus never hitting that check. Hence this check can be removed, and a note added to remind users of this function to ensure that args != NULL when calling this function for a map operation as it was suggested by Danilo [1] [1] https://lore.kernel.org/lkml/6a1ebcef-bade-45a0-9bd9-c05f0226e...@redhat.com Suggested-by: Danilo Krummrich Signed-off-by: Yuran Pereira Applied to drm-misc-next. --- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c index 5cf892c50f43..c8c3f1b1b604 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -604,6 +604,10 @@ op_unmap_prepare(struct drm_gpuva_op_unmap *u) drm_gpuva_unmap(u); } +/* + * Note: @args should not be NULL when calling for + * a map operation. + */ static int nouveau_uvmm_sm_prepare(struct nouveau_uvmm *uvmm, struct nouveau_uvma_prealloc *new, @@ -624,7 +628,7 @@ nouveau_uvmm_sm_prepare(struct nouveau_uvmm *uvmm, if (ret) goto unwind; - if (args && vmm_get_range) { + if (vmm_get_range) { ret = nouveau_uvmm_vmm_get(uvmm, vmm_get_start, vmm_get_range); if (ret) {
Re: [PATCH] driver: gpu: Fix warning directly dereferencing a rcu pointer
Hi Abhinav, Thanks for sending this follow-up patch. On 11/26/23 15:57, Abhinav Singh wrote: Fix a sparse warning with this message "warning:dereference of noderef expression". In this context it means we are dereferencing a __rcu tagged pointer directly. We should not be directly dereferencing a rcu pointer. To get a normal (non __rcu tagged pointer) from a __rcu tagged pointer we are using the function unrcu_pointer(...). The non __rcu tagged pointer then can be dereferenced just like a normal pointer. Can you please add a brief explanation why unrcu_pointer() is fine here? I tested with qemu with this command qemu-system-x86_64 \ -m 2G \ -smp 2 \ -kernel bzImage \ -append "console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \ -drive file=bullseye.img,format=raw \ -net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 \ -net nic,model=e1000 \ -enable-kvm \ -nographic \ -pidfile vm.pid \ 2>&1 | tee vm.log with lockdep enabled. How is that relevant for this patch? - Danilo Signed-off-by: Abhinav Singh --- drivers/gpu/drm/nouveau/nv10_fence.c | 2 +- drivers/gpu/drm/nouveau/nv84_fence.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nv10_fence.c b/drivers/gpu/drm/nouveau/nv10_fence.c index c6a0db5b9e21..845b64c079ed 100644 --- a/drivers/gpu/drm/nouveau/nv10_fence.c +++ b/drivers/gpu/drm/nouveau/nv10_fence.c @@ -32,7 +32,7 @@ int nv10_fence_emit(struct nouveau_fence *fence) { - struct nvif_push *push = fence->channel->chan.push; + struct nvif_push *push = unrcu_pointer(fence->channel)->chan.push; int ret = PUSH_WAIT(push, 2); if (ret == 0) { PUSH_MTHD(push, NV06E, SET_REFERENCE, fence->base.seqno); diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c b/drivers/gpu/drm/nouveau/nv84_fence.c index 812b8c62eeba..d42e72e23dec 100644 --- a/drivers/gpu/drm/nouveau/nv84_fence.c +++ b/drivers/gpu/drm/nouveau/nv84_fence.c @@ -85,7 +85,7 @@ nv84_fence_chid(struct nouveau_channel *chan) static int nv84_fence_emit(struct nouveau_fence *fence) { - struct nouveau_channel *chan = fence->channel; + struct nouveau_channel *chan = unrcu_pointer(fence->channel); struct nv84_fence_chan *fctx = chan->fence; u64 addr = fctx->vma->addr + nv84_fence_chid(chan) * 16;
[PATCH drm-misc-next v2 2/2] drm/imagination: vm: make use of GPUVM's drm_exec helper
Make use of GPUVM's drm_exec helper functions preventing direct access to GPUVM internal data structures, such as the external object list. This is especially important to ensure following the locking rules around the GPUVM external object list. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/imagination/pvr_vm.c | 91 +++- 1 file changed, 36 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c index e0d74d9a6190..c6ab1581d509 100644 --- a/drivers/gpu/drm/imagination/pvr_vm.c +++ b/drivers/gpu/drm/imagination/pvr_vm.c @@ -333,48 +333,6 @@ pvr_vm_bind_op_unmap_init(struct pvr_vm_bind_op *bind_op, return err; } -static int -pvr_vm_bind_op_lock_resvs(struct drm_exec *exec, struct pvr_vm_bind_op *bind_op) -{ - drm_exec_until_all_locked(exec) { - struct drm_gem_object *r_obj = _op->vm_ctx->dummy_gem; - struct drm_gpuvm *gpuvm = _op->vm_ctx->gpuvm_mgr; - struct pvr_gem_object *pvr_obj = bind_op->pvr_obj; - struct drm_gpuvm_bo *gpuvm_bo; - - /* Acquire lock on the vm_context's reserve object. */ - int err = drm_exec_lock_obj(exec, r_obj); - - drm_exec_retry_on_contention(exec); - if (err) - return err; - - /* Acquire lock on all BOs in the context. */ - list_for_each_entry(gpuvm_bo, >extobj.list, - list.entry.extobj) { - err = drm_exec_lock_obj(exec, gpuvm_bo->obj); - - drm_exec_retry_on_contention(exec); - if (err) - return err; - } - - /* Unmap operations don't have an object to lock. */ - if (!pvr_obj) - break; - - /* Acquire lock on the GEM being mapped. */ - err = drm_exec_lock_obj(exec, - gem_from_pvr_gem(bind_op->pvr_obj)); - - drm_exec_retry_on_contention(exec); - if (err) - return err; - } - - return 0; -} - /** * pvr_vm_gpuva_map() - Insert a mapping into a memory context. * @op: gpuva op containing the remap details. @@ -731,6 +689,20 @@ void pvr_destroy_vm_contexts_for_file(struct pvr_file *pvr_file) } } +static int +pvr_vm_lock_extra(struct drm_gpuvm_exec *vm_exec) +{ + struct pvr_vm_bind_op *bind_op = vm_exec->extra.priv; + struct pvr_gem_object *pvr_obj = bind_op->pvr_obj; + + /* Unmap operations don't have an object to lock. */ + if (!pvr_obj) + return 0; + + /* Acquire lock on the GEM being mapped. */ + return drm_exec_lock_obj(_exec->exec, gem_from_pvr_gem(pvr_obj)); +} + /** * pvr_vm_map() - Map a section of physical memory into a section of * device-virtual memory. @@ -758,7 +730,15 @@ pvr_vm_map(struct pvr_vm_context *vm_ctx, struct pvr_gem_object *pvr_obj, u64 pvr_obj_offset, u64 device_addr, u64 size) { struct pvr_vm_bind_op bind_op = {0}; - struct drm_exec exec; + struct drm_gpuvm_exec vm_exec = { + .vm = _ctx->gpuvm_mgr, + .flags = DRM_EXEC_INTERRUPTIBLE_WAIT | +DRM_EXEC_IGNORE_DUPLICATES, + .extra = { + .fn = pvr_vm_lock_extra, + .priv = _op, + }, + }; int err = pvr_vm_bind_op_map_init(_op, vm_ctx, pvr_obj, pvr_obj_offset, device_addr, @@ -767,18 +747,15 @@ pvr_vm_map(struct pvr_vm_context *vm_ctx, struct pvr_gem_object *pvr_obj, if (err) return err; - drm_exec_init(, - DRM_EXEC_INTERRUPTIBLE_WAIT | DRM_EXEC_IGNORE_DUPLICATES); - pvr_gem_object_get(pvr_obj); - err = pvr_vm_bind_op_lock_resvs(, _op); + err = drm_gpuvm_exec_lock(_exec); if (err) goto err_cleanup; err = pvr_vm_bind_op_exec(_op); - drm_exec_fini(); + drm_gpuvm_exec_unlock(_exec); err_cleanup: pvr_vm_bind_op_fini(_op); @@ -804,24 +781,28 @@ int pvr_vm_unmap(struct pvr_vm_context *vm_ctx, u64 device_addr, u64 size) { struct pvr_vm_bind_op bind_op = {0}; - struct drm_exec exec; + struct drm_gpuvm_exec vm_exec = { + .vm = _ctx->gpuvm_mgr, + .flags = DRM_EXEC_INTERRUPTIBLE_WAIT | +DRM_EXEC_IGNORE_DUPLICATES, + .extra = { + .fn = pvr_vm_lock_extra, + .priv = _op, + }, + }; int err = pvr_vm_bind_op_unmap_init(_op, vm_ctx, device_addr,
[PATCH drm-misc-next v2 1/2] drm/gpuvm: fall back to drm_exec_lock_obj()
Fall back to drm_exec_lock_obj() if num_fences is zero for the drm_gpuvm_prepare_* function family. Otherwise dma_resv_reserve_fences() would actually allocate slots even though num_fences is zero. Cc: Christian König Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c | 43 - include/drm/drm_gpuvm.h | 23 +++- 2 files changed, 41 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 54f5e8851de5..07a6676bc4f9 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -1085,6 +1085,37 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm) } EXPORT_SYMBOL_GPL(drm_gpuvm_put); +static int +exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj, +unsigned int num_fences) +{ + return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) : + drm_exec_lock_obj(exec, obj); +} + +/** + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv + * @gpuvm: the _gpuvm + * @exec: the _exec context + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object; if + * @num_fences is zero drm_exec_lock_obj() is called instead. + * + * Using this function directly, it is the drivers responsibility to call + * drm_exec_init() and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, +struct drm_exec *exec, +unsigned int num_fences) +{ + return exec_prepare_obj(exec, gpuvm->r_obj, num_fences); +} +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm); + static int __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, struct drm_exec *exec, @@ -1095,7 +1126,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, int ret = 0; for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; } @@ -1116,7 +1147,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm, drm_gpuvm_resv_assert_held(gpuvm); list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; @@ -1134,7 +1165,8 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm, * @num_fences: the amount of _fences to reserve * * Calls drm_exec_prepare_obj() for all _gem_objects the given - * _gpuvm contains mappings of. + * _gpuvm contains mappings of; if @num_fences is zero drm_exec_lock_obj() + * is called instead. * * Using this function directly, it is the drivers responsibility to call * drm_exec_init() and drm_exec_fini() accordingly. @@ -1171,7 +1203,8 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects); * @num_fences: the amount of _fences to reserve * * Calls drm_exec_prepare_obj() for all _gem_objects mapped between @addr - * and @addr + @range. + * and @addr + @range; if @num_fences is zero drm_exec_lock_obj() is called + * instead. * * Returns: 0 on success, negative error code on failure. */ @@ -1186,7 +1219,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec, drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) { struct drm_gem_object *obj = va->gem.obj; - ret = drm_exec_prepare_obj(exec, obj, num_fences); + ret = exec_prepare_obj(exec, obj, num_fences); if (ret) return ret; } diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index f94fec9a8517..b3f82ec7fb17 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -544,26 +544,9 @@ struct drm_gpuvm_exec { } extra; }; -/** - * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv - * @gpuvm: the _gpuvm - * @exec: the _exec context - * @num_fences: the amount of _fences to reserve - * - * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object. - * - * Using this function directly, it is the drivers responsibility to call - * drm_exec_init() and drm_exec_fini() accordingly. - * - * Returns: 0 on success, negative error code on failure. - */ -static inline int -drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, -struct drm_exec *exec, -unsigned int num_fences) -{ - return drm_exec_prepare_obj(exec, gpuvm->r_obj, num_fences); -} +int drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, +struct drm_exec *exec, +unsigned int num_fences); int drm_gpuvm_prepare_objects
[PATCH drm-misc-next v2 0/2] PowerVR VM fixes
Hi, Some major GPUVM changes landed just before v8 of the PowerVR series. Since v8 went in rather quickly (review process was finished otherwise) I haven't had the chance to review the subsequent code changes. Hence, this series with a few fixes in this context. Plus a minor GPUVM patch to make the drm_gpuvm_prepare_* helpers useful for PowerVR. - Danilo Changes in V2 = - GPUVM: update function DOC comment to indicate the passing zero fences to drm_gpuvm_prepare_* functions results into drm_exec_lock_obj() calls rather than drm_exec_prepare_obj() calls. - pvr/vm: use drm_gpuvm_exec wrappers - drop 3 patches which were applied already Danilo Krummrich (2): drm/gpuvm: fall back to drm_exec_lock_obj() drm/imagination: vm: make use of GPUVM's drm_exec helper drivers/gpu/drm/drm_gpuvm.c | 43 +++-- drivers/gpu/drm/imagination/pvr_vm.c | 91 +++- include/drm/drm_gpuvm.h | 23 +-- 3 files changed, 77 insertions(+), 80 deletions(-) base-commit: 83dc1029dcf50b5b849b26679a1b3f860b85d79c -- 2.43.0
Re: [Nouveau] [PATCH][next] nouveau/gsp: replace zero-length array with flex-array member and use __counted_by
On 11/29/23 02:06, Gustavo A. R. Silva wrote: On 11/28/23 19:01, Danilo Krummrich wrote: On 11/16/23 20:55, Timur Tabi wrote: On Thu, 2023-11-16 at 20:45 +0100, Danilo Krummrich wrote: As I already mentioned for Timur's patch [2], I'd prefer to get a fix upstream (meaning [1] in this case). Of course, that's probably more up to Timur to tell if this will work out. Don't count on it. I see. Well, I think it's fine. Once we implement a decent abstraction we likely don't need those header files in the kernel anymore. @Gustavo, if you agree I will discard the indentation change when applying the patch to keep the diff as small as possible. No problem. Applied to drm-misc-fixes. Thanks -- Gustavo
Re: [PATCH] nouveau/gsp/r535: remove a stray unlock in r535_gsp_rpc_send()
On 11/27/23 13:56, Dan Carpenter wrote: This unlock doesn't belong here and it leads to a double unlock in the caller, r535_gsp_rpc_push(). Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Dan Carpenter Good catch - applied to drm-misc-fixes. - Danilo --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index dc44f5c7833f..818e5c73b7a6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -365,10 +365,8 @@ r535_gsp_rpc_send(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc) } ret = r535_gsp_cmdq_push(gsp, rpc); - if (ret) { - mutex_unlock(>cmdq.mutex); + if (ret) return ERR_PTR(ret); - } if (wait) { msg = r535_gsp_msg_recv(gsp, fn, repc);
Re: [Nouveau] [PATCH][next] nouveau/gsp: replace zero-length array with flex-array member and use __counted_by
On 11/16/23 20:55, Timur Tabi wrote: On Thu, 2023-11-16 at 20:45 +0100, Danilo Krummrich wrote: As I already mentioned for Timur's patch [2], I'd prefer to get a fix upstream (meaning [1] in this case). Of course, that's probably more up to Timur to tell if this will work out. Don't count on it. I see. Well, I think it's fine. Once we implement a decent abstraction we likely don't need those header files in the kernel anymore. @Gustavo, if you agree I will discard the indentation change when applying the patch to keep the diff as small as possible. - Danilo Even if I did change [0] to [], I'm not going to be able to add the "__counted_by(numEntries);" because that's just not something that our build system uses. And even then, I would need to change all [0] to []. You're not going to be able to use RM's header files as-is anyway in the long term. If we changed the layout of PACKED_REGISTRY_TABLE, we're not going to create a PACKED_REGISTRY_TABLE2 and keep both around. We're just going to change PACKED_REGISTRY_TABLE and pretend the previous version never existed. You will then have to manually copy the new struct to your header files and and maintain two versions yourself.
Re: [EXTERNAL] [PATCH drm-misc-next 5/5] drm/imagination: vm: make use of GPUVM's drm_exec helper
On 11/28/23 11:47, Donald Robson wrote: Hi Danilo, Apologies - I guess I should have submitted a patch to handle zero fences in your locking functions with the final patch series. On Sat, 2023-11-25 at 00:36 +0100, Danilo Krummrich wrote: *** CAUTION: This email originates from a source not known to Imagination Technologies. Think before you click a link or open an attachment *** Make use of GPUVM's drm_exec helper functions preventing direct access to GPUVM internal data structures, such as the external object list. This is especially important to ensure following the locking rules around the GPUVM external object list. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/imagination/pvr_vm.c | 16 +--- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c index e0d74d9a6190..3f7888f5cc53 100644 --- a/drivers/gpu/drm/imagination/pvr_vm.c +++ b/drivers/gpu/drm/imagination/pvr_vm.c @@ -337,27 +337,21 @@ static int pvr_vm_bind_op_lock_resvs(struct drm_exec *exec, struct pvr_vm_bind_op *bind_op) { drm_exec_until_all_locked(exec) { - struct drm_gem_object *r_obj = _op->vm_ctx->dummy_gem; struct drm_gpuvm *gpuvm = _op->vm_ctx->gpuvm_mgr; struct pvr_gem_object *pvr_obj = bind_op->pvr_obj; - struct drm_gpuvm_bo *gpuvm_bo; /* Acquire lock on the vm_context's reserve object. */ - int err = drm_exec_lock_obj(exec, r_obj); + int err = drm_gpuvm_prepare_vm(gpuvm, exec, 0); drm_exec_retry_on_contention(exec); if (err) return err; /* Acquire lock on all BOs in the context. */ - list_for_each_entry(gpuvm_bo, >extobj.list, - list.entry.extobj) { - err = drm_exec_lock_obj(exec, gpuvm_bo->obj); - - drm_exec_retry_on_contention(exec); - if (err) - return err; - } + err = drm_gpuvm_prepare_objects(gpuvm, exec, 0); + drm_exec_retry_on_contention(exec); + if (err) + return err; Before I discovered the problem when not reserving fences, I was trying to use drm_gpuvm_exec_lock() with vm_exec->extra.fn() for the part below. Is there a reason not to do that now? No, that works - gonna change that. - Danilo Many thanks, Donald /* Unmap operations don't have an object to lock. */ if (!pvr_obj)
Re: [PATCH v5] Documentation/gpu: VM_BIND locking document
On 11/22/23 08:49, Thomas Hellström wrote: On 11/21/23 14:56, Boris Brezillon wrote: On Tue, 21 Nov 2023 11:40:46 +0100 Thomas Hellström wrote: Add the first version of the VM_BIND locking document which is intended to be part of the xe driver upstreaming agreement. The document describes and discuss the locking used during exec- functions, evicton and for userptr gpu-vmas. Intention is to be using the same nomenclature as the drm-vm-bind-async.rst. v2: - s/gvm/gpu_vm/g (Rodrigo Vivi) - Clarify the userptr seqlock with a pointer to mm/mmu_notifier.c (Rodrigo Vivi) - Adjust commit message accordingly. - Add SPDX license header. v3: - Large update to align with the drm_gpuvm manager locking - Add "Efficient userptr gpu_vma exec function iteration" section - Add "Locking at bind- and unbind time" section. v4: - Fix tabs vs space errors by untabifying (Rodrigo Vivi) - Minor style fixes and typos (Rodrigo Vivi) - Clarify situations where stale GPU mappings are occurring and how access through these mappings are blocked. (Rodrigo Vivi) - Insert into the toctree in implementation_guidelines.rst v5: - Add a section about recoverable page-faults. - Use local references to other documentation where possible (Bagas Sanjaya) - General documentation fixes and typos (Danilo Krummrich and Boris Brezillon) - Improve the documentation around locks that need to be grabbed from the dm-fence critical section (Boris Brezillon) - Add more references to the DRM GPUVM helpers (Danilo Krummrich and Boriz Brezillon) - Update the rfc/xe.rst document. Cc: Rodrigo Vivi Signed-off-by: Thomas Hellström Still have a few comments (see below), and in general, I find some sentences very long, which has the tendency of confusing me (always trying to figure out what was the main point, what the pronouns are referring to, etc). Anyway, I think it's better to have something imperfect than nothing at all, so here is my Reviewed-by: Boris Brezillon Feel free to add it even if you decide to ignore my comments. Thanks for reviewing, Boris! I'll make a final version incorporating much of the comments and suggestions, much appreciated. I still think, though, that in principle the referral between gpuvm and this document should be the other way around, or it should all sit in gpuvm. In any case this is an initial version and as more comments and suggestions land, we can certainly update. I think if we agree that GPUVM should be the common component that we recommend drivers to use, we should reference to GPUVM whenever possible. However, I'm not sure whether we'd want to dedicate *all* documentation to GPUVM. Since the topic is rather complex, I can see that it might be beneficial to do both, discuss it from a more abstract point of view and document the corresponding common component. Reviewed-by: Danilo Krummrich Thanks, Thomas
Re: [PATCH drm-misc-next 4/5] drm/gpuvm: fall back to drm_exec_lock_obj()
Hi, On 11/27/23 14:52, Christian König wrote: Am 25.11.23 um 00:40 schrieb Danilo Krummrich: Hi Christian, do you think it makes sense to handle this in drm_exec_prepare_obj() or even dma_resv_reserve_fences() even? IIRC for drm_exec the discussion has gone a bit back and forth between handling 0 and having a separate function which doesn't allocate fences. We ended up with the solution using separate calls since you either know that you don't need fences (because you for example only need to look something up) or you need fences and eventually calculate the number you need dynamically and if you then end up with 0 it's a bug. I don't have a strong opinion on that. Though, in GPUVM I'd probably still need some wrapper like +exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj, + unsigned int num_fences) +{ +return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) : +drm_exec_lock_obj(exec, obj); +} to prevent either duplicate code or rather unnecessary complicated abstractions. And I'm not sure it's super nice that drm_gpuvm_prepare_objects() allows zero fences, whereas drm_exec_prepare_obj() does not. So to sum it up the conclusion was that it's more defensive to complain about 0 fences to reserve (which reminds me that dma_resv_reserve_fences() still needs to get a warning for 0 fences as well). What's the logic you'd want to apply there? WARN() but still allocate at least 4 slots or WARN() and return doing nothing? - Danilo Regards, Christian. - Danilo On 11/25/23 00:36, Danilo Krummrich wrote: Fall back to drm_exec_lock_obj() if num_fences is zero for the drm_gpuvm_prepare_* function family. Otherwise dma_resv_reserve_fences() would actually allocate slots even though num_fences is zero. Cc: Christian König Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c | 36 +--- include/drm/drm_gpuvm.h | 23 +++ 2 files changed, 36 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 54f5e8851de5..d1d1c2379e44 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -1085,6 +1085,36 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm) } EXPORT_SYMBOL_GPL(drm_gpuvm_put); +static int +exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj, + unsigned int num_fences) +{ + return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) : + drm_exec_lock_obj(exec, obj); +} + +/** + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv + * @gpuvm: the _gpuvm + * @exec: the _exec context + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object. + * + * Using this function directly, it is the drivers responsibility to call + * drm_exec_init() and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, + struct drm_exec *exec, + unsigned int num_fences) +{ + return exec_prepare_obj(exec, gpuvm->r_obj, num_fences); +} +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm); + static int __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, struct drm_exec *exec, @@ -1095,7 +1125,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, int ret = 0; for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; } @@ -1116,7 +1146,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm, drm_gpuvm_resv_assert_held(gpuvm); list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; @@ -1186,7 +1216,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec, drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) { struct drm_gem_object *obj = va->gem.obj; - ret = drm_exec_prepare_obj(exec, obj, num_fences); + ret = exec_prepare_obj(exec, obj, num_fences); if (ret) return ret; } diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index f94fec9a8517..b3f82ec7fb17 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -544,26 +544,9 @@ struct drm_gpuvm_exec { } extra; }; -/** - * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv - * @gpuvm: the _gpuvm - * @exec: the _exec context - * @num_fences: the amount of _fences to reserve - * - * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object. - * - * Using this function directly, it is the drivers r
Re: [PATCH drm-misc-next 4/5] drm/gpuvm: fall back to drm_exec_lock_obj()
Hi Christian, do you think it makes sense to handle this in drm_exec_prepare_obj() or even dma_resv_reserve_fences() even? - Danilo On 11/25/23 00:36, Danilo Krummrich wrote: Fall back to drm_exec_lock_obj() if num_fences is zero for the drm_gpuvm_prepare_* function family. Otherwise dma_resv_reserve_fences() would actually allocate slots even though num_fences is zero. Cc: Christian König Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c | 36 +--- include/drm/drm_gpuvm.h | 23 +++ 2 files changed, 36 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 54f5e8851de5..d1d1c2379e44 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -1085,6 +1085,36 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm) } EXPORT_SYMBOL_GPL(drm_gpuvm_put); +static int +exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj, +unsigned int num_fences) +{ + return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) : + drm_exec_lock_obj(exec, obj); +} + +/** + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv + * @gpuvm: the _gpuvm + * @exec: the _exec context + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object. + * + * Using this function directly, it is the drivers responsibility to call + * drm_exec_init() and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, +struct drm_exec *exec, +unsigned int num_fences) +{ + return exec_prepare_obj(exec, gpuvm->r_obj, num_fences); +} +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm); + static int __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, struct drm_exec *exec, @@ -1095,7 +1125,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, int ret = 0; for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; } @@ -1116,7 +1146,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm, drm_gpuvm_resv_assert_held(gpuvm); list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; @@ -1186,7 +1216,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec, drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) { struct drm_gem_object *obj = va->gem.obj; - ret = drm_exec_prepare_obj(exec, obj, num_fences); + ret = exec_prepare_obj(exec, obj, num_fences); if (ret) return ret; } diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index f94fec9a8517..b3f82ec7fb17 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -544,26 +544,9 @@ struct drm_gpuvm_exec { } extra; }; -/** - * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv - * @gpuvm: the _gpuvm - * @exec: the _exec context - * @num_fences: the amount of _fences to reserve - * - * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object. - * - * Using this function directly, it is the drivers responsibility to call - * drm_exec_init() and drm_exec_fini() accordingly. - * - * Returns: 0 on success, negative error code on failure. - */ -static inline int -drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, -struct drm_exec *exec, -unsigned int num_fences) -{ - return drm_exec_prepare_obj(exec, gpuvm->r_obj, num_fences); -} +int drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, +struct drm_exec *exec, +unsigned int num_fences); int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
[PATCH drm-misc-next 5/5] drm/imagination: vm: make use of GPUVM's drm_exec helper
Make use of GPUVM's drm_exec helper functions preventing direct access to GPUVM internal data structures, such as the external object list. This is especially important to ensure following the locking rules around the GPUVM external object list. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/imagination/pvr_vm.c | 16 +--- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c index e0d74d9a6190..3f7888f5cc53 100644 --- a/drivers/gpu/drm/imagination/pvr_vm.c +++ b/drivers/gpu/drm/imagination/pvr_vm.c @@ -337,27 +337,21 @@ static int pvr_vm_bind_op_lock_resvs(struct drm_exec *exec, struct pvr_vm_bind_op *bind_op) { drm_exec_until_all_locked(exec) { - struct drm_gem_object *r_obj = _op->vm_ctx->dummy_gem; struct drm_gpuvm *gpuvm = _op->vm_ctx->gpuvm_mgr; struct pvr_gem_object *pvr_obj = bind_op->pvr_obj; - struct drm_gpuvm_bo *gpuvm_bo; /* Acquire lock on the vm_context's reserve object. */ - int err = drm_exec_lock_obj(exec, r_obj); + int err = drm_gpuvm_prepare_vm(gpuvm, exec, 0); drm_exec_retry_on_contention(exec); if (err) return err; /* Acquire lock on all BOs in the context. */ - list_for_each_entry(gpuvm_bo, >extobj.list, - list.entry.extobj) { - err = drm_exec_lock_obj(exec, gpuvm_bo->obj); - - drm_exec_retry_on_contention(exec); - if (err) - return err; - } + err = drm_gpuvm_prepare_objects(gpuvm, exec, 0); + drm_exec_retry_on_contention(exec); + if (err) + return err; /* Unmap operations don't have an object to lock. */ if (!pvr_obj) -- 2.42.0
[PATCH drm-misc-next 4/5] drm/gpuvm: fall back to drm_exec_lock_obj()
Fall back to drm_exec_lock_obj() if num_fences is zero for the drm_gpuvm_prepare_* function family. Otherwise dma_resv_reserve_fences() would actually allocate slots even though num_fences is zero. Cc: Christian König Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c | 36 +--- include/drm/drm_gpuvm.h | 23 +++ 2 files changed, 36 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 54f5e8851de5..d1d1c2379e44 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -1085,6 +1085,36 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm) } EXPORT_SYMBOL_GPL(drm_gpuvm_put); +static int +exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj, +unsigned int num_fences) +{ + return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) : + drm_exec_lock_obj(exec, obj); +} + +/** + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv + * @gpuvm: the _gpuvm + * @exec: the _exec context + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object. + * + * Using this function directly, it is the drivers responsibility to call + * drm_exec_init() and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, +struct drm_exec *exec, +unsigned int num_fences) +{ + return exec_prepare_obj(exec, gpuvm->r_obj, num_fences); +} +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm); + static int __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, struct drm_exec *exec, @@ -1095,7 +1125,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, int ret = 0; for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; } @@ -1116,7 +1146,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm, drm_gpuvm_resv_assert_held(gpuvm); list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) { - ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences); + ret = exec_prepare_obj(exec, vm_bo->obj, num_fences); if (ret) break; @@ -1186,7 +1216,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec, drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) { struct drm_gem_object *obj = va->gem.obj; - ret = drm_exec_prepare_obj(exec, obj, num_fences); + ret = exec_prepare_obj(exec, obj, num_fences); if (ret) return ret; } diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index f94fec9a8517..b3f82ec7fb17 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -544,26 +544,9 @@ struct drm_gpuvm_exec { } extra; }; -/** - * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv - * @gpuvm: the _gpuvm - * @exec: the _exec context - * @num_fences: the amount of _fences to reserve - * - * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object. - * - * Using this function directly, it is the drivers responsibility to call - * drm_exec_init() and drm_exec_fini() accordingly. - * - * Returns: 0 on success, negative error code on failure. - */ -static inline int -drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, -struct drm_exec *exec, -unsigned int num_fences) -{ - return drm_exec_prepare_obj(exec, gpuvm->r_obj, num_fences); -} +int drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, +struct drm_exec *exec, +unsigned int num_fences); int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, struct drm_exec *exec, -- 2.42.0
[PATCH drm-misc-next 3/5] drm/imagination: vm: fix drm_gpuvm reference count
The driver specific reference count indicates whether the VM should be teared down, whereas GPUVM's reference count indicates whether the VM structure can finally be freed. Hence, free the VM structure in pvr_gpuvm_free() and drop the last GPUVM reference after tearing down the VM. Generally, this prevents lifetime issues such as the VM being freed as long as drm_gpuvm_bo structures still hold references to the VM. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/imagination/pvr_vm.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c index 1e89092c3dcc..e0d74d9a6190 100644 --- a/drivers/gpu/drm/imagination/pvr_vm.c +++ b/drivers/gpu/drm/imagination/pvr_vm.c @@ -64,6 +64,12 @@ struct pvr_vm_context { struct drm_gem_object dummy_gem; }; +static inline +struct pvr_vm_context *to_pvr_vm_context(struct drm_gpuvm *gpuvm) +{ + return container_of(gpuvm, struct pvr_vm_context, gpuvm_mgr); +} + struct pvr_vm_context *pvr_vm_context_get(struct pvr_vm_context *vm_ctx) { if (vm_ctx) @@ -535,7 +541,7 @@ pvr_device_addr_and_size_are_valid(struct pvr_vm_context *vm_ctx, void pvr_gpuvm_free(struct drm_gpuvm *gpuvm) { - + kfree(to_pvr_vm_context(gpuvm)); } static const struct drm_gpuvm_ops pvr_vm_gpuva_ops = { @@ -655,12 +661,11 @@ pvr_vm_context_release(struct kref *ref_count) WARN_ON(pvr_vm_unmap(vm_ctx, vm_ctx->gpuvm_mgr.mm_start, vm_ctx->gpuvm_mgr.mm_range)); - drm_gpuvm_put(_ctx->gpuvm_mgr); pvr_mmu_context_destroy(vm_ctx->mmu_ctx); drm_gem_private_object_fini(_ctx->dummy_gem); mutex_destroy(_ctx->lock); - kfree(vm_ctx); + drm_gpuvm_put(_ctx->gpuvm_mgr); } /** -- 2.42.0
[PATCH drm-misc-next 0/5] PowerVR VM fixes
Hi, Some major GPUVM changes landed just before v8 of the PowerVR series. Since v8 went in rather quickly (review process was finished otherwise) I haven't had the chance to review the subsequent code changes. Hence, this series with a few fixes in this context. Plus a minor GPUVM patch to make the drm_gpuvm_prepare_* helpers useful for PowerVR. - Danilo Danilo Krummrich (5): drm/imagination: vm: prevent duplicate drm_gpuvm_bo instances drm/imagination: vm: check for drm_gpuvm_range_valid() drm/imagination: vm: fix drm_gpuvm reference count drm/gpuvm: fall back to drm_exec_lock_obj() drm/imagination: vm: make use of GPUVM's drm_exec helper drivers/gpu/drm/drm_gpuvm.c | 36 -- drivers/gpu/drm/imagination/pvr_vm.c | 46 +++- drivers/gpu/drm/imagination/pvr_vm.h | 3 +- include/drm/drm_gpuvm.h | 23 ++ 4 files changed, 63 insertions(+), 45 deletions(-) base-commit: 46990918f35c1bf6e367cf8e0423e7344fec9fcb -- 2.42.0
[PATCH drm-misc-next 2/5] drm/imagination: vm: check for drm_gpuvm_range_valid()
Extend pvr_device_addr_and_size_are_valid() by the corresponding GPUVM sanity checks. This includes a, previously missing, overflow check for the base address and size of the requested mapping. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/imagination/pvr_vm.c | 9 ++--- drivers/gpu/drm/imagination/pvr_vm.h | 3 ++- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c index 09d481c575b0..1e89092c3dcc 100644 --- a/drivers/gpu/drm/imagination/pvr_vm.c +++ b/drivers/gpu/drm/imagination/pvr_vm.c @@ -239,7 +239,7 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op, return -EINVAL; } - if (!pvr_device_addr_and_size_are_valid(device_addr, size) || + if (!pvr_device_addr_and_size_are_valid(vm_ctx, device_addr, size) || offset & ~PAGE_MASK || size & ~PAGE_MASK || offset >= pvr_obj_size || offset_plus_size > pvr_obj_size) return -EINVAL; @@ -295,7 +295,7 @@ pvr_vm_bind_op_unmap_init(struct pvr_vm_bind_op *bind_op, { int err; - if (!pvr_device_addr_and_size_are_valid(device_addr, size)) + if (!pvr_device_addr_and_size_are_valid(vm_ctx, device_addr, size)) return -EINVAL; bind_op->type = PVR_VM_BIND_TYPE_UNMAP; @@ -505,6 +505,7 @@ pvr_device_addr_is_valid(u64 device_addr) /** * pvr_device_addr_and_size_are_valid() - Tests whether a device-virtual * address and associated size are both valid. + * @vm_ctx: Target VM context. * @device_addr: Virtual device address to test. * @size: Size of the range based at @device_addr to test. * @@ -523,9 +524,11 @@ pvr_device_addr_is_valid(u64 device_addr) * * %false otherwise. */ bool -pvr_device_addr_and_size_are_valid(u64 device_addr, u64 size) +pvr_device_addr_and_size_are_valid(struct pvr_vm_context *vm_ctx, + u64 device_addr, u64 size) { return pvr_device_addr_is_valid(device_addr) && + drm_gpuvm_range_valid(_ctx->gpuvm_mgr, device_addr, size) && size != 0 && (size & ~PVR_DEVICE_PAGE_MASK) == 0 && (device_addr + size <= PVR_PAGE_TABLE_ADDR_SPACE_SIZE); } diff --git a/drivers/gpu/drm/imagination/pvr_vm.h b/drivers/gpu/drm/imagination/pvr_vm.h index cf8b97553dc8..f2a6463f2b05 100644 --- a/drivers/gpu/drm/imagination/pvr_vm.h +++ b/drivers/gpu/drm/imagination/pvr_vm.h @@ -29,7 +29,8 @@ struct drm_exec; /* Functions defined in pvr_vm.c */ bool pvr_device_addr_is_valid(u64 device_addr); -bool pvr_device_addr_and_size_are_valid(u64 device_addr, u64 size); +bool pvr_device_addr_and_size_are_valid(struct pvr_vm_context *vm_ctx, + u64 device_addr, u64 size); struct pvr_vm_context *pvr_vm_create_context(struct pvr_device *pvr_dev, bool is_userspace_context); -- 2.42.0
[PATCH drm-misc-next 1/5] drm/imagination: vm: prevent duplicate drm_gpuvm_bo instances
Use drm_gpuvm_bo_obtain() instead of drm_gpuvm_bo_create(). The latter should only be used in conjunction with drm_gpuvm_bo_obtain_prealloc(). drm_gpuvm_bo_obtain() re-uses existing instances of a given VM and BO combination, whereas drm_gpuvm_bo_create() would always create a new instance of struct drm_gpuvm_bo and hence leave us with duplicates. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/imagination/pvr_vm.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/imagination/pvr_vm.c b/drivers/gpu/drm/imagination/pvr_vm.c index 3ad1366294b9..09d481c575b0 100644 --- a/drivers/gpu/drm/imagination/pvr_vm.c +++ b/drivers/gpu/drm/imagination/pvr_vm.c @@ -224,6 +224,7 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op, struct pvr_gem_object *pvr_obj, u64 offset, u64 device_addr, u64 size) { + struct drm_gem_object *obj = gem_from_pvr_gem(pvr_obj); const bool is_user = vm_ctx == vm_ctx->pvr_dev->kernel_vm_ctx; const u64 pvr_obj_size = pvr_gem_object_size(pvr_obj); struct sg_table *sgt; @@ -245,10 +246,11 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op, bind_op->type = PVR_VM_BIND_TYPE_MAP; - bind_op->gpuvm_bo = drm_gpuvm_bo_create(_ctx->gpuvm_mgr, - gem_from_pvr_gem(pvr_obj)); - if (!bind_op->gpuvm_bo) - return -ENOMEM; + dma_resv_lock(obj->resv, NULL); + bind_op->gpuvm_bo = drm_gpuvm_bo_obtain(_ctx->gpuvm_mgr, obj); + dma_resv_unlock(obj->resv); + if (IS_ERR(bind_op->gpuvm_bo)) + return PTR_ERR(bind_op->gpuvm_bo); bind_op->new_va = kzalloc(sizeof(*bind_op->new_va), GFP_KERNEL); bind_op->prev_va = kzalloc(sizeof(*bind_op->prev_va), GFP_KERNEL); -- 2.42.0
Re: [PATCH v3] driver: gpu: Fixing warning directly dereferencing a rcu pointer
On 11/13/23 20:13, Abhinav Singh wrote: This patch fixes a sparse warning with this message "warning:dereference of noderef expression". In this context it means we are dereferencing a __rcu tagged pointer directly. We should not be directly dereferencing a rcu pointer. To get a normal (non __rcu tagged pointer) from a __rcu tagged pointer we are using the function unrcu_pointer(...). The non __rcu tagged pointer then can be dereferenced just like a normal pointer. I tested with qemu with this command qemu-system-x86_64 \ -m 2G \ -smp 2 \ -kernel bzImage \ -append "console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \ -drive file=bullseye.img,format=raw \ -net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 \ -net nic,model=e1000 \ -enable-kvm \ -nographic \ -pidfile vm.pid \ 2>&1 | tee vm.log with lockdep enabled. Signed-off-by: Abhinav Singh Applied, thanks! There are a few more such occurrences. [1][2] Plan to fix them as well? [1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nv10_fence.c#L35 [2] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nv84_fence.c#L88 --- v1 -> v2 : Replaced the rcu_dereference(...) with unrcu_pointer(...) and also removed the rcu locking and unlocking function call. v2 -> v3 : Changed the description of the patch to match it with the actual implementation. drivers/gpu/drm/nouveau/nv04_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nv04_fence.c b/drivers/gpu/drm/nouveau/nv04_fence.c index 5b71a5a5cd85..cdbc75e3d1f6 100644 --- a/drivers/gpu/drm/nouveau/nv04_fence.c +++ b/drivers/gpu/drm/nouveau/nv04_fence.c @@ -39,7 +39,7 @@ struct nv04_fence_priv { static int nv04_fence_emit(struct nouveau_fence *fence) { - struct nvif_push *push = fence->channel->chan.push; + struct nvif_push *push = unrcu_pointer(fence->channel)->chan.push; int ret = PUSH_WAIT(push, 2); if (ret == 0) { PUSH_NVSQ(push, NV_SW, 0x0150, fence->base.seqno);
Re: [PATCH 1/2] drm/scheduler: improve GPU scheduler documentation v2
Hi Christian, Thanks for sending an update of this patch! On Thu, Nov 16, 2023 at 03:15:46PM +0100, Christian König wrote: > Start to improve the scheduler document. Especially document the > lifetime of each of the objects as well as the restrictions around > DMA-fence handling and userspace compatibility. > > v2: Some improvements suggested by Danilo, add section about error > handling. > > Signed-off-by: Christian König > --- > Documentation/gpu/drm-mm.rst | 36 + > drivers/gpu/drm/scheduler/sched_main.c | 174 + > 2 files changed, 188 insertions(+), 22 deletions(-) > > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst > index acc5901ac840..112463fa9f3a 100644 > --- a/Documentation/gpu/drm-mm.rst > +++ b/Documentation/gpu/drm-mm.rst > @@ -552,12 +552,48 @@ Overview > .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > :doc: Overview > > +Job Object > +-- > + > +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > + :doc: Job Object > + > +Entity Object > +- > + > +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > + :doc: Entity Object > + > +Hardware Fence Object > +- > + > +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > + :doc: Hardware Fence Object > + > +Scheduler Fence Object > +-- > + > +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > + :doc: Scheduler Fence Object > + > +Scheduler and Run Queue Objects > +--- > + > +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > + :doc: Scheduler and Run Queue Objects > + > Flow Control > > > .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > :doc: Flow Control > > +Error and Timeout handling > +-- > + > +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > + :doc: Error and Timeout handling > + > Scheduler Function References > - > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 044a8c4875ba..026123497b0e 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -24,28 +24,122 @@ > /** > * DOC: Overview > * > - * The GPU scheduler provides entities which allow userspace to push jobs > - * into software queues which are then scheduled on a hardware run queue. > - * The software queues have a priority among them. The scheduler selects the > entities > - * from the run queue using a FIFO. The scheduler provides dependency > handling > - * features among jobs. The driver is supposed to provide callback functions > for > - * backend operations to the scheduler like submitting a job to hardware run > queue, > - * returning the dependencies of a job etc. > - * > - * The organisation of the scheduler is the following: > - * > - * 1. Each hw run queue has one scheduler > - * 2. Each scheduler has multiple run queues with different priorities > - *(e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL) > - * 3. Each scheduler run queue has a queue of entities to schedule > - * 4. Entities themselves maintain a queue of jobs that will be scheduled on > - *the hardware. > - * > - * The jobs in a entity are always scheduled in the order that they were > pushed. > - * > - * Note that once a job was taken from the entities queue and pushed to the > - * hardware, i.e. the pending queue, the entity must not be referenced > anymore > - * through the jobs entity pointer. > + * The GPU scheduler implements some logic to decide which command submission > + * to push next to the hardware. Another major use case of the GPU scheduler > + * is to enforce correct driver behavior around those command submissions. > + * Because of this it's also used by drivers which don't need the actual > + * scheduling functionality. This reads a bit like we're already right in the middle of the documentation. I'd propose to start with something like "The DRM GPU scheduler serves as a common component intended to help drivers to manage command submissions." And then mention the different solutions the scheduler provides, e.g. ordering of command submissions, dma-fence handling in the context of command submissions, etc. Also, I think it would be good to give a rough overview of the topology of the scheduler's components. And since you already mention that the component is "also used by drivers which don't need the actual scheduling functionality", I'd also mention the special case of a single entity and a single run-queue per scheduler. > + * > + * All callbacks the driver needs to implement are restricted by DMA-fence > + * signaling rules to guarantee deadlock free forward progress. This > especially > + * means that for normal operation no memory can be allocated in a callback. > + * All memory which is needed for pushing the job to the hardware must be > + * allocated before