Re: Device / Driver and PCI Rust abstractions

2024-05-20 Thread Danilo Krummrich
https://lore.kernel.org/rust-for-linux/20240520172554.182094-1-d...@redhat.com/



[RFC PATCH 8/8] nova: add initial driver stub

2024-05-20 Thread Danilo Krummrich
Add the initial driver stub of Nova, a Rust-based GSP-only driver for
Nvidia GPUs. Nova, in the long term, is intended to serve as the
successor of Nouveau for GSP-firmware-based GPUs. [1]

As a stub driver Nova's focus is to make use of the most basic device /
driver infrastructure required to build a DRM driver on the PCI bus and
serve as demonstration example and justification for this
infrastructure.

In further consequence, the idea is to develop Nova continuously
upstream, using those increments to lift further Rust abstractions and
infrastructure upstream.

Link: https://lore.kernel.org/dri-devel/Zfsj0_tb-0-tNrJy@cassiopeiae/T/#u [1]
Signed-off-by: Danilo Krummrich 
---
 MAINTAINERS|  10 ++
 drivers/gpu/drm/Kconfig|   2 +
 drivers/gpu/drm/Makefile   |   1 +
 drivers/gpu/drm/nova/Kconfig   |  11 +++
 drivers/gpu/drm/nova/Makefile  |   3 +
 drivers/gpu/drm/nova/driver.rs | 110 +
 drivers/gpu/drm/nova/file.rs   |  71 ++
 drivers/gpu/drm/nova/gem.rs|  50 ++
 drivers/gpu/drm/nova/gpu.rs| 172 +
 drivers/gpu/drm/nova/nova.rs   |  20 
 include/uapi/drm/nova_drm.h| 101 +++
 rust/uapi/uapi_helper.h|   1 +
 12 files changed, 552 insertions(+)
 create mode 100644 drivers/gpu/drm/nova/Kconfig
 create mode 100644 drivers/gpu/drm/nova/Makefile
 create mode 100644 drivers/gpu/drm/nova/driver.rs
 create mode 100644 drivers/gpu/drm/nova/file.rs
 create mode 100644 drivers/gpu/drm/nova/gem.rs
 create mode 100644 drivers/gpu/drm/nova/gpu.rs
 create mode 100644 drivers/gpu/drm/nova/nova.rs
 create mode 100644 include/uapi/drm/nova_drm.h

diff --git a/MAINTAINERS b/MAINTAINERS
index aa3b947fb080..1ca0ea445e3f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6863,6 +6863,16 @@ T:   git 
https://gitlab.freedesktop.org/drm/nouveau.git
 F: drivers/gpu/drm/nouveau/
 F: include/uapi/drm/nouveau_drm.h
 
+DRM DRIVER (STUB) FOR NVIDIA GSP GPUS [RUST]
+M: Danilo Krummrich 
+L: dri-devel@lists.freedesktop.org
+L: nouv...@lists.freedesktop.org
+S: Supported
+C: irc://irc.oftc.net/nouveau
+T: git https://gitlab.freedesktop.org/drm/nova.git
+F: drivers/gpu/drm/nova/
+F: include/uapi/drm/nova_drm.h
+
 DRM DRIVER FOR OLIMEX LCD-OLINUXINO PANELS
 M: Stefan Mavrodiev 
 S: Maintained
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 5a0c476361c3..0bb0442a252e 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -274,6 +274,8 @@ source "drivers/gpu/drm/amd/amdgpu/Kconfig"
 
 source "drivers/gpu/drm/nouveau/Kconfig"
 
+source "drivers/gpu/drm/nova/Kconfig"
+
 source "drivers/gpu/drm/i915/Kconfig"
 
 source "drivers/gpu/drm/xe/Kconfig"
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 104b42df2e95..de6dc006cb7f 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -143,6 +143,7 @@ obj-$(CONFIG_DRM_VMWGFX)+= vmwgfx/
 obj-$(CONFIG_DRM_VGEM) += vgem/
 obj-$(CONFIG_DRM_VKMS) += vkms/
 obj-$(CONFIG_DRM_NOUVEAU) +=nouveau/
+obj-$(CONFIG_DRM_NOVA_STUB) += nova/
 obj-$(CONFIG_DRM_EXYNOS) +=exynos/
 obj-$(CONFIG_DRM_ROCKCHIP) +=rockchip/
 obj-$(CONFIG_DRM_GMA500) += gma500/
diff --git a/drivers/gpu/drm/nova/Kconfig b/drivers/gpu/drm/nova/Kconfig
new file mode 100644
index ..1ae2d1322a92
--- /dev/null
+++ b/drivers/gpu/drm/nova/Kconfig
@@ -0,0 +1,11 @@
+config DRM_NOVA_STUB
+   tristate "Nova GPU driver stub"
+   depends on DRM
+   depends on PCI
+   depends on RUST
+   default n
+   help
+ Choose this if you want to build the Nova stub driver for Nvidia
+ GSP-based GPUs.
+
+ If M is selected, the module will be called nova.
diff --git a/drivers/gpu/drm/nova/Makefile b/drivers/gpu/drm/nova/Makefile
new file mode 100644
index ..733ac5fb9f4f
--- /dev/null
+++ b/drivers/gpu/drm/nova/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_DRM_NOVA_STUB) += nova.o
diff --git a/drivers/gpu/drm/nova/driver.rs b/drivers/gpu/drm/nova/driver.rs
new file mode 100644
index ..47eaa2eddf16
--- /dev/null
+++ b/drivers/gpu/drm/nova/driver.rs
@@ -0,0 +1,110 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use alloc::boxed::Box;
+use core::pin::Pin;
+use kernel::{
+bindings, c_str, device, driver, drm,
+drm::{drv, ioctl},
+pci,
+pci::define_pci_id_table,
+prelude::*,
+sync::Arc,
+};
+
+use crate::{file::File, gpu::Gpu};
+
+pub(crate) struct NovaDriver;
+
+/// Convienence type alias for the DRM device type for this driver
+pub(crate) type NovaDevice = drm::device::Device;
+
+#[allow(dead_code)]
+pub(crate) struct NovaData {
+pub(crate) gpu: Arc,
+pub(crate) pdev: pci::Device,
+}
+
+type DeviceData = device::Data, NovaData>;
+
+const INFO: drm::drv::DriverInfo = drm::drv::DriverInfo {
+major: 0,
+minor: 0,

[RFC PATCH 7/8] rust: add firmware abstractions

2024-05-20 Thread Danilo Krummrich
Add an abstraction around the kernels firmware API to request firmware
images. The abstraction provides functions to access the firmware
buffer and / or copy it to a new buffer allocated with a given allocator
backend.

The firmware is released once the abstraction is dropped.

Signed-off-by: Danilo Krummrich 
---
 rust/bindings/bindings_helper.h |  1 +
 rust/kernel/firmware.rs | 74 +
 rust/kernel/lib.rs  |  1 +
 3 files changed, 76 insertions(+)
 create mode 100644 rust/kernel/firmware.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index b245db8d5a87..e4ffc47da5ec 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/rust/kernel/firmware.rs b/rust/kernel/firmware.rs
new file mode 100644
index ..700504fb3c9c
--- /dev/null
+++ b/rust/kernel/firmware.rs
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Firmware abstraction
+//!
+//! C header: 
[`include/linux/firmware.h`](../../../../include/linux/firmware.h")
+
+use crate::{bindings, device::Device, error::Error, error::Result, str::CStr, 
types::Opaque};
+
+/// Abstraction around a C firmware struct.
+///
+/// This is a simple abstraction around the C firmware API. Just like with the 
C API, firmware can
+/// be requested. Once requested the abstraction provides direct access to the 
firmware buffer as
+/// `&[u8]`. Alternatively, the firmware can be copied to a new buffer using 
`Firmware::copy`. The
+/// firmware is released once [`Firmware`] is dropped.
+///
+/// # Examples
+///
+/// ```
+/// let fw = Firmware::request("path/to/firmware.bin", dev.as_ref())?;
+/// driver_load_firmware(fw.data());
+/// ```
+pub struct Firmware(Opaque<*const bindings::firmware>);
+
+impl Firmware {
+/// Send a firmware request and wait for it. See also 
`bindings::request_firmware`.
+pub fn request(name: , dev: ) -> Result {
+let fw = Opaque::uninit();
+
+let ret = unsafe { bindings::request_firmware(fw.get(), 
name.as_char_ptr(), dev.as_raw()) };
+if ret != 0 {
+return Err(Error::from_errno(ret));
+}
+
+Ok(Firmware(fw))
+}
+
+/// Send a request for an optional fw module. See also 
`bindings::request_firmware_nowarn`.
+pub fn request_nowarn(name: , dev: ) -> Result {
+let fw = Opaque::uninit();
+
+let ret = unsafe {
+bindings::firmware_request_nowarn(fw.get(), name.as_char_ptr(), 
dev.as_raw())
+};
+if ret != 0 {
+return Err(Error::from_errno(ret));
+}
+
+Ok(Firmware(fw))
+}
+
+/// Returns the size of the requested firmware in bytes.
+pub fn size() -> usize {
+unsafe { (*(*self.0.get())).size }
+}
+
+/// Returns the requested firmware as `&[u8]`.
+pub fn data() -> &[u8] {
+unsafe { core::slice::from_raw_parts((*(*self.0.get())).data, 
self.size()) }
+}
+}
+
+impl Drop for Firmware {
+fn drop( self) {
+unsafe { bindings::release_firmware(*self.0.get()) };
+}
+}
+
+// SAFETY: `Firmware` only holds a pointer to a C firmware struct, which is 
safe to be used from any
+// thread.
+unsafe impl Send for Firmware {}
+
+// SAFETY: `Firmware` only holds a pointer to a C firmware struct, references 
to which are safe to
+// be used from any thread.
+unsafe impl Sync for Firmware {}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 6415968ee3b8..ed97d131661a 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -35,6 +35,7 @@
 #[cfg(CONFIG_DRM = "y")]
 pub mod drm;
 pub mod error;
+pub mod firmware;
 pub mod init;
 pub mod ioctl;
 #[cfg(CONFIG_KUNIT)]
-- 
2.45.1



[RFC PATCH 6/8] rust: drm: gem: Add GEM object abstraction

2024-05-20 Thread Danilo Krummrich
From: Asahi Lina 

The DRM GEM subsystem is the DRM memory management subsystem used by
most modern drivers. Add a Rust abstraction to allow Rust DRM driver
implementations to use it.

Signed-off-by: Asahi Lina 
Co-developed-by: Danilo Krummrich 
Signed-off-by: Danilo Krummrich 
---
 rust/bindings/bindings_helper.h |   1 +
 rust/helpers.c  |  23 ++
 rust/kernel/drm/drv.rs  |   4 +-
 rust/kernel/drm/gem/mod.rs  | 406 
 rust/kernel/drm/mod.rs  |   1 +
 5 files changed, 433 insertions(+), 2 deletions(-)
 create mode 100644 rust/kernel/drm/gem/mod.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index c591811ccb67..b245db8d5a87 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/rust/helpers.c b/rust/helpers.c
index dc2405772b1a..30e86bf00337 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -20,6 +20,7 @@
  * Sorted alphabetically.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -302,6 +303,28 @@ u64 rust_helper_pci_resource_len(struct pci_dev *pdev, int 
barnr)
return pci_resource_len(pdev, barnr);
 }
 
+#ifdef CONFIG_DRM
+
+void rust_helper_drm_gem_object_get(struct drm_gem_object *obj)
+{
+   drm_gem_object_get(obj);
+}
+EXPORT_SYMBOL_GPL(rust_helper_drm_gem_object_get);
+
+void rust_helper_drm_gem_object_put(struct drm_gem_object *obj)
+{
+   drm_gem_object_put(obj);
+}
+EXPORT_SYMBOL_GPL(rust_helper_drm_gem_object_put);
+
+__u64 rust_helper_drm_vma_node_offset_addr(struct drm_vma_offset_node *node)
+{
+   return drm_vma_node_offset_addr(node);
+}
+EXPORT_SYMBOL_GPL(rust_helper_drm_vma_node_offset_addr);
+
+#endif
+
 /*
  * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can
  * use it in contexts where Rust expects a `usize` like slice (array) indices.
diff --git a/rust/kernel/drm/drv.rs b/rust/kernel/drm/drv.rs
index c5a63663ea21..063b420f57e5 100644
--- a/rust/kernel/drm/drv.rs
+++ b/rust/kernel/drm/drv.rs
@@ -113,7 +113,7 @@ pub struct AllocOps {
 }
 
 /// Trait for memory manager implementations. Implemented internally.
-pub trait AllocImpl: Sealed {
+pub trait AllocImpl: Sealed + drm::gem::IntoGEMObject {
 /// The C callback operations for this memory manager.
 const ALLOC_OPS: AllocOps;
 }
@@ -243,7 +243,7 @@ pub fn new(parent: ::Device) -> Result {
 drm: drm::device::Device::new(parent, )?,
 registered: false,
 vtable,
-fops: Default::default(), // TODO: GEM abstraction
+fops: drm::gem::create_fops(),
 _pin: PhantomPinned,
 _p: PhantomData,
 })
diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs
new file mode 100644
index ..4cd85d5f1df8
--- /dev/null
+++ b/rust/kernel/drm/gem/mod.rs
@@ -0,0 +1,406 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+//! DRM GEM API
+//!
+//! C header: 
[`include/linux/drm/drm_gem.h`](../../../../include/linux/drm/drm_gem.h)
+
+use alloc::boxed::Box;
+
+use crate::{
+alloc::flags::*,
+bindings,
+drm::{device, drv, file},
+error::{to_result, Result},
+prelude::*,
+};
+use core::{marker::PhantomPinned, mem, ops::Deref, ops::DerefMut};
+
+/// GEM object functions, which must be implemented by drivers.
+pub trait BaseDriverObject: Sync + Send + Sized {
+/// Create a new driver data object for a GEM object of a given size.
+fn new(dev: ::Device, size: usize) -> impl PinInit;
+
+/// Open a new handle to an existing object, associated with a File.
+fn open(
+_obj: &<::Driver as drv::Driver>::Object,
+_file: ::File<<::Driver as 
drv::Driver>::File>,
+) -> Result {
+Ok(())
+}
+
+/// Close a handle to an existing object, associated with a File.
+fn close(
+_obj: &<::Driver as drv::Driver>::Object,
+_file: ::File<<::Driver as 
drv::Driver>::File>,
+) {
+}
+}
+
+/// Trait that represents a GEM object subtype
+pub trait IntoGEMObject: Sized + crate::private::Sealed {
+/// Owning driver for this type
+type Driver: drv::Driver;
+
+/// Returns a reference to the raw `drm_gem_object` structure, which must 
be valid as long as
+/// this owning object is valid.
+fn gem_obj() -> ::drm_gem_object;
+
+/// Converts a pointer to a `drm_gem_object` into a pointer to this type.
+fn from_gem_obj(obj: *mut bindings::drm_gem_object) -> *mut Self;
+}
+
+/// Trait which must be implemented by drivers using base GEM objects.
+pub trait DriverObject: BaseDriverObject> {
+/// Parent `Driver` for this object.
+type Driver: drv::Driver;
+}
+
+unsafe extern "C" fn free_callback(obj: *mut 
bindings::drm_gem_object) {
+// SAFETY: All of our objects are Object.
+let this = unsafe { cr

[RFC PATCH 5/8] rust: drm: file: Add File abstraction

2024-05-20 Thread Danilo Krummrich
From: Asahi Lina 

A DRM File is the DRM counterpart to a kernel file structure,
representing an open DRM file descriptor. Add a Rust abstraction to
allow drivers to implement their own File types that implement the
DriverFile trait.

Signed-off-by: Asahi Lina 
Signed-off-by: Danilo Krummrich 
---
 rust/bindings/bindings_helper.h |   1 +
 rust/kernel/drm/drv.rs  |   7 +-
 rust/kernel/drm/file.rs | 113 
 rust/kernel/drm/mod.rs  |   1 +
 4 files changed, 120 insertions(+), 2 deletions(-)
 create mode 100644 rust/kernel/drm/file.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 831fbfe03a47..c591811ccb67 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/rust/kernel/drm/drv.rs b/rust/kernel/drm/drv.rs
index 5dd8f3f8df7c..c5a63663ea21 100644
--- a/rust/kernel/drm/drv.rs
+++ b/rust/kernel/drm/drv.rs
@@ -131,6 +131,9 @@ pub trait Driver {
 /// Should be either `drm::gem::Object` or `drm::gem::shmem::Object`.
 type Object: AllocImpl;
 
+/// The type used to represent a DRM File (client)
+type File: drm::file::DriverFile;
+
 /// Driver metadata
 const INFO: DriverInfo;
 
@@ -200,8 +203,8 @@ macro_rules! drm_device_register {
 impl Registration {
 const VTABLE: bindings::drm_driver = drm_legacy_fields! {
 load: None,
-open: None, // TODO: File abstraction
-postclose: None, // TODO: File abstraction
+open: Some(drm::file::open_callback::),
+postclose: Some(drm::file::postclose_callback::),
 lastclose: None,
 unload: None,
 release: None,
diff --git a/rust/kernel/drm/file.rs b/rust/kernel/drm/file.rs
new file mode 100644
index ..a25b8c61f4ee
--- /dev/null
+++ b/rust/kernel/drm/file.rs
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+//! DRM File objects.
+//!
+//! C header: 
[`include/linux/drm/drm_file.h`](../../../../include/linux/drm/drm_file.h)
+
+use crate::{bindings, drm, error::Result};
+use alloc::boxed::Box;
+use core::marker::PhantomData;
+use core::pin::Pin;
+
+/// Trait that must be implemented by DRM drivers to represent a DRM File (a 
client instance).
+pub trait DriverFile {
+/// The parent `Driver` implementation for this `DriverFile`.
+type Driver: drm::drv::Driver;
+
+/// Open a new file (called when a client opens the DRM device).
+fn open(device: ::device::Device) -> 
Result>>;
+}
+
+/// An open DRM File.
+///
+/// # Invariants
+/// `raw` is a valid pointer to a `drm_file` struct.
+#[repr(transparent)]
+pub struct File {
+raw: *mut bindings::drm_file,
+_p: PhantomData,
+}
+
+pub(super) unsafe extern "C" fn open_callback(
+raw_dev: *mut bindings::drm_device,
+raw_file: *mut bindings::drm_file,
+) -> core::ffi::c_int {
+let drm = unsafe { drm::device::Device::borrow(raw_dev) };
+// SAFETY: This reference won't escape this function
+let file = unsafe {  *raw_file };
+
+let inner = match T::open(drm) {
+Err(e) => {
+return e.to_errno();
+}
+Ok(i) => i,
+};
+
+// SAFETY: This pointer is treated as pinned, and the Drop guarantee is 
upheld below.
+file.driver_priv = Box::into_raw(unsafe { Pin::into_inner_unchecked(inner) 
}) as *mut _;
+
+0
+}
+
+pub(super) unsafe extern "C" fn postclose_callback(
+_dev: *mut bindings::drm_device,
+raw_file: *mut bindings::drm_file,
+) {
+// SAFETY: This reference won't escape this function
+let file = unsafe { &*raw_file };
+
+// Drop the DriverFile
+unsafe {
+let _ = Box::from_raw(file.driver_priv as *mut T);
+};
+}
+
+impl File {
+// Not intended to be called externally, except via declare_drm_ioctls!()
+#[doc(hidden)]
+pub unsafe fn from_raw(raw_file: *mut bindings::drm_file) -> File {
+File {
+raw: raw_file,
+_p: PhantomData,
+}
+}
+
+#[allow(dead_code)]
+/// Return the raw pointer to the underlying `drm_file`.
+pub(super) fn raw() -> *const bindings::drm_file {
+self.raw
+}
+
+/// Return an immutable reference to the raw `drm_file` structure.
+pub(super) fn file() -> ::drm_file {
+unsafe { &*self.raw }
+}
+
+/// Return a pinned reference to the driver file structure.
+pub fn inner() -> Pin<> {
+unsafe { Pin::new_unchecked(&*(self.file().driver_priv as *const T)) }
+}
+}
+
+impl crate::private::Sealed for File {}
+
+/// Generic trait to allow users that don't care about driver specifics to 
accept any File.
+///
+/// # Safety
+/// Must only be implemented for File and return the pointer, following the 
normal invariants
+/// of that type.
+pub unsafe trait GenericFile: crate::private::Sealed {
+//

[RFC PATCH 4/8] rust: drm: implement `AsRef` for DRM device

2024-05-20 Thread Danilo Krummrich
Implement `AsRef` for `drm::device::Device` such that
`dev_*` print macros can be used conveniently.

Signed-off-by: Danilo Krummrich 
---
 rust/kernel/drm/device.rs | 8 
 1 file changed, 8 insertions(+)

diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs
index f72bab8dd42d..aef947893dab 100644
--- a/rust/kernel/drm/device.rs
+++ b/rust/kernel/drm/device.rs
@@ -79,6 +79,14 @@ unsafe fn dec_ref(obj: NonNull) {
 }
 }
 
+impl AsRef for Device {
+fn as_ref() -> ::Device {
+// SAFETY: `bindings::drm_device::dev` is valid as long as the DRM 
device itself is valid,
+// which is guaranteed by the type invariant.
+unsafe { device::Device::as_ref((*self.as_raw()).dev) }
+}
+}
+
 // SAFETY: `Device` only holds a pointer to a C device, which is safe to be 
used from any thread.
 unsafe impl Send for Device {}
 
-- 
2.45.1



[RFC PATCH 4/8] rust: drm: implement `AsRef` for DRM device

2024-05-20 Thread Danilo Krummrich
Implement `AsRef` for `drm::device::Device` such that
`dev_*` print macros can be used conveniently.

Signed-off-by: Danilo Krummrich 
---
 rust/kernel/drm/device.rs | 8 
 1 file changed, 8 insertions(+)

diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs
index f72bab8dd42d..aef947893dab 100644
--- a/rust/kernel/drm/device.rs
+++ b/rust/kernel/drm/device.rs
@@ -79,6 +79,14 @@ unsafe fn dec_ref(obj: NonNull) {
 }
 }
 
+impl AsRef for Device {
+fn as_ref() -> ::Device {
+// SAFETY: `bindings::drm_device::dev` is valid as long as the DRM 
device itself is valid,
+// which is guaranteed by the type invariant.
+unsafe { device::Device::as_ref((*self.as_raw()).dev) }
+}
+}
+
 // SAFETY: `Device` only holds a pointer to a C device, which is safe to be 
used from any thread.
 unsafe impl Send for Device {}
 
-- 
2.45.1



[RFC PATCH 3/8] rust: drm: Add Device and Driver abstractions

2024-05-20 Thread Danilo Krummrich
From: Asahi Lina 

Add abstractions for DRM drivers and devices. These go together in one
commit since both are fairly tightly coupled types.

A few things have been stubbed out, to be implemented as further bits of
the DRM subsystem are introduced.

Signed-off-by: Asahi Lina 
Co-developed-by: Danilo Krummrich 
Signed-off-by: Danilo Krummrich 
---
 rust/bindings/bindings_helper.h |   2 +
 rust/kernel/drm/device.rs   |  87 +
 rust/kernel/drm/drv.rs  | 318 
 rust/kernel/drm/mod.rs  |   2 +
 4 files changed, 409 insertions(+)
 create mode 100644 rust/kernel/drm/device.rs
 create mode 100644 rust/kernel/drm/drv.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 14188034285a..831fbfe03a47 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -6,6 +6,8 @@
  * Sorted alphabetically.
  */
 
+#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs
new file mode 100644
index ..f72bab8dd42d
--- /dev/null
+++ b/rust/kernel/drm/device.rs
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+//! DRM device.
+//!
+//! C header: 
[`include/linux/drm/drm_device.h`](../../../../include/linux/drm/drm_device.h)
+
+use crate::{
+bindings, device, drm,
+error::code::*,
+error::from_err_ptr,
+error::Result,
+types::{ARef, AlwaysRefCounted, ForeignOwnable, Opaque},
+};
+use alloc::boxed::Box;
+use core::{ffi::c_void, marker::PhantomData, pin::Pin, ptr::NonNull};
+
+/// A typed DRM device with a specific driver. The device is always 
reference-counted.
+#[repr(transparent)]
+pub struct Device(Opaque, 
PhantomData);
+
+impl Device {
+pub(crate) fn new(
+dev: ::Device,
+vtable: >,
+) -> Result> {
+let raw_drm = unsafe { bindings::drm_dev_alloc(&**vtable, 
dev.as_raw()) };
+let raw_drm = NonNull::new(from_err_ptr(raw_drm)? as *mut 
_).ok_or(ENOMEM)?;
+
+// SAFETY: The reference count is one, and now we take ownership of 
that reference as a
+// drm::device::Device.
+Ok(unsafe { ARef::from_raw(raw_drm) })
+}
+
+pub(crate) fn as_raw() -> *mut bindings::drm_device {
+self.0.get()
+}
+
+// Not intended to be called externally, except via declare_drm_ioctls!()
+#[doc(hidden)]
+pub unsafe fn borrow<'a>(raw: *const bindings::drm_device) -> &'a Self {
+unsafe { &*(raw as *const Self) }
+}
+
+pub(crate) fn raw_data() -> *const c_void {
+// SAFETY: `self` is guaranteed to hold a valid `bindings::drm_device` 
pointer.
+unsafe { *self.as_raw() }.dev_private
+}
+
+// SAFETY: Must be called only once after device creation.
+pub(crate) unsafe fn set_raw_data(, ptr: *const c_void) {
+// SAFETY: Safe as by the safety precondition.
+unsafe {  *self.as_raw() }.dev_private = ptr as _;
+}
+
+/// Returns a borrowed reference to the user data associated with this 
Device.
+pub fn data() -> Option<::Borrowed<'_>> {
+let dev_private = self.raw_data();
+
+if dev_private.is_null() {
+None
+} else {
+// SAFETY: `dev_private` is NULL before the DRM device is 
registered; after the DRM
+// device has been registered dev_private is guaranteed to be 
valid.
+Some(unsafe { T::Data::borrow(dev_private) })
+}
+}
+}
+
+// SAFETY: DRM device objects are always reference counted and the get/put 
functions
+// satisfy the requirements.
+unsafe impl AlwaysRefCounted for Device {
+fn inc_ref() {
+unsafe { bindings::drm_dev_get(_raw() as *const _ as *mut _) };
+}
+
+unsafe fn dec_ref(obj: NonNull) {
+// SAFETY: The Device type has the same layout as drm_device, so we 
can just cast.
+unsafe { bindings::drm_dev_put(obj.as_ptr() as *mut _) };
+}
+}
+
+// SAFETY: `Device` only holds a pointer to a C device, which is safe to be 
used from any thread.
+unsafe impl Send for Device {}
+
+// SAFETY: `Device` only holds a pointer to a C device, references to which 
are safe to be used
+// from any thread.
+unsafe impl Sync for Device {}
diff --git a/rust/kernel/drm/drv.rs b/rust/kernel/drm/drv.rs
new file mode 100644
index ..5dd8f3f8df7c
--- /dev/null
+++ b/rust/kernel/drm/drv.rs
@@ -0,0 +1,318 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+//! DRM driver core.
+//!
+//! C header: 
[`include/linux/drm/drm_drv.h`](../../../../include/linux/drm/drm_drv.h)
+
+use crate::{
+alloc::flags::*,
+bindings, device, drm,
+error::code::*,
+error::{Error, Result},
+prelude::*,
+private::Sealed,
+str::CStr,
+types::{ARef, ForeignOwnable},
+ThisModule,
+};
+use core::{
+marker::{PhantomData, PhantomPinned},
+pin::Pin,
+};
+use macros::vtable;
+
+/// Driver use the GE

[RFC PATCH 2/8] rust: Add a Sealed trait

2024-05-20 Thread Danilo Krummrich
From: Asahi Lina 

Some traits exposed by the kernel crate may not be intended to be
implemented by downstream modules. Add a Sealed trait to allow avoiding
this using the sealed trait pattern.

Signed-off-by: Asahi Lina 
Signed-off-by: Danilo Krummrich 
---
 rust/kernel/lib.rs | 5 +
 1 file changed, 5 insertions(+)

diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index cb0dd3d76729..6415968ee3b8 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -65,6 +65,11 @@
 #[doc(hidden)]
 pub use build_error::build_error;
 
+pub(crate) mod private {
+#[allow(unreachable_pub)]
+pub trait Sealed {}
+}
+
 /// Prefix to appear before log messages printed from within the `kernel` 
crate.
 const __LOG_PREFIX: &[u8] = b"rust_kernel\0";
 
-- 
2.45.1



[RFC PATCH 1/8] rust: drm: ioctl: Add DRM ioctl abstraction

2024-05-20 Thread Danilo Krummrich
From: Asahi Lina 

DRM drivers need to be able to declare which driver-specific ioctls they
support. Add an abstraction implementing the required types and a helper
macro to generate the ioctl definition inside the DRM driver.

Note that this macro is not usable until further bits of the abstraction
are in place (but it will not fail to compile on its own, if not called).

Signed-off-by: Asahi Lina 
Signed-off-by: Danilo Krummrich 
---
 rust/bindings/bindings_helper.h |   1 +
 rust/kernel/drm/ioctl.rs| 153 
 rust/kernel/drm/mod.rs  |   5 ++
 rust/kernel/lib.rs  |   2 +
 rust/uapi/uapi_helper.h |   1 +
 5 files changed, 162 insertions(+)
 create mode 100644 rust/kernel/drm/ioctl.rs
 create mode 100644 rust/kernel/drm/mod.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 32221de16e57..14188034285a 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -6,6 +6,7 @@
  * Sorted alphabetically.
  */
 
+#include 
 #include 
 #include 
 #include 
diff --git a/rust/kernel/drm/ioctl.rs b/rust/kernel/drm/ioctl.rs
new file mode 100644
index ..79d720e9d18e
--- /dev/null
+++ b/rust/kernel/drm/ioctl.rs
@@ -0,0 +1,153 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+#![allow(non_snake_case)]
+
+//! DRM IOCTL definitions.
+//!
+//! C header: 
[`include/linux/drm/drm_ioctl.h`](../../../../include/linux/drm/drm_ioctl.h)
+
+use crate::ioctl;
+
+const BASE: u32 = uapi::DRM_IOCTL_BASE as u32;
+
+/// Construct a DRM ioctl number with no argument.
+#[inline(always)]
+pub const fn IO(nr: u32) -> u32 {
+ioctl::_IO(BASE, nr)
+}
+
+/// Construct a DRM ioctl number with a read-only argument.
+#[inline(always)]
+pub const fn IOR(nr: u32) -> u32 {
+ioctl::_IOR::(BASE, nr)
+}
+
+/// Construct a DRM ioctl number with a write-only argument.
+#[inline(always)]
+pub const fn IOW(nr: u32) -> u32 {
+ioctl::_IOW::(BASE, nr)
+}
+
+/// Construct a DRM ioctl number with a read-write argument.
+#[inline(always)]
+pub const fn IOWR(nr: u32) -> u32 {
+ioctl::_IOWR::(BASE, nr)
+}
+
+/// Descriptor type for DRM ioctls. Use the `declare_drm_ioctls!{}` macro to 
construct them.
+pub type DrmIoctlDescriptor = bindings::drm_ioctl_desc;
+
+/// This is for ioctl which are used for rendering, and require that the file 
descriptor is either
+/// for a render node, or if it’s a legacy/primary node, then it must be 
authenticated.
+pub const AUTH: u32 = bindings::drm_ioctl_flags_DRM_AUTH;
+
+/// This must be set for any ioctl which can change the modeset or display 
state. Userspace must
+/// call the ioctl through a primary node, while it is the active master.
+///
+/// Note that read-only modeset ioctl can also be called by unauthenticated 
clients, or when a
+/// master is not the currently active one.
+pub const MASTER: u32 = bindings::drm_ioctl_flags_DRM_MASTER;
+
+/// Anything that could potentially wreak a master file descriptor needs to 
have this flag set.
+///
+/// Current that’s only for the SETMASTER and DROPMASTER ioctl, which e.g. 
logind can call to
+/// force a non-behaving master (display compositor) into compliance.
+///
+/// This is equivalent to callers with the SYSADMIN capability.
+pub const ROOT_ONLY: u32 = bindings::drm_ioctl_flags_DRM_ROOT_ONLY;
+
+/// This is used for all ioctl needed for rendering only, for drivers which 
support render nodes.
+/// This should be all new render drivers, and hence it should be always set 
for any ioctl with
+/// `AUTH` set. Note though that read-only query ioctl might have this set, 
but have not set
+/// DRM_AUTH because they do not require authentication.
+pub const RENDER_ALLOW: u32 = bindings::drm_ioctl_flags_DRM_RENDER_ALLOW;
+
+/// Internal structures used by the [`declare_drm_ioctls!{}`] macro. Do not 
use directly.
+#[doc(hidden)]
+pub mod internal {
+pub use bindings::drm_device;
+pub use bindings::drm_file;
+pub use bindings::drm_ioctl_desc;
+}
+
+/// Declare the DRM ioctls for a driver.
+///
+/// Each entry in the list should have the form:
+///
+/// `(ioctl_number, argument_type, flags, user_callback),`
+///
+/// `argument_type` is the type name within the `bindings` crate.
+/// `user_callback` should have the following prototype:
+///
+/// ```
+/// fn foo(device: ::drm::device::Device,
+///data:  bindings::argument_type,
+///file: ::drm::file::File,
+/// )
+/// ```
+/// where `Self` is the drm::drv::Driver implementation these ioctls are being 
declared within.
+///
+/// # Examples
+///
+/// ```
+/// kernel::declare_drm_ioctls! {
+/// (FOO_GET_PARAM, drm_foo_get_param, ioctl::RENDER_ALLOW, 
my_get_param_handler),
+/// }
+/// ```
+///
+#[macro_export]
+macro_rules! declare_drm_ioctls {
+( $(($cmd:ident, $struct:ident, $flags:expr, $func:expr)),* $(,)? ) => {
+const IOCTLS: &'static [$crate::drm::ioctl::DrmIoctlDescriptor] = {
+use $crate::uapi::*;

[RFC PATCH 0/8] [RFC] DRM Rust abstractions and Nova

2024-05-20 Thread Danilo Krummrich
This patch series implements some basic DRM Rust abstractions and a stub
implementation of the Nova GPU driver.

Nova is intended to be developed upstream, starting out with just a stub driver
to lift some initial required infrastructure upstream. A more detailed
explanation can be found in [1].

This patch series is based on the "Device / Driver and PCI Rust abstractions"
series [2].

The DRM bits can also be found in [3] and the Nova bits in [4].

[1] https://lore.kernel.org/dri-devel/Zfsj0_tb-0-tNrJy@cassiopeiae/T/#u
[2] Reply to this mail titled "Device / Driver and PCI Rust abstractions".
[3] https://gitlab.freedesktop.org/drm/misc/kernel/-/tree/topic/rust-drm
[4] https://gitlab.freedesktop.org/drm/nova/-/tree/nova-next

Asahi Lina (5):
  rust: drm: ioctl: Add DRM ioctl abstraction
  rust: Add a Sealed trait
  rust: drm: Add Device and Driver abstractions
  rust: drm: file: Add File abstraction
  rust: drm: gem: Add GEM object abstraction

Danilo Krummrich (3):
  rust: drm: implement `AsRef` for DRM device
  rust: add firmware abstractions
  nova: add initial driver stub

 MAINTAINERS |  10 +
 drivers/gpu/drm/Kconfig |   2 +
 drivers/gpu/drm/Makefile|   1 +
 drivers/gpu/drm/nova/Kconfig|  11 +
 drivers/gpu/drm/nova/Makefile   |   3 +
 drivers/gpu/drm/nova/driver.rs  | 110 +
 drivers/gpu/drm/nova/file.rs|  71 ++
 drivers/gpu/drm/nova/gem.rs |  50 
 drivers/gpu/drm/nova/gpu.rs | 172 ++
 drivers/gpu/drm/nova/nova.rs|  20 ++
 include/uapi/drm/nova_drm.h | 101 
 rust/bindings/bindings_helper.h |   6 +
 rust/helpers.c  |  23 ++
 rust/kernel/drm/device.rs   |  95 
 rust/kernel/drm/drv.rs  | 321 +
 rust/kernel/drm/file.rs | 113 +
 rust/kernel/drm/gem/mod.rs  | 406 
 rust/kernel/drm/ioctl.rs| 153 
 rust/kernel/drm/mod.rs  |   9 +
 rust/kernel/firmware.rs |  74 ++
 rust/kernel/lib.rs  |   8 +
 rust/uapi/uapi_helper.h |   2 +
 22 files changed, 1761 insertions(+)
 create mode 100644 drivers/gpu/drm/nova/Kconfig
 create mode 100644 drivers/gpu/drm/nova/Makefile
 create mode 100644 drivers/gpu/drm/nova/driver.rs
 create mode 100644 drivers/gpu/drm/nova/file.rs
 create mode 100644 drivers/gpu/drm/nova/gem.rs
 create mode 100644 drivers/gpu/drm/nova/gpu.rs
 create mode 100644 drivers/gpu/drm/nova/nova.rs
 create mode 100644 include/uapi/drm/nova_drm.h
 create mode 100644 rust/kernel/drm/device.rs
 create mode 100644 rust/kernel/drm/drv.rs
 create mode 100644 rust/kernel/drm/file.rs
 create mode 100644 rust/kernel/drm/gem/mod.rs
 create mode 100644 rust/kernel/drm/ioctl.rs
 create mode 100644 rust/kernel/drm/mod.rs
 create mode 100644 rust/kernel/firmware.rs


base-commit: 4a67fc8e2330cbd57b49d8ea703ffbf7292ef828
-- 
2.45.1



Re: [PATCH v4] drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations

2024-05-13 Thread Danilo Krummrich

Hi Mohamed,

Thank you for fixing this up!

On 5/9/24 22:43, Mohamed Ahmed wrote:

Allows PTE kind and tile mode on BO create with VM_BIND,
and adds a GETPARAM to indicate this change. This is needed to support


It's usually better to use imperative verb form for commit messages. No
need to send a new version though.


modifiers in NVK and ensure correctness when dealing with the nouveau
GL driver.

The userspace modifiers implementation this is for can be found here:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795

Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
Signed-off-by: Mohamed Ahmed 


Applied to drm-misc-next-fixes.

Generally, please make sure to use scripts/get_maintainer.pl before sending
patches.

- Danilo


---
  drivers/gpu/drm/nouveau/nouveau_abi16.c |  3 ++
  drivers/gpu/drm/nouveau/nouveau_bo.c| 44 +++--
  include/uapi/drm/nouveau_drm.h  |  7 
  3 files changed, 29 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c 
b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index 80f74ee0f..47e53e17b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -272,6 +272,9 @@ nouveau_abi16_ioctl_getparam(ABI16_IOCTL_ARGS)
getparam->value = (u64)ttm_resource_manager_usage(vram_mgr);
break;
}
+   case NOUVEAU_GETPARAM_HAS_VMA_TILEMODE:
+   getparam->value = 1;
+   break;
default:
NV_PRINTK(dbg, cli, "unknown parameter %lld\n", 
getparam->param);
return -EINVAL;
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index db8cbf615..186add400 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -241,28 +241,28 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int 
*align, u32 domain,
}
  
  	nvbo->contig = !(tile_flags & NOUVEAU_GEM_TILE_NONCONTIG);

-   if (!nouveau_cli_uvmm(cli) || internal) {
-   /* for BO noVM allocs, don't assign kinds */
-   if (cli->device.info.family >= NV_DEVICE_INFO_V0_FERMI) {
-   nvbo->kind = (tile_flags & 0xff00) >> 8;
-   if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
-   kfree(nvbo);
-   return ERR_PTR(-EINVAL);
-   }
  
-			nvbo->comp = mmu->kind[nvbo->kind] != nvbo->kind;

-   } else if (cli->device.info.family >= NV_DEVICE_INFO_V0_TESLA) {
-   nvbo->kind = (tile_flags & 0x7f00) >> 8;
-   nvbo->comp = (tile_flags & 0x0003) >> 16;
-   if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
-   kfree(nvbo);
-   return ERR_PTR(-EINVAL);
-   }
-   } else {
-   nvbo->zeta = (tile_flags & 0x0007);
+   if (cli->device.info.family >= NV_DEVICE_INFO_V0_FERMI) {
+   nvbo->kind = (tile_flags & 0xff00) >> 8;
+   if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
+   kfree(nvbo);
+   return ERR_PTR(-EINVAL);
+   }
+
+   nvbo->comp = mmu->kind[nvbo->kind] != nvbo->kind;
+   } else if (cli->device.info.family >= NV_DEVICE_INFO_V0_TESLA) {
+   nvbo->kind = (tile_flags & 0x7f00) >> 8;
+   nvbo->comp = (tile_flags & 0x0003) >> 16;
+   if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) {
+   kfree(nvbo);
+   return ERR_PTR(-EINVAL);
}
-   nvbo->mode = tile_mode;
+   } else {
+   nvbo->zeta = (tile_flags & 0x0007);
+   }
+   nvbo->mode = tile_mode;
  
+	if (!nouveau_cli_uvmm(cli) || internal) {

/* Determine the desirable target GPU page size for the buffer. 
*/
for (i = 0; i < vmm->page_nr; i++) {
/* Because we cannot currently allow VMM maps to fail
@@ -304,12 +304,6 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int 
*align, u32 domain,
}
nvbo->page = vmm->page[pi].shift;
} else {
-   /* reject other tile flags when in VM mode. */
-   if (tile_mode)
-   return ERR_PTR(-EINVAL);
-   if (tile_flags & ~NOUVEAU_GEM_TILE_NONCONTIG)
-   return ERR_PTR(-EINVAL);
-
/* Determine the desirable target GPU page size for the buffer. 
*/
for (i = 0; i < vmm->page_nr; i++) {
/* Because we cannot currently allow VMM maps to fail
diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index cd84227f1..5402f77ee 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ 

Re: [PATCH] drm/i915: fix missing linux/debugfs.h includes

2024-04-30 Thread Danilo Krummrich

On 4/30/24 16:53, Jani Nikula wrote:

On Tue, 30 Apr 2024, Danilo Krummrich  wrote:

After dropping linux/debugfs.h include from drm/drm_print.h the following
files in i915 miss the linux/debugfs.h include: i915_debugfs.c,
i915_debugfs_params.c and i915_gpu_error.c.

Add the include to fix the corresponding build errors.

Reported-by: kernel test robot 
Fixes: 33d5ae6cacf4 ("drm/print: drop include debugfs.h and include where 
needed")
Closes: 
https://lore.kernel.org/oe-kbuild-all/202404260726.nyguvl59-...@intel.com/
Signed-off-by: Danilo Krummrich 


Thanks, but it's already fixed by commit 7fa043eafdb7 ("drm/i915: fix
build with missing debugfs includes") in drm-next.


Even better, note that drm-misc-next is still broken though.

- Danilo



BR,
Jani.



---
  drivers/gpu/drm/i915/i915_debugfs.c| 1 +
  drivers/gpu/drm/i915/i915_debugfs_params.c | 1 +
  drivers/gpu/drm/i915/i915_gpu_error.c  | 1 +
  3 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 24c78873b3cf..b552e27cdcd5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -26,6 +26,7 @@
   *
   */
  
+#include 

  #include 
  #include 
  #include 
diff --git a/drivers/gpu/drm/i915/i915_debugfs_params.c 
b/drivers/gpu/drm/i915/i915_debugfs_params.c
index 8bca02025e09..295486b8ceb1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs_params.c
+++ b/drivers/gpu/drm/i915/i915_debugfs_params.c
@@ -3,6 +3,7 @@
   * Copyright © 2019 Intel Corporation
   */
  
+#include 

  #include 
  
  #include "i915_debugfs_params.h"

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2594eb10c559..625b3c024540 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -28,6 +28,7 @@
   */
  
  #include 

+#include 
  #include 
  #include 
  #include 

base-commit: 4a9a567ab101e659a4fafb7a691ff6b84531a10a






Re: [PATCH] nouveau: Add missing break statement

2024-04-30 Thread Danilo Krummrich

On 4/30/24 15:18, Chaitanya Kumar Borah wrote:

Add the missing break statement that causes the following build error

  CC [M]  drivers/gpu/drm/i915/display/intel_display_device.o
../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function 
‘build_registry’:
../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label 
at end of compound statement
1266 |   default:
  |   ^~~
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/imu_v11_0.o
make[7]: *** [../scripts/Makefile.build:244: 
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1
make[7]: *** Waiting for unfinished jobs

Fixes: b58a0bc904ff ("nouveau: add command-line GSP-RM registry support")
Closes: 
https://lore.kernel.org/all/913052ca6c0988db1bab293cfae38529251b4594.ca...@nvidia.com/T/#m3c9acebac754f2e74a85b76c858c093bb1aacaf0
Signed-off-by: Chaitanya Kumar Borah 


Applied to drm-misc-next, thanks!


---
  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 0b46db5c77b8..63619512e7f6 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1264,6 +1264,7 @@ static void build_registry(struct nvkm_gsp *gsp, 
PACKED_REGISTRY_TABLE *registry
str_offset += reg->vlen;
break;
default:
+   break;
}
  
  		i++;




[PATCH] drm/i915: fix missing linux/debugfs.h includes

2024-04-30 Thread Danilo Krummrich
After dropping linux/debugfs.h include from drm/drm_print.h the following
files in i915 miss the linux/debugfs.h include: i915_debugfs.c,
i915_debugfs_params.c and i915_gpu_error.c.

Add the include to fix the corresponding build errors.

Reported-by: kernel test robot 
Fixes: 33d5ae6cacf4 ("drm/print: drop include debugfs.h and include where 
needed")
Closes: 
https://lore.kernel.org/oe-kbuild-all/202404260726.nyguvl59-...@intel.com/
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 1 +
 drivers/gpu/drm/i915/i915_debugfs_params.c | 1 +
 drivers/gpu/drm/i915/i915_gpu_error.c  | 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 24c78873b3cf..b552e27cdcd5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -26,6 +26,7 @@
  *
  */
 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/gpu/drm/i915/i915_debugfs_params.c 
b/drivers/gpu/drm/i915/i915_debugfs_params.c
index 8bca02025e09..295486b8ceb1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs_params.c
+++ b/drivers/gpu/drm/i915/i915_debugfs_params.c
@@ -3,6 +3,7 @@
  * Copyright © 2019 Intel Corporation
  */
 
+#include 
 #include 
 
 #include "i915_debugfs_params.h"
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2594eb10c559..625b3c024540 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -28,6 +28,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 

base-commit: 4a9a567ab101e659a4fafb7a691ff6b84531a10a
-- 
2.44.0



Re: [PATCH] [v7] nouveau: add command-line GSP-RM registry support

2024-04-30 Thread Danilo Krummrich

On 4/30/24 15:06, Lucas De Marchi wrote:

On Fri, Apr 26, 2024 at 06:08:19PM GMT, Danilo Krummrich wrote:

On 4/25/24 18:38, Timur Tabi wrote:

On Thu, 2024-04-25 at 15:22 +0200, Danilo Krummrich wrote:

+    size_t length;
+
+    /* Remove any whitespace from the parameter string */
+    length = strip(p, " \t\n");


With that, I see the following warning compiling this patch.

warning: variable ‘length’ set but not used [-Wunused-but-set-variable]

Did you intend to use the length for anything?


No, and I could have sworn I fixed that before sending out v7.  I think I
originally intended 'length' to determine when I finished parsing the
string.


Also, looking at the warning made me aware of 'p' potentially being NULL.

If you agree, I can fix the warning and add the corresponding NULL check
when
applying the patch.


Yes, that would be great.  You can just delete 'length'.  The NULL check for
'p' should call clean_registry() before returning -ENOMEM.


Applied to drm-misc-next, thanks!



since this commit our CI is failing to build with the following error:

   CC [M]  drivers/gpu/drm/i915/display/intel_display_device.o
 ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function 
‘build_registry’:
 ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label at 
end of compound statement
 1266 |   default:
   |   ^~~
   CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o
   HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h
   CC [M]  drivers/gpu/drm/amd/amdgpu/imu_v11_0.o
 make[7]: *** [../scripts/Makefile.build:244: 
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1
 make[7]: *** Waiting for unfinished jobs

you are missing a `break;` after that default.


@Timur, do you want to send a quick fix for this yourself? Otherwise, I can send
one as well.

We should fix this asap.

- Danilo








Thanks for catching this.








[PATCH] drm/nouveau: use vmemdup_array_user() in u_memcpya()

2024-04-15 Thread Danilo Krummrich
Now that we have vmemdup_array_user(), make use of it.

Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/nouveau/nouveau_drv.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index e239c6bf4afa..2038d60958e3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -190,11 +190,8 @@ static inline void *
 u_memcpya(uint64_t user, unsigned int nmemb, unsigned int size)
 {
void __user *userptr = u64_to_user_ptr(user);
-   size_t bytes;
 
-   if (unlikely(check_mul_overflow(nmemb, size, )))
-   return ERR_PTR(-EOVERFLOW);
-   return vmemdup_user(userptr, bytes);
+   return vmemdup_array_user(userptr, nmemb, size);
 }
 
 #include 

base-commit: a57e191ebbaa0363dbf352cc37447c2230573e29
prerequisite-patch-id: d18b2d6615c58084a7027f4f1d9b51bd3f9dd83f
-- 
2.44.0



[PATCH] drm/nouveau: fix duplicate pointer to struct drm_device

2024-04-15 Thread Danilo Krummrich
nouveau_uvmm_ioctl_vm_init() already has a pointer to struct drm_device,
no need to derive another one from struct drm_file.

Fixes: 266f7618e761 ("drm/nouveau: separately allocate struct nouveau_uvmm")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/nouveau/nouveau_uvmm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c 
b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index 0a0a11dc9ec0..929afd5bc773 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -1812,13 +1812,12 @@ static const struct drm_gpuvm_ops gpuvm_ops = {
 };
 
 int
-nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
+nouveau_uvmm_ioctl_vm_init(struct drm_device *drm,
   void *data,
   struct drm_file *file_priv)
 {
struct nouveau_uvmm *uvmm;
struct nouveau_cli *cli = nouveau_cli(file_priv);
-   struct drm_device *drm = cli->drm->dev;
struct drm_gem_object *r_obj;
struct drm_nouveau_vm_init *init = data;
u64 kernel_managed_end;

base-commit: a57e191ebbaa0363dbf352cc37447c2230573e29
prerequisite-patch-id: 2d453c82161ab6a188caae12e558e8c8d6c1a450
-- 
2.44.0



Re: [PATCH] nouveau: fix instmem race condition around ptr stores

2024-04-15 Thread Danilo Krummrich

On 4/11/24 03:15, Dave Airlie wrote:

From: Dave Airlie 

Running a lot of VK CTS in parallel against nouveau, once every
few hours you might see something like this crash.

BUG: kernel NULL pointer dereference, address: 0008
PGD 800114e6e067 P4D 800114e6e067 PUD 109046067 PMD 0
Oops:  [#1] PREEMPT SMP PTI
CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS 
PRO WIFI-CF, BIOS F8 11/05/2021
RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 
da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 
8b 7e 08 48 89 d9 48 8d 75 04 48 c1
RSP: :ac20c5857838 EFLAGS: 00010202
RAX:  RBX: 004d8001 RCX: 0001
RDX: 004d8001 RSI: 06d8 RDI: a07afe332180
RBP: 06d8 R08: ac20c5857ad0 R09: 0010
R10: 0001 R11: a07af27e2de0 R12: 001c
R13: ac20c5857ad0 R14: a07a96fe9040 R15: 001c
FS:  7fe395eed7c0() GS:a07e2c98() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0008 CR3: 00011febe001 CR4: 003706f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:

...

  ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
  ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau]
  nvkm_vmm_iter+0x351/0xa20 [nouveau]
  ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  ? __lock_acquire+0x3ed/0x2170
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau]
  ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  nvkm_vmm_map_locked+0x224/0x3a0 [nouveau]

Adding any sort of useful debug usually makes it go away, so I hand
wrote the function in a line, and debugged the asm.

Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in
the nv50_instobj_acquire called from nvkm_kmap.

If Thread A and Thread B both get to nv50_instobj_acquire around
the same time, and Thread A hits the refcount_set line, and in
lockstep thread B succeeds at refcount_inc_not_zero, there is a
chance the ptrs value won't have been stored since refcount_set
is unordered. Force a memory barrier here, I picked smp_mb, since
we want it on all CPUs and it's write followed by a read.

v2: use paired smp_rmb/smp_wmb.

Cc: linux-stable
Signed-off-by: Dave Airlie 


Added a "Fixes:" tag and applied to drm-misc-fixes.


---
  drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c
index a7f3fc342d87..dd5b5a17ece0 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c
@@ -222,8 +222,11 @@ nv50_instobj_acquire(struct nvkm_memory *memory)
void __iomem *map = NULL;
  
  	/* Already mapped? */

-   if (refcount_inc_not_zero(>maps))
+   if (refcount_inc_not_zero(>maps)) {
+   /* read barrier match the wmb on refcount set */
+   smp_rmb();
return iobj->map;
+   }
  
  	/* Take the lock, and re-check that another thread hasn't

 * already mapped the object in the meantime.
@@ -250,6 +253,8 @@ nv50_instobj_acquire(struct nvkm_memory *memory)
iobj->base.memory.ptrs = _instobj_fast;
else
iobj->base.memory.ptrs = _instobj_slow;
+   /* barrier to ensure the ptrs are written before refcount is 
set */
+   smp_wmb();
refcount_set(>maps, 1);
}
  


Re: [PATCH v2] drm: nv04: Fix out of bounds access

2024-04-15 Thread Danilo Krummrich

On 4/11/24 13:08, Mikhail Kobuk wrote:

When Output Resource (dcb->or) value is assigned in
fabricate_dcb_output(), there may be out of bounds access to
dac_users array in case dcb->or is zero because ffs(dcb->or) is
used as index there.
The 'or' argument of fabricate_dcb_output() must be interpreted as a
number of bit to set, not value.

Utilize macros from 'enum nouveau_or' in calls instead of hardcoding.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for iMac G4")
Fixes: 670820c0e6a9 ("drm/nouveau: Workaround incorrect DCB entry on a GeForce3 Ti 
200.")
Signed-off-by: Mikhail Kobuk 


Applied to drm-misc-fixes, thanks!


---
Changes in v2:
- Instead of checking ffs(dcb->or), adjust function calls to match
   argument semantics
- Link to v1: 
https://lore.kernel.org/all/20240331064552.6112-1-m.ko...@ispras.ru/

  drivers/gpu/drm/nouveau/nouveau_bios.c | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c 
b/drivers/gpu/drm/nouveau/nouveau_bios.c
index 479effcf607e..79cfab53f80e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bios.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bios.c
@@ -23,6 +23,7 @@
   */
  
  #include "nouveau_drv.h"

+#include "nouveau_bios.h"
  #include "nouveau_reg.h"
  #include "dispnv04/hw.h"
  #include "nouveau_encoder.h"
@@ -1677,7 +1678,7 @@ apply_dcb_encoder_quirks(struct drm_device *dev, int idx, 
u32 *conn, u32 *conf)
 */
if (nv_match_device(dev, 0x0201, 0x1462, 0x8851)) {
if (*conn == 0xf2005014 && *conf == 0x) {
-   fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 1, 1, 1);
+   fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 1, 1, 
DCB_OUTPUT_B);
return false;
}
}
@@ -1763,26 +1764,26 @@ fabricate_dcb_encoder_table(struct drm_device *dev, 
struct nvbios *bios)
  #ifdef __powerpc__
/* Apple iMac G4 NV17 */
if (of_machine_is_compatible("PowerMac4,5")) {
-   fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, 1);
-   fabricate_dcb_output(dcb, DCB_OUTPUT_ANALOG, 1, all_heads, 2);
+   fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, 
DCB_OUTPUT_B);
+   fabricate_dcb_output(dcb, DCB_OUTPUT_ANALOG, 1, all_heads, 
DCB_OUTPUT_C);
return;
}
  #endif
  
  	/* Make up some sane defaults */

fabricate_dcb_output(dcb, DCB_OUTPUT_ANALOG,
-bios->legacy.i2c_indices.crt, 1, 1);
+bios->legacy.i2c_indices.crt, 1, DCB_OUTPUT_B);
  
  	if (nv04_tv_identify(dev, bios->legacy.i2c_indices.tv) >= 0)

fabricate_dcb_output(dcb, DCB_OUTPUT_TV,
 bios->legacy.i2c_indices.tv,
-all_heads, 0);
+all_heads, DCB_OUTPUT_A);
  
  	else if (bios->tmds.output0_script_ptr ||

 bios->tmds.output1_script_ptr)
fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS,
 bios->legacy.i2c_indices.panel,
-all_heads, 1);
+all_heads, DCB_OUTPUT_B);
  }
  
  static int




Re: [PATCH] drm: nv04: Add check to avoid out of bounds access

2024-04-10 Thread Danilo Krummrich

On 4/10/24 17:39, Mikhail Kobuk wrote:

On 08/04/2024 16:23, Danilo Krummrich wrote:

On 4/5/24 22:05, Lyude Paul wrote:

On Fri, 2024-04-05 at 17:53 +0200, Danilo Krummrich wrote:

On 3/31/24 08:45, Mikhail Kobuk wrote:

Output Resource (dcb->or) value is not guaranteed to be non-zero
(i.e.
in drivers/gpu/drm/nouveau/nouveau_bios.c, in
'fabricate_dcb_encoder_table()'
'dcb->or' is assigned value '0' in call to
'fabricate_dcb_output()').


I don't really know much about the semantics of this code.

Looking at fabricate_dcb_output() though I wonder if the intention
was to assign
BIT(or) to entry->or.

@Lyude, can you help here?


This code is definitely a bit before my time as well - but I think
you're completely correct. Especially considering this bit I found in
nouveau_bios.h:


Thanks for confirming.

@Mikhail, I think we should rather fix this assignment then.


Thank you all for a thorough look!



- Danilo



enum nouveau_or {
DCB_OUTPUT_A = (1 << 0),
DCB_OUTPUT_B = (1 << 1),
DCB_OUTPUT_C = (1 << 2)
};




Considering this code bit, and the fact that fabricate_dcb_output() is called 
in drivers/gpu/drm/nouveau/nouveau_bios.c only, there's option to adjust 
function calls instead of adding BIT(or), i.e.:

fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, DCB_OUTPUT_B);

instead of current:

fabricate_dcb_output(dcb, DCB_OUTPUT_TMDS, 0, all_heads, 1);

and etc.

Should I make a new patch with adjusted calls or stick with BIT(or)?


Please send a new patch adjusting the calls using enum nouveau_or, that
seems to be cleaner.

- Danilo





Otherwise, for parsing the DCB entries, it seems that the bound
checks are
happening in olddcb_outp_foreach() [1].

[1]
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nouveau_bios.c#L1331



Add check to validate 'dcb->or' before it's used.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for
iMac G4")
Signed-off-by: Mikhail Kobuk 
---
   drivers/gpu/drm/nouveau/dispnv04/dac.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/dac.c
b/drivers/gpu/drm/nouveau/dispnv04/dac.c
index d6b8e0cce2ac..0c8d4fc95ff3 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/dac.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/dac.c
@@ -428,7 +428,7 @@ void nv04_dac_update_dacclk(struct drm_encoder
*encoder, bool enable)
   struct drm_device *dev = encoder->dev;
   struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
-    if (nv_gf4_disp_arch(dev)) {
+    if (nv_gf4_disp_arch(dev) && ffs(dcb->or)) {
   uint32_t *dac_users = _display(dev)-

dac_users[ffs(dcb->or) - 1];

   int dacclk_off = NV_PRAMDAC_DACCLK +
nv04_dac_output_offset(encoder);
   uint32_t dacclk = NVReadRAMDAC(dev, 0,
dacclk_off);
@@ -453,7 +453,7 @@ bool nv04_dac_in_use(struct drm_encoder
*encoder)
   struct drm_device *dev = encoder->dev;
   struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
-    return nv_gf4_disp_arch(encoder->dev) &&
+    return nv_gf4_disp_arch(encoder->dev) && ffs(dcb->or) &&
   (nv04_display(dev)->dac_users[ffs(dcb->or) - 1] &
~(1 << dcb->index));
   }










Re: [PATCH] nouveau: fix instmem race condition around ptr stores

2024-04-09 Thread Danilo Krummrich

On 4/9/24 10:27, Lucas Stach wrote:

Am Dienstag, dem 09.04.2024 um 10:34 +1000 schrieb Dave Airlie:

From: Dave Airlie 

Running a lot of VK CTS in parallel against nouveau, once every
few hours you might see something like this crash.

BUG: kernel NULL pointer dereference, address: 0008
PGD 800114e6e067 P4D 800114e6e067 PUD 109046067 PMD 0
Oops:  [#1] PREEMPT SMP PTI
CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS 
PRO WIFI-CF, BIOS F8 11/05/2021
RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 
da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 
8b 7e 08 48 89 d9 48 8d 75 04 48 c1
RSP: :ac20c5857838 EFLAGS: 00010202
RAX:  RBX: 004d8001 RCX: 0001
RDX: 004d8001 RSI: 06d8 RDI: a07afe332180
RBP: 06d8 R08: ac20c5857ad0 R09: 0010
R10: 0001 R11: a07af27e2de0 R12: 001c
R13: ac20c5857ad0 R14: a07a96fe9040 R15: 001c
FS:  7fe395eed7c0() GS:a07e2c98() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0008 CR3: 00011febe001 CR4: 003706f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:

...

  ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
  ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau]
  nvkm_vmm_iter+0x351/0xa20 [nouveau]
  ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  ? __lock_acquire+0x3ed/0x2170
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau]
  ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
  ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
  nvkm_vmm_map_locked+0x224/0x3a0 [nouveau]

Adding any sort of useful debug usually makes it go away, so I hand
wrote the function in a line, and debugged the asm.

Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in
the nv50_instobj_acquire called from nvkm_kmap.

If Thread A and Thread B both get to nv50_instobj_acquire around
the same time, and Thread A hits the refcount_set line, and in
lockstep thread B succeeds at refcount_inc_not_zero, there is a
chance the ptrs value won't have been stored since refcount_set
is unordered. Force a memory barrier here, I picked smp_mb, since
we want it on all CPUs and it's write followed by a read.


Good catch!



Cc: linux-stable
Signed-off-by: Dave Airlie 
---
  drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c
index a7f3fc342d87..cbacc7b11f8c 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c
@@ -250,6 +250,9 @@ nv50_instobj_acquire(struct nvkm_memory *memory)
iobj->base.memory.ptrs = _instobj_fast;
else
iobj->base.memory.ptrs = _instobj_slow;
+   /* barrier to ensure ptrs is written before another thread
+  does refcount_inc_not_zero successfully. */
+   smp_mb();


Doesn't this miss the corresponding smp_rmb after
refcount_inc_not_zero? Without it a sufficiently speculating CPU might
still hoist the NULL ptr load across the refcount increase.


Agree, also think this one could be smp_wmb() only.

I also think it's reasonable to keep "the fast path refcount_inc_not_zero
that doesn't take the lock", since the scope for this being potentially racy
is limited to this function only.



Regards,
Lucas


Re: [PATCH] drm: nv04: Add check to avoid out of bounds access

2024-04-08 Thread Danilo Krummrich

On 4/5/24 22:05, Lyude Paul wrote:

On Fri, 2024-04-05 at 17:53 +0200, Danilo Krummrich wrote:

On 3/31/24 08:45, Mikhail Kobuk wrote:

Output Resource (dcb->or) value is not guaranteed to be non-zero
(i.e.
in drivers/gpu/drm/nouveau/nouveau_bios.c, in
'fabricate_dcb_encoder_table()'
'dcb->or' is assigned value '0' in call to
'fabricate_dcb_output()').


I don't really know much about the semantics of this code.

Looking at fabricate_dcb_output() though I wonder if the intention
was to assign
BIT(or) to entry->or.

@Lyude, can you help here?


This code is definitely a bit before my time as well - but I think
you're completely correct. Especially considering this bit I found in
nouveau_bios.h:


Thanks for confirming.

@Mikhail, I think we should rather fix this assignment then.

- Danilo



enum nouveau_or {
DCB_OUTPUT_A = (1 << 0),
DCB_OUTPUT_B = (1 << 1),
DCB_OUTPUT_C = (1 << 2)
};




Otherwise, for parsing the DCB entries, it seems that the bound
checks are
happening in olddcb_outp_foreach() [1].

[1]
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nouveau_bios.c#L1331



Add check to validate 'dcb->or' before it's used.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for
iMac G4")
Signed-off-by: Mikhail Kobuk 
---
   drivers/gpu/drm/nouveau/dispnv04/dac.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/dac.c
b/drivers/gpu/drm/nouveau/dispnv04/dac.c
index d6b8e0cce2ac..0c8d4fc95ff3 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/dac.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/dac.c
@@ -428,7 +428,7 @@ void nv04_dac_update_dacclk(struct drm_encoder
*encoder, bool enable)
    struct drm_device *dev = encoder->dev;
    struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
   
-	if (nv_gf4_disp_arch(dev)) {

+   if (nv_gf4_disp_arch(dev) && ffs(dcb->or)) {
    uint32_t *dac_users = _display(dev)-

dac_users[ffs(dcb->or) - 1];

    int dacclk_off = NV_PRAMDAC_DACCLK +
nv04_dac_output_offset(encoder);
    uint32_t dacclk = NVReadRAMDAC(dev, 0,
dacclk_off);
@@ -453,7 +453,7 @@ bool nv04_dac_in_use(struct drm_encoder
*encoder)
    struct drm_device *dev = encoder->dev;
    struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
   
-	return nv_gf4_disp_arch(encoder->dev) &&

+   return nv_gf4_disp_arch(encoder->dev) && ffs(dcb->or) &&
    (nv04_display(dev)->dac_users[ffs(dcb->or) - 1] &
~(1 << dcb->index));
   }
   






Re: [PATCH] nouveau: fix devinit paths to only handle display on GSP.

2024-04-08 Thread Danilo Krummrich

On 4/8/24 08:42, Dave Airlie wrote:

This reverts:
nouveau/gsp: don't check devinit disable on GSP.
and applies a further fix.

It turns out the open gpu driver, checks this register, but only for display.

Match that behaviour and only disable displays on turings.

Fixes: 5d4e8ae6e57b ("nouveau/gsp: don't check devinit disable on GSP.")
Signed-off-by: Dave Airlie 


Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c | 12 
  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c  |  1 +
  2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c
index 7bcbc4895ec2..271bfa038f5b 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/gm107.c
@@ -25,6 +25,7 @@
  
  #include 

  #include 
+#include 
  
  void

  gm107_devinit_disable(struct nvkm_devinit *init)
@@ -33,10 +34,13 @@ gm107_devinit_disable(struct nvkm_devinit *init)
u32 r021c00 = nvkm_rd32(device, 0x021c00);
u32 r021c04 = nvkm_rd32(device, 0x021c04);
  
-	if (r021c00 & 0x0001)

-   nvkm_subdev_disable(device, NVKM_ENGINE_CE, 0);
-   if (r021c00 & 0x0004)
-   nvkm_subdev_disable(device, NVKM_ENGINE_CE, 2);
+   /* gsp only wants to enable/disable display */
+   if (!nvkm_gsp_rm(device->gsp)) {
+   if (r021c00 & 0x0001)
+   nvkm_subdev_disable(device, NVKM_ENGINE_CE, 0);
+   if (r021c00 & 0x0004)
+   nvkm_subdev_disable(device, NVKM_ENGINE_CE, 2);
+   }
if (r021c04 & 0x0001)
nvkm_subdev_disable(device, NVKM_ENGINE_DISP, 0);
  }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
index 11b4c9c274a1..666eb93b1742 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
@@ -41,6 +41,7 @@ r535_devinit_new(const struct nvkm_devinit_func *hw,
  
  	rm->dtor = r535_devinit_dtor;

rm->post = hw->post;
+   rm->disable = hw->disable;
  
  	ret = nv50_devinit_new_(rm, device, type, inst, pdevinit);

if (ret)


Re: [PATCH] [v2] nouveau: fix function cast warning

2024-04-05 Thread Danilo Krummrich

On 4/4/24 18:02, Arnd Bergmann wrote:

From: Arnd Bergmann 

Calling a function through an incompatible pointer type causes breaks
kcfi, so clang warns about the assignment:

drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 
'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible 
function type [-Werror,-Wcast-function-type-strict]
73 | .fini = (void(*)(void *))kfree,

Avoid this with a trivial wrapper.

Fixes: c39f472e9f14 ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code 
changes)")
Signed-off-by: Arnd Bergmann 


Applied to drm-misc-fixes, thanks!


---
v2: avoid 'return kfree()' expression returning a void
---
  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c
index 4bf486b57101..cb05f7f48a98 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c
@@ -66,11 +66,16 @@ of_init(struct nvkm_bios *bios, const char *name)
return ERR_PTR(-EINVAL);
  }
  
+static void of_fini(void *p)

+{
+   kfree(p);
+}
+
  const struct nvbios_source
  nvbios_of = {
.name = "OpenFirmware",
.init = of_init,
-   .fini = (void(*)(void *))kfree,
+   .fini = of_fini,
.read = of_read,
.size = of_size,
.rw = false,




Re: [PATCH] nouveau/gsp: Avoid addressing beyond end of rpc->entries

2024-04-05 Thread Danilo Krummrich

On 3/30/24 15:12, Kees Cook wrote:

Using the end of rpc->entries[] for addressing runs into both compile-time
and run-time detection of accessing beyond the end of the array. Use the
base pointer instead, since was allocated with the additional bytes for
storing the strings. Avoids the following warning in future GCC releases
with support for __counted_by:

In function 'fortify_memcpy_chk',
 inlined from 'r535_gsp_rpc_set_registry' at 
../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1123:3:
../include/linux/fortify-string.h:553:25: error: call to 
'__write_overflow_field' declared with attribute warning: detected write beyond 
size of field (1st parameter); maybe use struct_group()? 
[-Werror=attribute-warning]
   553 | __write_overflow_field(p_size_field, size);
   | ^~

for this code:

strings = (char *)>entries[NV_GSP_REG_NUM_ENTRIES];
...
 memcpy(strings, r535_registry_entries[i].name, name_len);

Signed-off-by: Kees Cook 


Applied to drm-misc-fixes, thanks!


---
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: Ben Skeggs 
Cc: Timur Tabi 
Cc: dri-devel@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
---
  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 9994cbd6f1c4..9858c1438aa7 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1112,7 +1112,7 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp)
rpc->numEntries = NV_GSP_REG_NUM_ENTRIES;
  
  	str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]);

-   strings = (char *)>entries[NV_GSP_REG_NUM_ENTRIES];
+   strings = (char *)rpc + str_offset;
for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++) {
int name_len = strlen(r535_registry_entries[i].name) + 1;
  




Re: [PATCH v0 13/14] drm/nouveau: Make I2C terminology more inclusive

2024-04-05 Thread Danilo Krummrich

Hi Easwar,

On 3/29/24 18:00, Easwar Hariharan wrote:

I2C v7, SMBus 3.2, and I3C specifications have replaced "master/slave"
with more appropriate terms. Inspired by and following on to Wolfram's
series to fix drivers/i2c/[1], fix the terminology for users of
I2C_ALGOBIT bitbanging interface, now that the approved verbiage exists
in the specification.

Compile tested, no functionality changes intended

[1]: 
https://lore.kernel.org/all/20240322132619.6389-1-wsa+rene...@sang-engineering.com/

Signed-off-by: Easwar Hariharan 
---
  drivers/gpu/drm/nouveau/dispnv04/dfp.c | 14 +++---
  .../gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h |  2 +-
  drivers/gpu/drm/nouveau/nouveau_bios.c |  4 ++--
  3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/dfp.c 
b/drivers/gpu/drm/nouveau/dispnv04/dfp.c
index d5b129dc623b..65b791006b19 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/dfp.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/dfp.c
@@ -149,7 +149,7 @@ void nv04_dfp_update_fp_control(struct drm_encoder 
*encoder, int mode)
}
  }
  
-static struct drm_encoder *get_tmds_slave(struct drm_encoder *encoder)

+static struct drm_encoder *get_tmds_client(struct drm_encoder *encoder)
  {
struct drm_device *dev = encoder->dev;
struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
@@ -172,7 +172,7 @@ static struct drm_encoder *get_tmds_slave(struct 
drm_encoder *encoder)
struct dcb_output *slave_dcb = nouveau_encoder(slave)->dcb;
  
  		if (slave_dcb->type == DCB_OUTPUT_TMDS && get_slave_funcs(slave) &&

-   slave_dcb->tmdsconf.slave_addr == dcb->tmdsconf.slave_addr)
+   slave_dcb->tmdsconf.client_addr == 
dcb->tmdsconf.client_addr)
return slave;


While, personally, I think master/slave was well suiting for the device 
relationship
on those busses, I think that if we change it up to conform with the change in
specification, we should make sure to update drivers consistently.

We should make sure to also change the names of the sourrounding structures and 
variable
names, otherwise we just make this code harder to read.

- Danilo


}
  
@@ -471,7 +471,7 @@ static void nv04_dfp_commit(struct drm_encoder *encoder)

NVWriteRAMDAC(dev, 0, NV_PRAMDAC_TEST_CONTROL + 
nv04_dac_output_offset(encoder), 0x0010);
  
  	/* Init external transmitters */

-   slave_encoder = get_tmds_slave(encoder);
+   slave_encoder = get_tmds_client(encoder);
if (slave_encoder)
get_slave_funcs(slave_encoder)->mode_set(
slave_encoder, _encoder->mode, _encoder->mode);
@@ -621,7 +621,7 @@ static void nv04_dfp_destroy(struct drm_encoder *encoder)
kfree(nv_encoder);
  }
  
-static void nv04_tmds_slave_init(struct drm_encoder *encoder)

+static void nv04_tmds_client_init(struct drm_encoder *encoder)
  {
struct drm_device *dev = encoder->dev;
struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
@@ -632,7 +632,7 @@ static void nv04_tmds_slave_init(struct drm_encoder 
*encoder)
{
{
.type = "sil164",
-   .addr = (dcb->tmdsconf.slave_addr == 0x7 ? 0x3a : 0x38),
+   .addr = (dcb->tmdsconf.client_addr == 0x7 ? 0x3a : 
0x38),
.platform_data = &(struct sil164_encoder_params) {
SIL164_INPUT_EDGE_RISING
 }
@@ -642,7 +642,7 @@ static void nv04_tmds_slave_init(struct drm_encoder 
*encoder)
};
int type;
  
-	if (!nv_gf4_disp_arch(dev) || !bus || get_tmds_slave(encoder))

+   if (!nv_gf4_disp_arch(dev) || !bus || get_tmds_client(encoder))
return;
  
  	type = nvkm_i2c_bus_probe(bus, "TMDS transmitter", info, NULL, NULL);

@@ -716,7 +716,7 @@ nv04_dfp_create(struct drm_connector *connector, struct 
dcb_output *entry)
  
  	if (entry->type == DCB_OUTPUT_TMDS &&

entry->location != DCB_LOC_ON_CHIP)
-   nv04_tmds_slave_init(encoder);
+   nv04_tmds_client_init(encoder);
  
  	drm_connector_attach_encoder(connector, encoder);

return 0;
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h 
b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h
index 73f9d9947e7e..5da40cf74b1a 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h
@@ -50,7 +50,7 @@ struct dcb_output {
} dpconf;
struct {
struct sor_conf sor;
-   int slave_addr;
+   int client_addr;
} tmdsconf;
};
bool i2c_upper_default;
diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c 
b/drivers/gpu/drm/nouveau/nouveau_bios.c
index 479effcf607e..a91a5d3df3ca 100644
--- 

Re: [PATCH] drm: nv04: Add check to avoid out of bounds access

2024-04-05 Thread Danilo Krummrich

On 3/31/24 08:45, Mikhail Kobuk wrote:

Output Resource (dcb->or) value is not guaranteed to be non-zero (i.e.
in drivers/gpu/drm/nouveau/nouveau_bios.c, in 'fabricate_dcb_encoder_table()'
'dcb->or' is assigned value '0' in call to 'fabricate_dcb_output()').


I don't really know much about the semantics of this code.

Looking at fabricate_dcb_output() though I wonder if the intention was to assign
BIT(or) to entry->or.

@Lyude, can you help here?

Otherwise, for parsing the DCB entries, it seems that the bound checks are
happening in olddcb_outp_foreach() [1].

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nouveau_bios.c#L1331



Add check to validate 'dcb->or' before it's used.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for iMac G4")
Signed-off-by: Mikhail Kobuk 
---
  drivers/gpu/drm/nouveau/dispnv04/dac.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/dac.c 
b/drivers/gpu/drm/nouveau/dispnv04/dac.c
index d6b8e0cce2ac..0c8d4fc95ff3 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/dac.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/dac.c
@@ -428,7 +428,7 @@ void nv04_dac_update_dacclk(struct drm_encoder *encoder, 
bool enable)
struct drm_device *dev = encoder->dev;
struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
  
-	if (nv_gf4_disp_arch(dev)) {

+   if (nv_gf4_disp_arch(dev) && ffs(dcb->or)) {
uint32_t *dac_users = 
_display(dev)->dac_users[ffs(dcb->or) - 1];
int dacclk_off = NV_PRAMDAC_DACCLK + 
nv04_dac_output_offset(encoder);
uint32_t dacclk = NVReadRAMDAC(dev, 0, dacclk_off);
@@ -453,7 +453,7 @@ bool nv04_dac_in_use(struct drm_encoder *encoder)
struct drm_device *dev = encoder->dev;
struct dcb_output *dcb = nouveau_encoder(encoder)->dcb;
  
-	return nv_gf4_disp_arch(encoder->dev) &&

+   return nv_gf4_disp_arch(encoder->dev) && ffs(dcb->or) &&
(nv04_display(dev)->dac_users[ffs(dcb->or) - 1] & ~(1 << 
dcb->index));
  }
  




Re: [PATCH] nouveau/uvmm: fix addr/range calcs for remap operations

2024-03-28 Thread Danilo Krummrich

On 3/28/24 03:43, Dave Airlie wrote:

From: Dave Airlie 

dEQP-VK.sparse_resources.image_rebind.2d_array.r64i.128_128_8
was causing a remap operation like the below.

op_remap: prev: 003fffed 000f a5abd18a 

op_remap: next:
op_remap: unmap: 003fffed 0010 0
op_map: map: 003c 0001 5b1ba33c 000e

This was resulting in an unmap operation from 0x3fffed+0xf, 0x10
which was corrupting the pagetables and oopsing the kernel.


Good catch, thanks for looking into that.



Fixes the prev + unmap range calcs to use start/end and map back to addr/range.


I like how using start/end instead fixes the issue and keeps it simple.



Signed-off-by: Dave Airlie 
Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich 


Applied the patch to drm-misc-fixes.


---
  drivers/gpu/drm/nouveau/nouveau_uvmm.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c 
b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index 9675ef25b16d..87bce1a9d073 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -813,15 +813,15 @@ op_remap(struct drm_gpuva_op_remap *r,
struct drm_gpuva_op_unmap *u = r->unmap;
struct nouveau_uvma *uvma = uvma_from_va(u->va);
u64 addr = uvma->va.va.addr;
-   u64 range = uvma->va.va.range;
+   u64 end = uvma->va.va.addr + uvma->va.va.range;
  
  	if (r->prev)

addr = r->prev->va.addr + r->prev->va.range;
  
  	if (r->next)

-   range = r->next->va.addr - addr;
+   end = r->next->va.addr;
  
-	op_unmap_range(u, addr, range);

+   op_unmap_range(u, addr, end - addr);
  }
  
  static int




Re: [PATCH][next] drm/nouveau/gr/gf100: Remove second semicolon

2024-03-28 Thread Danilo Krummrich

On 3/15/24 10:09, Colin Ian King wrote:

There is a statement with two semicolons. Remove the second one, it
is redundant.

Signed-off-by: Colin Ian King 


Applied, thanks!


---
  drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c 
b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
index 986e8d547c94..060c74a80eb1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
@@ -420,7 +420,7 @@ gf100_gr_chan_new(struct nvkm_gr *base, struct nvkm_chan 
*fifoch,
return ret;
} else {
ret = nvkm_memory_map(gr->attrib_cb, 0, chan->vmm, 
chan->attrib_cb,
- , sizeof(args));;
+ , sizeof(args));
if (ret)
return ret;
}




Nova and staging Rust abstractions

2024-03-20 Thread Danilo Krummrich

Hi all,

In this mail I briefly want to announce the Nova project and subsequently talk 
about the first
efforts taken in order to upstream required Rust abstractions:

We just started to work on Nova, a Rust-based GSP-only driver for Nvidia GPUs. 
Nova, in the long
term, is intended to serve as the successor of Nouveau for GSP-firmware-based 
GPUs.

With Nova we see the chance to significantly decrease the complexity of the 
driver compared to
Nouveau for mainly two reasons. First, Nouveau's historic architecture, 
especially around nvif/nvkm,
is rather complicated and inflexible and requires major rework to solve certain 
problems (such as
locking hierarchy in VMM / MMU code for VM_BIND currently being solved with a 
workaround) and
second, with a GSP-only driver there is no need to maintain compatibility with 
pre-GSP code.

Besides that, we also want to take the chance to contribute to the Rust efforts 
in the kernel and
benefit from from more memory safety offered by the Rust programming language. 
Ideally, all that
leads to better maintainability and a driver that makes it easy for people to 
get involved into this
project.

With the choice of Rust the first problem to deal with are missing C binding 
abstractions for
integral kernel infrastructure (e.g. device / driver abstractions). Since this 
is a bit of a chicken
and egg problem - we need a user to upstream abstractions, but we also need the 
abstractions to
create a driver - we want to develop Nova upstream and start with just a driver 
stub that only makes
use of some basic Rust abstractions.

In particular, we want to start with basic device / driver, DRM and PCI 
abstractions and a Nova stub
driver making use of them.

Fortunately, a lot of those abstractions did already exist in various 
downstream trees (big thanks
to all the authors). We started picking up existing work, figured out the 
dependencies, fixed a few
issues and warnings and applied some structure by organizing it in the 
following branches.

Remotes: drm-misc [1], rust (Rust-for-Linux) [2], nova [3]

- drm-misc/topic/rust-drm:  [4] contains basic DRM abstractions, e.g. DRM 
device, GEM; depends on
rust-device
- rust/staging/rust-device: [5] contains all common device / driver abstractions
- rust/staging/rust-pci:[6] contains basic PCI abstractions, e.g. PCI 
device, but also other
dependencies such as IoMem and Ressources; 
depends on rust-device
- rust/staging/dev: [7] integration branch for all staging Rust 
abstractions
- nova/stub/nova:   [8] the nova stub driver; depends on rust-drm and 
rust-pci

All branches and commits are functional, but the code and commit messages are 
not yet in a state for
sending actual patch series. While we are working on improving those patches we 
would like to ask
for feedback and / or improvements already.

@Greg, can you please have a first quick look at rust-device [5]?

- Danilo

[1] https://cgit.freedesktop.org/drm/drm-misc
[2] https://github.com/Rust-for-Linux/linux
[3] https://gitlab.freedesktop.org/drm/nova
[4] https://cgit.freedesktop.org/drm/drm-misc/log/?h=topic/rust-drm
[5] https://github.com/Rust-for-Linux/linux/tree/staging/rust-device
[6] https://github.com/Rust-for-Linux/linux/tree/staging/rust-pci
[7] https://github.com/Rust-for-Linux/linux/tree/staging/dev
[8] https://gitlab.freedesktop.org/drm/nova/-/tree/stub/nova



Re: [PATCH] nouveau/gsp: don't check devinit disable on GSP.

2024-03-19 Thread Danilo Krummrich

On 3/19/24 18:38, Timothy Maden wrote:

Hello

Can this be merged please ?


Applied this to drm-misc-next-fixes a couple of hours ago.

- Danilo



On 3/14/24 03:45, Dave Airlie wrote:

From: Dave Airlie 

GSP should be handling this and I can see no evidence in opengpu
driver that this register should be touched.

Fixed acceleration on 2080 Ti GPUs.

Fixes: 15740541e8f0 ("drm/nouveau/devinit/tu102-: prepare for GSP-RM")

Signed-off-by: Dave Airlie 
---
  drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
index 666eb93b1742..11b4c9c274a1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/r535.c
@@ -41,7 +41,6 @@ r535_devinit_new(const struct nvkm_devinit_func *hw,
  rm->dtor = r535_devinit_dtor;
  rm->post = hw->post;
-    rm->disable = hw->disable;
  ret = nv50_devinit_new_(rm, device, type, inst, pdevinit);
  if (ret)





Re: [RESEND v3 1/6] drm/mst: read sideband messaging cap

2024-03-19 Thread Danilo Krummrich

On 3/19/24 10:20, Jani Nikula wrote:

On Tue, 19 Mar 2024, Jani Nikula  wrote:

Amend drm_dp_read_mst_cap() to return an enum, indicating "SST", "SST
with sideband messaging", or "MST". Modify all call sites to take the
new return value into account.


drm-misc and nouveau maintainers, ack for merging this via drm-intel,
please?


Sure, please go ahead.

Thanks,
Danilo



BR,
Jani.





v2:
- Rename enumerators (Ville)

Cc: Arun R Murthy 
Cc: Ville Syrjälä 
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Reviewed-by: Ville Syrjälä 
Signed-off-by: Jani Nikula 
---
  drivers/gpu/drm/display/drm_dp_mst_topology.c | 20 ++--
  drivers/gpu/drm/i915/display/intel_dp.c   |  4 ++--
  drivers/gpu/drm/nouveau/nouveau_dp.c  |  2 +-
  include/drm/display/drm_dp_mst_helper.h   | 23 ++-
  4 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c 
b/drivers/gpu/drm/display/drm_dp_mst_topology.c
index 03d528209426..c193be3577f7 100644
--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
@@ -3608,24 +3608,30 @@ fixed20_12 drm_dp_get_vc_payload_bw(const struct 
drm_dp_mst_topology_mgr *mgr,
  EXPORT_SYMBOL(drm_dp_get_vc_payload_bw);
  
  /**

- * drm_dp_read_mst_cap() - check whether or not a sink supports MST
+ * drm_dp_read_mst_cap() - Read the sink's MST mode capability
   * @aux: The DP AUX channel to use
   * @dpcd: A cached copy of the DPCD capabilities for this sink
   *
- * Returns: %True if the sink supports MST, %false otherwise
+ * Returns: enum drm_dp_mst_mode to indicate MST mode capability
   */
-bool drm_dp_read_mst_cap(struct drm_dp_aux *aux,
-const u8 dpcd[DP_RECEIVER_CAP_SIZE])
+enum drm_dp_mst_mode drm_dp_read_mst_cap(struct drm_dp_aux *aux,
+const u8 dpcd[DP_RECEIVER_CAP_SIZE])
  {
u8 mstm_cap;
  
  	if (dpcd[DP_DPCD_REV] < DP_DPCD_REV_12)

-   return false;
+   return DRM_DP_SST;
  
  	if (drm_dp_dpcd_readb(aux, DP_MSTM_CAP, _cap) != 1)

-   return false;
+   return DRM_DP_SST;
+
+   if (mstm_cap & DP_MST_CAP)
+   return DRM_DP_MST;
+
+   if (mstm_cap & DP_SINGLE_STREAM_SIDEBAND_MSG)
+   return DRM_DP_SST_SIDEBAND_MSG;
  
-	return mstm_cap & DP_MST_CAP;

+   return DRM_DP_SST;
  }
  EXPORT_SYMBOL(drm_dp_read_mst_cap);
  
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c

index af7ca00e9bc0..91c42949ac7e 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -4046,7 +4046,7 @@ intel_dp_can_mst(struct intel_dp *intel_dp)
  
  	return i915->display.params.enable_dp_mst &&

intel_dp_mst_source_support(intel_dp) &&
-   drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd);
+   drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) == 
DRM_DP_MST;
  }
  
  static void

@@ -4055,7 +4055,7 @@ intel_dp_configure_mst(struct intel_dp *intel_dp)
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
struct intel_encoder *encoder =
_to_dig_port(intel_dp)->base;
-   bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd);
+   bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) 
== DRM_DP_MST;
  
  	drm_dbg_kms(>drm,

"[ENCODER:%d:%s] MST support: port: %s, sink: %s, modparam: 
%s\n",
diff --git a/drivers/gpu/drm/nouveau/nouveau_dp.c 
b/drivers/gpu/drm/nouveau/nouveau_dp.c
index 7de7707ec6a8..fb06ee17d9e5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dp.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dp.c
@@ -181,7 +181,7 @@ nouveau_dp_probe_dpcd(struct nouveau_connector 
*nv_connector,
if (nouveau_mst) {
mstm = outp->dp.mstm;
if (mstm)
-   mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd);
+   mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd) == 
DRM_DP_MST;
}
  
  	if (nouveau_dp_has_sink_count(connector, outp)) {

diff --git a/include/drm/display/drm_dp_mst_helper.h 
b/include/drm/display/drm_dp_mst_helper.h
index 3ae88a383a41..cbcb49cb6a46 100644
--- a/include/drm/display/drm_dp_mst_helper.h
+++ b/include/drm/display/drm_dp_mst_helper.h
@@ -817,7 +817,28 @@ int drm_dp_mst_topology_mgr_init(struct 
drm_dp_mst_topology_mgr *mgr,
  
  void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr);
  
-bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, const u8 dpcd[DP_RECEIVER_CAP_SIZE]);

+/**
+ * enum drm_dp_mst_mode - sink's MST mode capability
+ */
+enum drm_dp_mst_mode {
+   /**
+* @DRM_DP_SST: The sink does not support MST nor single stream sideband
+* messaging.
+*/
+   DRM_DP_SST,

Re: [RESEND v3 1/6] drm/mst: read sideband messaging cap

2024-03-19 Thread Danilo Krummrich

On 3/19/24 10:12, Jani Nikula wrote:

Amend drm_dp_read_mst_cap() to return an enum, indicating "SST", "SST
with sideband messaging", or "MST". Modify all call sites to take the
new return value into account.

v2:
- Rename enumerators (Ville)

Cc: Arun R Murthy 
Cc: Ville Syrjälä 
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Reviewed-by: Ville Syrjälä 
Signed-off-by: Jani Nikula 


Acked-by: Danilo Krummrich 


---
  drivers/gpu/drm/display/drm_dp_mst_topology.c | 20 ++--
  drivers/gpu/drm/i915/display/intel_dp.c   |  4 ++--
  drivers/gpu/drm/nouveau/nouveau_dp.c  |  2 +-
  include/drm/display/drm_dp_mst_helper.h   | 23 ++-
  4 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c 
b/drivers/gpu/drm/display/drm_dp_mst_topology.c
index 03d528209426..c193be3577f7 100644
--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
@@ -3608,24 +3608,30 @@ fixed20_12 drm_dp_get_vc_payload_bw(const struct 
drm_dp_mst_topology_mgr *mgr,
  EXPORT_SYMBOL(drm_dp_get_vc_payload_bw);
  
  /**

- * drm_dp_read_mst_cap() - check whether or not a sink supports MST
+ * drm_dp_read_mst_cap() - Read the sink's MST mode capability
   * @aux: The DP AUX channel to use
   * @dpcd: A cached copy of the DPCD capabilities for this sink
   *
- * Returns: %True if the sink supports MST, %false otherwise
+ * Returns: enum drm_dp_mst_mode to indicate MST mode capability
   */
-bool drm_dp_read_mst_cap(struct drm_dp_aux *aux,
-const u8 dpcd[DP_RECEIVER_CAP_SIZE])
+enum drm_dp_mst_mode drm_dp_read_mst_cap(struct drm_dp_aux *aux,
+const u8 dpcd[DP_RECEIVER_CAP_SIZE])
  {
u8 mstm_cap;
  
  	if (dpcd[DP_DPCD_REV] < DP_DPCD_REV_12)

-   return false;
+   return DRM_DP_SST;
  
  	if (drm_dp_dpcd_readb(aux, DP_MSTM_CAP, _cap) != 1)

-   return false;
+   return DRM_DP_SST;
+
+   if (mstm_cap & DP_MST_CAP)
+   return DRM_DP_MST;
+
+   if (mstm_cap & DP_SINGLE_STREAM_SIDEBAND_MSG)
+   return DRM_DP_SST_SIDEBAND_MSG;
  
-	return mstm_cap & DP_MST_CAP;

+   return DRM_DP_SST;
  }
  EXPORT_SYMBOL(drm_dp_read_mst_cap);
  
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c

index af7ca00e9bc0..91c42949ac7e 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -4046,7 +4046,7 @@ intel_dp_can_mst(struct intel_dp *intel_dp)
  
  	return i915->display.params.enable_dp_mst &&

intel_dp_mst_source_support(intel_dp) &&
-   drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd);
+   drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) == 
DRM_DP_MST;
  }
  
  static void

@@ -4055,7 +4055,7 @@ intel_dp_configure_mst(struct intel_dp *intel_dp)
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
struct intel_encoder *encoder =
_to_dig_port(intel_dp)->base;
-   bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd);
+   bool sink_can_mst = drm_dp_read_mst_cap(_dp->aux, intel_dp->dpcd) 
== DRM_DP_MST;
  
  	drm_dbg_kms(>drm,

"[ENCODER:%d:%s] MST support: port: %s, sink: %s, modparam: 
%s\n",
diff --git a/drivers/gpu/drm/nouveau/nouveau_dp.c 
b/drivers/gpu/drm/nouveau/nouveau_dp.c
index 7de7707ec6a8..fb06ee17d9e5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dp.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dp.c
@@ -181,7 +181,7 @@ nouveau_dp_probe_dpcd(struct nouveau_connector 
*nv_connector,
if (nouveau_mst) {
mstm = outp->dp.mstm;
if (mstm)
-   mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd);
+   mstm->can_mst = drm_dp_read_mst_cap(aux, dpcd) == 
DRM_DP_MST;
}
  
  	if (nouveau_dp_has_sink_count(connector, outp)) {

diff --git a/include/drm/display/drm_dp_mst_helper.h 
b/include/drm/display/drm_dp_mst_helper.h
index 3ae88a383a41..cbcb49cb6a46 100644
--- a/include/drm/display/drm_dp_mst_helper.h
+++ b/include/drm/display/drm_dp_mst_helper.h
@@ -817,7 +817,28 @@ int drm_dp_mst_topology_mgr_init(struct 
drm_dp_mst_topology_mgr *mgr,
  
  void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr);
  
-bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, const u8 dpcd[DP_RECEIVER_CAP_SIZE]);

+/**
+ * enum drm_dp_mst_mode - sink's MST mode capability
+ */
+enum drm_dp_mst_mode {
+   /**
+* @DRM_DP_SST: The sink does not support MST nor single stream sideband
+* messaging.
+*/
+   DRM_DP_SST,
+   /**
+* @DRM_DP_MST: Sink supports MST, more than one stream and single
+* stream sideband messaging.
+*/
+   DRM_DP

Re: [PATCH v3] nouveau/dmem: handle kcalloc() allocation failure

2024-03-08 Thread Danilo Krummrich

On 3/6/24 06:01, Duoming Zhou wrote:

The kcalloc() in nouveau_dmem_evict_chunk() will return null if
the physical memory has run out. As a result, if we dereference
src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs
will happen.

Moreover, the GPU is going away. If the kcalloc() fails, we could not
evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL
flag in kcalloc().

Finally, as there is no need to have physically contiguous memory,
this patch switches kcalloc() to kvcalloc() in order to avoid
failing allocations.

Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release")
Suggested-by: Danilo Krummrich 
Signed-off-by: Duoming Zhou 


Applied to drm-misc-fixes, thanks!


---
Changes in v3:
   - Switch kcalloc() to kvcalloc().

  drivers/gpu/drm/nouveau/nouveau_dmem.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 12feecf71e7..6fb65b01d77 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -378,9 +378,9 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
dma_addr_t *dma_addrs;
struct nouveau_fence *fence;
  
-	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);

-   dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
-   dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
+   src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | 
__GFP_NOFAIL);
+   dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | 
__GFP_NOFAIL);
+   dma_addrs = kvcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | 
__GFP_NOFAIL);
  
  	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,

npages);
@@ -406,11 +406,11 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
migrate_device_pages(src_pfns, dst_pfns, npages);
nouveau_dmem_fence_done();
migrate_device_finalize(src_pfns, dst_pfns, npages);
-   kfree(src_pfns);
-   kfree(dst_pfns);
+   kvfree(src_pfns);
+   kvfree(dst_pfns);
for (i = 0; i < npages; i++)
dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, 
DMA_BIDIRECTIONAL);
-   kfree(dma_addrs);
+   kvfree(dma_addrs);
  }
  
  void




Re: [PATCH v2] nouveau/dmem: handle kcalloc() allocation failure

2024-03-05 Thread Danilo Krummrich

Hi Duoming,

thanks for sending a V2.

On 3/5/24 15:39, Duoming Zhou wrote:

The kcalloc() in nouveau_dmem_evict_chunk() will return null if
the physical memory has run out. As a result, if we dereference
src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs
will happen.

Moreover, the GPU is going away. If the kcalloc() fails, we could not
evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL
flag in kcalloc().

Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release")
Signed-off-by: Duoming Zhou 
---
Changes in v2:
   - Allocate with __GFP_NOFAIL.

  drivers/gpu/drm/nouveau/nouveau_dmem.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 12feecf71e7..f5ae9724ee2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -378,9 +378,9 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
dma_addr_t *dma_addrs;
struct nouveau_fence *fence;
  
-	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);

-   dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
-   dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
+   src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | 
__GFP_NOFAIL);
+   dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | 
__GFP_NOFAIL);
+   dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | 
__GFP_NOFAIL);


I think we should also switch to kvcalloc(), AFAICS we don't need 
physically contiguous memory.


Sorry I did not mention that in V1 already.

- Danilo

  
  	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,

npages);




Re: [PATCH] nouveau/dmem: handle kcalloc() allocation failure

2024-03-05 Thread Danilo Krummrich

Hi Duoming,

On 3/3/24 08:53, Duoming Zhou wrote:

The kcalloc() in nouveau_dmem_evict_chunk() will return null if
the physical memory has run out. As a result, if we dereference
src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs
will happen.

This patch uses stack variables to replace the kcalloc().

Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release")
Signed-off-by: Duoming Zhou 
---
  drivers/gpu/drm/nouveau/nouveau_dmem.c | 13 +
  1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 12feecf71e7..9a578262c6d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -374,13 +374,13 @@ static void
  nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
  {
unsigned long i, npages = range_len(>pagemap.range) >> 
PAGE_SHIFT;
-   unsigned long *src_pfns, *dst_pfns;
-   dma_addr_t *dma_addrs;
+   unsigned long src_pfns[npages], dst_pfns[npages];
+   dma_addr_t dma_addrs[npages];


Please don't use variable length arrays, this can potentially blow up
the stack.

As a fix I think we should allocate with __GFP_NOFAIL instead.

- Danilo


struct nouveau_fence *fence;
  
-	src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);

-   dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
-   dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
+   memset(src_pfns, 0, npages);
+   memset(dst_pfns, 0, npages);
+   memset(dma_addrs, 0, npages);
  
  	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,

npages);
@@ -406,11 +406,8 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
migrate_device_pages(src_pfns, dst_pfns, npages);
nouveau_dmem_fence_done();
migrate_device_finalize(src_pfns, dst_pfns, npages);
-   kfree(src_pfns);
-   kfree(dst_pfns);
for (i = 0; i < npages; i++)
dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, 
DMA_BIDIRECTIONAL);
-   kfree(dma_addrs);
  }
  
  void




Re: [PATCH 2/2] drm/nouveau: move more missing UAPI bits

2024-03-05 Thread Danilo Krummrich

On 3/4/24 19:31, Karol Herbst wrote:

Those are already de-facto UAPI, so let's just move it into the uapi
header.

Signed-off-by: Karol Herbst 


Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/nouveau/nouveau_abi16.c | 20 +++-
  drivers/gpu/drm/nouveau/nouveau_abi16.h | 12 
  include/uapi/drm/nouveau_drm.h  | 22 ++
  3 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c 
b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index cd14f993bdd1b..92f9127b284ac 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -312,11 +312,21 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
if (device->info.family >= NV_DEVICE_INFO_V0_KEPLER) {
if (init->fb_ctxdma_handle == ~0) {
switch (init->tt_ctxdma_handle) {
-   case 0x01: engine = NV_DEVICE_HOST_RUNLIST_ENGINES_GR   
 ; break;
-   case 0x02: engine = 
NV_DEVICE_HOST_RUNLIST_ENGINES_MSPDEC; break;
-   case 0x04: engine = 
NV_DEVICE_HOST_RUNLIST_ENGINES_MSPPP ; break;
-   case 0x08: engine = 
NV_DEVICE_HOST_RUNLIST_ENGINES_MSVLD ; break;
-   case 0x30: engine = NV_DEVICE_HOST_RUNLIST_ENGINES_CE   
 ; break;
+   case NOUVEAU_FIFO_ENGINE_GR:
+   engine = NV_DEVICE_HOST_RUNLIST_ENGINES_GR;
+   break;
+   case NOUVEAU_FIFO_ENGINE_VP:
+   engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSPDEC;
+   break;
+   case NOUVEAU_FIFO_ENGINE_PPP:
+   engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSPPP;
+   break;
+   case NOUVEAU_FIFO_ENGINE_BSP:
+   engine = NV_DEVICE_HOST_RUNLIST_ENGINES_MSVLD;
+   break;
+   case NOUVEAU_FIFO_ENGINE_CE:
+   engine = NV_DEVICE_HOST_RUNLIST_ENGINES_CE;
+   break;
default:
return nouveau_abi16_put(abi16, -ENOSYS);
}
diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.h 
b/drivers/gpu/drm/nouveau/nouveau_abi16.h
index 11c8c4a80079b..661b901d8ecc9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.h
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.h
@@ -50,18 +50,6 @@ struct drm_nouveau_grobj_alloc {
int  class;
  };
  
-struct drm_nouveau_notifierobj_alloc {

-   uint32_t channel;
-   uint32_t handle;
-   uint32_t size;
-   uint32_t offset;
-};
-
-struct drm_nouveau_gpuobj_free {
-   int  channel;
-   uint32_t handle;
-};
-
  struct drm_nouveau_setparam {
uint64_t param;
uint64_t value;
diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index 77d7ff0d5b110..5404d4cfff4c2 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ b/include/uapi/drm/nouveau_drm.h
@@ -73,6 +73,16 @@ struct drm_nouveau_getparam {
__u64 value;
  };
  
+/*

+ * Those are used to support selecting the main engine used on Kepler.
+ * This goes into drm_nouveau_channel_alloc::tt_ctxdma_handle
+ */
+#define NOUVEAU_FIFO_ENGINE_GR  0x01
+#define NOUVEAU_FIFO_ENGINE_VP  0x02
+#define NOUVEAU_FIFO_ENGINE_PPP 0x04
+#define NOUVEAU_FIFO_ENGINE_BSP 0x08
+#define NOUVEAU_FIFO_ENGINE_CE  0x30
+
  struct drm_nouveau_channel_alloc {
__u32 fb_ctxdma_handle;
__u32 tt_ctxdma_handle;
@@ -95,6 +105,18 @@ struct drm_nouveau_channel_free {
__s32 channel;
  };
  
+struct drm_nouveau_notifierobj_alloc {

+   __u32 channel;
+   __u32 handle;
+   __u32 size;
+   __u32 offset;
+};
+
+struct drm_nouveau_gpuobj_free {
+   __s32 channel;
+   __u32 handle;
+};
+
  #define NOUVEAU_GEM_DOMAIN_CPU   (1 << 0)
  #define NOUVEAU_GEM_DOMAIN_VRAM  (1 << 1)
  #define NOUVEAU_GEM_DOMAIN_GART  (1 << 2)




Re: [PATCH 1/2] drm/nouveau: fix stale locked mutex in nouveau_gem_ioctl_pushbuf

2024-03-05 Thread Danilo Krummrich

On 3/4/24 19:31, Karol Herbst wrote:

If VM_BIND is enabled on the client the legacy submission ioctl can't be
used, however if a client tries to do so regardless it will return an
error. In this case the clients mutex remained unlocked leading to a
deadlock inside nouveau_drm_postclose or any other nouveau ioctl call.

Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich 
Signed-off-by: Karol Herbst 


Should add a stable tag for that one, otherwise:

Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 49c2bcbef1299..5a887d67dc0e8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -764,7 +764,7 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void 
*data,
return -ENOMEM;
  
  	if (unlikely(nouveau_cli_uvmm(cli)))

-   return -ENOSYS;
+   return nouveau_abi16_put(abi16, -ENOSYS);
  
  	list_for_each_entry(temp, >channels, head) {

if (temp->chan->chid == req->channel) {




Re: [PATCH stable v6.7] drm/nouveau: don't fini scheduler before entity flush

2024-03-04 Thread Danilo Krummrich

On 3/4/24 18:55, Greg KH wrote:

On Mon, Mar 04, 2024 at 06:01:46PM +0100, Danilo Krummrich wrote:

Cc:  # v6.7 only

You say 6.7 only, but this commit is in 6.6, so why not 6.6 also?


Good catch, I was sure I originally merged this for 6.7. This fix should indeed
be applied to 6.6 as well. Should have double checked that, my bad.

- Danilo



thanks,

greg k-h





[PATCH stable v6.7] drm/nouveau: don't fini scheduler before entity flush

2024-03-04 Thread Danilo Krummrich
_sched]
[  305.226418]  nouveau_drm_device_fini+0x2a0/0x490 [nouveau]
[  305.226624]  nouveau_drm_remove+0x18e/0x220 [nouveau]
[  305.226832]  pci_device_remove+0xa3/0x1d0
[  305.226836]  device_release_driver_internal+0x379/0x540
[  305.226840]  driver_detach+0xc5/0x180
[  305.226843]  bus_remove_driver+0x11e/0x2a0
[  305.226847]  pci_unregister_driver+0x2a/0x250
[  305.226850]  nouveau_drm_exit+0x1f/0x970 [nouveau]
[  305.227056]  __do_sys_delete_module+0x350/0x580
[  305.227060]  do_syscall_64+0x61/0xe0
[  305.227064]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

[  305.227070] The buggy address belongs to the object at 8881440a8f00
which belongs to the cache kmalloc-128 of size 128
[  305.227073] The buggy address is located 72 bytes inside of
freed 128-byte region [8881440a8f00, 8881440a8f80)

[  305.227078] The buggy address belongs to the physical page:
[  305.227081] page:627efa0a refcount:1 mapcount:0 
mapping: index:0x0 pfn:0x1440a8
[  305.227085] head:627efa0a order:1 entire_mapcount:0 
nr_pages_mapped:0 pincount:0
[  305.227088] flags: 
0x17c840(slab|head|node=0|zone=2|lastcpupid=0x1f)
[  305.227093] page_type: 0x()
[  305.227097] raw: 0017c840 8881000428c0 ea0005b33500 
dead0002
[  305.227100] raw:  00200020 0001 

[  305.227102] page dumped because: kasan: bad access detected

[  305.227106] Memory state around the buggy address:
[  305.227109]  8881440a8e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[  305.227112]  8881440a8e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[  305.227114] >8881440a8f00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[  305.227117]   ^
[  305.227120]  8881440a8f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[  305.227122]  8881440a9000: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc 
fc
[  305.227125] 
==

Cc:  # v6.7 only
Reported-by: Karol Herbst 
Closes: 
https://gist.githubusercontent.com/karolherbst/a20eb0f937a06ed6aabe2ac2ca3d11b5/raw/9cd8b1dc5894872d0eeebbee3dd0fdd28bb576bc/gistfile1.txt
Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 50589f982d1a..75545da9d1e9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -708,10 +708,11 @@ nouveau_drm_device_fini(struct drm_device *dev)
}
mutex_unlock(>clients_lock);
 
-   nouveau_sched_fini(drm);
-
nouveau_cli_fini(>client);
nouveau_cli_fini(>master);
+
+   nouveau_sched_fini(drm);
+
nvif_parent_dtor(>parent);
mutex_destroy(>clients_lock);
kfree(drm);
-- 
2.44.0



Re: [PATCH 1/2] nouveau: lock the client object tree.

2024-03-04 Thread Danilo Krummrich

On 3/1/24 04:42, Dave Airlie wrote:

From: Dave Airlie 

It appears the client object tree has no locking unless I've missed
something else. Fix races around adding/removing client objects,
mostly vram bar mappings.

  4562.099306] general protection fault, probably for non-canonical address 
0x6677ed422bceb80c:  [#1] PREEMPT SMP PTI
[ 4562.099314] CPU: 2 PID: 23171 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
[ 4562.099324] Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO 
WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
[ 4562.099330] RIP: 0010:nvkm_object_search+0x1d/0x70 [nouveau]
[ 4562.099503] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 
00 48 89 f8 48 85 f6 74 39 48 8b 87 a0 00 00 00 48 85 c0 74 12 <48> 8b 48 f8 48 
39 ce 73 15 48 8b 40 10 48 85 c0 75 ee 48 c7 c0 fe
[ 4562.099506] RSP: :a94cc420bbf8 EFLAGS: 00010206
[ 4562.099512] RAX: 6677ed422bceb814 RBX: 98108791f400 RCX: 9810f26b8f58
[ 4562.099517] RDX:  RSI: 9810f26b9158 RDI: 98108791f400
[ 4562.099519] RBP: 9810f26b9158 R08:  R09: 
[ 4562.099521] R10: a94cc420bc48 R11: 0001 R12: 9810f02a7cc0
[ 4562.099526] R13:  R14: 00ff R15: 0007
[ 4562.099528] FS:  7f629c5017c0() GS:98142c70() 
knlGS:
[ 4562.099534] CS:  0010 DS:  ES:  CR0: 80050033
[ 4562.099536] CR2: 7f629a882000 CR3: 00017019e004 CR4: 003706f0
[ 4562.099541] DR0:  DR1:  DR2: 
[ 4562.099542] DR3:  DR6: fffe0ff0 DR7: 0400
[ 4562.099544] Call Trace:
[ 4562.099555]  
[ 4562.099573]  ? die_addr+0x36/0x90
[ 4562.099583]  ? exc_general_protection+0x246/0x4a0
[ 4562.099593]  ? asm_exc_general_protection+0x26/0x30
[ 4562.099600]  ? nvkm_object_search+0x1d/0x70 [nouveau]
[ 4562.099730]  nvkm_ioctl+0xa1/0x250 [nouveau]
[ 4562.099861]  nvif_object_map_handle+0xc8/0x180 [nouveau]
[ 4562.099986]  nouveau_ttm_io_mem_reserve+0x122/0x270 [nouveau]
[ 4562.100156]  ? dma_resv_test_signaled+0x26/0xb0
[ 4562.100163]  ttm_bo_vm_fault_reserved+0x97/0x3c0 [ttm]
[ 4562.100182]  ? __mutex_unlock_slowpath+0x2a/0x270
[ 4562.100189]  nouveau_ttm_fault+0x69/0xb0 [nouveau]
[ 4562.100356]  __do_fault+0x32/0x150
[ 4562.100362]  do_fault+0x7c/0x560
[ 4562.100369]  __handle_mm_fault+0x800/0xc10
[ 4562.100382]  handle_mm_fault+0x17c/0x3e0
[ 4562.100388]  do_user_addr_fault+0x208/0x860
[ 4562.100395]  exc_page_fault+0x7f/0x200
[ 4562.100402]  asm_exc_page_fault+0x26/0x30
[ 4562.100412] RIP: 0033:0x9b9870
[ 4562.100419] Code: 85 a8 f7 ff ff 8b 8d 80 f7 ff ff 89 08 e9 18 f2 ff ff 0f 1f 84 
00 00 00 00 00 44 89 32 e9 90 fa ff ff 0f 1f 84 00 00 00 00 00 <44> 89 32 e9 f8 
f1 ff ff 0f 1f 84 00 00 00 00 00 66 44 89 32 e9 e7
[ 4562.100422] RSP: 002b:7fff9ba2dc70 EFLAGS: 00010246
[ 4562.100426] RAX: 0004 RBX: 0dd65e10 RCX: 00fff000
[ 4562.100428] RDX: 7f629a882000 RSI: 7f629a882000 RDI: 0066
[ 4562.100432] RBP: 7fff9ba2e570 R08:  R09: 000123ddf000
[ 4562.100434] R10: 0001 R11: 0246 R12: 7fff
[ 4562.100436] R13:  R14:  R15: 
[ 4562.100446]  
[ 4562.100448] Modules linked in: nf_conntrack_netbios_ns 
nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib 
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat 
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c 
nfnetlink cmac bnep sunrpc iwlmvm intel_rapl_msr intel_rapl_common 
snd_sof_pci_intel_cnl x86_pkg_temp_thermal intel_powerclamp 
snd_sof_intel_hda_common mac80211 coretemp snd_soc_acpi_intel_match kvm_intel 
snd_soc_acpi snd_soc_hdac_hda snd_sof_pci snd_sof_xtensa_dsp 
snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof kvm snd_sof_utils 
snd_soc_core snd_hda_codec_realtek libarc4 snd_hda_codec_generic snd_compress 
snd_hda_ext_core vfat fat snd_hda_intel snd_intel_dspcfg irqbypass iwlwifi 
snd_hda_codec snd_hwdep snd_hda_core btusb btrtl mei_hdcp iTCO_wdt rapl mei_pxp 
btintel snd_seq iTCO_vendor_support btbcm snd_seq_device intel_cstate bluetooth 
snd_pcm cfg80211 intel_wmi_thunderbolt wmi_bmof intel_uncore snd_timer mei_me 
snd ecdh_generic i2c_i801
[ 4562.100541]  ecc mei i2c_smbus soundcore rfkill intel_pch_thermal acpi_pad 
zram nouveau drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_gpuvm drm_exec 
mxm_wmi drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul 
nvme e1000e crc32c_intel nvme_core ghash_clmulni_intel video wmi 
pinctrl_cannonlake ip6_tables ip_tables fuse
[ 4562.100616] ---[ end trace  ]---

Signed-off-by: Dave Airlie 
Cc: sta...@vger.kernel.org
---
  .../drm/nouveau/include/nvkm/core/client.h|  1 +
  drivers/gpu/drm/nouveau/nvkm/core/client.c|  1 +
  

Re: [PATCH] drm/nouveau: use dedicated wq for fence uevents work

2024-02-23 Thread Danilo Krummrich
On Fri, Feb 23, 2024 at 10:14:53AM +1000, Dave Airlie wrote:
> On Fri, 23 Feb 2024 at 00:45, Danilo Krummrich  wrote:
> >
> > Using the kernel global workqueue to signal fences can lead to
> > unexpected deadlocks. Some other work (e.g. from a different driver)
> > could directly or indirectly depend on this fence to be signaled.
> > However, if the WQ_MAX_ACTIVE limit is reached by waiters, this can
> > prevent the work signaling the fence from running.
> >
> > While this seems fairly unlikely, it's potentially exploitable.
> 
> LGTM
> 
> Reviewed-by: Dave Airlie 
> 
> probably should go into drm-misc-fixes?

Yes, however, 39126abc5e20 went in through drm-fixes directly it seems, since
it's not in drm-misc-fixes.

Guess you want me to cherry-pick 39126abc5e20 to drm-misc-fixes rather than take
this one through drm-fixes as well?

> 
> >
> > Fixes: 39126abc5e20 ("nouveau: offload fence uevents work to workqueue")
> > Signed-off-by: Danilo Krummrich 
> > ---
> >  drivers/gpu/drm/nouveau/nouveau_drm.c   | 13 +++--
> >  drivers/gpu/drm/nouveau/nouveau_drv.h   |  3 +++
> >  drivers/gpu/drm/nouveau/nouveau_fence.c |  3 ++-
> >  drivers/gpu/drm/nouveau/nouveau_fence.h |  2 ++
> >  4 files changed, 18 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> > b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > index 6f6c31a9937b..6be202081077 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > @@ -598,9 +598,15 @@ nouveau_drm_device_init(struct drm_device *dev)
> > goto fail_alloc;
> > }
> >
> > +   drm->fence_wq = alloc_workqueue("nouveau_fence_wq", 0, 
> > WQ_MAX_ACTIVE);
> > +   if (!drm->fence_wq) {
> > +   ret = -ENOMEM;
> > +   goto fail_sched_wq;
> > +   }
> > +
> > ret = nouveau_cli_init(drm, "DRM-master", >master);
> > if (ret)
> > -   goto fail_wq;
> > +   goto fail_fence_wq;
> >
> > ret = nouveau_cli_init(drm, "DRM", >client);
> > if (ret)
> > @@ -670,7 +676,9 @@ nouveau_drm_device_init(struct drm_device *dev)
> > nouveau_cli_fini(>client);
> >  fail_master:
> > nouveau_cli_fini(>master);
> > -fail_wq:
> > +fail_fence_wq:
> > +   destroy_workqueue(drm->fence_wq);
> > +fail_sched_wq:
> > destroy_workqueue(drm->sched_wq);
> >  fail_alloc:
> > nvif_parent_dtor(>parent);
> > @@ -725,6 +733,7 @@ nouveau_drm_device_fini(struct drm_device *dev)
> >
> > nouveau_cli_fini(>client);
> > nouveau_cli_fini(>master);
> > +   destroy_workqueue(drm->fence_wq);
> > destroy_workqueue(drm->sched_wq);
> > nvif_parent_dtor(>parent);
> > mutex_destroy(>clients_lock);
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
> > b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > index 8a6d94c8b163..b43619a213a4 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_drv.h
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > @@ -261,6 +261,9 @@ struct nouveau_drm {
> > /* Workqueue used for channel schedulers. */
> > struct workqueue_struct *sched_wq;
> >
> > +   /* Workqueue used to signal fences. */
> > +   struct workqueue_struct *fence_wq;
> > +
> > /* context for accelerated drm-internal operations */
> > struct nouveau_channel *cechan;
> > struct nouveau_channel *channel;
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
> > b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > index 93f08f9479d8..c3ea3cd933cd 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > @@ -174,7 +174,7 @@ static int
> >  nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, 
> > u32 repc)
> >  {
> > struct nouveau_fence_chan *fctx = container_of(event, 
> > typeof(*fctx), event);
> > -   schedule_work(>uevent_work);
> > +   queue_work(fctx->wq, >uevent_work);
> > return NVIF_EVENT_KEEP;
> >  }
> >
> > @@ -194,6 +194,7 @@ nouveau_fence_context_new(struct nouveau_channel *chan, 
> > struct nouveau_fence_cha
> > INIT_LIST_HEAD(>pending);
> > spin_lock_init(>lock);
> > fctx->context = chan->drm->runl[chan->runli

[PATCH] drm/nouveau: use dedicated wq for fence uevents work

2024-02-22 Thread Danilo Krummrich
Using the kernel global workqueue to signal fences can lead to
unexpected deadlocks. Some other work (e.g. from a different driver)
could directly or indirectly depend on this fence to be signaled.
However, if the WQ_MAX_ACTIVE limit is reached by waiters, this can
prevent the work signaling the fence from running.

While this seems fairly unlikely, it's potentially exploitable.

Fixes: 39126abc5e20 ("nouveau: offload fence uevents work to workqueue")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/nouveau/nouveau_drm.c   | 13 +++--
 drivers/gpu/drm/nouveau/nouveau_drv.h   |  3 +++
 drivers/gpu/drm/nouveau/nouveau_fence.c |  3 ++-
 drivers/gpu/drm/nouveau/nouveau_fence.h |  2 ++
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 6f6c31a9937b..6be202081077 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -598,9 +598,15 @@ nouveau_drm_device_init(struct drm_device *dev)
goto fail_alloc;
}
 
+   drm->fence_wq = alloc_workqueue("nouveau_fence_wq", 0, WQ_MAX_ACTIVE);
+   if (!drm->fence_wq) {
+   ret = -ENOMEM;
+   goto fail_sched_wq;
+   }
+
ret = nouveau_cli_init(drm, "DRM-master", >master);
if (ret)
-   goto fail_wq;
+   goto fail_fence_wq;
 
ret = nouveau_cli_init(drm, "DRM", >client);
if (ret)
@@ -670,7 +676,9 @@ nouveau_drm_device_init(struct drm_device *dev)
nouveau_cli_fini(>client);
 fail_master:
nouveau_cli_fini(>master);
-fail_wq:
+fail_fence_wq:
+   destroy_workqueue(drm->fence_wq);
+fail_sched_wq:
destroy_workqueue(drm->sched_wq);
 fail_alloc:
nvif_parent_dtor(>parent);
@@ -725,6 +733,7 @@ nouveau_drm_device_fini(struct drm_device *dev)
 
nouveau_cli_fini(>client);
nouveau_cli_fini(>master);
+   destroy_workqueue(drm->fence_wq);
destroy_workqueue(drm->sched_wq);
nvif_parent_dtor(>parent);
mutex_destroy(>clients_lock);
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 8a6d94c8b163..b43619a213a4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -261,6 +261,9 @@ struct nouveau_drm {
/* Workqueue used for channel schedulers. */
struct workqueue_struct *sched_wq;
 
+   /* Workqueue used to signal fences. */
+   struct workqueue_struct *fence_wq;
+
/* context for accelerated drm-internal operations */
struct nouveau_channel *cechan;
struct nouveau_channel *channel;
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 93f08f9479d8..c3ea3cd933cd 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -174,7 +174,7 @@ static int
 nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 
repc)
 {
struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), 
event);
-   schedule_work(>uevent_work);
+   queue_work(fctx->wq, >uevent_work);
return NVIF_EVENT_KEEP;
 }
 
@@ -194,6 +194,7 @@ nouveau_fence_context_new(struct nouveau_channel *chan, 
struct nouveau_fence_cha
INIT_LIST_HEAD(>pending);
spin_lock_init(>lock);
fctx->context = chan->drm->runl[chan->runlist].context_base + 
chan->chid;
+   fctx->wq = chan->drm->fence_wq;
 
if (chan == chan->drm->cechan)
strcpy(fctx->name, "copy engine channel");
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h 
b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 8bc065acfe35..bc13110bdfa4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -44,7 +44,9 @@ struct nouveau_fence_chan {
u32 context;
char name[32];
 
+   struct workqueue_struct *wq;
struct work_struct uevent_work;
+
struct nvif_event event;
int notify_ref, dead, killed;
 };

base-commit: 1f4c6f11a557642505e5f403e0dfabbaff9c529a
-- 
2.43.0



Re: [PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized

2024-02-20 Thread Danilo Krummrich

On 2/20/24 20:45, Timur Tabi wrote:

On Mon, 2024-02-19 at 10:32 +0100, Danilo Krummrich wrote:

Looks like I spoke too soon, I just hit the problem with the drm-next tree.


Did you apply the patch to drm-next?


Ugh, you're right.  I don't how I got confused, but I could have sworn that I 
saw your two patches in drm-next, but they are not there.

Sorry for the noise.


No worries, thanks for testing! :-)



Re: [PATCH] nouveau: offload fence uevents work to workqueue

2024-02-19 Thread Danilo Krummrich

On 2/16/24 17:41, Daniel Vetter wrote:

On Tue, Feb 13, 2024 at 06:39:20PM +0100, Danilo Krummrich wrote:

On 2/9/24 19:52, Daniel Vetter wrote:

On Fri, Feb 09, 2024 at 06:41:32PM +0100, Danilo Krummrich wrote:

On 2/6/24 15:03, Daniel Vetter wrote:

On Mon, Feb 05, 2024 at 11:00:04PM +0100, Danilo Krummrich wrote:

On 2/5/24 22:08, Dave Airlie wrote:

On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich  wrote:


On 1/29/24 02:50, Dave Airlie wrote:

From: Dave Airlie 

This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Signed-off-by: Dave Airlie 


Nouveau's scheduler uses a dedicated wq, hence from this perspective it's
safe deferring fence signalling to the kernel global wq. However, I wonder
if we could create deadlocks by building dependency chains into other
drivers / kernel code that, by chance, makes use of the kernel global wq as
well.

Admittedly, even if, it's gonna be extremely unlikely given that
WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq.

Also, do we need to CC stable?


I pushed this to Linus at the end of last week, since the hangs in 6.7
take out the complete system and I wanted it in stable.

It might be safer to use a dedicated wq, is the concern someone is
waiting on a fence in a workqueue somewhere else so we will never
signal it?


Yes, if some other work is waiting for this fence (or something else in the same
dependency chain) to signal it can prevent executing the work signaling this 
fence,
in case both are scheduled on the same wq. As mentioned, with the kernel global 
wq
this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 
512,
but formally the race condition exists. I guess a malicious attacker could try 
to
intentionally push jobs directly or indirectly depending on this fence to a 
driver
which queues them up on a scheduler using the kernel global wq.


I think if you add dma_fence_signalling annotations (aside, there's some
patch from iirc Thomas Hellstrom to improve them and cut down on some
false positives, but I lost track) then I think you won't get any splats
because the wq subsystem assumes that WC_MAX_ACTIVE is close enough to
infinity to not matter.


As mentioned, for the kernel global wq it's 512. (Intentionally) feeding the 
kernel
with enough jobs to to provoke a deadlock doesn't seem impossible to me.

I think it'd be safer to just establish not to use the kernel global wq for 
executing
work in the fence signalling critical path.

We could also run into similar problems with a dedicated wq, e.g. when drivers 
share
a wq between drm_gpu_scheduler instances (see [1]), however, I'm not sure we 
can catch
that with lockdep.


I think if you want to fix it perfectly you'd need to set the max number
of wq to the number of engines (or for dynamic/fw scheduled engines to the
number of context) you have. Or whatever limit to the number of parallel
timelines there is.> I guess this would need a new wq function to
update? drm/sched code could
be able to set that for drivers, so drivers cannot get this wrong.


Not sure I can follow. The scheduler instance might be per context and bind
queue. In this case it gets the shared wq passed, but doesn't know how many
other scheduler instances share the same one.


Yeah that's why maybe more of that logic should be in the drm/sched code
instead of drivers just cleverly using what's there ...


Agree, that's gonna be a huge design change though.




Additionally, there might be drivers not using the DRM scheduler for for bind
queues at all (I think Xe does not).


Uh ... maybe we should do this the same across all drivers? But I also
thought that Xe was flat-out synchronous and only had an out-fence since
you need a userspace thread in mesa for vk semantics anyway.

If xe hand-rolled a scheduler I'm not going to be very amused.


I really don't know the details, but there are similarities at least.

There is the the rebind work, which seems to be called in some VM_BIND cases and
in the context of an EXEC ioctl and seems to signal a fence. It seems valid to
not stuff this into the scheduler.

There are also cases like this one, where we have fence signalling critical code
in wqs outside the context of a scheduler instance.




If we don't do something like that then I'm not sure there's really much
benefit - instead of carefully timing 512 dma_fence on the global wq you
just need to create a pile of context (at least on intel's guc that's
absolutely no issue) and then careful time them.


Well, that's true. I'd still argue that there is a slight difference. From a
drivers isolated perspective using the global kernel wq might be entirely fine,
as in this patch. However, in combination with another driver doing the same
thing, things can blow up. For the case you illustrated it's at least possible
to spot it from a driver's perspective.



I still feel like we have bigger fish to 

Re: [PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized

2024-02-19 Thread Danilo Krummrich

On 2/15/24 00:48, Timur Tabi wrote:

On Fri, 2024-02-02 at 17:14 +, Timur Tabi wrote:

On Fri, 2024-02-02 at 01:05 +0100, Danilo Krummrich wrote:

nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call
their corresponding *_fini() counterpart. This can lead to
nouveau_sched_fini() being called without struct nouveau_sched ever
being initialized in the first place.


Thanks, I've confirmed that these patches do fix the problem.


Looks like I spoke too soon, I just hit the problem with the drm-next tree.


Did you apply the patch to drm-next?



I'm able to repro the problem by having r535_gsp_init() return an error.
r535_gsp_rpc_poll return -EINVAL (I'm testing my own GSP-RM build) and
nouveau_sched_fini() is called even though nouveau_sched_init() was never
called.




Re: [PATCH] drm/nouveau/mmu/r535: uninitialized variable in r535_bar_new_()

2024-02-16 Thread Danilo Krummrich

On 2/13/24 19:09, Dan Carpenter wrote:

If gf100_bar_new_() fails then "bar" is not initialized.

Fixes: 5bf0257136a2 ("drm/nouveau/mmu/r535: initial support")
Signed-off-by: Dan Carpenter 


Applied to drm-misc-fixes, thanks!


---
It looks like this was intended to handle a failure from the "rm" func
but "rm" can't actually fail so it's easier to write the error handling
for the code as-is.

  drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c
index 4135690326f4..3a30bea30e36 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/r535.c
@@ -168,12 +168,11 @@ r535_bar_new_(const struct nvkm_bar_func *hw, struct 
nvkm_device *device,
rm->flush = r535_bar_flush;
  
  	ret = gf100_bar_new_(rm, device, type, inst, );

-   *pbar = bar;
if (ret) {
-   if (!bar)
-   kfree(rm);
+   kfree(rm);
return ret;
}
+   *pbar = bar;
  
  	bar->flushBAR2PhysMode = ioremap(device->func->resource_addr(device, 3), PAGE_SIZE);

if (!bar->flushBAR2PhysMode)




Re: [PATCH] nouveau: fix function cast warnings

2024-02-16 Thread Danilo Krummrich

On 2/13/24 10:57, Arnd Bergmann wrote:

From: Arnd Bergmann 

clang-16 warns about casting between incompatible function types:

drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c:161:10: error: cast from 
'void (*)(const struct firmware *)' to 'void (*)(void *)' converts to 
incompatible function type [-Werror,-Wcast-function-type-strict]
   161 | .fini = (void(*)(void *))release_firmware,

This one was done to use the generic shadow_fw_release() function as a
callback for struct nvbios_source. Change it to use the same prototype
as the other five instances, with a trivial helper function that actually
calls release_firmware.

Fixes: 70c0f263cc2e ("drm/nouveau/bios: pull in basic vbios subdev, more to come 
later")
Signed-off-by: Arnd Bergmann 


Applied to drm-misc-fixes, thanks!


---
  drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c
index 19188683c8fc..8c2bf1c16f2a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c
@@ -154,11 +154,17 @@ shadow_fw_init(struct nvkm_bios *bios, const char *name)
return (void *)fw;
  }
  
+static void

+shadow_fw_release(void *fw)
+{
+   release_firmware(fw);
+}
+
  static const struct nvbios_source
  shadow_fw = {
.name = "firmware",
.init = shadow_fw_init,
-   .fini = (void(*)(void *))release_firmware,
+   .fini = shadow_fw_release,
.read = shadow_fw_read,
.rw = false,
  };




Re: [PATCH 2/2] [v3] drm/nouveau: expose GSP-RM logging buffers via debugfs

2024-02-13 Thread Danilo Krummrich

On 2/13/24 18:10, Timur Tabi wrote:

On Tue, 2024-02-13 at 17:57 +0100, Danilo Krummrich wrote:

+   struct debugfs_blob_wrapper blob_init;
+   struct debugfs_blob_wrapper blob_intr;
+   struct debugfs_blob_wrapper blob_rm;
+   struct debugfs_blob_wrapper blob_pmu;
+   struct dentry *debugfs_logging_dir;


I think we should not create those from within the nvkm layer, but rather pass
them down through nvkm_device_pci_new().


Should they be created in nvkm_device_pci_new() also, even though we have no 
idea whether GSP is involved at that point?


We can pass some structure to nvkm_device_pci_new() where GSP sets the pointers 
to
the buffers and maybe a callback to clean them up. Once nvkm_device_pci_new() 
returns
we'd see if something has been set or not. If so, we can go ahead and create the
debugfs nodes.





Lifecycle wise I think we should ensure that removing the Nouveau kernel module
also cleans up those buffers. Even though keep-gsp-logging is considered unsafe,
we shouldn't leak memory.


I agree, but then there needs to be some way to keep these debugfs entries 
until the driver unloads.  I don't know how to do that without creating some 
ugly global variables.


I fear that might be the only option. However, I still prefer a global list 
over a
memory leak.





For instance, can we allocate corresponding buffers in the driver layer, copy
things over and keep those buffers until nouveau_drm_exit()? This would also
avoid exposing those DMA buffers via debugfs.


The whole point behind this patch is to expose the buffers via debugfs.  How 
else should they be exposed?



I think I only thought about the fail case where we could just copy over the 
data from
the DMA buffer to another buffer and expose that one instead. However, that 
doesn't
work if GSP loads successfully.

Hence, as mentioned above, we might just want to have some structure that we 
can add to
the global list that contains the pointers to the DMA buffers and maybe a 
callback
function to clean them up defined by the GSP code that we can simply call from
nouveau_drm_exit().



Re: [PATCH] nouveau: offload fence uevents work to workqueue

2024-02-13 Thread Danilo Krummrich

On 2/9/24 19:52, Daniel Vetter wrote:

On Fri, Feb 09, 2024 at 06:41:32PM +0100, Danilo Krummrich wrote:

On 2/6/24 15:03, Daniel Vetter wrote:

On Mon, Feb 05, 2024 at 11:00:04PM +0100, Danilo Krummrich wrote:

On 2/5/24 22:08, Dave Airlie wrote:

On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich  wrote:


On 1/29/24 02:50, Dave Airlie wrote:

From: Dave Airlie 

This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Signed-off-by: Dave Airlie 


Nouveau's scheduler uses a dedicated wq, hence from this perspective it's
safe deferring fence signalling to the kernel global wq. However, I wonder
if we could create deadlocks by building dependency chains into other
drivers / kernel code that, by chance, makes use of the kernel global wq as
well.

Admittedly, even if, it's gonna be extremely unlikely given that
WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq.

Also, do we need to CC stable?


I pushed this to Linus at the end of last week, since the hangs in 6.7
take out the complete system and I wanted it in stable.

It might be safer to use a dedicated wq, is the concern someone is
waiting on a fence in a workqueue somewhere else so we will never
signal it?


Yes, if some other work is waiting for this fence (or something else in the same
dependency chain) to signal it can prevent executing the work signaling this 
fence,
in case both are scheduled on the same wq. As mentioned, with the kernel global 
wq
this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 
512,
but formally the race condition exists. I guess a malicious attacker could try 
to
intentionally push jobs directly or indirectly depending on this fence to a 
driver
which queues them up on a scheduler using the kernel global wq.


I think if you add dma_fence_signalling annotations (aside, there's some
patch from iirc Thomas Hellstrom to improve them and cut down on some
false positives, but I lost track) then I think you won't get any splats
because the wq subsystem assumes that WC_MAX_ACTIVE is close enough to
infinity to not matter.


As mentioned, for the kernel global wq it's 512. (Intentionally) feeding the 
kernel
with enough jobs to to provoke a deadlock doesn't seem impossible to me.

I think it'd be safer to just establish not to use the kernel global wq for 
executing
work in the fence signalling critical path.

We could also run into similar problems with a dedicated wq, e.g. when drivers 
share
a wq between drm_gpu_scheduler instances (see [1]), however, I'm not sure we 
can catch
that with lockdep.


I think if you want to fix it perfectly you'd need to set the max number
of wq to the number of engines (or for dynamic/fw scheduled engines to the
number of context) you have. Or whatever limit to the number of parallel
timelines there is.> 
I guess this would need a new wq function to update? drm/sched code could

be able to set that for drivers, so drivers cannot get this wrong.


Not sure I can follow. The scheduler instance might be per context and bind
queue. In this case it gets the shared wq passed, but doesn't know how many
other scheduler instances share the same one.

Additionally, there might be drivers not using the DRM scheduler for for bind
queues at all (I think Xe does not).



If we don't do something like that then I'm not sure there's really much
benefit - instead of carefully timing 512 dma_fence on the global wq you
just need to create a pile of context (at least on intel's guc that's
absolutely no issue) and then careful time them.


Well, that's true. I'd still argue that there is a slight difference. From a
drivers isolated perspective using the global kernel wq might be entirely fine,
as in this patch. However, in combination with another driver doing the same
thing, things can blow up. For the case you illustrated it's at least possible
to spot it from a driver's perspective.



I still feel like we have bigger fish to fry ... But might be worth the
effort to at least make the parallel timeline limit a lot more explicit?


I agree, and it'd be great if we can find a solution such bugs can be detected
systematically (e.g. through lockdep), but maybe we can start to at least
document that we should never use the kernel global wq and where we need to be
careful in sharing driver wqs.

- Danilo



Cheers, Sima



[1] 
https://elixir.bootlin.com/linux/v6.8-rc3/source/drivers/gpu/drm/nouveau/nouveau_drm.c#L313



I'm not sure we should care differently, but I guess it'd be good to
annotate it all in case the wq subsystem's idea of how much such deadlocks
are real changes.

Also Teo is on a mission to get rid of all the global wq flushes, so there
should also be no source of deadlocks from that kind of cross-driver
dependency. Or at least shouldn't be in the future, I'm not sure it all
landed.
-Sima








Re: [PATCH 1/2] drm/scheduler: improve GPU scheduler documentation v2

2024-02-13 Thread Danilo Krummrich

Hi Christian,

What's the status of this effort? Was there ever a follow-up?

- Danilo

On 11/16/23 23:23, Danilo Krummrich wrote:

Hi Christian,

Thanks for sending an update of this patch!

On Thu, Nov 16, 2023 at 03:15:46PM +0100, Christian König wrote:

Start to improve the scheduler document. Especially document the
lifetime of each of the objects as well as the restrictions around
DMA-fence handling and userspace compatibility.

v2: Some improvements suggested by Danilo, add section about error
 handling.

Signed-off-by: Christian König 
---
  Documentation/gpu/drm-mm.rst   |  36 +
  drivers/gpu/drm/scheduler/sched_main.c | 174 +
  2 files changed, 188 insertions(+), 22 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index acc5901ac840..112463fa9f3a 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -552,12 +552,48 @@ Overview
  .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
 :doc: Overview
  
+Job Object

+--
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Job Object
+
+Entity Object
+-
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Entity Object
+
+Hardware Fence Object
+-
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Hardware Fence Object
+
+Scheduler Fence Object
+--
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Scheduler Fence Object
+
+Scheduler and Run Queue Objects
+---
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Scheduler and Run Queue Objects
+
  Flow Control
  
  
  .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c

 :doc: Flow Control
  
+Error and Timeout handling

+--
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Error and Timeout handling
+
  Scheduler Function References
  -
  
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c

index 044a8c4875ba..026123497b0e 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -24,28 +24,122 @@
  /**
   * DOC: Overview
   *
- * The GPU scheduler provides entities which allow userspace to push jobs
- * into software queues which are then scheduled on a hardware run queue.
- * The software queues have a priority among them. The scheduler selects the 
entities
- * from the run queue using a FIFO. The scheduler provides dependency handling
- * features among jobs. The driver is supposed to provide callback functions 
for
- * backend operations to the scheduler like submitting a job to hardware run 
queue,
- * returning the dependencies of a job etc.
- *
- * The organisation of the scheduler is the following:
- *
- * 1. Each hw run queue has one scheduler
- * 2. Each scheduler has multiple run queues with different priorities
- *(e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)
- * 3. Each scheduler run queue has a queue of entities to schedule
- * 4. Entities themselves maintain a queue of jobs that will be scheduled on
- *the hardware.
- *
- * The jobs in a entity are always scheduled in the order that they were 
pushed.
- *
- * Note that once a job was taken from the entities queue and pushed to the
- * hardware, i.e. the pending queue, the entity must not be referenced anymore
- * through the jobs entity pointer.
+ * The GPU scheduler implements some logic to decide which command submission
+ * to push next to the hardware. Another major use case of the GPU scheduler
+ * is to enforce correct driver behavior around those command submissions.
+ * Because of this it's also used by drivers which don't need the actual
+ * scheduling functionality.


This reads a bit like we're already right in the middle of the documentation.
I'd propose to start with something like "The DRM GPU scheduler serves as a
common component intended to help drivers to manage command submissions." And
then mention the different solutions the scheduler provides, e.g. ordering of
command submissions, dma-fence handling in the context of command submissions,
etc.

Also, I think it would be good to give a rough overview of the topology of the
scheduler's components. And since you already mention that the component is
"also used by drivers which don't need the actual scheduling functionality", I'd
also mention the special case of a single entity and a single run-queue per
scheduler.


+ *
+ * All callbacks the driver needs to implement are restricted by DMA-fence
+ * signaling rules to guarantee deadlock free forward progress. This especially
+ * means that for normal operation no memory can be allocated in a callback.
+ * All memory which is needed for pushing the job to the hardware must be
+ * allocated before arming a job. It also means that no locks can be taken
+ 

Re: [PATCH 2/2] [v3] drm/nouveau: expose GSP-RM logging buffers via debugfs

2024-02-13 Thread Danilo Krummrich

On 2/12/24 22:15, Timur Tabi wrote:

The LOGINIT, LOGINTR, LOGRM, and LOGPMU buffers are circular buffers
that have printf-like logs from GSP-RM and PMU encoded in them.

LOGINIT, LOGINTR, and LOGRM are allocated by Nouveau and their DMA
addresses are passed to GSP-RM during initialization.  The buffers are
required for GSP-RM to initialize properly.

LOGPMU is also allocated by Nouveau, but its contents are updated
when Nouveau receives an NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPC from
GSP-RM.  Nouveau then copies the RPC to the buffer.

The messages are encoded as an array of variable-length structures that
contain the parameters to an NV_PRINTF call.  The format string and
parameter count are stored in a special ELF image that contains only
logging strings.  This image is not currently shipped with the Nvidia
driver.

There are two methods to extract the logs.

OpenRM tries to load the logging ELF, and if present, parses the log
buffers in real time and outputs the strings to the kernel console.

Alternatively, and this is the method used by this patch, the buffers
can be exposed to user space, and a user-space tool (along with the
logging ELF image) can parse the buffer and dump the logs.

This method has the advantage that it allows the buffers to be parsed
even when the logging ELF file is not available to the user.  However,
it has the disadvantage the debubfs entries need to remain until the
driver is unloaded.

The buffers are exposed via debugfs.  The debugfs entries must be
created before GSP-RM is started, to ensure that they are available
during GSP-RM initialization.

If GSP-RM fails to initialize, then Nouveau immediately shuts down
the GSP interface.  This would normally also deallocate the logging
buffers, thereby preventing the user from capturing the debug logs.
To avoid this, the keep-gsp-logging command line parameter can be
specified.  This parmater is marked as *unsafe* (thereby taining the
kernel) because the DMA buffer and debugfs entries are never
deallocated, even if the driver unloads.  This gives the user the
time to capture the logs, but it also means that resources can only
be recovered by a reboot.

An end-user can capture the logs using the following commands:

 cp /sys/kernel/debug/dri//loginit loginit
 cp /sys/kernel/debug/dri//logrm logrm
 cp /sys/kernel/debug/dri//logintr logintr
 cp /sys/kernel/debug/dri//logpmu logpmu

where  is the PCI ID of the GPU (e.g. :65:00.0).  If
keep-gsp-logging is specified, then the  is the same but with
-debug appended (e.g. :65:00.0-debug).

Since LOGPMU is not needed for normal GSP-RM operation, it is only
created if debugfs is available.  Otherwise, the
NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPCs are ignored.

Signed-off-by: Timur Tabi 
---
v3: reworked r535_gsp_libos_debugfs_init, rebased for drm-next

  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |  12 +
  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 215 +-
  2 files changed, 223 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h 
b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
index 3fbc57b16a05..2ee44bdf8be7 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
@@ -5,6 +5,8 @@
  #include 
  #include 
  
+#include 

+
  #define GSP_PAGE_SHIFT 12
  #define GSP_PAGE_SIZE  BIT(GSP_PAGE_SHIFT)
  
@@ -217,6 +219,16 @@ struct nvkm_gsp {
  
  	/* The size of the registry RPC */

size_t registry_rpc_size;
+
+   /*
+* Logging buffers in debugfs.  The wrapper objects need to remain
+* in memory until the dentry is deleted.
+*/
+   struct debugfs_blob_wrapper blob_init;
+   struct debugfs_blob_wrapper blob_intr;
+   struct debugfs_blob_wrapper blob_rm;
+   struct debugfs_blob_wrapper blob_pmu;
+   struct dentry *debugfs_logging_dir;


I think we should not create those from within the nvkm layer, but rather pass
them down through nvkm_device_pci_new().

Lifecycle wise I think we should ensure that removing the Nouveau kernel module
also cleans up those buffers. Even though keep-gsp-logging is considered unsafe,
we shouldn't leak memory.

For instance, can we allocate corresponding buffers in the driver layer, copy
things over and keep those buffers until nouveau_drm_exit()? This would also
avoid exposing those DMA buffers via debugfs.


  };
  
  static inline bool

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 86b62c7e1229..56209bf81360 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -26,6 +26,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include 
  
@@ -1979,6 +1980,196 @@ r535_gsp_rmargs_init(struct nvkm_gsp *gsp, bool resume)

return 0;
  }
  
+#define NV_GSP_MSG_EVENT_UCODE_LIBOS_CLASS_PMU		0xf3d722

+
+/**
+ * 

Re: [PATCH 1/2] [v3] nouveau: add command-line GSP-RM registry support

2024-02-13 Thread Danilo Krummrich

On 2/12/24 22:15, Timur Tabi wrote:

Add the NVreg_RegistryDwords command line parameter, which allows
specifying additional registry keys to be sent to GSP-RM.  This
allows additional configuration, debugging, and experimentation
with GSP-RM, which uses these keys to alter its behavior.

Note that these keys are passed as-is to GSP-RM, and Nouveau does
not parse them.  This is in contrast to the Nvidia driver, which may
parse some of the keys to configure some functionality in concert with
GSP-RM.  Therefore, any keys which also require action by the driver
may not function correctly when passed by Nouveau.  Caveat emptor.

The name and format of NVreg_RegistryDwords is the same as used by
the Nvidia driver, to maintain compatibility.

Signed-off-by: Timur Tabi 
---
v3: rebased to drm-next

  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |   6 +
  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 279 --
  2 files changed, 261 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h 
b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
index 6f5d376d8fcc..3fbc57b16a05 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
@@ -211,6 +211,12 @@ struct nvkm_gsp {
struct mutex mutex;;
struct idr idr;
} client_id;
+
+   /* A linked list of registry items. The registry RPC will be built from 
it. */
+   struct list_head registry_list;
+
+   /* The size of the registry RPC */
+   size_t registry_rpc_size;
  };
  
  static inline bool

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 1c46e3f919be..86b62c7e1229 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -54,6 +54,8 @@
  #include 
  
  #include 

+#include 
+#include 
  
  #define GSP_MSG_MIN_SIZE GSP_PAGE_SIZE

  #define GSP_MSG_MAX_SIZE GSP_PAGE_MIN_SIZE * 16
@@ -1082,51 +1084,280 @@ r535_gsp_rpc_unloading_guest_driver(struct nvkm_gsp 
*gsp, bool suspend)
return nvkm_gsp_rpc_wr(gsp, rpc, true);
  }
  
+struct registry_list_entry {

+   struct list_head list;


Nit: 'head' or 'entry' might be more suitable.


+   size_t name_len;
+   u32 type;


I prefer to represent type as enum or use a define for '1 = integer' instead.
This also gets you rid of the need to explicitly explain it's meaning in the
documentation of add_registry().


+   u32 data;
+   u32 length;
+   char name[];
+};
+
+/**
+ * add_registry -- adds a registry entry
+ * @gsp: gsp pointer
+ * @name: name of the registry key
+ * @type: type of data (1 = integer)
+ * @data: value
+ * @length: size of data, in bytes
+ *
+ * Adds a registry key/value pair to the registry database.
+ *
+ * Currently, only 32-bit integers (type == 1, length == 4) are supported.


What if we pass something else? This function doesn't seem to ensure nothing
else is accepted. Although I recognize that it's only used by add_registry_num()
enforcing this limitation by it's function signature.


+ *
+ * This function collects the registry information in a linked list.  After
+ * all registry keys have been added, build_registry() is used to create the
+ * RPC data structure.
+ *
+ * registry_rpc_size is a running total of the size of all registry keys.
+ * It's used to avoid an O(n) calculation of the size when the RPC is built.
+ *
+ * Returns 0 on success, or negative error code on error.
+ */
+static int add_registry(struct nvkm_gsp *gsp, const char *name, u32 type, u32 
data, u32 length)
+{
+   struct registry_list_entry *reg;
+   size_t nlen = strlen(name) + 1;
+
+   /* Set an arbitrary limit to avoid problems with broken command lines */
+   if (strlen(name) > 64)


Could re-use nlen for this check.


+   return -EFBIG;


This error code doesn't seem to apply here, prefer EINVAL.


+
+   reg = kmalloc(sizeof(struct registry_list_entry) + nlen, GFP_KERNEL);
+   if (!reg)
+   return -ENOMEM;
+
+   memcpy(reg->name, name, nlen);
+   reg->name_len = nlen;
+   reg->type = type;
+   reg->data = data;
+   reg->length = length;
+
+   nvkm_debug(>subdev, "adding GSP-RM registry '%s=%u'\n", name, 
data);
+   list_add_tail(>list, >registry_list);
+   gsp->registry_rpc_size += sizeof(PACKED_REGISTRY_ENTRY) + nlen;
+
+   return 0;
+}
+
+static int add_registry_num(struct nvkm_gsp *gsp, const char *name, u32 value)
+{
+   return add_registry(gsp, name, 1, value, sizeof(u32));
+}
+
+/**
+ * build_registry -- create the registry RPC data
+ * @gsp: gsp pointer
+ * @registry: pointer to the RPC payload to fill
+ *
+ * After all registry key/value pairs have been added, call this function to
+ * build the RPC.
+ *
+ * The registry RPC looks like this:
+ *
+ * +-+
+ * |NvU32 size;  |
+ * |NvU32 

Re: [PATCH] nouveau/svm: fix kvcalloc() argument order

2024-02-12 Thread Danilo Krummrich

On 2/12/24 12:22, Arnd Bergmann wrote:

From: Arnd Bergmann 

The conversion to kvcalloc() mixed up the object size and count
arguments, causing a warning:

drivers/gpu/drm/nouveau/nouveau_svm.c: In function 
'nouveau_svm_fault_buffer_ctor':
drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: error: 'kvcalloc' sizes 
specified with 'sizeof' in the earlier argument and not in the later argument 
[-Werror=calloc-transposed-args]
  1010 | buffer->fault = kvcalloc(sizeof(*buffer->fault), 
buffer->entries, GFP_KERNEL);
   |^
drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: note: earlier argument should 
specify number of elements, later size of each element

The behavior is still correct aside from the warning, but fixing it avoids
the warnings and can help the compiler track the individual objects better.

Fixes: 71e4bbca070e ("nouveau/svm: Use kvcalloc() instead of kvzalloc()")
Signed-off-by: Arnd Bergmann 


Applied to drm-misc-fixes, thanks!


---
  drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 4d1008915499..b4da82ddbb6b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -1007,7 +1007,7 @@ nouveau_svm_fault_buffer_ctor(struct nouveau_svm *svm, 
s32 oclass, int id)
if (ret)
return ret;
  
-	buffer->fault = kvcalloc(sizeof(*buffer->fault), buffer->entries, GFP_KERNEL);

+   buffer->fault = kvcalloc(buffer->entries, sizeof(*buffer->fault), 
GFP_KERNEL);
if (!buffer->fault)
return -ENOMEM;
  




Re: [PATCH] nouveau: offload fence uevents work to workqueue

2024-02-09 Thread Danilo Krummrich

On 2/6/24 15:03, Daniel Vetter wrote:

On Mon, Feb 05, 2024 at 11:00:04PM +0100, Danilo Krummrich wrote:

On 2/5/24 22:08, Dave Airlie wrote:

On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich  wrote:


On 1/29/24 02:50, Dave Airlie wrote:

From: Dave Airlie 

This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Signed-off-by: Dave Airlie 


Nouveau's scheduler uses a dedicated wq, hence from this perspective it's
safe deferring fence signalling to the kernel global wq. However, I wonder
if we could create deadlocks by building dependency chains into other
drivers / kernel code that, by chance, makes use of the kernel global wq as
well.

Admittedly, even if, it's gonna be extremely unlikely given that
WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq.

Also, do we need to CC stable?


I pushed this to Linus at the end of last week, since the hangs in 6.7
take out the complete system and I wanted it in stable.

It might be safer to use a dedicated wq, is the concern someone is
waiting on a fence in a workqueue somewhere else so we will never
signal it?


Yes, if some other work is waiting for this fence (or something else in the same
dependency chain) to signal it can prevent executing the work signaling this 
fence,
in case both are scheduled on the same wq. As mentioned, with the kernel global 
wq
this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 
512,
but formally the race condition exists. I guess a malicious attacker could try 
to
intentionally push jobs directly or indirectly depending on this fence to a 
driver
which queues them up on a scheduler using the kernel global wq.


I think if you add dma_fence_signalling annotations (aside, there's some
patch from iirc Thomas Hellstrom to improve them and cut down on some
false positives, but I lost track) then I think you won't get any splats
because the wq subsystem assumes that WC_MAX_ACTIVE is close enough to
infinity to not matter.


As mentioned, for the kernel global wq it's 512. (Intentionally) feeding the 
kernel
with enough jobs to to provoke a deadlock doesn't seem impossible to me.

I think it'd be safer to just establish not to use the kernel global wq for 
executing
work in the fence signalling critical path.

We could also run into similar problems with a dedicated wq, e.g. when drivers 
share
a wq between drm_gpu_scheduler instances (see [1]), however, I'm not sure we 
can catch
that with lockdep.

[1] 
https://elixir.bootlin.com/linux/v6.8-rc3/source/drivers/gpu/drm/nouveau/nouveau_drm.c#L313



I'm not sure we should care differently, but I guess it'd be good to
annotate it all in case the wq subsystem's idea of how much such deadlocks
are real changes.

Also Teo is on a mission to get rid of all the global wq flushes, so there
should also be no source of deadlocks from that kind of cross-driver
dependency. Or at least shouldn't be in the future, I'm not sure it all
landed.
-Sima




Re: [PATCH] nouveau: offload fence uevents work to workqueue

2024-02-05 Thread Danilo Krummrich

On 2/5/24 22:08, Dave Airlie wrote:

On Tue, 6 Feb 2024 at 02:22, Danilo Krummrich  wrote:


On 1/29/24 02:50, Dave Airlie wrote:

From: Dave Airlie 

This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Signed-off-by: Dave Airlie 


Nouveau's scheduler uses a dedicated wq, hence from this perspective it's
safe deferring fence signalling to the kernel global wq. However, I wonder
if we could create deadlocks by building dependency chains into other
drivers / kernel code that, by chance, makes use of the kernel global wq as
well.

Admittedly, even if, it's gonna be extremely unlikely given that
WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq.

Also, do we need to CC stable?


I pushed this to Linus at the end of last week, since the hangs in 6.7
take out the complete system and I wanted it in stable.

It might be safer to use a dedicated wq, is the concern someone is
waiting on a fence in a workqueue somewhere else so we will never
signal it?


Yes, if some other work is waiting for this fence (or something else in the same
dependency chain) to signal it can prevent executing the work signaling this 
fence,
in case both are scheduled on the same wq. As mentioned, with the kernel global 
wq
this would be rather unlikely to lead to an actual stall with WQ_MAX_ACTIVE == 
512,
but formally the race condition exists. I guess a malicious attacker could try 
to
intentionally push jobs directly or indirectly depending on this fence to a 
driver
which queues them up on a scheduler using the kernel global wq.



Dave.





Re: [PATCH] nouveau/gsp: use correct size for registry rpc.

2024-02-05 Thread Danilo Krummrich

On 1/30/24 04:26, Dave Airlie wrote:

From: Dave Airlie 

Timur pointed this out before, and it just slipped my mind,
but this might help some things work better, around pcie power
management.

Fixes: 8d55b0a940bb ("nouveau/gsp: add some basic registry entries.")
Signed-off-by: Dave Airlie 


Added stable CC for v6.7 and applied it to drm-misc-fixes.


---
  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 9ee58e2a0eb2..5e1fa176aac4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1078,7 +1078,6 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp)
if (IS_ERR(rpc))
return PTR_ERR(rpc);
  
-	rpc->size = sizeof(*rpc);

rpc->numEntries = NV_GSP_REG_NUM_ENTRIES;
  
  	str_offset = offsetof(typeof(*rpc), entries[NV_GSP_REG_NUM_ENTRIES]);

@@ -1094,6 +1093,7 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp)
strings += name_len;
str_offset += name_len;
}
+   rpc->size = str_offset;
  
  	return nvkm_gsp_rpc_wr(gsp, rpc, false);

  }


Re: [PATCH] nouveau: offload fence uevents work to workqueue

2024-02-05 Thread Danilo Krummrich

On 1/29/24 02:50, Dave Airlie wrote:

From: Dave Airlie 

This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Signed-off-by: Dave Airlie 


Nouveau's scheduler uses a dedicated wq, hence from this perspective it's
safe deferring fence signalling to the kernel global wq. However, I wonder
if we could create deadlocks by building dependency chains into other
drivers / kernel code that, by chance, makes use of the kernel global wq as
well.

Admittedly, even if, it's gonna be extremely unlikely given that
WQ_MAX_ACTIVE == 512. But maybe it'd be safer to use a dedicated wq.

Also, do we need to CC stable?


---
  drivers/gpu/drm/nouveau/nouveau_fence.c | 24 ++--
  drivers/gpu/drm/nouveau/nouveau_fence.h |  1 +
  2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index ca762ea55413..93f08f9479d8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -103,6 +103,7 @@ nouveau_fence_context_kill(struct nouveau_fence_chan *fctx, 
int error)
  void
  nouveau_fence_context_del(struct nouveau_fence_chan *fctx)
  {
+   cancel_work_sync(>uevent_work);
nouveau_fence_context_kill(fctx, 0);
nvif_event_dtor(>event);
fctx->dead = 1;
@@ -145,12 +146,13 @@ nouveau_fence_update(struct nouveau_channel *chan, struct 
nouveau_fence_chan *fc
return drop;
  }
  
-static int

-nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 
repc)
+static void
+nouveau_fence_uevent_work(struct work_struct *work)
  {
-   struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), 
event);
+   struct nouveau_fence_chan *fctx = container_of(work, struct 
nouveau_fence_chan,
+  uevent_work);
unsigned long flags;
-   int ret = NVIF_EVENT_KEEP;
+   int drop = 0;
  
  	spin_lock_irqsave(>lock, flags);

if (!list_empty(>pending)) {
@@ -160,11 +162,20 @@ nouveau_fence_wait_uevent_handler(struct nvif_event 
*event, void *repv, u32 repc
fence = list_entry(fctx->pending.next, typeof(*fence), head);
chan = rcu_dereference_protected(fence->channel, 
lockdep_is_held(>lock));
if (nouveau_fence_update(chan, fctx))
-   ret = NVIF_EVENT_DROP;
+   drop = 1;
}
+   if (drop)
+   nvif_event_block(>event);
+
spin_unlock_irqrestore(>lock, flags);
+}
  
-	return ret;

+static int
+nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 
repc)
+{
+   struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), 
event);
+   schedule_work(>uevent_work);
+   return NVIF_EVENT_KEEP;
  }
  
  void

@@ -178,6 +189,7 @@ nouveau_fence_context_new(struct nouveau_channel *chan, 
struct nouveau_fence_cha
} args;
int ret;
  
+	INIT_WORK(>uevent_work, nouveau_fence_uevent_work);

INIT_LIST_HEAD(>flip);
INIT_LIST_HEAD(>pending);
spin_lock_init(>lock);
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h 
b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 64d33ae7f356..8bc065acfe35 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -44,6 +44,7 @@ struct nouveau_fence_chan {
u32 context;
char name[32];
  
+	struct work_struct uevent_work;

struct nvif_event event;
int notify_ref, dead, killed;
  };




Re: [PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs

2024-02-05 Thread Danilo Krummrich

On 2/5/24 10:03, Boris Brezillon wrote:

+Danilo for the panthor gpuvm-needs update.

On Sun, 4 Feb 2024 09:14:44 +0800 (CST)
"Andy Yan"  wrote:


Hi Boris:
I saw this warning sometimes(Run on a armbain based bookworm),not sure is a 
know issue or something else。
[15368.293031] systemd-journald[715]: Received client request to relinquish 
/var/log/journal/1bc4a340506142af9bd31a6a3d2170ba access.
[37743.040737] [ cut here ]
[37743.040764] panthor fb00.gpu: drm_WARN_ON(shmem->pages_use_count)
[37743.040890] WARNING: CPU: 2 PID: 5702 at 
drivers/gpu/drm/drm_gem_shmem_helper.c:158 drm_gem_shmem_free+0x144/0x14c 
[drm_shmem_helper]
[37743.040929] Modules linked in: joydev rfkill sunrpc lz4hc lz4 zram 
binfmt_misc hantro_vpu crct10dif_ce v4l2_vp9 v4l2_h264 snd_soc_simple_amplifier 
v4l2_mem2mem videobuf2_dma_contig snd_soc_es8328_i2c videobuf2_memops 
rk_crypto2 snd_soc_es8328 videobuf2_v4l2 sm3_generic videodev crypto_engine sm3 
rockchip_rng videobuf2_common nvmem_rockchip_otp snd_soc_rockchip_i2s_tdm 
snd_soc_hdmi_codec snd_soc_simple_card mc snd_soc_simple_card_utils 
snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_pcm snd_timer snd 
soundcore dm_mod ip_tables x_tables autofs4 dw_hdmi_qp_i2s_audio dw_hdmi_qp_cec 
rk808_regulator rockchipdrm dw_mipi_dsi dw_hdmi_qp dw_hdmi analogix_dp 
drm_dma_helper fusb302 display_connector rk8xx_spi drm_display_helper 
phy_rockchip_snps_pcie3 phy_rockchip_samsung_hdptx_hdmi panthor tcpm rk8xx_core 
cec drm_gpuvm gpu_sched drm_kms_helper drm_shmem_helper drm_exec r8169 drm 
pwm_bl adc_keys
[37743.041108] CPU: 2 PID: 5702 Comm: kworker/u16:8 Not tainted 
6.8.0-rc1-edge-rockchip-rk3588 #2
[37743.041115] Hardware name: Rockchip RK3588 EVB1 V10 Board (DT)
[37743.041120] Workqueue: panthor-cleanup 
panthor_vm_bind_job_cleanup_op_ctx_work [panthor]
[37743.041151] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[37743.041157] pc : drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper]
[37743.041169] lr : drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper]
[37743.041181] sp : 80008d37bcc0
[37743.041184] x29: 80008d37bcc0 x28: 800081d379c0 x27: 800081d37000
[37743.041196] x26: 00019909a280 x25: 00019909a2c0 x24: 0001017a4c05
[37743.041206] x23: dead0100 x22: dead0122 x21: 0001627ac1a0
[37743.041217] x20:  x19: 0001627ac000 x18: 
[37743.041227] x17: 00040044 x16: 005000f2b5503510 x15: fff91b77
[37743.041238] x14: 0001 x13: 03c5 x12: ffea
[37743.041248] x11: dfff x10: dfff x9 : 800081e0e818
[37743.041259] x8 : 0002ffe8 x7 : c000dfff x6 : 000affa8
[37743.041269] x5 : 1fff x4 :  x3 : 8000819a6008
[37743.041279] x2 :  x1 :  x0 : 00018465e900
[37743.041290] Call trace:
[37743.041293]  drm_gem_shmem_free+0x144/0x14c [drm_shmem_helper]
[37743.041306]  panthor_gem_free_object+0x24/0xa0 [panthor]
[37743.041321]  drm_gem_object_free+0x1c/0x30 [drm]
[37743.041452]  panthor_vm_bo_put+0xc4/0x12c [panthor]
[37743.041475]  panthor_vm_cleanup_op_ctx.constprop.0+0xb0/0x104 [panthor]
[37743.041491]  panthor_vm_bind_job_cleanup_op_ctx_work+0x28/0xd0 [panthor]


Ok, I think I found the culprit: there's a race between
the drm_gpuvm_bo_put() call in panthor_vm_bo_put() and the list
iteration done by drm_gpuvm_prepare_objects(). Because we're not
setting DRM_GPUVM_RESV_PROTECTED, the code goes through the 'lockless'
iteration loop, and takes/release a vm_bo ref at each iteration. This
means our 'were we the last vm_bo user?' test in panthor_vm_bo_put()
might return false even if we were actually the last user, and when
for_each_vm_bo_in_list() releases the ref it acquired, it not only leaks
the pin reference, thus leaving GEM pages pinned (which explains this
WARN_ON() splat), but it also calls drm_gpuvm_bo_destroy() in a path
where we don't hold the GPUVA list lock, which is bad.


IIUC, GPUVM generally behaves correctly. It's just that the return value
of drm_gpuvm_bo_put() needs to be treated with care.

Maybe we should extend c50a291d621a ("drm/gpuvm: Let drm_gpuvm_bo_put()
report when the vm_bo object is destroyed") by a note explaining this
unexpected case, or, if not required anymore, simply revert the patch.



Long story short, I'll have to use DRM_GPUVM_RESV_PROTECTED, which is
fine because I'm deferring vm_bo removal to a work where taking the VM
resv lock is allowed. Since I was the one asking for this lockless
iterator in the first place, I wonder if we should kill that and make
DRM_GPUVM_RESV_PROTECTED the default (this would greatly simplify
the code). AFAICT, The PowerVR driver shouldn't be impacted because it's
using drm_gpuvm in synchronous mode only, and Xe already uses the
resv-protected mode. That leaves Nouveau, but IIRC, it's also doing VM
updates in the ioctl path.

Danilo, any 

Re: [PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized

2024-02-02 Thread Danilo Krummrich

On 2/2/24 18:14, Timur Tabi wrote:

On Fri, 2024-02-02 at 01:05 +0100, Danilo Krummrich wrote:

nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call
their corresponding *_fini() counterpart. This can lead to
nouveau_sched_fini() being called without struct nouveau_sched ever
being initialized in the first place.


Thanks, I've confirmed that these patches do fix the problem


Cool, gonna add your 'Tested-by' then.

- Danilo
 



[PATCH 2/2] drm/nouveau: omit to create schedulers using the legacy uAPI

2024-02-01 Thread Danilo Krummrich
Omit to create scheduler instances when using the legacy uAPI. When
using the legacy NOUVEAU_GEM_PUSHBUF ioctl no scheduler instance is
required, hence omit creating scheduler instances in
nouveau_abi16_ioctl_channel_alloc().

Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/nouveau/nouveau_abi16.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c 
b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index ca4b5ab3e59e..d1bb8151a1df 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -339,10 +339,16 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
if (ret)
goto done;
 
-   ret = nouveau_sched_create(>sched, drm, drm->sched_wq,
-  chan->chan->dma.ib_max);
-   if (ret)
-   goto done;
+   /* If we're not using the VM_BIND uAPI, we don't need a scheduler.
+*
+* The client lock is already acquired by nouveau_abi16_get().
+*/
+   if (nouveau_cli_uvmm(cli)) {
+   ret = nouveau_sched_create(>sched, drm, drm->sched_wq,
+  chan->chan->dma.ib_max);
+   if (ret)
+   goto done;
+   }
 
init->channel = chan->chan->chid;
 
-- 
2.43.0



[PATCH 1/2] drm/nouveau: don't fini scheduler if not initialized

2024-02-01 Thread Danilo Krummrich
nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call
their corresponding *_fini() counterpart. This can lead to
nouveau_sched_fini() being called without struct nouveau_sched ever
being initialized in the first place.

Instead of embedding struct nouveau_sched into struct nouveau_cli and
struct nouveau_chan_abi16, allocate struct nouveau_sched separately,
such that we can check for the corresponding pointer to be NULL in the
particular *_fini() functions.

It makes sense to allocate struct nouveau_sched separately anyway, since
in a subsequent commit we can also avoid to allocate a struct
nouveau_sched in nouveau_abi16_ioctl_channel_alloc() at all, if the
VM_BIND uAPI has been disabled due to the legacy uAPI being used.

Fixes: 5f03a507b29e ("drm/nouveau: implement 1:1 scheduler - entity 
relationship")
Reported-by: Timur Tabi 
Closes: 
https://lore.kernel.org/nouveau/20240131213917.1545604-1-tt...@nvidia.com/
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/nouveau/nouveau_abi16.c | 10 ---
 drivers/gpu/drm/nouveau/nouveau_abi16.h |  2 +-
 drivers/gpu/drm/nouveau/nouveau_drm.c   |  7 +++--
 drivers/gpu/drm/nouveau/nouveau_drv.h   |  2 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c | 38 +++--
 drivers/gpu/drm/nouveau/nouveau_sched.h |  6 ++--
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  2 +-
 8 files changed, 53 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c 
b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index a04156ca8390..ca4b5ab3e59e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -128,12 +128,14 @@ nouveau_abi16_chan_fini(struct nouveau_abi16 *abi16,
struct nouveau_abi16_ntfy *ntfy, *temp;
 
/* Cancel all jobs from the entity's queue. */
-   drm_sched_entity_fini(>sched.entity);
+   if (chan->sched)
+   drm_sched_entity_fini(>sched->entity);
 
if (chan->chan)
nouveau_channel_idle(chan->chan);
 
-   nouveau_sched_fini(>sched);
+   if (chan->sched)
+   nouveau_sched_destroy(>sched);
 
/* cleanup notifier state */
list_for_each_entry_safe(ntfy, temp, >notifiers, head) {
@@ -337,8 +339,8 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
if (ret)
goto done;
 
-   ret = nouveau_sched_init(>sched, drm, drm->sched_wq,
-chan->chan->dma.ib_max);
+   ret = nouveau_sched_create(>sched, drm, drm->sched_wq,
+  chan->chan->dma.ib_max);
if (ret)
goto done;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.h 
b/drivers/gpu/drm/nouveau/nouveau_abi16.h
index 1f5e243c0c75..11c8c4a80079 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.h
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.h
@@ -26,7 +26,7 @@ struct nouveau_abi16_chan {
struct nouveau_bo *ntfy;
struct nouveau_vma *ntfy_vma;
struct nvkm_mm  heap;
-   struct nouveau_sched sched;
+   struct nouveau_sched *sched;
 };
 
 struct nouveau_abi16 {
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 6f6c31a9937b..a947e1d5f309 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -201,7 +201,8 @@ nouveau_cli_fini(struct nouveau_cli *cli)
WARN_ON(!list_empty(>worker));
 
usif_client_fini(cli);
-   nouveau_sched_fini(>sched);
+   if (cli->sched)
+   nouveau_sched_destroy(>sched);
if (uvmm)
nouveau_uvmm_fini(uvmm);
nouveau_vmm_fini(>svm);
@@ -311,7 +312,7 @@ nouveau_cli_init(struct nouveau_drm *drm, const char *sname,
cli->mem = [ret];
 
/* Don't pass in the (shared) sched_wq in order to let
-* nouveau_sched_init() create a dedicated one for VM_BIND jobs.
+* nouveau_sched_create() create a dedicated one for VM_BIND jobs.
 *
 * This is required to ensure that for VM_BIND jobs free_job() work and
 * run_job() work can always run concurrently and hence, free_job() work
@@ -320,7 +321,7 @@ nouveau_cli_init(struct nouveau_drm *drm, const char *sname,
 * locks which indirectly or directly are held for allocations
 * elsewhere.
 */
-   ret = nouveau_sched_init(>sched, drm, NULL, 1);
+   ret = nouveau_sched_create(>sched, drm, NULL, 1);
if (ret)
goto done;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 8a6d94c8b163..e239c6bf4afa 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -98,7 +98,7 @@ struct nouveau_cli {
bool disabled;
} uvmm;
 
-   struct nouveau_sched sched;
+ 

Re: [PATCH v3 1/2] drm: enable (most) W=1 warnings by default across the subsystem

2024-02-01 Thread Danilo Krummrich



On 2/1/24 15:33, Jani Nikula wrote:

At least the i915 and amd drivers enable a bunch more compiler warnings
than the kernel defaults.

Extend most of the W=1 warnings to the entire drm subsystem by
default. Use the copy-pasted warnings from scripts/Makefile.extrawarn
with s/KBUILD_CFLAGS/subdir-ccflags-y/ to make it easier to compare and
keep up with them in the future.

This is similar to the approach currently used in i915.

Some of the -Wextra warnings do need to be disabled, just like in
Makefile.extrawarn, but take care to not disable them for W=2 or W=3
builds, depending on the warning.

There are too many -Wformat-truncation warnings to cleanly fix up front;
leave that warning disabled for now.

v3:
- Drop -Wmissing-declarations (already enabled by default)
- Drop -Wmissing-prototypes (already enabled by default)

v2:
- Drop -Wformat-truncation (too many warnings)
- Drop -Wstringop-overflow (already enabled by default)

Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: Alex Deucher 
Cc: Christian König 
Cc: Pan, Xinhui 
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Sean Paul 
Cc: Marijn Suijten 
Cc: Hamza Mahfooz 
Acked-by: Javier Martinez Canillas 
Acked-by: Thomas Zimmermann 
Acked-by: Sui Jingfeng 
Signed-off-by: Jani Nikula 


Acked-by: Danilo Krummrich 


---
  drivers/gpu/drm/Makefile | 25 +
  1 file changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 104b42df2e95..04722a5dfb78 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -5,6 +5,31 @@
  
  CFLAGS-$(CONFIG_DRM_USE_DYNAMIC_DEBUG)	+= -DDYNAMIC_DEBUG_MODULE
  
+# Unconditionally enable W=1 warnings locally

+# --- begin copy-paste W=1 warnings from scripts/Makefile.extrawarn
+subdir-ccflags-y += -Wextra -Wunused -Wno-unused-parameter
+subdir-ccflags-y += $(call cc-option, -Wrestrict)
+subdir-ccflags-y += -Wmissing-format-attribute
+subdir-ccflags-y += -Wold-style-definition
+subdir-ccflags-y += -Wmissing-include-dirs
+subdir-ccflags-y += $(call cc-option, -Wunused-but-set-variable)
+subdir-ccflags-y += $(call cc-option, -Wunused-const-variable)
+subdir-ccflags-y += $(call cc-option, -Wpacked-not-aligned)
+subdir-ccflags-y += $(call cc-option, -Wformat-overflow)
+# FIXME: fix -Wformat-truncation warnings and uncomment
+#subdir-ccflags-y += $(call cc-option, -Wformat-truncation)
+subdir-ccflags-y += $(call cc-option, -Wstringop-truncation)
+# The following turn off the warnings enabled by -Wextra
+ifeq ($(findstring 2, $(KBUILD_EXTRA_WARN)),)
+subdir-ccflags-y += -Wno-missing-field-initializers
+subdir-ccflags-y += -Wno-type-limits
+subdir-ccflags-y += -Wno-shift-negative-value
+endif
+ifeq ($(findstring 3, $(KBUILD_EXTRA_WARN)),)
+subdir-ccflags-y += -Wno-sign-compare
+endif
+# --- end copy-paste
+
  drm-y := \
drm_aperture.o \
drm_atomic.o \




Re: [PATCH 2/6] drm/nouveau/svm: remove unused but set variables

2024-01-31 Thread Danilo Krummrich

On 1/10/24 18:39, Jani Nikula wrote:

Fix the W=1 warning -Wunused-but-set-variable.

Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: nouv...@lists.freedesktop.org
Signed-off-by: Jani Nikula 


Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/nouveau/nouveau_svm.c | 10 +++---
  1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index cc03e0c22ff3..4d1008915499 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -112,7 +112,7 @@ nouveau_svmm_bind(struct drm_device *dev, void *data,
  {
struct nouveau_cli *cli = nouveau_cli(file_priv);
struct drm_nouveau_svm_bind *args = data;
-   unsigned target, cmd, priority;
+   unsigned target, cmd;
unsigned long addr, end;
struct mm_struct *mm;
  
@@ -136,9 +136,6 @@ nouveau_svmm_bind(struct drm_device *dev, void *data,

return -EINVAL;
}
  
-	priority = args->header >> NOUVEAU_SVM_BIND_PRIORITY_SHIFT;

-   priority &= NOUVEAU_SVM_BIND_PRIORITY_MASK;
-
/* FIXME support CPU target ie all target value < GPU_VRAM */
target = args->header >> NOUVEAU_SVM_BIND_TARGET_SHIFT;
target &= NOUVEAU_SVM_BIND_TARGET_MASK;
@@ -926,15 +923,14 @@ nouveau_pfns_map(struct nouveau_svmm *svmm, struct 
mm_struct *mm,
 unsigned long addr, u64 *pfns, unsigned long npages)
  {
struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns);
-   int ret;
  
  	args->p.addr = addr;

args->p.size = npages << PAGE_SHIFT;
  
  	mutex_lock(>mutex);
  
-	ret = nvif_object_ioctl(>vmm->vmm.object, args,

-   struct_size(args, p.phys, npages), NULL);
+   nvif_object_ioctl(>vmm->vmm.object, args,
+ struct_size(args, p.phys, npages), NULL);
  
  	mutex_unlock(>mutex);

  }




Re: [PATCH 1/6] drm/nouveau/acr/ga102: remove unused but set variable

2024-01-31 Thread Danilo Krummrich

On 1/10/24 18:39, Jani Nikula wrote:

Fix the W=1 warning -Wunused-but-set-variable.

Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: nouv...@lists.freedesktop.org
Signed-off-by: Jani Nikula 


Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
index f36a359d4531..bd104a030243 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
@@ -218,7 +218,7 @@ nvkm_acr_lsfw_load_sig_image_desc_v2(struct nvkm_subdev 
*subdev,
const struct firmware *hsbl;
const struct nvfw_ls_hsbl_bin_hdr *hdr;
const struct nvfw_ls_hsbl_hdr *hshdr;
-   u32 loc, sig, cnt, *meta;
+   u32 sig, cnt, *meta;
  
  		ret = nvkm_firmware_load_name(subdev, path, "hs_bl_sig", ver, );

if (ret)
@@ -227,7 +227,6 @@ nvkm_acr_lsfw_load_sig_image_desc_v2(struct nvkm_subdev 
*subdev,
hdr = nvfw_ls_hsbl_bin_hdr(subdev, hsbl->data);
hshdr = nvfw_ls_hsbl_hdr(subdev, hsbl->data + 
hdr->header_offset);
meta = (u32 *)(hsbl->data + hshdr->meta_data_offset);
-   loc = *(u32 *)(hsbl->data + hshdr->patch_loc);
sig = *(u32 *)(hsbl->data + hshdr->patch_sig);
cnt = *(u32 *)(hsbl->data + hshdr->num_sig);
  




Re: Making drm_gpuvm work across gpu devices

2024-01-25 Thread Danilo Krummrich

On 1/24/24 04:57, Zeng, Oak wrote:

Thanks a lot Danilo.

Maybe I wasn't clear enough. In the solution I proposed, each device still have 
separate vm/page tables. Each device still need to manage the mapping, page 
table flags etc. It is just in svm use case, all devices share one drm_gpuvm 
instance. As I understand it, drm_gpuvm's main function is the va range split 
and merging. I don't see why it doesn't work across gpu devices.


I'm pretty sure it does work. You can indeed use GPUVM for tracking mappings 
using
the split and merge feature only, ignoring all other features it provides. 
However,
I don't think it's a good idea to have a single GPUVM instance to track the 
memory
mappings of different devices with different page tables, different object life 
times,
etc.



But I read more about drm_gpuvm. Its split merge function takes a 
drm_gem_object parameter, see drm_gpuvm_sm_map_ops_create and drm_gpuvm_sm_map. 
Actually the whole drm_gpuvm is designed for BO-centric driver, for example, it 
has a drm_gpuvm_bo concept to keep track of the 1BO:Ngpuva mapping. The whole 
purpose of leveraging drm_gpuvm is to re-use the va split/merge functions for 
SVM. But in our SVM implementation, there is no buffer object at all. So I 
don't think our SVM codes can leverage drm_gpuvm.


That's all optional features. As mentioned above, you can use GPUVM for 
tracking mappings
using the split and merge feature only. The drm_gem_object parameter in
drm_gpuvm_sm_map_ops_create() can simply be NULL. Afaik, Xe already does that 
for userptr
stuff already. But again, I don't think it's a good idea to track memory 
mappings of
multiple independent physical devices and driver instances in a single 
different place
whether you use GPUVM or a custom implementation.

- Danilo



I will give up this approach, unless Matt or Brian can see a way.

A few replies inline @Welty, Brian I had more thoughts inline to one of 
your original question


-Original Message-
From: Danilo Krummrich 
Sent: Tuesday, January 23, 2024 6:57 PM
To: Zeng, Oak ; Christian König
; Dave Airlie ; Daniel Vetter
; Felix Kuehling 
Cc: Welty, Brian ; dri-devel@lists.freedesktop.org; 
intel-
x...@lists.freedesktop.org; Bommu, Krishnaiah ;
Ghimiray, Himal Prasad ;
thomas.hellst...@linux.intel.com; Vishwanathapura, Niranjana
; Brost, Matthew
; Gupta, saurabhg 
Subject: Re: Making drm_gpuvm work across gpu devices

Hi Oak,

On 1/23/24 20:37, Zeng, Oak wrote:

Thanks Christian. I have some comment inline below.

Danilo, can you also take a look and give your feedback? Thanks.


I agree with everything Christian already wrote. Except for the KFD parts, which
I'm simply not familiar with, I had exactly the same thoughts after reading your
initial mail.

Please find some more comments below.




-Original Message-
From: Christian König 
Sent: Tuesday, January 23, 2024 6:13 AM
To: Zeng, Oak ; Danilo Krummrich ;
Dave Airlie ; Daniel Vetter 
Cc: Welty, Brian ; dri-devel@lists.freedesktop.org;

intel-

x...@lists.freedesktop.org; Bommu, Krishnaiah

;

Ghimiray, Himal Prasad ;
thomas.hellst...@linux.intel.com; Vishwanathapura, Niranjana
; Brost, Matthew

Subject: Re: Making drm_gpuvm work across gpu devices

Hi Oak,

Am 23.01.24 um 04:21 schrieb Zeng, Oak:

Hi Danilo and all,

During the work of Intel's SVM code, we came up the idea of making

drm_gpuvm to work across multiple gpu devices. See some discussion here:
https://lore.kernel.org/dri-


devel/PH7PR11MB70049E7E6A2F40BF6282ECC292742@PH7PR11MB7004.namprd

11.prod.outlook.com/


The reason we try to do this is, for a SVM (shared virtual memory across cpu

program and all gpu program on all gpu devices) process, the address space

has

to be across all gpu devices. So if we make drm_gpuvm to work across devices,
then our SVM code can leverage drm_gpuvm as well.


At a first look, it seems feasible because drm_gpuvm doesn't really use the

drm_device *drm pointer a lot. This param is used only for printing/warning.

So I

think maybe we can delete this drm field from drm_gpuvm.


This way, on a multiple gpu device system, for one process, we can have only

one drm_gpuvm instance, instead of multiple drm_gpuvm instances (one for
each gpu device).


What do you think?


Well from the GPUVM side I don't think it would make much difference if
we have the drm device or not.

But the experience we had with the KFD I think I should mention that we
should absolutely *not* deal with multiple devices at the same time in
the UAPI or VM objects inside the driver.

The background is that all the APIs inside the Linux kernel are build
around the idea that they work with only one device at a time. This
accounts for both low level APIs like the DMA API as well as pretty high
level things like for example file system address space etc...


Yes most API are per device based.

One exception I know is actually the kfd SVM API. If you look at the svm_ioctl

function, it is per-process based. Each

Re: Making drm_gpuvm work across gpu devices

2024-01-23 Thread Danilo Krummrich

Hi Oak,

On 1/23/24 20:37, Zeng, Oak wrote:

Thanks Christian. I have some comment inline below.

Danilo, can you also take a look and give your feedback? Thanks.


I agree with everything Christian already wrote. Except for the KFD parts, which
I'm simply not familiar with, I had exactly the same thoughts after reading your
initial mail.

Please find some more comments below.




-Original Message-
From: Christian König 
Sent: Tuesday, January 23, 2024 6:13 AM
To: Zeng, Oak ; Danilo Krummrich ;
Dave Airlie ; Daniel Vetter 
Cc: Welty, Brian ; dri-devel@lists.freedesktop.org; 
intel-
x...@lists.freedesktop.org; Bommu, Krishnaiah ;
Ghimiray, Himal Prasad ;
thomas.hellst...@linux.intel.com; Vishwanathapura, Niranjana
; Brost, Matthew

Subject: Re: Making drm_gpuvm work across gpu devices

Hi Oak,

Am 23.01.24 um 04:21 schrieb Zeng, Oak:

Hi Danilo and all,

During the work of Intel's SVM code, we came up the idea of making

drm_gpuvm to work across multiple gpu devices. See some discussion here:
https://lore.kernel.org/dri-
devel/PH7PR11MB70049E7E6A2F40BF6282ECC292742@PH7PR11MB7004.namprd
11.prod.outlook.com/


The reason we try to do this is, for a SVM (shared virtual memory across cpu

program and all gpu program on all gpu devices) process, the address space has
to be across all gpu devices. So if we make drm_gpuvm to work across devices,
then our SVM code can leverage drm_gpuvm as well.


At a first look, it seems feasible because drm_gpuvm doesn't really use the

drm_device *drm pointer a lot. This param is used only for printing/warning. So 
I
think maybe we can delete this drm field from drm_gpuvm.


This way, on a multiple gpu device system, for one process, we can have only

one drm_gpuvm instance, instead of multiple drm_gpuvm instances (one for
each gpu device).


What do you think?


Well from the GPUVM side I don't think it would make much difference if
we have the drm device or not.

But the experience we had with the KFD I think I should mention that we
should absolutely *not* deal with multiple devices at the same time in
the UAPI or VM objects inside the driver.

The background is that all the APIs inside the Linux kernel are build
around the idea that they work with only one device at a time. This
accounts for both low level APIs like the DMA API as well as pretty high
level things like for example file system address space etc...


Yes most API are per device based.

One exception I know is actually the kfd SVM API. If you look at the svm_ioctl 
function, it is per-process based. Each kfd_process represent a process across 
N gpu devices. Cc Felix.

Need to say, kfd SVM represent a shared virtual address space across CPU and 
all GPU devices on the system. This is by the definition of SVM (shared virtual 
memory). This is very different from our legacy gpu *device* driver which works 
for only one device (i.e., if you want one device to access another device's 
memory, you will have to use dma-buf export/import etc).

We have the same design requirement of SVM. For anyone who want to implement 
the SVM concept, this is a hard requirement. Since now drm has the drm_gpuvm 
concept which strictly speaking is designed for one device, I want to see 
whether we can extend drm_gpuvm to make it work for both single device (as used 
in xe) and multipe devices (will be used in the SVM code). That is why I 
brought up this topic.



So when you have multiple GPUs you either have an inseparable cluster of
them which case you would also only have one drm_device. Or you have
separated drm_device which also results in separate drm render nodes and
separate virtual address spaces and also eventually separate IOMMU
domains which gives you separate dma_addresses for the same page and so
separate GPUVM page tables


I am thinking we can still make each device has its separate drm_device/render 
node/iommu domains/gpu page table. Just as what we have today. I am not plan to 
change this picture.

But the virtual address space will support two modes of operation:
1. one drm_gpuvm per device. This is when svm is not in the picture
2. all devices in the process share one single drm_gpuvm, when svm is in the 
picture. In xe driver design, we have to support a mixture use of legacy mode 
(such as gem_create and vm_bind) and svm (such as malloc'ed memory for gpu 
submission). So whenever SVM is in the picture, we want one single process 
address space across all devices. Drm_gpuvm doesn't need to be aware of those 
two operation modes. It is driver's responsibility to use different mode.

For example, in mode #1, a driver's vm structure (such as xe_vm) can inherit 
from drm_gpuvm. In mode #2, a driver's svm structure (xe_svm in this series: 
https://lore.kernel.org/dri-devel/20240117221223.18540-1-oak.z...@intel.com/) 
can inherit from drm_gpuvm while each xe_vm (still a per-device based struct) 
will just have a pointer to the drm_gpuvm structure. This way when svm is in 
play, we build a 1 process:1

Re: [PATCH][next] drm/nouveau/fifo/gk104: remove redundant variable ret

2024-01-22 Thread Danilo Krummrich

On 1/16/24 13:31, Dan Carpenter wrote:

On Tue, Jan 16, 2024 at 11:16:09AM +, Colin Ian King wrote:

The variable ret is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so ret can be
removed.

Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King 
---
  drivers/gpu/drm/nouveau/nvif/fifo.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/fifo.c 
b/drivers/gpu/drm/nouveau/nvif/fifo.c
index a463289962b2..e96de14ce87e 100644
--- a/drivers/gpu/drm/nouveau/nvif/fifo.c
+++ b/drivers/gpu/drm/nouveau/nvif/fifo.c
@@ -73,9 +73,9 @@ u64
  nvif_fifo_runlist(struct nvif_device *device, u64 engine)
  {
u64 runm = 0;
-   int ret, i;
+   int i;
  
-	if ((ret = nvif_fifo_runlists(device)))

+   if (nvif_fifo_runlists(device))
return runm;


Could we return a literal zero here?  Otherwise, I'm surprised this
doesn't trigger a static checker warning.


Why do you think so? Conditionally, runm is used later on as well. I don't
think the checker should complain about keeping the value single source.

If you agree, want to offer your RB?

- Danilo



regards,
dan carpenter





Re: [PATCH] drm/exec, drm/gpuvm: Prefer u32 over uint32_t

2024-01-19 Thread Danilo Krummrich

On 1/19/24 10:05, Thomas Hellström wrote:

The relatively recently introduced drm/exec utility was using uint32_t
in its interface, which was then also carried over to drm/gpuvm.

Prefer u32 in new code and update drm/exec and drm/gpuvm accordingly.

Cc: Christian König 
Cc: Danilo Krummrich 
Signed-off-by: Thomas Hellström 


Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/drm_exec.c | 2 +-
  include/drm/drm_exec.h | 4 ++--
  include/drm/drm_gpuvm.h| 2 +-
  3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
index 5d2809de4517..20e59d88218d 100644
--- a/drivers/gpu/drm/drm_exec.c
+++ b/drivers/gpu/drm/drm_exec.c
@@ -72,7 +72,7 @@ static void drm_exec_unlock_all(struct drm_exec *exec)
   *
   * Initialize the object and make sure that we can track locked objects.
   */
-void drm_exec_init(struct drm_exec *exec, uint32_t flags)
+void drm_exec_init(struct drm_exec *exec, u32 flags)
  {
exec->flags = flags;
exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL);
diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h
index b5bf0b6da791..187c3ec44606 100644
--- a/include/drm/drm_exec.h
+++ b/include/drm/drm_exec.h
@@ -18,7 +18,7 @@ struct drm_exec {
/**
 * @flags: Flags to control locking behavior
 */
-   uint32_tflags;
+   u32 flags;
  
  	/**

 * @ticket: WW ticket used for acquiring locks
@@ -135,7 +135,7 @@ static inline bool drm_exec_is_contended(struct drm_exec 
*exec)
return !!exec->contended;
  }
  
-void drm_exec_init(struct drm_exec *exec, uint32_t flags);

+void drm_exec_init(struct drm_exec *exec, u32 flags);
  void drm_exec_fini(struct drm_exec *exec);
  bool drm_exec_cleanup(struct drm_exec *exec);
  int drm_exec_lock_obj(struct drm_exec *exec, struct drm_gem_object *obj);
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 48311e6d664c..554046321d24 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -514,7 +514,7 @@ struct drm_gpuvm_exec {
/**
 * @flags: the flags for the struct drm_exec
 */
-   uint32_t flags;
+   u32 flags;
  
  	/**

 * @vm: the _gpuvm to lock its DMA reservations




Re: [PATCH linux-next v2] drm/nouveau/disp: switch to use kmemdup() helper

2024-01-10 Thread Danilo Krummrich

On 1/9/24 07:24, yang.gua...@zte.com.cn wrote:

From: Chen Haonan 

Use kmemdup() helper instead of open-coding to
simplify the code.

Signed-off-by: Chen Haonan 
Reviewed-by: Yang Guang 


Applied to drm-misc-next, thanks!


---
  drivers/gpu/drm/nouveau/nvif/outp.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/outp.c 
b/drivers/gpu/drm/nouveau/nvif/outp.c
index 5d3190c05250..6daeb7f0b09b 100644
--- a/drivers/gpu/drm/nouveau/nvif/outp.c
+++ b/drivers/gpu/drm/nouveau/nvif/outp.c
@@ -452,13 +452,12 @@ nvif_outp_edid_get(struct nvif_outp *outp, u8 **pedid)
if (ret)
goto done;

-   *pedid = kmalloc(args->size, GFP_KERNEL);
+   *pedid = kmemdup(args->data, args->size, GFP_KERNEL);
if (!*pedid) {
ret = -ENOMEM;
goto done;
}

-   memcpy(*pedid, args->data, args->size);
ret = args->size;
  done:
kfree(args);




Re: [PATCH 1/3] drm/nouveau: include drm/drm_edid.h only where needed

2024-01-09 Thread Danilo Krummrich

On 1/9/24 10:59, Jani Nikula wrote:

On Mon, 08 Jan 2024, Danilo Krummrich  wrote:

On 1/4/24 21:16, Jani Nikula wrote:

Including drm_edid.h from nouveau_connector.h causes the rebuild of 15
files when drm_edid.h is modified, while there are only a few files that
actually need to include drm_edid.h.

Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: nouv...@lists.freedesktop.org
Signed-off-by: Jani Nikula 


Reviewed-by: Danilo Krummrich 


Are you going to pick this up via the nouveau tree, or shall I apply it
to drm-misc-next?


We don't maintain a separate tree, hence feel free to apply it to drm-misc-next.

- Danilo



BR,
Jani.




---
   drivers/gpu/drm/nouveau/dispnv50/head.c | 1 +
   drivers/gpu/drm/nouveau/nouveau_connector.h | 2 +-
   2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c 
b/drivers/gpu/drm/nouveau/dispnv50/head.c
index 5f490fbf1877..83355dbc15ee 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.c
@@ -32,6 +32,7 @@
   
   #include 

   #include 
+#include 
   #include 
   #include "nouveau_connector.h"
   
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h

index a2df4918340c..0608cabed058 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.h
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.h
@@ -35,7 +35,6 @@
   
   #include 

   #include 
-#include 
   #include 
   #include 
   
@@ -44,6 +43,7 @@
   
   struct nvkm_i2c_port;

   struct dcb_output;
+struct edid;
   
   #ifdef CONFIG_DRM_NOUVEAU_BACKLIGHT

   struct nouveau_backlight {








Re: [PATCH -next] drm/nouveau: uapi: fix kerneldoc warnings

2024-01-08 Thread Danilo Krummrich

On 12/25/23 07:51, Vegard Nossum wrote:

As of commit b77fdd6a48e6 ("scripts/kernel-doc: restore warning for
Excess struct/union"), we see the following warnings when running 'make
htmldocs':

   ./include/uapi/drm/nouveau_drm.h:292: warning: Excess struct member 
'DRM_NOUVEAU_VM_BIND_OP_MAP' description in 'drm_nouveau_vm_bind_op'
   ./include/uapi/drm/nouveau_drm.h:292: warning: Excess struct member 
'DRM_NOUVEAU_VM_BIND_OP_UNMAP' description in 'drm_nouveau_vm_bind_op'
   ./include/uapi/drm/nouveau_drm.h:292: warning: Excess struct member 
'DRM_NOUVEAU_VM_BIND_SPARSE' description in 'drm_nouveau_vm_bind_op'
   ./include/uapi/drm/nouveau_drm.h:336: warning: Excess struct member 
'DRM_NOUVEAU_VM_BIND_RUN_ASYNC' description in 'drm_nouveau_vm_bind'

The problem is that these values are #define constants, but had kerneldoc
comments attached to them as if they were actual struct members.

There are a number of ways we could fix this, but I chose to draw
inspiration from include/uapi/drm/i915_drm.h, which pulls them into the
corresponding kerneldoc comment for the struct member that they are
intended to be used with.

To keep the diff readable, there are a number of things I _didn't_ do in
this patch, but which we should also consider:

- This is pretty good documentation, but it ends up in gpu/driver-uapi,
   which is part of subsystem-apis/ when it really ought to display under
   userspace-api/ (the "Linux kernel user-space API guide" book of the
   documentation).


I agree, it indeed looks like this would make sense, same goes for
gpu/drm-uapi.rst.

@Jani, Sima: Was this intentional? Or can we change it?



- More generally, we might want a warning if include/uapi/ files are
   kerneldoc'd outside userspace-api/.

- I'd consider it cleaner if the #defines appeared between the kerneldoc
   for the member and the member itself (which is something other DRM-
   related UAPI docs do).

- The %IDENTIFIER kerneldoc syntax is intended for "constants", and is
   more appropriate in this context than ``IDENTIFIER`` or 
   The DRM docs aren't very consistent on this.

Cc: Randy Dunlap 
Cc: Jonathan Corbet 
Signed-off-by: Vegard Nossum 


Applied to drm-misc-next, thanks!


---
  include/uapi/drm/nouveau_drm.h | 56 --
  1 file changed, 27 insertions(+), 29 deletions(-)

diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index 0bade1592f34..c95ef8a4d94a 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ b/include/uapi/drm/nouveau_drm.h
@@ -238,34 +238,32 @@ struct drm_nouveau_vm_init {
  struct drm_nouveau_vm_bind_op {
/**
 * @op: the operation type
+*
+* Supported values:
+*
+* %DRM_NOUVEAU_VM_BIND_OP_MAP - Map a GEM object to the GPU's VA
+* space. Optionally, the _NOUVEAU_VM_BIND_SPARSE flag can be
+* passed to instruct the kernel to create sparse mappings for the
+* given range.
+*
+* %DRM_NOUVEAU_VM_BIND_OP_UNMAP - Unmap an existing mapping in the
+* GPU's VA space. If the region the mapping is located in is a
+* sparse region, new sparse mappings are created where the unmapped
+* (memory backed) mapping was mapped previously. To remove a sparse
+* region the _NOUVEAU_VM_BIND_SPARSE must be set.
 */
__u32 op;
-/**
- * @DRM_NOUVEAU_VM_BIND_OP_MAP:
- *
- * Map a GEM object to the GPU's VA space. Optionally, the
- * _NOUVEAU_VM_BIND_SPARSE flag can be passed to instruct the kernel to
- * create sparse mappings for the given range.
- */
  #define DRM_NOUVEAU_VM_BIND_OP_MAP 0x0
-/**
- * @DRM_NOUVEAU_VM_BIND_OP_UNMAP:
- *
- * Unmap an existing mapping in the GPU's VA space. If the region the mapping
- * is located in is a sparse region, new sparse mappings are created where the
- * unmapped (memory backed) mapping was mapped previously. To remove a sparse
- * region the _NOUVEAU_VM_BIND_SPARSE must be set.
- */
  #define DRM_NOUVEAU_VM_BIND_OP_UNMAP 0x1
/**
 * @flags: the flags for a _nouveau_vm_bind_op
+*
+* Supported values:
+*
+* %DRM_NOUVEAU_VM_BIND_SPARSE - Indicates that an allocated VA
+* space region should be sparse.
 */
__u32 flags;
-/**
- * @DRM_NOUVEAU_VM_BIND_SPARSE:
- *
- * Indicates that an allocated VA space region should be sparse.
- */
  #define DRM_NOUVEAU_VM_BIND_SPARSE (1 << 8)
/**
 * @handle: the handle of the DRM GEM object to map
@@ -301,17 +299,17 @@ struct drm_nouveau_vm_bind {
__u32 op_count;
/**
 * @flags: the flags for a _nouveau_vm_bind ioctl
+*
+* Supported values:
+*
+* %DRM_NOUVEAU_VM_BIND_RUN_ASYNC - Indicates that the given VM_BIND
+* operation should be executed asynchronously by the kernel.
+*
+* If this flag is not supplied the kernel executes the associated
+* operations synchronously and doesn't accept any 

Re: [PATCH 1/3] drm/nouveau: include drm/drm_edid.h only where needed

2024-01-08 Thread Danilo Krummrich

On 1/4/24 21:16, Jani Nikula wrote:

Including drm_edid.h from nouveau_connector.h causes the rebuild of 15
files when drm_edid.h is modified, while there are only a few files that
actually need to include drm_edid.h.

Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: nouv...@lists.freedesktop.org
Signed-off-by: Jani Nikula 


Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/nouveau/dispnv50/head.c | 1 +
  drivers/gpu/drm/nouveau/nouveau_connector.h | 2 +-
  2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c 
b/drivers/gpu/drm/nouveau/dispnv50/head.c
index 5f490fbf1877..83355dbc15ee 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.c
@@ -32,6 +32,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include "nouveau_connector.h"
  
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h

index a2df4918340c..0608cabed058 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.h
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.h
@@ -35,7 +35,6 @@
  
  #include 

  #include 
-#include 
  #include 
  #include 
  
@@ -44,6 +43,7 @@
  
  struct nvkm_i2c_port;

  struct dcb_output;
+struct edid;
  
  #ifdef CONFIG_DRM_NOUVEAU_BACKLIGHT

  struct nouveau_backlight {




Re: [PATCH 1/4] drm/nouveau/disp: don't misuse kernel-doc comments

2024-01-08 Thread Danilo Krummrich

On 1/1/24 00:36, Randy Dunlap wrote:

Change kernel-doc "/**" comments to common "/*" comments to prevent
kernel-doc warnings:

crtc.c:453: warning: This comment starts with '/**', but isn't a kernel-doc 
comment. Refer Documentation/doc-guide/kernel-doc.rst
  * Sets up registers for the given mode/adjusted_mode pair.
crtc.c:453: warning: missing initial short description on line:
  * Sets up registers for the given mode/adjusted_mode pair.
crtc.c:629: warning: This comment starts with '/**', but isn't a kernel-doc 
comment. Refer Documentation/doc-guide/kernel-doc.rst
  * Sets up registers for the given mode/adjusted_mode pair.
crtc.c:629: warning: missing initial short description on line:
  * Sets up registers for the given mode/adjusted_mode pair.

Signed-off-by: Randy Dunlap 
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: nouv...@lists.freedesktop.org
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 


Series applied to drm-misc-next, thanks!


---
  drivers/gpu/drm/nouveau/dispnv04/crtc.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff -- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -449,7 +449,7 @@ nv_crtc_mode_set_vga(struct drm_crtc *cr
regp->Attribute[NV_CIO_AR_CSEL_INDEX] = 0x00;
  }
  
-/**

+/*
   * Sets up registers for the given mode/adjusted_mode pair.
   *
   * The clocks, CRTCs and outputs attached to this CRTC must be off.
@@ -625,7 +625,7 @@ nv_crtc_swap_fbs(struct drm_crtc *crtc,
return ret;
  }
  
-/**

+/*
   * Sets up registers for the given mode/adjusted_mode pair.
   *
   * The clocks, CRTCs and outputs attached to this CRTC must be off.





Re: [PATCH] drm/nouveau/bios/init: drop kernel-doc notation

2024-01-08 Thread Danilo Krummrich

On 12/16/23 21:11, Randy Dunlap wrote:

The "/**" comments in this file are not kernel-doc comments. They are
used on static functions which can have kernel-doc comments, but that
is not the primary focus of kernel-doc comments.
Since these comments are incomplete for kernel-doc notation, remove
the kernel-doc "/**" markers and make them common comments.

This prevents scripts/kernel-doc from issuing 68 warnings:

init.c:584: warning: Function parameter or member 'init' not described in 
'init_reserved'

and 67 warnings like this one:
init.c:611: warning: expecting prototype for INIT_DONE(). Prototype was for 
init_done() instead

Signed-off-by: Randy Dunlap 
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: dri-devel@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 


Applied to drm-misc-next, thanks!


---
  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c |  136 +++---
  1 file changed, 68 insertions(+), 68 deletions(-)

diff -- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
@@ -575,7 +575,7 @@ init_tmds_reg(struct nvbios_init *init,
   * init opcode handlers
   
*/
  
-/**

+/*
   * init_reserved - stub for various unknown/unused single-byte opcodes
   *
   */
@@ -602,7 +602,7 @@ init_reserved(struct nvbios_init *init)
init->offset += length;
  }
  
-/**

+/*
   * INIT_DONE - opcode 0x71
   *
   */
@@ -613,7 +613,7 @@ init_done(struct nvbios_init *init)
init->offset = 0x;
  }
  
-/**

+/*
   * INIT_IO_RESTRICT_PROG - opcode 0x32
   *
   */
@@ -650,7 +650,7 @@ init_io_restrict_prog(struct nvbios_init
trace("}]\n");
  }
  
-/**

+/*
   * INIT_REPEAT - opcode 0x33
   *
   */
@@ -676,7 +676,7 @@ init_repeat(struct nvbios_init *init)
init->repeat = repeat;
  }
  
-/**

+/*
   * INIT_IO_RESTRICT_PLL - opcode 0x34
   *
   */
@@ -716,7 +716,7 @@ init_io_restrict_pll(struct nvbios_init
trace("}]\n");
  }
  
-/**

+/*
   * INIT_END_REPEAT - opcode 0x36
   *
   */
@@ -732,7 +732,7 @@ init_end_repeat(struct nvbios_init *init
}
  }
  
-/**

+/*
   * INIT_COPY - opcode 0x37
   *
   */
@@ -759,7 +759,7 @@ init_copy(struct nvbios_init *init)
init_wrvgai(init, port, index, data);
  }
  
-/**

+/*
   * INIT_NOT - opcode 0x38
   *
   */
@@ -771,7 +771,7 @@ init_not(struct nvbios_init *init)
init_exec_inv(init);
  }
  
-/**

+/*
   * INIT_IO_FLAG_CONDITION - opcode 0x39
   *
   */
@@ -788,7 +788,7 @@ init_io_flag_condition(struct nvbios_ini
init_exec_set(init, false);
  }
  
-/**

+/*
   * INIT_GENERIC_CONDITION - opcode 0x3a
   *
   */
@@ -840,7 +840,7 @@ init_generic_condition(struct nvbios_ini
}
  }
  
-/**

+/*
   * INIT_IO_MASK_OR - opcode 0x3b
   *
   */
@@ -859,7 +859,7 @@ init_io_mask_or(struct nvbios_init *init
init_wrvgai(init, 0x03d4, index, data &= ~(1 << or));
  }
  
-/**

+/*
   * INIT_IO_OR - opcode 0x3c
   *
   */
@@ -878,7 +878,7 @@ init_io_or(struct nvbios_init *init)
init_wrvgai(init, 0x03d4, index, data | (1 << or));
  }
  
-/**

+/*
   * INIT_ANDN_REG - opcode 0x47
   *
   */
@@ -895,7 +895,7 @@ init_andn_reg(struct nvbios_init *init)
init_mask(init, reg, mask, 0);
  }
  
-/**

+/*
   * INIT_OR_REG - opcode 0x48
   *
   */
@@ -912,7 +912,7 @@ init_or_reg(struct nvbios_init *init)
init_mask(init, reg, 0, mask);
  }
  
-/**

+/*
   * INIT_INDEX_ADDRESS_LATCHED - opcode 0x49
   *
   */
@@ -942,7 +942,7 @@ init_idx_addr_latched(struct nvbios_init
}
  }
  
-/**

+/*
   * INIT_IO_RESTRICT_PLL2 - opcode 0x4a
   *
   */
@@ -977,7 +977,7 @@ init_io_restrict_pll2(struct nvbios_init
trace("}]\n");
  }
  
-/**

+/*
   * INIT_PLL2 - opcode 0x4b
   *
   */
@@ -994,7 +994,7 @@ init_pll2(struct nvbios_init *init)
init_prog_pll(init, reg, freq);
  }
  
-/**

+/*
   * INIT_I2C_BYTE - opcode 0x4c
   *
   */
@@ -1025,7 +1025,7 @@ init_i2c_byte(struct nvbios_init *init)
}
  }
  
-/**

+/*
   * INIT_ZM_I2C_BYTE - opcode 0x4d
   *
   */
@@ -1051,7 +1051,7 @@ init_zm_i2c_byte(struct nvbios_init *ini
}
  }
  
-/**

+/*
   * INIT_ZM_I2C - opcode 0x4e
   *
   */
@@ -1085,7 +1085,7 @@ init_zm_i2c(struct nvbios_init *init)
}
  }
  
-/**

+/*
   * INIT_TMDS - opcode 0x4f
   *
   */
@@ -,7 +,7 @@ init_tmds(struct nvbios_init *init)
init_wr32(init, reg + 0, addr);
  }
  
-/**

+/*
   * INIT_ZM_TMDS_GROUP - opcode 0x50
   *
   */
@@ -1138,7 +1138,7 @@ init_zm_tmds_group(struct nvbios_init *i
}
  }
  
-/**

+/*
   * INIT_CR_INDEX_ADDRESS_LATCHED - opcode 0x51
   *
   */
@@ -1168,7 +1168,7 @@ init_cr_idx_

Re: [PATCH] drm/nouveau/fifo: remove duplicated including

2024-01-08 Thread Danilo Krummrich

Hi Wang,

there is another patch [1] to fix this, which already made it upstream.

- Danilo

[1] 
https://patchwork.freedesktop.org/patch/msgid/20231122004926.84933-1-yang@linux.alibaba.com

On 12/15/23 11:02, Wang Jinchao wrote:

rm second including of chid.h

Signed-off-by: Wang Jinchao 
---
  drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c 
b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c
index 87a62d4ff4bd..7d4716dcd512 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c
@@ -24,7 +24,6 @@
  #include "chan.h"
  #include "chid.h"
  #include "cgrp.h"
-#include "chid.h"
  #include "runl.h"
  #include "priv.h"
  




Re: [PATCH linux-next] drm/nouveau/disp: switch to use kmemdup() helper

2024-01-08 Thread Danilo Krummrich

Hi Yang,

On 12/14/23 13:03, yang.gua...@zte.com.cn wrote:

From: Yang Guang 

Use kmemdup() helper instead of open-coding to
simplify the code.

Signed-off-by: Chen Haonan 


Please add your "Signed-off-by" tag to this patch. If the above is intended to 
indicate
that Chan was involved in creating this patch (e.g. as co-author, reporter, 
etc.) please
use the corresponding tag instead. See also [1].

Once this is fixed, I'll apply the patch.

- Danilo

[1] 
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin


---
  drivers/gpu/drm/nouveau/nvif/outp.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/outp.c 
b/drivers/gpu/drm/nouveau/nvif/outp.c
index 5d3190c05250..6daeb7f0b09b 100644
--- a/drivers/gpu/drm/nouveau/nvif/outp.c
+++ b/drivers/gpu/drm/nouveau/nvif/outp.c
@@ -452,13 +452,12 @@ nvif_outp_edid_get(struct nvif_outp *outp, u8 **pedid)
if (ret)
goto done;

-   *pedid = kmalloc(args->size, GFP_KERNEL);
+   *pedid = kmemdup(args->data, args->size, GFP_KERNEL);
if (!*pedid) {
ret = -ENOMEM;
goto done;
}

-   memcpy(*pedid, args->data, args->size);
ret = args->size;
  done:
kfree(args);




Re: [PATCH drm-misc-next v2 0/2] PowerVR VM fixes

2023-12-05 Thread Danilo Krummrich
On Tue, Dec 05, 2023 at 04:35:00PM +0100, Maxime Ripard wrote:
> Hi,
> 
> On Wed, Nov 29, 2023 at 11:07:59PM +0100, Danilo Krummrich wrote:
> > Some major GPUVM changes landed just before v8 of the PowerVR series. Since 
> > v8
> > went in rather quickly (review process was finished otherwise) I haven't 
> > had the
> > chance to review the subsequent code changes.
> > 
> > Hence, this series with a few fixes in this context. Plus a minor GPUVM 
> > patch to
> > make the drm_gpuvm_prepare_* helpers useful for PowerVR.
> 
> This doesn't apply on drm-misc-next anymore, could you rebase and
> resend?

I already applied the two patches to drm-misc-next.

- Danilo

> 
> Thanks!
> Maxime




Re: [PATCH] drm/gpuvm: Let drm_gpuvm_bo_put() report when the vm_bo object is destroyed

2023-12-04 Thread Danilo Krummrich

On 12/4/23 16:14, Boris Brezillon wrote:

Some users need to release resources attached to the vm_bo object when
it's destroyed. In Panthor's case, we need to release the pin ref so
BO pages can be returned to the system when all GPU mappings are gone.

This could be done through a custom drm_gpuvm::vm_bo_free() hook, but
this has all sort of locking implications that would force us to expose
a drm_gem_shmem_unpin_locked() helper, not to mention the fact that
having a ::vm_bo_free() implementation without a ::vm_bo_alloc() one
seems odd. So let's keep things simple, and extend drm_gpuvm_bo_put()
to report when the object is destroyed.

Signed-off-by: Boris Brezillon 


Reviewed-by: Danilo Krummrich 


---
  drivers/gpu/drm/drm_gpuvm.c | 8 ++--
  include/drm/drm_gpuvm.h | 2 +-
  2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 54f5e8851de5..ae13e2d63637 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1502,14 +1502,18 @@ drm_gpuvm_bo_destroy(struct kref *kref)
   * hold the dma-resv or driver specific GEM gpuva lock.
   *
   * This function may only be called from non-atomic context.
+ *
+ * Returns: true if vm_bo was destroyed, false otherwise.
   */
-void
+bool
  drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
  {
might_sleep();
  
  	if (vm_bo)

-   kref_put(_bo->kref, drm_gpuvm_bo_destroy);
+   return !!kref_put(_bo->kref, drm_gpuvm_bo_destroy);
+
+   return false;
  }
  EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
  
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h

index f94fec9a8517..7cc41a7d86d5 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -738,7 +738,7 @@ drm_gpuvm_bo_get(struct drm_gpuvm_bo *vm_bo)
return vm_bo;
  }
  
-void drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo);

+bool drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo);
  
  struct drm_gpuvm_bo *

  drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,




Re: [PATCH] drm/nouveau: Removes unnecessary args check in nouveau_uvmm_sm_prepare

2023-11-29 Thread Danilo Krummrich

On 11/16/23 21:52, Yuran Pereira wrote:

Checking `args` after calling `op_map_prepare` is unnecessary since
if `op_map_prepare` was to be called with  NULL args, it would lead
to a NULL pointer dereference, thus never hitting that check.

Hence this check can be removed, and a note added to remind users of
this function to ensure that args != NULL when calling this function
for a map operation as it was suggested  by Danilo [1]

[1] https://lore.kernel.org/lkml/6a1ebcef-bade-45a0-9bd9-c05f0226e...@redhat.com

Suggested-by: Danilo Krummrich 
Signed-off-by: Yuran Pereira 


Applied to drm-misc-next.


---
  drivers/gpu/drm/nouveau/nouveau_uvmm.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c 
b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index 5cf892c50f43..c8c3f1b1b604 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -604,6 +604,10 @@ op_unmap_prepare(struct drm_gpuva_op_unmap *u)
drm_gpuva_unmap(u);
  }
  
+/*

+ * Note: @args should not be NULL when calling for
+ * a map operation.
+ */
  static int
  nouveau_uvmm_sm_prepare(struct nouveau_uvmm *uvmm,
struct nouveau_uvma_prealloc *new,
@@ -624,7 +628,7 @@ nouveau_uvmm_sm_prepare(struct nouveau_uvmm *uvmm,
if (ret)
goto unwind;
  
-			if (args && vmm_get_range) {

+   if (vmm_get_range) {
ret = nouveau_uvmm_vmm_get(uvmm, vmm_get_start,
   vmm_get_range);
if (ret) {




Re: [PATCH] driver: gpu: Fix warning directly dereferencing a rcu pointer

2023-11-29 Thread Danilo Krummrich

Hi Abhinav,

Thanks for sending this follow-up patch.

On 11/26/23 15:57, Abhinav Singh wrote:

Fix a sparse warning with this message
"warning:dereference of noderef expression". In this context it means we
are dereferencing a __rcu tagged pointer directly.

We should not be directly dereferencing a rcu pointer. To get a normal
(non __rcu tagged pointer) from a __rcu tagged pointer we are using the
function unrcu_pointer(...). The non __rcu tagged pointer then can be
dereferenced just like a normal pointer.


Can you please add a brief explanation why unrcu_pointer() is fine here?



I tested with qemu with this command
qemu-system-x86_64 \
-m 2G \
-smp 2 \
-kernel bzImage \
-append "console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
-drive file=bullseye.img,format=raw \
-net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 \
-net nic,model=e1000 \
-enable-kvm \
-nographic \
-pidfile vm.pid \
2>&1 | tee vm.log
with lockdep enabled.


How is that relevant for this patch?

- Danilo



Signed-off-by: Abhinav Singh 
---
  drivers/gpu/drm/nouveau/nv10_fence.c | 2 +-
  drivers/gpu/drm/nouveau/nv84_fence.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nv10_fence.c 
b/drivers/gpu/drm/nouveau/nv10_fence.c
index c6a0db5b9e21..845b64c079ed 100644
--- a/drivers/gpu/drm/nouveau/nv10_fence.c
+++ b/drivers/gpu/drm/nouveau/nv10_fence.c
@@ -32,7 +32,7 @@
  int
  nv10_fence_emit(struct nouveau_fence *fence)
  {
-   struct nvif_push *push = fence->channel->chan.push;
+   struct nvif_push *push = unrcu_pointer(fence->channel)->chan.push;
int ret = PUSH_WAIT(push, 2);
if (ret == 0) {
PUSH_MTHD(push, NV06E, SET_REFERENCE, fence->base.seqno);
diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c 
b/drivers/gpu/drm/nouveau/nv84_fence.c
index 812b8c62eeba..d42e72e23dec 100644
--- a/drivers/gpu/drm/nouveau/nv84_fence.c
+++ b/drivers/gpu/drm/nouveau/nv84_fence.c
@@ -85,7 +85,7 @@ nv84_fence_chid(struct nouveau_channel *chan)
  static int
  nv84_fence_emit(struct nouveau_fence *fence)
  {
-   struct nouveau_channel *chan = fence->channel;
+   struct nouveau_channel *chan = unrcu_pointer(fence->channel);
struct nv84_fence_chan *fctx = chan->fence;
u64 addr = fctx->vma->addr + nv84_fence_chid(chan) * 16;
  




[PATCH drm-misc-next v2 2/2] drm/imagination: vm: make use of GPUVM's drm_exec helper

2023-11-29 Thread Danilo Krummrich
Make use of GPUVM's drm_exec helper functions preventing direct access
to GPUVM internal data structures, such as the external object list.

This is especially important to ensure following the locking rules
around the GPUVM external object list.

Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/imagination/pvr_vm.c | 91 +++-
 1 file changed, 36 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/imagination/pvr_vm.c 
b/drivers/gpu/drm/imagination/pvr_vm.c
index e0d74d9a6190..c6ab1581d509 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.c
+++ b/drivers/gpu/drm/imagination/pvr_vm.c
@@ -333,48 +333,6 @@ pvr_vm_bind_op_unmap_init(struct pvr_vm_bind_op *bind_op,
return err;
 }
 
-static int
-pvr_vm_bind_op_lock_resvs(struct drm_exec *exec, struct pvr_vm_bind_op 
*bind_op)
-{
-   drm_exec_until_all_locked(exec) {
-   struct drm_gem_object *r_obj = _op->vm_ctx->dummy_gem;
-   struct drm_gpuvm *gpuvm = _op->vm_ctx->gpuvm_mgr;
-   struct pvr_gem_object *pvr_obj = bind_op->pvr_obj;
-   struct drm_gpuvm_bo *gpuvm_bo;
-
-   /* Acquire lock on the vm_context's reserve object. */
-   int err = drm_exec_lock_obj(exec, r_obj);
-
-   drm_exec_retry_on_contention(exec);
-   if (err)
-   return err;
-
-   /* Acquire lock on all BOs in the context. */
-   list_for_each_entry(gpuvm_bo, >extobj.list,
-   list.entry.extobj) {
-   err = drm_exec_lock_obj(exec, gpuvm_bo->obj);
-
-   drm_exec_retry_on_contention(exec);
-   if (err)
-   return err;
-   }
-
-   /* Unmap operations don't have an object to lock. */
-   if (!pvr_obj)
-   break;
-
-   /* Acquire lock on the GEM being mapped. */
-   err = drm_exec_lock_obj(exec,
-   gem_from_pvr_gem(bind_op->pvr_obj));
-
-   drm_exec_retry_on_contention(exec);
-   if (err)
-   return err;
-   }
-
-   return 0;
-}
-
 /**
  * pvr_vm_gpuva_map() - Insert a mapping into a memory context.
  * @op: gpuva op containing the remap details.
@@ -731,6 +689,20 @@ void pvr_destroy_vm_contexts_for_file(struct pvr_file 
*pvr_file)
}
 }
 
+static int
+pvr_vm_lock_extra(struct drm_gpuvm_exec *vm_exec)
+{
+   struct pvr_vm_bind_op *bind_op = vm_exec->extra.priv;
+   struct pvr_gem_object *pvr_obj = bind_op->pvr_obj;
+
+   /* Unmap operations don't have an object to lock. */
+   if (!pvr_obj)
+   return 0;
+
+   /* Acquire lock on the GEM being mapped. */
+   return drm_exec_lock_obj(_exec->exec, gem_from_pvr_gem(pvr_obj));
+}
+
 /**
  * pvr_vm_map() - Map a section of physical memory into a section of
  * device-virtual memory.
@@ -758,7 +730,15 @@ pvr_vm_map(struct pvr_vm_context *vm_ctx, struct 
pvr_gem_object *pvr_obj,
   u64 pvr_obj_offset, u64 device_addr, u64 size)
 {
struct pvr_vm_bind_op bind_op = {0};
-   struct drm_exec exec;
+   struct drm_gpuvm_exec vm_exec = {
+   .vm = _ctx->gpuvm_mgr,
+   .flags = DRM_EXEC_INTERRUPTIBLE_WAIT |
+DRM_EXEC_IGNORE_DUPLICATES,
+   .extra = {
+   .fn = pvr_vm_lock_extra,
+   .priv = _op,
+   },
+   };
 
int err = pvr_vm_bind_op_map_init(_op, vm_ctx, pvr_obj,
  pvr_obj_offset, device_addr,
@@ -767,18 +747,15 @@ pvr_vm_map(struct pvr_vm_context *vm_ctx, struct 
pvr_gem_object *pvr_obj,
if (err)
return err;
 
-   drm_exec_init(,
- DRM_EXEC_INTERRUPTIBLE_WAIT | DRM_EXEC_IGNORE_DUPLICATES);
-
pvr_gem_object_get(pvr_obj);
 
-   err = pvr_vm_bind_op_lock_resvs(, _op);
+   err = drm_gpuvm_exec_lock(_exec);
if (err)
goto err_cleanup;
 
err = pvr_vm_bind_op_exec(_op);
 
-   drm_exec_fini();
+   drm_gpuvm_exec_unlock(_exec);
 
 err_cleanup:
pvr_vm_bind_op_fini(_op);
@@ -804,24 +781,28 @@ int
 pvr_vm_unmap(struct pvr_vm_context *vm_ctx, u64 device_addr, u64 size)
 {
struct pvr_vm_bind_op bind_op = {0};
-   struct drm_exec exec;
+   struct drm_gpuvm_exec vm_exec = {
+   .vm = _ctx->gpuvm_mgr,
+   .flags = DRM_EXEC_INTERRUPTIBLE_WAIT |
+DRM_EXEC_IGNORE_DUPLICATES,
+   .extra = {
+   .fn = pvr_vm_lock_extra,
+   .priv = _op,
+   },
+   };
 
int err = pvr_vm_bind_op_unmap_init(_op, vm_ctx, device_addr,

[PATCH drm-misc-next v2 1/2] drm/gpuvm: fall back to drm_exec_lock_obj()

2023-11-29 Thread Danilo Krummrich
Fall back to drm_exec_lock_obj() if num_fences is zero for the
drm_gpuvm_prepare_* function family.

Otherwise dma_resv_reserve_fences() would actually allocate slots even
though num_fences is zero.

Cc: Christian König 
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/drm_gpuvm.c | 43 -
 include/drm/drm_gpuvm.h | 23 +++-
 2 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 54f5e8851de5..07a6676bc4f9 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1085,6 +1085,37 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm)
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_put);
 
+static int
+exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj,
+unsigned int num_fences)
+{
+   return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) :
+   drm_exec_lock_obj(exec, obj);
+}
+
+/**
+ * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
+ * @gpuvm: the _gpuvm
+ * @exec: the _exec context
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object; if
+ * @num_fences is zero drm_exec_lock_obj() is called instead.
+ *
+ * Using this function directly, it is the drivers responsibility to call
+ * drm_exec_init() and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+struct drm_exec *exec,
+unsigned int num_fences)
+{
+   return exec_prepare_obj(exec, gpuvm->r_obj, num_fences);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm);
+
 static int
 __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
struct drm_exec *exec,
@@ -1095,7 +1126,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
int ret = 0;
 
for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) {
-   ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+   ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
if (ret)
break;
}
@@ -1116,7 +1147,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm,
 
drm_gpuvm_resv_assert_held(gpuvm);
list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) {
-   ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+   ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
if (ret)
break;
 
@@ -1134,7 +1165,8 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm,
  * @num_fences: the amount of _fences to reserve
  *
  * Calls drm_exec_prepare_obj() for all _gem_objects the given
- * _gpuvm contains mappings of.
+ * _gpuvm contains mappings of; if @num_fences is zero drm_exec_lock_obj()
+ * is called instead.
  *
  * Using this function directly, it is the drivers responsibility to call
  * drm_exec_init() and drm_exec_fini() accordingly.
@@ -1171,7 +1203,8 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
  * @num_fences: the amount of _fences to reserve
  *
  * Calls drm_exec_prepare_obj() for all _gem_objects mapped between @addr
- * and @addr + @range.
+ * and @addr + @range; if @num_fences is zero drm_exec_lock_obj() is called
+ * instead.
  *
  * Returns: 0 on success, negative error code on failure.
  */
@@ -1186,7 +1219,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct 
drm_exec *exec,
drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
struct drm_gem_object *obj = va->gem.obj;
 
-   ret = drm_exec_prepare_obj(exec, obj, num_fences);
+   ret = exec_prepare_obj(exec, obj, num_fences);
if (ret)
return ret;
}
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index f94fec9a8517..b3f82ec7fb17 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -544,26 +544,9 @@ struct drm_gpuvm_exec {
} extra;
 };
 
-/**
- * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
- * @gpuvm: the _gpuvm
- * @exec: the _exec context
- * @num_fences: the amount of _fences to reserve
- *
- * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object.
- *
- * Using this function directly, it is the drivers responsibility to call
- * drm_exec_init() and drm_exec_fini() accordingly.
- *
- * Returns: 0 on success, negative error code on failure.
- */
-static inline int
-drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
-struct drm_exec *exec,
-unsigned int num_fences)
-{
-   return drm_exec_prepare_obj(exec, gpuvm->r_obj, num_fences);
-}
+int drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+struct drm_exec *exec,
+unsigned int num_fences);
 
 int drm_gpuvm_prepare_objects

[PATCH drm-misc-next v2 0/2] PowerVR VM fixes

2023-11-29 Thread Danilo Krummrich
Hi,

Some major GPUVM changes landed just before v8 of the PowerVR series. Since v8
went in rather quickly (review process was finished otherwise) I haven't had the
chance to review the subsequent code changes.

Hence, this series with a few fixes in this context. Plus a minor GPUVM patch to
make the drm_gpuvm_prepare_* helpers useful for PowerVR.

- Danilo

Changes in V2
=
- GPUVM: update function DOC comment to indicate the passing zero fences to
  drm_gpuvm_prepare_* functions results into drm_exec_lock_obj() calls rather
  than drm_exec_prepare_obj() calls.
- pvr/vm: use drm_gpuvm_exec wrappers
- drop 3 patches which were applied already

Danilo Krummrich (2):
  drm/gpuvm: fall back to drm_exec_lock_obj()
  drm/imagination: vm: make use of GPUVM's drm_exec helper

 drivers/gpu/drm/drm_gpuvm.c  | 43 +++--
 drivers/gpu/drm/imagination/pvr_vm.c | 91 +++-
 include/drm/drm_gpuvm.h  | 23 +--
 3 files changed, 77 insertions(+), 80 deletions(-)


base-commit: 83dc1029dcf50b5b849b26679a1b3f860b85d79c
-- 
2.43.0



Re: [Nouveau] [PATCH][next] nouveau/gsp: replace zero-length array with flex-array member and use __counted_by

2023-11-28 Thread Danilo Krummrich

On 11/29/23 02:06, Gustavo A. R. Silva wrote:



On 11/28/23 19:01, Danilo Krummrich wrote:

On 11/16/23 20:55, Timur Tabi wrote:

On Thu, 2023-11-16 at 20:45 +0100, Danilo Krummrich wrote:

As I already mentioned for Timur's patch [2], I'd prefer to get a fix
upstream
(meaning [1] in this case). Of course, that's probably more up to Timur to
tell
if this will work out.


Don't count on it.


I see. Well, I think it's fine. Once we implement a decent abstraction we likely
don't need those header files in the kernel anymore.

@Gustavo, if you agree I will discard the indentation change when applying the
patch to keep the diff as small as possible.


No problem.


Applied to drm-misc-fixes.



Thanks
--
Gustavo






Re: [PATCH] nouveau/gsp/r535: remove a stray unlock in r535_gsp_rpc_send()

2023-11-28 Thread Danilo Krummrich

On 11/27/23 13:56, Dan Carpenter wrote:

This unlock doesn't belong here and it leads to a double unlock in
the caller, r535_gsp_rpc_push().

Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Dan Carpenter 


Good catch - applied to drm-misc-fixes.

- Danilo


---
  drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index dc44f5c7833f..818e5c73b7a6 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -365,10 +365,8 @@ r535_gsp_rpc_send(struct nvkm_gsp *gsp, void *argv, bool 
wait, u32 repc)
}
  
  	ret = r535_gsp_cmdq_push(gsp, rpc);

-   if (ret) {
-   mutex_unlock(>cmdq.mutex);
+   if (ret)
return ERR_PTR(ret);
-   }
  
  	if (wait) {

msg = r535_gsp_msg_recv(gsp, fn, repc);




Re: [Nouveau] [PATCH][next] nouveau/gsp: replace zero-length array with flex-array member and use __counted_by

2023-11-28 Thread Danilo Krummrich

On 11/16/23 20:55, Timur Tabi wrote:

On Thu, 2023-11-16 at 20:45 +0100, Danilo Krummrich wrote:

As I already mentioned for Timur's patch [2], I'd prefer to get a fix
upstream
(meaning [1] in this case). Of course, that's probably more up to Timur to
tell
if this will work out.


Don't count on it.


I see. Well, I think it's fine. Once we implement a decent abstraction we likely
don't need those header files in the kernel anymore.

@Gustavo, if you agree I will discard the indentation change when applying the
patch to keep the diff as small as possible.

- Danilo



Even if I did change [0] to [], I'm not going to be able to add the
"__counted_by(numEntries);" because that's just not something that our build
system uses.

And even then, I would need to change all [0] to [].

You're not going to be able to use RM's header files as-is anyway in the
long term.  If we changed the layout of PACKED_REGISTRY_TABLE, we're not
going to create a PACKED_REGISTRY_TABLE2 and keep both around.  We're just
going to change PACKED_REGISTRY_TABLE and pretend the previous version never
existed.  You will then have to manually copy the new struct to your header
files and and maintain two versions yourself.







Re: [EXTERNAL] [PATCH drm-misc-next 5/5] drm/imagination: vm: make use of GPUVM's drm_exec helper

2023-11-28 Thread Danilo Krummrich

On 11/28/23 11:47, Donald Robson wrote:

Hi Danilo,

Apologies - I guess I should have submitted a patch to handle zero fences in 
your
locking functions with the final patch series.

On Sat, 2023-11-25 at 00:36 +0100, Danilo Krummrich wrote:

*** CAUTION: This email originates from a source not known to Imagination 
Technologies. Think before you click a link or open an attachment ***

Make use of GPUVM's drm_exec helper functions preventing direct access
to GPUVM internal data structures, such as the external object list.

This is especially important to ensure following the locking rules
around the GPUVM external object list.

Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code")
Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/imagination/pvr_vm.c | 16 +---
  1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/imagination/pvr_vm.c 
b/drivers/gpu/drm/imagination/pvr_vm.c
index e0d74d9a6190..3f7888f5cc53 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.c
+++ b/drivers/gpu/drm/imagination/pvr_vm.c
@@ -337,27 +337,21 @@ static int
  pvr_vm_bind_op_lock_resvs(struct drm_exec *exec, struct pvr_vm_bind_op 
*bind_op)
  {
drm_exec_until_all_locked(exec) {
-   struct drm_gem_object *r_obj = _op->vm_ctx->dummy_gem;
struct drm_gpuvm *gpuvm = _op->vm_ctx->gpuvm_mgr;
struct pvr_gem_object *pvr_obj = bind_op->pvr_obj;
-   struct drm_gpuvm_bo *gpuvm_bo;
  
  		/* Acquire lock on the vm_context's reserve object. */

-   int err = drm_exec_lock_obj(exec, r_obj);
+   int err = drm_gpuvm_prepare_vm(gpuvm, exec, 0);
  
  		drm_exec_retry_on_contention(exec);

if (err)
return err;
  
  		/* Acquire lock on all BOs in the context. */

-   list_for_each_entry(gpuvm_bo, >extobj.list,
-   list.entry.extobj) {
-   err = drm_exec_lock_obj(exec, gpuvm_bo->obj);
-
-   drm_exec_retry_on_contention(exec);
-   if (err)
-   return err;
-   }
+   err = drm_gpuvm_prepare_objects(gpuvm, exec, 0);
+   drm_exec_retry_on_contention(exec);
+   if (err)
+   return err;


Before I discovered the problem when not reserving fences, I was trying to use
drm_gpuvm_exec_lock() with vm_exec->extra.fn() for the part below.  Is there
a reason not to do that now?


No, that works - gonna change that.

- Danilo



Many thanks,
Donald

  
  		/* Unmap operations don't have an object to lock. */

if (!pvr_obj)




Re: [PATCH v5] Documentation/gpu: VM_BIND locking document

2023-11-27 Thread Danilo Krummrich

On 11/22/23 08:49, Thomas Hellström wrote:


On 11/21/23 14:56, Boris Brezillon wrote:

On Tue, 21 Nov 2023 11:40:46 +0100
Thomas Hellström  wrote:


Add the first version of the VM_BIND locking document which is
intended to be part of the xe driver upstreaming agreement.

The document describes and discuss the locking used during exec-
functions, evicton and for userptr gpu-vmas. Intention is to be using the
same nomenclature as the drm-vm-bind-async.rst.

v2:
- s/gvm/gpu_vm/g (Rodrigo Vivi)
- Clarify the userptr seqlock with a pointer to mm/mmu_notifier.c
   (Rodrigo Vivi)
- Adjust commit message accordingly.
- Add SPDX license header.

v3:
- Large update to align with the drm_gpuvm manager locking
- Add "Efficient userptr gpu_vma exec function iteration" section
- Add "Locking at bind- and unbind time" section.

v4:
- Fix tabs vs space errors by untabifying (Rodrigo Vivi)
- Minor style fixes and typos (Rodrigo Vivi)
- Clarify situations where stale GPU mappings are occurring and how
   access through these mappings are blocked. (Rodrigo Vivi)
- Insert into the toctree in implementation_guidelines.rst

v5:
- Add a section about recoverable page-faults.
- Use local references to other documentation where possible
   (Bagas Sanjaya)
- General documentation fixes and typos (Danilo Krummrich and
   Boris Brezillon)
- Improve the documentation around locks that need to be grabbed from the
   dm-fence critical section (Boris Brezillon)
- Add more references to the DRM GPUVM helpers (Danilo Krummrich and
   Boriz Brezillon)
- Update the rfc/xe.rst document.

Cc: Rodrigo Vivi 
Signed-off-by: Thomas Hellström 

Still have a few comments (see below), and in general, I find some
sentences very long, which has the tendency of confusing me (always
trying to figure out what was the main point, what the pronouns are
referring to, etc). Anyway, I think it's better to have something
imperfect than nothing at all, so here is my

Reviewed-by: Boris Brezillon 

Feel free to add it even if you decide to ignore my comments.


Thanks for reviewing, Boris!

I'll make a final version incorporating much of the comments and suggestions, 
much appreciated.

I still think, though, that in principle the referral between gpuvm and this 
document should be the other way around, or it should all sit in gpuvm. In any 
case this is an initial version and as more comments and suggestions land, we 
can certainly update.


I think if we agree that GPUVM should be the common component that we recommend
drivers to use, we should reference to GPUVM whenever possible. However, I'm not
sure whether we'd want to dedicate *all* documentation to GPUVM. Since the topic
is rather complex, I can see that it might be beneficial to do both, discuss it
from a more abstract point of view and document the corresponding common 
component.

Reviewed-by: Danilo Krummrich 



Thanks,

Thomas





Re: [PATCH drm-misc-next 4/5] drm/gpuvm: fall back to drm_exec_lock_obj()

2023-11-27 Thread Danilo Krummrich

Hi,

On 11/27/23 14:52, Christian König wrote:

Am 25.11.23 um 00:40 schrieb Danilo Krummrich:

Hi Christian,

do you think it makes sense to handle this in drm_exec_prepare_obj() or
even dma_resv_reserve_fences() even?


IIRC for drm_exec the discussion has gone a bit back and forth between handling 
0 and having a separate function which doesn't allocate fences.

We ended up with the solution using separate calls since you either know that 
you don't need fences (because you for example only need to look something up) 
or you need fences and eventually calculate the number you need dynamically and 
if you then end up with 0 it's a bug.


I don't have a strong opinion on that. Though, in GPUVM I'd probably still need 
some wrapper like

+exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj,
+ unsigned int num_fences)
+{
+return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) :
+drm_exec_lock_obj(exec, obj);
+}

to prevent either duplicate code or rather unnecessary complicated abstractions.

And I'm not sure it's super nice that drm_gpuvm_prepare_objects() allows zero 
fences, whereas drm_exec_prepare_obj() does not.



So to sum it up the conclusion was that it's more defensive to complain about 0 
fences to reserve (which reminds me that dma_resv_reserve_fences() still needs 
to get a warning for 0 fences as well).


What's the logic you'd want to apply there? WARN() but still allocate at least 
4 slots or WARN() and return doing nothing?

- Danilo



Regards,
Christian.



- Danilo

On 11/25/23 00:36, Danilo Krummrich wrote:

Fall back to drm_exec_lock_obj() if num_fences is zero for the
drm_gpuvm_prepare_* function family.

Otherwise dma_resv_reserve_fences() would actually allocate slots even
though num_fences is zero.

Cc: Christian König 
Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/drm_gpuvm.c | 36 +---
  include/drm/drm_gpuvm.h | 23 +++
  2 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 54f5e8851de5..d1d1c2379e44 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1085,6 +1085,36 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm)
  }
  EXPORT_SYMBOL_GPL(drm_gpuvm_put);
  +static int
+exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj,
+ unsigned int num_fences)
+{
+    return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) :
+    drm_exec_lock_obj(exec, obj);
+}
+
+/**
+ * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
+ * @gpuvm: the _gpuvm
+ * @exec: the _exec context
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object.
+ *
+ * Using this function directly, it is the drivers responsibility to call
+ * drm_exec_init() and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+ struct drm_exec *exec,
+ unsigned int num_fences)
+{
+    return exec_prepare_obj(exec, gpuvm->r_obj, num_fences);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm);
+
  static int
  __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
  struct drm_exec *exec,
@@ -1095,7 +1125,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
  int ret = 0;
    for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) {
-    ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+    ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
  if (ret)
  break;
  }
@@ -1116,7 +1146,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm,
    drm_gpuvm_resv_assert_held(gpuvm);
  list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) {
-    ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+    ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
  if (ret)
  break;
  @@ -1186,7 +1216,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct 
drm_exec *exec,
  drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
  struct drm_gem_object *obj = va->gem.obj;
  -    ret = drm_exec_prepare_obj(exec, obj, num_fences);
+    ret = exec_prepare_obj(exec, obj, num_fences);
  if (ret)
  return ret;
  }
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index f94fec9a8517..b3f82ec7fb17 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -544,26 +544,9 @@ struct drm_gpuvm_exec {
  } extra;
  };
  -/**
- * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
- * @gpuvm: the _gpuvm
- * @exec: the _exec context
- * @num_fences: the amount of _fences to reserve
- *
- * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object.
- *
- * Using this function directly, it is the drivers r

Re: [PATCH drm-misc-next 4/5] drm/gpuvm: fall back to drm_exec_lock_obj()

2023-11-24 Thread Danilo Krummrich

Hi Christian,

do you think it makes sense to handle this in drm_exec_prepare_obj() or
even dma_resv_reserve_fences() even?

- Danilo

On 11/25/23 00:36, Danilo Krummrich wrote:

Fall back to drm_exec_lock_obj() if num_fences is zero for the
drm_gpuvm_prepare_* function family.

Otherwise dma_resv_reserve_fences() would actually allocate slots even
though num_fences is zero.

Cc: Christian König 
Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/drm_gpuvm.c | 36 +---
  include/drm/drm_gpuvm.h | 23 +++
  2 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 54f5e8851de5..d1d1c2379e44 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1085,6 +1085,36 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm)
  }
  EXPORT_SYMBOL_GPL(drm_gpuvm_put);
  
+static int

+exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj,
+unsigned int num_fences)
+{
+   return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) :
+   drm_exec_lock_obj(exec, obj);
+}
+
+/**
+ * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
+ * @gpuvm: the _gpuvm
+ * @exec: the _exec context
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object.
+ *
+ * Using this function directly, it is the drivers responsibility to call
+ * drm_exec_init() and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+struct drm_exec *exec,
+unsigned int num_fences)
+{
+   return exec_prepare_obj(exec, gpuvm->r_obj, num_fences);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm);
+
  static int
  __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
struct drm_exec *exec,
@@ -1095,7 +1125,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
int ret = 0;
  
  	for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) {

-   ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+   ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
if (ret)
break;
}
@@ -1116,7 +1146,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm,
  
  	drm_gpuvm_resv_assert_held(gpuvm);

list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) {
-   ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+   ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
if (ret)
break;
  
@@ -1186,7 +1216,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec,

drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
struct drm_gem_object *obj = va->gem.obj;
  
-		ret = drm_exec_prepare_obj(exec, obj, num_fences);

+   ret = exec_prepare_obj(exec, obj, num_fences);
if (ret)
return ret;
}
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index f94fec9a8517..b3f82ec7fb17 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -544,26 +544,9 @@ struct drm_gpuvm_exec {
} extra;
  };
  
-/**

- * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
- * @gpuvm: the _gpuvm
- * @exec: the _exec context
- * @num_fences: the amount of _fences to reserve
- *
- * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object.
- *
- * Using this function directly, it is the drivers responsibility to call
- * drm_exec_init() and drm_exec_fini() accordingly.
- *
- * Returns: 0 on success, negative error code on failure.
- */
-static inline int
-drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
-struct drm_exec *exec,
-unsigned int num_fences)
-{
-   return drm_exec_prepare_obj(exec, gpuvm->r_obj, num_fences);
-}
+int drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+struct drm_exec *exec,
+unsigned int num_fences);
  
  int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,

  struct drm_exec *exec,




[PATCH drm-misc-next 5/5] drm/imagination: vm: make use of GPUVM's drm_exec helper

2023-11-24 Thread Danilo Krummrich
Make use of GPUVM's drm_exec helper functions preventing direct access
to GPUVM internal data structures, such as the external object list.

This is especially important to ensure following the locking rules
around the GPUVM external object list.

Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/imagination/pvr_vm.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/imagination/pvr_vm.c 
b/drivers/gpu/drm/imagination/pvr_vm.c
index e0d74d9a6190..3f7888f5cc53 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.c
+++ b/drivers/gpu/drm/imagination/pvr_vm.c
@@ -337,27 +337,21 @@ static int
 pvr_vm_bind_op_lock_resvs(struct drm_exec *exec, struct pvr_vm_bind_op 
*bind_op)
 {
drm_exec_until_all_locked(exec) {
-   struct drm_gem_object *r_obj = _op->vm_ctx->dummy_gem;
struct drm_gpuvm *gpuvm = _op->vm_ctx->gpuvm_mgr;
struct pvr_gem_object *pvr_obj = bind_op->pvr_obj;
-   struct drm_gpuvm_bo *gpuvm_bo;
 
/* Acquire lock on the vm_context's reserve object. */
-   int err = drm_exec_lock_obj(exec, r_obj);
+   int err = drm_gpuvm_prepare_vm(gpuvm, exec, 0);
 
drm_exec_retry_on_contention(exec);
if (err)
return err;
 
/* Acquire lock on all BOs in the context. */
-   list_for_each_entry(gpuvm_bo, >extobj.list,
-   list.entry.extobj) {
-   err = drm_exec_lock_obj(exec, gpuvm_bo->obj);
-
-   drm_exec_retry_on_contention(exec);
-   if (err)
-   return err;
-   }
+   err = drm_gpuvm_prepare_objects(gpuvm, exec, 0);
+   drm_exec_retry_on_contention(exec);
+   if (err)
+   return err;
 
/* Unmap operations don't have an object to lock. */
if (!pvr_obj)
-- 
2.42.0



[PATCH drm-misc-next 4/5] drm/gpuvm: fall back to drm_exec_lock_obj()

2023-11-24 Thread Danilo Krummrich
Fall back to drm_exec_lock_obj() if num_fences is zero for the
drm_gpuvm_prepare_* function family.

Otherwise dma_resv_reserve_fences() would actually allocate slots even
though num_fences is zero.

Cc: Christian König 
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/drm_gpuvm.c | 36 +---
 include/drm/drm_gpuvm.h | 23 +++
 2 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 54f5e8851de5..d1d1c2379e44 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1085,6 +1085,36 @@ drm_gpuvm_put(struct drm_gpuvm *gpuvm)
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_put);
 
+static int
+exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj,
+unsigned int num_fences)
+{
+   return num_fences ? drm_exec_prepare_obj(exec, obj, num_fences) :
+   drm_exec_lock_obj(exec, obj);
+}
+
+/**
+ * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
+ * @gpuvm: the _gpuvm
+ * @exec: the _exec context
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object.
+ *
+ * Using this function directly, it is the drivers responsibility to call
+ * drm_exec_init() and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+struct drm_exec *exec,
+unsigned int num_fences)
+{
+   return exec_prepare_obj(exec, gpuvm->r_obj, num_fences);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_vm);
+
 static int
 __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
struct drm_exec *exec,
@@ -1095,7 +1125,7 @@ __drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
int ret = 0;
 
for_each_vm_bo_in_list(gpuvm, extobj, , vm_bo) {
-   ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+   ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
if (ret)
break;
}
@@ -1116,7 +1146,7 @@ drm_gpuvm_prepare_objects_locked(struct drm_gpuvm *gpuvm,
 
drm_gpuvm_resv_assert_held(gpuvm);
list_for_each_entry(vm_bo, >extobj.list, list.entry.extobj) {
-   ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+   ret = exec_prepare_obj(exec, vm_bo->obj, num_fences);
if (ret)
break;
 
@@ -1186,7 +1216,7 @@ drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct 
drm_exec *exec,
drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
struct drm_gem_object *obj = va->gem.obj;
 
-   ret = drm_exec_prepare_obj(exec, obj, num_fences);
+   ret = exec_prepare_obj(exec, obj, num_fences);
if (ret)
return ret;
}
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index f94fec9a8517..b3f82ec7fb17 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -544,26 +544,9 @@ struct drm_gpuvm_exec {
} extra;
 };
 
-/**
- * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
- * @gpuvm: the _gpuvm
- * @exec: the _exec context
- * @num_fences: the amount of _fences to reserve
- *
- * Calls drm_exec_prepare_obj() for the GPUVMs dummy _gem_object.
- *
- * Using this function directly, it is the drivers responsibility to call
- * drm_exec_init() and drm_exec_fini() accordingly.
- *
- * Returns: 0 on success, negative error code on failure.
- */
-static inline int
-drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
-struct drm_exec *exec,
-unsigned int num_fences)
-{
-   return drm_exec_prepare_obj(exec, gpuvm->r_obj, num_fences);
-}
+int drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+struct drm_exec *exec,
+unsigned int num_fences);
 
 int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
  struct drm_exec *exec,
-- 
2.42.0



[PATCH drm-misc-next 3/5] drm/imagination: vm: fix drm_gpuvm reference count

2023-11-24 Thread Danilo Krummrich
The driver specific reference count indicates whether the VM should be
teared down, whereas GPUVM's reference count indicates whether the VM
structure can finally be freed.

Hence, free the VM structure in pvr_gpuvm_free() and drop the last
GPUVM reference after tearing down the VM. Generally, this prevents
lifetime issues such as the VM being freed as long as drm_gpuvm_bo
structures still hold references to the VM.

Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/imagination/pvr_vm.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/imagination/pvr_vm.c 
b/drivers/gpu/drm/imagination/pvr_vm.c
index 1e89092c3dcc..e0d74d9a6190 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.c
+++ b/drivers/gpu/drm/imagination/pvr_vm.c
@@ -64,6 +64,12 @@ struct pvr_vm_context {
struct drm_gem_object dummy_gem;
 };
 
+static inline
+struct pvr_vm_context *to_pvr_vm_context(struct drm_gpuvm *gpuvm)
+{
+   return container_of(gpuvm, struct pvr_vm_context, gpuvm_mgr);
+}
+
 struct pvr_vm_context *pvr_vm_context_get(struct pvr_vm_context *vm_ctx)
 {
if (vm_ctx)
@@ -535,7 +541,7 @@ pvr_device_addr_and_size_are_valid(struct pvr_vm_context 
*vm_ctx,
 
 void pvr_gpuvm_free(struct drm_gpuvm *gpuvm)
 {
-
+   kfree(to_pvr_vm_context(gpuvm));
 }
 
 static const struct drm_gpuvm_ops pvr_vm_gpuva_ops = {
@@ -655,12 +661,11 @@ pvr_vm_context_release(struct kref *ref_count)
WARN_ON(pvr_vm_unmap(vm_ctx, vm_ctx->gpuvm_mgr.mm_start,
 vm_ctx->gpuvm_mgr.mm_range));
 
-   drm_gpuvm_put(_ctx->gpuvm_mgr);
pvr_mmu_context_destroy(vm_ctx->mmu_ctx);
drm_gem_private_object_fini(_ctx->dummy_gem);
mutex_destroy(_ctx->lock);
 
-   kfree(vm_ctx);
+   drm_gpuvm_put(_ctx->gpuvm_mgr);
 }
 
 /**
-- 
2.42.0



[PATCH drm-misc-next 0/5] PowerVR VM fixes

2023-11-24 Thread Danilo Krummrich
Hi,

Some major GPUVM changes landed just before v8 of the PowerVR series. Since v8
went in rather quickly (review process was finished otherwise) I haven't had the
chance to review the subsequent code changes.

Hence, this series with a few fixes in this context. Plus a minor GPUVM patch to
make the drm_gpuvm_prepare_* helpers useful for PowerVR.

- Danilo


Danilo Krummrich (5):
  drm/imagination: vm: prevent duplicate drm_gpuvm_bo instances
  drm/imagination: vm: check for drm_gpuvm_range_valid()
  drm/imagination: vm: fix drm_gpuvm reference count
  drm/gpuvm: fall back to drm_exec_lock_obj()
  drm/imagination: vm: make use of GPUVM's drm_exec helper

 drivers/gpu/drm/drm_gpuvm.c  | 36 --
 drivers/gpu/drm/imagination/pvr_vm.c | 46 +++-
 drivers/gpu/drm/imagination/pvr_vm.h |  3 +-
 include/drm/drm_gpuvm.h  | 23 ++
 4 files changed, 63 insertions(+), 45 deletions(-)


base-commit: 46990918f35c1bf6e367cf8e0423e7344fec9fcb
-- 
2.42.0



[PATCH drm-misc-next 2/5] drm/imagination: vm: check for drm_gpuvm_range_valid()

2023-11-24 Thread Danilo Krummrich
Extend pvr_device_addr_and_size_are_valid() by the corresponding GPUVM
sanity checks. This includes a, previously missing, overflow check for
the base address and size of the requested mapping.

Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/imagination/pvr_vm.c | 9 ++---
 drivers/gpu/drm/imagination/pvr_vm.h | 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/imagination/pvr_vm.c 
b/drivers/gpu/drm/imagination/pvr_vm.c
index 09d481c575b0..1e89092c3dcc 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.c
+++ b/drivers/gpu/drm/imagination/pvr_vm.c
@@ -239,7 +239,7 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op,
return -EINVAL;
}
 
-   if (!pvr_device_addr_and_size_are_valid(device_addr, size) ||
+   if (!pvr_device_addr_and_size_are_valid(vm_ctx, device_addr, size) ||
offset & ~PAGE_MASK || size & ~PAGE_MASK ||
offset >= pvr_obj_size || offset_plus_size > pvr_obj_size)
return -EINVAL;
@@ -295,7 +295,7 @@ pvr_vm_bind_op_unmap_init(struct pvr_vm_bind_op *bind_op,
 {
int err;
 
-   if (!pvr_device_addr_and_size_are_valid(device_addr, size))
+   if (!pvr_device_addr_and_size_are_valid(vm_ctx, device_addr, size))
return -EINVAL;
 
bind_op->type = PVR_VM_BIND_TYPE_UNMAP;
@@ -505,6 +505,7 @@ pvr_device_addr_is_valid(u64 device_addr)
 /**
  * pvr_device_addr_and_size_are_valid() - Tests whether a device-virtual
  * address and associated size are both valid.
+ * @vm_ctx: Target VM context.
  * @device_addr: Virtual device address to test.
  * @size: Size of the range based at @device_addr to test.
  *
@@ -523,9 +524,11 @@ pvr_device_addr_is_valid(u64 device_addr)
  *  * %false otherwise.
  */
 bool
-pvr_device_addr_and_size_are_valid(u64 device_addr, u64 size)
+pvr_device_addr_and_size_are_valid(struct pvr_vm_context *vm_ctx,
+  u64 device_addr, u64 size)
 {
return pvr_device_addr_is_valid(device_addr) &&
+  drm_gpuvm_range_valid(_ctx->gpuvm_mgr, device_addr, size) &&
   size != 0 && (size & ~PVR_DEVICE_PAGE_MASK) == 0 &&
   (device_addr + size <= PVR_PAGE_TABLE_ADDR_SPACE_SIZE);
 }
diff --git a/drivers/gpu/drm/imagination/pvr_vm.h 
b/drivers/gpu/drm/imagination/pvr_vm.h
index cf8b97553dc8..f2a6463f2b05 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.h
+++ b/drivers/gpu/drm/imagination/pvr_vm.h
@@ -29,7 +29,8 @@ struct drm_exec;
 /* Functions defined in pvr_vm.c */
 
 bool pvr_device_addr_is_valid(u64 device_addr);
-bool pvr_device_addr_and_size_are_valid(u64 device_addr, u64 size);
+bool pvr_device_addr_and_size_are_valid(struct pvr_vm_context *vm_ctx,
+   u64 device_addr, u64 size);
 
 struct pvr_vm_context *pvr_vm_create_context(struct pvr_device *pvr_dev,
 bool is_userspace_context);
-- 
2.42.0



[PATCH drm-misc-next 1/5] drm/imagination: vm: prevent duplicate drm_gpuvm_bo instances

2023-11-24 Thread Danilo Krummrich
Use drm_gpuvm_bo_obtain() instead of drm_gpuvm_bo_create(). The latter
should only be used in conjunction with drm_gpuvm_bo_obtain_prealloc().

drm_gpuvm_bo_obtain() re-uses existing instances of a given VM and BO
combination, whereas drm_gpuvm_bo_create() would always create a new
instance of struct drm_gpuvm_bo and hence leave us with duplicates.

Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/imagination/pvr_vm.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/imagination/pvr_vm.c 
b/drivers/gpu/drm/imagination/pvr_vm.c
index 3ad1366294b9..09d481c575b0 100644
--- a/drivers/gpu/drm/imagination/pvr_vm.c
+++ b/drivers/gpu/drm/imagination/pvr_vm.c
@@ -224,6 +224,7 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op,
struct pvr_gem_object *pvr_obj, u64 offset,
u64 device_addr, u64 size)
 {
+   struct drm_gem_object *obj = gem_from_pvr_gem(pvr_obj);
const bool is_user = vm_ctx == vm_ctx->pvr_dev->kernel_vm_ctx;
const u64 pvr_obj_size = pvr_gem_object_size(pvr_obj);
struct sg_table *sgt;
@@ -245,10 +246,11 @@ pvr_vm_bind_op_map_init(struct pvr_vm_bind_op *bind_op,
 
bind_op->type = PVR_VM_BIND_TYPE_MAP;
 
-   bind_op->gpuvm_bo = drm_gpuvm_bo_create(_ctx->gpuvm_mgr,
-   gem_from_pvr_gem(pvr_obj));
-   if (!bind_op->gpuvm_bo)
-   return -ENOMEM;
+   dma_resv_lock(obj->resv, NULL);
+   bind_op->gpuvm_bo = drm_gpuvm_bo_obtain(_ctx->gpuvm_mgr, obj);
+   dma_resv_unlock(obj->resv);
+   if (IS_ERR(bind_op->gpuvm_bo))
+   return PTR_ERR(bind_op->gpuvm_bo);
 
bind_op->new_va = kzalloc(sizeof(*bind_op->new_va), GFP_KERNEL);
bind_op->prev_va = kzalloc(sizeof(*bind_op->prev_va), GFP_KERNEL);
-- 
2.42.0



Re: [PATCH v3] driver: gpu: Fixing warning directly dereferencing a rcu pointer

2023-11-20 Thread Danilo Krummrich

On 11/13/23 20:13, Abhinav Singh wrote:

This patch fixes a sparse warning with this message
"warning:dereference of noderef expression". In this context it means we
are dereferencing a __rcu tagged pointer directly.

We should not be directly dereferencing a rcu pointer. To get a normal
(non __rcu tagged pointer) from a __rcu tagged pointer we are using the
function unrcu_pointer(...). The non __rcu tagged pointer then can be
dereferenced just like a normal pointer.

I tested with qemu with this command
qemu-system-x86_64 \
-m 2G \
-smp 2 \
-kernel bzImage \
-append "console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
-drive file=bullseye.img,format=raw \
-net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 \
-net nic,model=e1000 \
-enable-kvm \
-nographic \
-pidfile vm.pid \
2>&1 | tee vm.log
with lockdep enabled.

Signed-off-by: Abhinav Singh 


Applied, thanks!

There are a few more such occurrences. [1][2] Plan to fix them as well?

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nv10_fence.c#L35
[2] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/nouveau/nv84_fence.c#L88


---
v1 -> v2 : Replaced the rcu_dereference(...) with unrcu_pointer(...) and
also removed the rcu locking and unlocking function call.
v2 -> v3 : Changed the description of the patch to match it with the actual
   implementation.

  drivers/gpu/drm/nouveau/nv04_fence.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nv04_fence.c 
b/drivers/gpu/drm/nouveau/nv04_fence.c
index 5b71a5a5cd85..cdbc75e3d1f6 100644
--- a/drivers/gpu/drm/nouveau/nv04_fence.c
+++ b/drivers/gpu/drm/nouveau/nv04_fence.c
@@ -39,7 +39,7 @@ struct nv04_fence_priv {
  static int
  nv04_fence_emit(struct nouveau_fence *fence)
  {
-   struct nvif_push *push = fence->channel->chan.push;
+   struct nvif_push *push = unrcu_pointer(fence->channel)->chan.push;
int ret = PUSH_WAIT(push, 2);
if (ret == 0) {
PUSH_NVSQ(push, NV_SW, 0x0150, fence->base.seqno);




Re: [PATCH 1/2] drm/scheduler: improve GPU scheduler documentation v2

2023-11-16 Thread Danilo Krummrich
Hi Christian,

Thanks for sending an update of this patch!

On Thu, Nov 16, 2023 at 03:15:46PM +0100, Christian König wrote:
> Start to improve the scheduler document. Especially document the
> lifetime of each of the objects as well as the restrictions around
> DMA-fence handling and userspace compatibility.
> 
> v2: Some improvements suggested by Danilo, add section about error
> handling.
> 
> Signed-off-by: Christian König 
> ---
>  Documentation/gpu/drm-mm.rst   |  36 +
>  drivers/gpu/drm/scheduler/sched_main.c | 174 +
>  2 files changed, 188 insertions(+), 22 deletions(-)
> 
> diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> index acc5901ac840..112463fa9f3a 100644
> --- a/Documentation/gpu/drm-mm.rst
> +++ b/Documentation/gpu/drm-mm.rst
> @@ -552,12 +552,48 @@ Overview
>  .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> :doc: Overview
>  
> +Job Object
> +--
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> +   :doc: Job Object
> +
> +Entity Object
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> +   :doc: Entity Object
> +
> +Hardware Fence Object
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> +   :doc: Hardware Fence Object
> +
> +Scheduler Fence Object
> +--
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> +   :doc: Scheduler Fence Object
> +
> +Scheduler and Run Queue Objects
> +---
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> +   :doc: Scheduler and Run Queue Objects
> +
>  Flow Control
>  
>  
>  .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> :doc: Flow Control
>  
> +Error and Timeout handling
> +--
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
> +   :doc: Error and Timeout handling
> +
>  Scheduler Function References
>  -
>  
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 044a8c4875ba..026123497b0e 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -24,28 +24,122 @@
>  /**
>   * DOC: Overview
>   *
> - * The GPU scheduler provides entities which allow userspace to push jobs
> - * into software queues which are then scheduled on a hardware run queue.
> - * The software queues have a priority among them. The scheduler selects the 
> entities
> - * from the run queue using a FIFO. The scheduler provides dependency 
> handling
> - * features among jobs. The driver is supposed to provide callback functions 
> for
> - * backend operations to the scheduler like submitting a job to hardware run 
> queue,
> - * returning the dependencies of a job etc.
> - *
> - * The organisation of the scheduler is the following:
> - *
> - * 1. Each hw run queue has one scheduler
> - * 2. Each scheduler has multiple run queues with different priorities
> - *(e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)
> - * 3. Each scheduler run queue has a queue of entities to schedule
> - * 4. Entities themselves maintain a queue of jobs that will be scheduled on
> - *the hardware.
> - *
> - * The jobs in a entity are always scheduled in the order that they were 
> pushed.
> - *
> - * Note that once a job was taken from the entities queue and pushed to the
> - * hardware, i.e. the pending queue, the entity must not be referenced 
> anymore
> - * through the jobs entity pointer.
> + * The GPU scheduler implements some logic to decide which command submission
> + * to push next to the hardware. Another major use case of the GPU scheduler
> + * is to enforce correct driver behavior around those command submissions.
> + * Because of this it's also used by drivers which don't need the actual
> + * scheduling functionality.

This reads a bit like we're already right in the middle of the documentation.
I'd propose to start with something like "The DRM GPU scheduler serves as a
common component intended to help drivers to manage command submissions." And
then mention the different solutions the scheduler provides, e.g. ordering of
command submissions, dma-fence handling in the context of command submissions,
etc.

Also, I think it would be good to give a rough overview of the topology of the
scheduler's components. And since you already mention that the component is
"also used by drivers which don't need the actual scheduling functionality", I'd
also mention the special case of a single entity and a single run-queue per
scheduler.

> + *
> + * All callbacks the driver needs to implement are restricted by DMA-fence
> + * signaling rules to guarantee deadlock free forward progress. This 
> especially
> + * means that for normal operation no memory can be allocated in a callback.
> + * All memory which is needed for pushing the job to the hardware must be
> + * allocated before 

  1   2   3   4   5   6   7   8   9   >