Re: [PATCH 0/9] drm: Annotate structs with __counted_by

2023-10-05 Thread Kees Cook
On Thu, Oct 05, 2023 at 11:42:38AM +0200, Christian König wrote:
> Am 02.10.23 um 20:22 schrieb Kees Cook:
> > On Mon, Oct 02, 2023 at 08:11:41PM +0200, Christian König wrote:
> > > Am 02.10.23 um 20:08 schrieb Kees Cook:
> > > > On Mon, Oct 02, 2023 at 08:01:57PM +0200, Christian König wrote:
> > > > > Am 02.10.23 um 18:53 schrieb Kees Cook:
> > > > > > On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote:
> > > > > > > On Mon, Oct 2, 2023 at 5:20 AM Christian König
> > > > > > >  wrote:
> > > > > > > > Am 29.09.23 um 21:33 schrieb Kees Cook:
> > > > > > > > > On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote:
> > > > > > > > > > This is a batch of patches touching drm for preparing for 
> > > > > > > > > > the coming
> > > > > > > > > > implementation by GCC and Clang of the __counted_by 
> > > > > > > > > > attribute. Flexible
> > > > > > > > > > array members annotated with __counted_by can have their 
> > > > > > > > > > accesses
> > > > > > > > > > bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS 
> > > > > > > > > > (for array
> > > > > > > > > > indexing) and CONFIG_FORTIFY_SOURCE (for 
> > > > > > > > > > strcpy/memcpy-family functions).
> > > > > > > > > > 
> > > > > > > > > > As found with Coccinelle[1], add __counted_by to structs 
> > > > > > > > > > that would
> > > > > > > > > > benefit from the annotation.
> > > > > > > > > > 
> > > > > > > > > > [...]
> > > > > > > > > Since this got Acks, I figure I should carry it in my tree. 
> > > > > > > > > Let me know
> > > > > > > > > if this should go via drm instead.
> > > > > > > > > 
> > > > > > > > > Applied to for-next/hardening, thanks!
> > > > > > > > > 
> > > > > > > > > [1/9] drm/amd/pm: Annotate struct 
> > > > > > > > > smu10_voltage_dependency_table with __counted_by
> > > > > > > > >   https://git.kernel.org/kees/c/a6046ac659d6
> > > > > > > > STOP! In a follow up discussion Alex and I figured out that 
> > > > > > > > this won't work.
> > > > > > I'm so confused; from the discussion I saw that Alex said both 
> > > > > > instances
> > > > > > were false positives?
> > > > > > 
> > > > > > > > The value in the structure is byte swapped based on some 
> > > > > > > > firmware
> > > > > > > > endianness which not necessary matches the CPU endianness.
> > > > > > > SMU10 is APU only so the endianess of the SMU firmware and the CPU
> > > > > > > will always match.
> > > > > > Which I think is what is being said here?
> > > > > > 
> > > > > > > > Please revert that one from going upstream if it's already on 
> > > > > > > > it's way.
> > > > > > > > 
> > > > > > > > And because of those reasons I strongly think that patches like 
> > > > > > > > this
> > > > > > > > should go through the DRM tree :)
> > > > > > Sure, that's fine -- please let me know. It was others Acked/etc. 
> > > > > > Who
> > > > > > should carry these patches?
> > > > > Probably best if the relevant maintainer pick them up individually.
> > > > > 
> > > > > Some of those structures are filled in by firmware/hardware and only 
> > > > > the
> > > > > maintainers can judge if that value actually matches what the compiler
> > > > > needs.
> > > > > 
> > > > > We have cases where individual bits are used as flags or when the 
> > > > > size is
> > > > > byte swapped etc...
> > > > > 
> > > > > Even Alex and I didn't immediately say how and where that field is 
> > > > > actually
> > > > > used and had to dig that up. That's where the confusion came from.
> > > > Okay, I've dropped them all from my tree. Several had Acks/Reviews, so
> > > > hopefully those can get picked up for the DRM tree?
> > > I will pick those up to go through drm-misc-next.
> > > 
> > > Going to ping maintainers once more when I'm not sure if stuff is correct 
> > > or
> > > not.
> > Sounds great; thanks!
> 
> I wasn't 100% sure for the VC4 patch, but pushed the whole set to
> drm-misc-next anyway.
> 
> This also means that the patches are now auto merged into the drm-tip
> integration branch and should any build or unit test go boom we should
> notice immediately and can revert it pretty easily.

Thanks very much; I'll keep an eye out for any reports.

-- 
Kees Cook



Re: [Nouveau] [PATCH drm-misc-next v5 4/6] drm/gpuvm: track/lock/validate external/evicted objects

2023-10-05 Thread Thomas Hellström



On 9/28/23 21:16, Danilo Krummrich wrote:

Currently the DRM GPUVM offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVM represent
a basis for GPU-VM implementations. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

Big thanks to Boris Brezillon for his help to figure out locking for
drivers updating the GPU VA space within the fence signalling path.

Suggested-by: Matthew Brost 
Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/drm_gpuvm.c | 642 
  include/drm/drm_gpuvm.h | 240 ++
  2 files changed, 882 insertions(+)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 27100423154b..770bb3d68d1f 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -82,6 +82,21 @@
   * _gem_object list of _gpuvm_bos for an existing instance of this
   * particular combination. If not existent a new instance is created and 
linked
   * to the _gem_object.
+ *
+ * _gpuvm_bo structures, since unique for a given _gpuvm, are also used
+ * as entry for the _gpuvm's lists of external and evicted objects. Those
+ * list are maintained in order to accelerate locking of dma-resv locks and
+ * validation of evicted objects bound in a _gpuvm. For instance, all
+ * _gem_object's _resv of a given _gpuvm can be locked by calling
+ * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in
+ * order to validate all evicted _gem_objects. It is also possible to lock
+ * additional _gem_objects by providing the corresponding parameters to
+ * drm_gpuvm_exec_lock() as well as open code the _exec loop while making
+ * use of helper functions such as drm_gpuvm_prepare_range() or
+ * drm_gpuvm_prepare_objects().
+ *
+ * Every bound _gem_object is treated as external object when its _resv
+ * structure is different than the _gpuvm's common _resv structure.
   */
  
  /**

@@ -429,6 +444,20 @@
   * Subsequent calls to drm_gpuvm_bo_obtain() for the same _gpuvm and
   * _gem_object must be able to observe previous creations and destructions
   * of _gpuvm_bos in order to keep instances unique.
+ *
+ * The _gpuvm's lists for keeping track of external and evicted objects are
+ * protected against concurrent insertion / removal and iteration internally.
+ *
+ * However, drivers still need ensure to protect concurrent calls to functions
+ * iterating those lists, namely drm_gpuvm_prepare_objects() and
+ * drm_gpuvm_validate().
+ *
+ * Alternatively, drivers can set the _GPUVM_RESV_PROTECTED flag to 
indicate
+ * that the corresponding _resv locks are held in order to protect the
+ * lists. If _GPUVM_RESV_PROTECTED is set, internal locking is disabled and
+ * the corresponding lockdep checks are enabled. This is an optimization for
+ * drivers which are capable of taking the corresponding _resv locks and
+ * hence do not require internal locking.
   */
  
  /**

@@ -641,6 +670,195 @@
   *}
   */
  
+/**

+ * get_next_vm_bo_from_list() - get the next vm_bo element
+ * @__gpuvm: The GPU VM
+ * @__list_name: The name of the list we're iterating on
+ * @__local_list: A pointer to the local list used to store already iterated 
items
+ * @__prev_vm_bo: The previous element we got from 
drm_gpuvm_get_next_cached_vm_bo()
+ *
+ * This helper is here to provide lockless list iteration. Lockless as in, the
+ * iterator releases the lock immediately after picking the first element from
+ * the list, so list insertion deletion can happen concurrently.
+ *
+ * Elements popped from the original list are kept in a local list, so removal
+ * and is_empty checks can still happen while we're iterating the list.
+ */
+#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, 
__prev_vm_bo) \
+   ({  
\
+   struct drm_gpuvm_bo *__vm_bo = NULL;
\
+   
\
+   drm_gpuvm_bo_put(__prev_vm_bo); 
\
+   
\
+

Re: [Nouveau] [PATCH drm-misc-next v5 3/6] drm/gpuvm: add an abstraction for a VM / BO combination

2023-10-05 Thread Thomas Hellström

Hi,

On 9/28/23 21:16, Danilo Krummrich wrote:

This patch adds an abstraction layer between the drm_gpuva mappings of

NIT: imperative:  s/This patch adds/Add/

a particular drm_gem_object and this GEM object itself. The abstraction
represents a combination of a drm_gem_object and drm_gpuvm. The
drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
representing this abstraction), while each drm_gpuvm_bo contains list of
mappings of this GEM object.

This has multiple advantages:

1) We can use the drm_gpuvm_bo structure to attach it to various lists
of the drm_gpuvm. This is useful for tracking external and evicted
objects per VM, which is introduced in subsequent patches.

2) Finding mappings of a certain drm_gem_object mapped in a certain
drm_gpuvm becomes much cheaper.

3) Drivers can derive and extend the structure to easily represent
driver specific states of a BO for a certain GPUVM.

The idea of this abstraction was taken from amdgpu, hence the credit for
this idea goes to the developers of amdgpu.

Cc: Christian König 
Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/drm_gpuvm.c| 334 +
  drivers/gpu/drm/nouveau/nouveau_uvmm.c |  64 +++--
  include/drm/drm_gem.h  |  32 +--
  include/drm/drm_gpuvm.h| 177 -
  4 files changed, 523 insertions(+), 84 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 6368dfdbe9dd..27100423154b 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -70,6 +70,18 @@
   * _gem_object, such as the _gem_object containing the root page 
table,
   * but it can also be a 'dummy' object, which can be allocated with
   * drm_gpuvm_root_object_alloc().
+ *
+ * In order to connect a struct drm_gpuva its backing _gem_object each
+ * _gem_object maintains a list of _gpuvm_bo structures, and each
+ * _gpuvm_bo contains a list of &_gpuva structures.
+ *
+ * A _gpuvm_bo is an abstraction that represents a combination of a
+ * _gpuvm and a _gem_object. Every such combination should be unique.
+ * This is ensured by the API through drm_gpuvm_bo_obtain() and
+ * drm_gpuvm_bo_obtain_prealloc() which first look into the corresponding
+ * _gem_object list of _gpuvm_bos for an existing instance of this
+ * particular combination. If not existent a new instance is created and linked
+ * to the _gem_object.
   */
  
  /**

@@ -395,21 +407,28 @@
  /**
   * DOC: Locking
   *
- * Generally, the GPU VA manager does not take care of locking itself, it is
- * the drivers responsibility to take care about locking. Drivers might want to
- * protect the following operations: inserting, removing and iterating
- * _gpuva objects as well as generating all kinds of operations, such as
- * split / merge or prefetch.
- *
- * The GPU VA manager also does not take care of the locking of the backing
- * _gem_object buffers GPU VA lists by itself; drivers are responsible to
- * enforce mutual exclusion using either the GEMs dma_resv lock or 
alternatively
- * a driver specific external lock. For the latter see also
- * drm_gem_gpuva_set_lock().
- *
- * However, the GPU VA manager contains lockdep checks to ensure callers of its
- * API hold the corresponding lock whenever the _gem_objects GPU VA list is
- * accessed by functions such as drm_gpuva_link() or drm_gpuva_unlink().
+ * In terms of managing _gpuva entries DRM GPUVM does not take care of
+ * locking itself, it is the drivers responsibility to take care about locking.
+ * Drivers might want to protect the following operations: inserting, removing
+ * and iterating _gpuva objects as well as generating all kinds of
+ * operations, such as split / merge or prefetch.
+ *
+ * DRM GPUVM also does not take care of the locking of the backing
+ * _gem_object buffers GPU VA lists and _gpuvm_bo abstractions by
+ * itself; drivers are responsible to enforce mutual exclusion using either the
+ * GEMs dma_resv lock or alternatively a driver specific external lock. For the
+ * latter see also drm_gem_gpuva_set_lock().
+ *
+ * However, DRM GPUVM contains lockdep checks to ensure callers of its API hold
+ * the corresponding lock whenever the _gem_objects GPU VA list is accessed
+ * by functions such as drm_gpuva_link() or drm_gpuva_unlink(), but also
+ * drm_gpuvm_bo_obtain() and drm_gpuvm_bo_put().
+ *
+ * The latter is required since on creation and destruction of a _gpuvm_bo
+ * the _gpuvm_bo is attached / removed from the _gem_objects gpuva 
list.
+ * Subsequent calls to drm_gpuvm_bo_obtain() for the same _gpuvm and
+ * _gem_object must be able to observe previous creations and destructions
+ * of _gpuvm_bos in order to keep instances unique.
   */
  
  /**

@@ -439,6 +458,7 @@
   *{
   *struct drm_gpuva_ops *ops;
   *struct drm_gpuva_op *op
+ * struct drm_gpuvm_bo *vm_bo;
   *
   *driver_lock_va_space();
   *ops = 

Re: [Nouveau] [PATCH 0/9] drm: Annotate structs with __counted_by

2023-10-05 Thread Christian König

Am 02.10.23 um 20:22 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 08:11:41PM +0200, Christian König wrote:

Am 02.10.23 um 20:08 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 08:01:57PM +0200, Christian König wrote:

Am 02.10.23 um 18:53 schrieb Kees Cook:

On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote:

On Mon, Oct 2, 2023 at 5:20 AM Christian König
 wrote:

Am 29.09.23 um 21:33 schrieb Kees Cook:

On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote:

This is a batch of patches touching drm for preparing for the coming
implementation by GCC and Clang of the __counted_by attribute. Flexible
array members annotated with __counted_by can have their accesses
bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array
indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions).

As found with Coccinelle[1], add __counted_by to structs that would
benefit from the annotation.

[...]

Since this got Acks, I figure I should carry it in my tree. Let me know
if this should go via drm instead.

Applied to for-next/hardening, thanks!

[1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with 
__counted_by
  https://git.kernel.org/kees/c/a6046ac659d6

STOP! In a follow up discussion Alex and I figured out that this won't work.

I'm so confused; from the discussion I saw that Alex said both instances
were false positives?


The value in the structure is byte swapped based on some firmware
endianness which not necessary matches the CPU endianness.

SMU10 is APU only so the endianess of the SMU firmware and the CPU
will always match.

Which I think is what is being said here?


Please revert that one from going upstream if it's already on it's way.

And because of those reasons I strongly think that patches like this
should go through the DRM tree :)

Sure, that's fine -- please let me know. It was others Acked/etc. Who
should carry these patches?

Probably best if the relevant maintainer pick them up individually.

Some of those structures are filled in by firmware/hardware and only the
maintainers can judge if that value actually matches what the compiler
needs.

We have cases where individual bits are used as flags or when the size is
byte swapped etc...

Even Alex and I didn't immediately say how and where that field is actually
used and had to dig that up. That's where the confusion came from.

Okay, I've dropped them all from my tree. Several had Acks/Reviews, so
hopefully those can get picked up for the DRM tree?

I will pick those up to go through drm-misc-next.

Going to ping maintainers once more when I'm not sure if stuff is correct or
not.

Sounds great; thanks!


I wasn't 100% sure for the VC4 patch, but pushed the whole set to 
drm-misc-next anyway.


This also means that the patches are now auto merged into the drm-tip 
integration branch and should any build or unit test go boom we should 
notice immediately and can revert it pretty easily.


Thanks,
Christian.



-Kees





Re: [Nouveau] [PATCH drm-misc-next v5 0/6] [RFC] DRM GPUVM features

2023-10-05 Thread Thomas Hellström

Hi, Danilo

On 9/28/23 21:16, Danilo Krummrich wrote:

Currently GPUVM offers common infrastructure to track GPU VA allocations
and mappings, generically connect GPU VA mappings to their backing
buffers and perform more complex mapping operations on the GPU VA space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make GPUVM represent the
basis of a VM implementation. In this context, this patch series aims at
generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

The implementation introduces struct drm_gpuvm_bo, which serves as abstraction
combining a struct drm_gpuvm and struct drm_gem_object, similar to what
amdgpu does with struct amdgpu_bo_vm. While this adds a bit of complexity it
improves the efficiency of tracking external and evicted GEM objects.

This patch series is also available at [3].

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/gpuvm-next

Changes in V2:
==
   - rename 'drm_gpuva_manager' -> 'drm_gpuvm' which generally leads to more
 consistent naming
   - properly separate commits (introduce common dma-resv, drm_gpuvm_bo
 abstraction, etc.)
   - remove maple tree for tracking external objects, use a list drm_gpuvm_bos
 per drm_gpuvm instead
   - rework dma-resv locking helpers (Thomas)
   - add a locking helper for a given range of the VA space (Christian)
   - make the GPUVA manager buildable as module, rather than drm_exec
 builtin (Christian)

Changes in V3:
==
   - rename missing function and files (Boris)
   - warn if vm_obj->obj != obj in drm_gpuva_link() (Boris)
   - don't expose drm_gpuvm_bo_destroy() (Boris)
   - unlink VM_BO from GEM in drm_gpuvm_bo_destroy() rather than
 drm_gpuva_unlink() and link within drm_gpuvm_bo_obtain() to keep
 drm_gpuvm_bo instances unique
   - add internal locking to external and evicted object lists to support 
drivers
 updating the VA space from within the fence signalling critical path 
(Boris)
   - unlink external objects and evicted objects from the GPUVM's list in
 drm_gpuvm_bo_destroy()
   - add more documentation and fix some kernel doc issues

Changes in V4:
==
   - add a drm_gpuvm_resv() helper (Boris)
   - add a drm_gpuvmlocal_list field (Boris)
   - remove drm_gpuvm_bo_get_unless_zero() helper (Boris)
   - fix missing NULL assignment in get_next_vm_bo_from_list() (Boris)
   - keep a drm_gem_object reference on potential vm_bo destroy (alternatively 
we
 could free the vm_bo and drop the vm_bo's drm_gem_object reference through
 async work)
   - introduce DRM_GPUVM_RESV_PROTECTED flag to indicate external locking 
through
 the corresponding dma-resv locks to optimize for drivers already holding
 them when needed; add the corresponding lock_assert_held() calls (Thomas)
   - make drm_gpuvm_bo_evict() per vm_bo and add a drm_gpuvm_bo_gem_evict()
 helper (Thomas)
   - pass a drm_gpuvm_bo in drm_gpuvm_ops::vm_bo_validate() (Thomas)
   - documentation fixes

Changes in V5:
==
   - use a root drm_gem_object provided by the driver as a base for the VM's
 common dma-resv (Christian)
   - provide a helper to allocate a "dummy" root GEM object in case a driver
 specific root GEM object isn't available
   - add a dedicated patch for nouveau to make use of the GPUVM's shared 
dma-resv
   - improve documentation (Boris)
   - the following patches are removed from the series, since they already 
landed
 in drm-misc-next
 - f72c2db47080 ("drm/gpuvm: rename struct drm_gpuva_manager to struct 
drm_gpuvm")
 - fe7acaa727e1 ("drm/gpuvm: allow building as module")
 - 78f54469b871 ("drm/nouveau: uvmm: rename 'umgr' to 'base'")

Danilo Krummrich (6):
   drm/gpuvm: add common dma-resv per struct drm_gpuvm
   drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm
   drm/gpuvm: add an abstraction for a VM / BO combination
   drm/gpuvm: track/lock/validate external/evicted objects
   drm/nouveau: make use of the GPUVM's shared dma-resv
   drm/nouveau: use GPUVM common infrastructure

  drivers/gpu/drm/drm_gpuvm.c | 1036 +--
  drivers/gpu/drm/nouveau/nouveau_bo.c|   15 +-
  drivers/gpu/drm/nouveau/nouveau_bo.h|5 +
  drivers/gpu/drm/nouveau/nouveau_exec.c  |   52 +-
  drivers/gpu/drm/nouveau/nouveau_exec.h  |4 -
  drivers/gpu/drm/nouveau/nouveau_gem.c   |   10 +-
  drivers/gpu/drm/nouveau/nouveau_sched.h |4 +-
  drivers/gpu/drm/nouveau/nouveau_uvmm.c