Re: [PATCH v4 03/17] drm/imagination/uapi: Add PowerVR driver UAPI

2023-08-15 Thread Frank Binns
Hi Michel,

Thank you for the feedback (comments below).

On Tue, 2023-08-08 at 15:46 +0200, Michel Dänzer wrote:
> On 7/14/23 16:25, Sarah Walker wrote:
> > +/**
> > + * DOC: PowerVR IOCTL CREATE_BO interface
> > + */
> > +
> > +/**
> > + * DOC: Flags for CREATE_BO
> > + *
> > + * The  drm_pvr_ioctl_create_bo_args.flags field is 64 bits wide 
> > and consists
> > + * of three groups of flags: creation, device mapping and CPU mapping.
> > + *
> > + * We use "device" to refer to the GPU here because of the ambiguity 
> > between
> > + * CPU and GPU in some fonts.
> > + *
> > + * Creation options
> > + *These use the prefix ``DRM_PVR_BO_CREATE_``.
> > + *
> > + *:ZEROED: Require the allocated buffer to be zeroed before returning. 
> > Note
> > + *  that this is an active operation, and is never zero cost. Unless 
> > it is
> > + *  explicitly required, this option should not be set.
> 
> Making this optional is kind of problematic from a security standpoint 
> (information leak, at least if the memory was previously used by a different 
> process). See e.g. the discussion starting at 
> https://gitlab.freedesktop.org/mesa/mesa/-/issues/9189#note_1972986 .
> 
> AFAICT the approach I suggested there (Clear freed memory in the background, 
> and make it available for allocation again only once the clear has finished) 
> isn't really possible with gem_shmem in its current state though. There seems 
> to be ongoing work to do something like that for __GFP_ZERO in general, maybe 
> gem_shmem could take advantage of that when it lands. I'm afraid this series 
> can't depend on that though.
> 

Good point! We'll remove this flag and always zero the memory at allocation
time. We can then look to do something better in the future once the driver is
merged.

> 
> > +/**
> > + * DOC: PowerVR IOCTL VM_MAP and VM_UNMAP interfaces
> > + *
> > + * The VM UAPI allows userspace to create buffer object mappings in GPU 
> > virtual address space.
> > + *
> > + * The client is responsible for managing GPU address space. It should 
> > allocate mappings within
> > + * the heaps returned by %DRM_PVR_DEV_QUERY_HEAP_INFO_GET.
> > + *
> > + * %DRM_IOCTL_PVR_VM_MAP creates a new mapping. The client provides the 
> > target virtual address for
> > + * the mapping. Size and offset within the mapped buffer object can be 
> > specified, so the client can
> > + * partially map a buffer.
> > + *
> > + * %DRM_IOCTL_PVR_VM_UNMAP removes a mapping. The entire mapping will be 
> > removed from GPU address
> > + * space. For this reason only the start address is provided by the client.
> > + */
> 
> FWIW, the amdgpu driver uses a single ioctl for VM map & unmap (plus two 
> additional operations for partial residency). Maybe this would make sense for 
> the PowerVR driver as well, in particular if it might support partial 
> residency in the future.
> 
> (amdgpu also uses similar multiplexer ioctls for other things such as context 
> create/destroy/...)
> 
> Just an idea, feel free to ignore.
> 

We originally had a single ioctl for VM map/unmap, which was inspired by amdgpu,
but we were told to split this into separate ioctls:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15243#note_1283760

> 
> > +/**
> > + * DOC: Flags for SUBMIT_JOB ioctl geometry command.
> > + *
> > + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_FIRST
> > + *
> > + *Indicates if this the first command to be issued for a render.
> > + *
> > + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_LAST
> 
> Does user space really need to pass in the FIRST/LAST flags, can't the kernel 
> driver determine this implicitly? What happens if user space sets these 
> incorrectly?
> 

I don't think there's a way for the kernel driver to determine when the LAST
flag should be set. The obvious time to do this would be when it first
encounters a fragment job and there are geometry jobs proceeding it. However,
there is a least one case (transform feedback) when we want to submit a geometry
job without a fragment job and we need to set the LAST flag. I can't think of an
obvious way to detect this in the kernel driver.

If the flags aren't set correctly then it can cause the hardware to lockup.

> 
> > + * .. c:macro:: DRM_PVR_SUBMIT_JOB_FRAG_CMD_PREVENT_CDM_OVERLAP
> > + *
> > + *Disallow compute overlapped with this render.
> 
> Does this affect only compute from the same context, or also from other 
> contexts?
> 

Fragment and compute jobs are submitted on separate contexts. This flag is set
on fragment jobs and will prevent the job in question overlapping with compute
jobs across all compute contexts. This is necessary in some cases to avoid
hardware lockups on those GPUs that support compute overlapping with other job
types.

> (Similar question for DRM_PVR_SUBMIT_JOB_COMPUTE_CMD_PREVENT_ALL_OVERLAP)
> 

Likewise, this will prevent compute jobs with this flag set from overlapping
with all other jobs across all contexts. Again, this is to avoid hardware

Re: [PATCH v4 03/17] drm/imagination/uapi: Add PowerVR driver UAPI

2023-08-08 Thread Michel Dänzer
On 7/14/23 16:25, Sarah Walker wrote:
> 
> +/**
> + * DOC: PowerVR IOCTL CREATE_BO interface
> + */
> +
> +/**
> + * DOC: Flags for CREATE_BO
> + *
> + * The  drm_pvr_ioctl_create_bo_args.flags field is 64 bits wide and 
> consists
> + * of three groups of flags: creation, device mapping and CPU mapping.
> + *
> + * We use "device" to refer to the GPU here because of the ambiguity between
> + * CPU and GPU in some fonts.
> + *
> + * Creation options
> + *These use the prefix ``DRM_PVR_BO_CREATE_``.
> + *
> + *:ZEROED: Require the allocated buffer to be zeroed before returning. 
> Note
> + *  that this is an active operation, and is never zero cost. Unless it 
> is
> + *  explicitly required, this option should not be set.

Making this optional is kind of problematic from a security standpoint 
(information leak, at least if the memory was previously used by a different 
process). See e.g. the discussion starting at 
https://gitlab.freedesktop.org/mesa/mesa/-/issues/9189#note_1972986 .

AFAICT the approach I suggested there (Clear freed memory in the background, 
and make it available for allocation again only once the clear has finished) 
isn't really possible with gem_shmem in its current state though. There seems 
to be ongoing work to do something like that for __GFP_ZERO in general, maybe 
gem_shmem could take advantage of that when it lands. I'm afraid this series 
can't depend on that though.


> +/**
> + * DOC: PowerVR IOCTL VM_MAP and VM_UNMAP interfaces
> + *
> + * The VM UAPI allows userspace to create buffer object mappings in GPU 
> virtual address space.
> + *
> + * The client is responsible for managing GPU address space. It should 
> allocate mappings within
> + * the heaps returned by %DRM_PVR_DEV_QUERY_HEAP_INFO_GET.
> + *
> + * %DRM_IOCTL_PVR_VM_MAP creates a new mapping. The client provides the 
> target virtual address for
> + * the mapping. Size and offset within the mapped buffer object can be 
> specified, so the client can
> + * partially map a buffer.
> + *
> + * %DRM_IOCTL_PVR_VM_UNMAP removes a mapping. The entire mapping will be 
> removed from GPU address
> + * space. For this reason only the start address is provided by the client.
> + */

FWIW, the amdgpu driver uses a single ioctl for VM map & unmap (plus two 
additional operations for partial residency). Maybe this would make sense for 
the PowerVR driver as well, in particular if it might support partial residency 
in the future.

(amdgpu also uses similar multiplexer ioctls for other things such as context 
create/destroy/...)

Just an idea, feel free to ignore.


> +/**
> + * DOC: Flags for SUBMIT_JOB ioctl geometry command.
> + *
> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_FIRST
> + *
> + *Indicates if this the first command to be issued for a render.
> + *
> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_GEOM_CMD_LAST

Does user space really need to pass in the FIRST/LAST flags, can't the kernel 
driver determine this implicitly? What happens if user space sets these 
incorrectly?


> + * .. c:macro:: DRM_PVR_SUBMIT_JOB_FRAG_CMD_PREVENT_CDM_OVERLAP
> + *
> + *Disallow compute overlapped with this render.

Does this affect only compute from the same context, or also from other 
contexts?

(Similar question for DRM_PVR_SUBMIT_JOB_COMPUTE_CMD_PREVENT_ALL_OVERLAP)


P.S. I mostly just skimmed the other patches of the series, but my impression 
is that the patches and code are cleanly structured and well-documented.

-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



[PATCH v4 03/17] drm/imagination/uapi: Add PowerVR driver UAPI

2023-07-14 Thread Sarah Walker
Add the UAPI implementation for the PowerVR driver.

Signed-off-by: Sarah Walker 
---
 MAINTAINERS|1 +
 include/uapi/drm/pvr_drm.h | 1304 
 2 files changed, 1305 insertions(+)
 create mode 100644 include/uapi/drm/pvr_drm.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 0763388b31ef..4c3bcc68f152 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10105,6 +10105,7 @@ M:  Sarah Walker 
 M: Donald Robson 
 S: Supported
 F: Documentation/devicetree/bindings/gpu/img,powervr.yaml
+F: include/uapi/drm/pvr_drm.h
 
 IMON SOUNDGRAPH USB IR RECEIVER
 M: Sean Young 
diff --git a/include/uapi/drm/pvr_drm.h b/include/uapi/drm/pvr_drm.h
new file mode 100644
index ..58533659a0c8
--- /dev/null
+++ b/include/uapi/drm/pvr_drm.h
@@ -0,0 +1,1304 @@
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */
+/* Copyright (c) 2022 Imagination Technologies Ltd. */
+
+#ifndef PVR_DRM_UAPI_H
+#define PVR_DRM_UAPI_H
+
+#include "drm.h"
+
+#include 
+#include 
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/**
+ * DOC: PowerVR UAPI
+ *
+ * The PowerVR IOCTL argument structs have a few limitations in place, in
+ * addition to the standard kernel restrictions:
+ *
+ *  - All members must be type-aligned.
+ *  - The overall struct must be padded to 64-bit alignment.
+ *  - Explicit padding is almost always required. This takes the form of
+ *``_padding_[x]`` members of sufficient size to pad to the next 
power-of-two
+ *alignment, where [x] is the offset into the struct in hexadecimal. Arrays
+ *are never used for alignment. Padding fields must be zeroed; this is
+ *always checked.
+ *  - Unions may only appear as the last member of a struct.
+ *  - Individual union members may grow in the future. The space between the
+ *end of a union member and the end of its containing union is considered
+ *"implicit padding" and must be zeroed. This is always checked.
+ *
+ * In addition to the IOCTL argument structs, the PowerVR UAPI makes use of
+ * DEV_QUERY argument structs. These are used to fetch information about the
+ * device and runtime. These structs are subject to the same rules set out
+ * above.
+ */
+
+/**
+ * struct drm_pvr_obj_array - Container used to pass arrays of objects
+ *
+ * It is not unusual to have to extend objects to pass new parameters, and the 
DRM
+ * ioctl infrastructure is supporting that by padding ioctl arguments with 
zeros
+ * when the data passed by userspace is smaller than the struct defined in the
+ * drm_ioctl_desc, thus keeping things backward compatible. This type is just
+ * applying the same concepts to indirect objects passed through arrays 
referenced
+ * from the main ioctl arguments structure: the stride basically defines the 
size
+ * of the object passed by userspace, which allows the kernel driver to pad 
with
+ * zeros when it's smaller than the size of the object it expects.
+ *
+ * Use ``DRM_PVR_OBJ_ARRAY()`` to fill object array fields, unless you
+ * have a very good reason not to.
+ */
+struct drm_pvr_obj_array {
+   /** @stride: Stride of object struct. Used for versioning. */
+   __u32 stride;
+
+   /** @count: Number of objects in the array. */
+   __u32 count;
+
+   /** @array: User pointer to an array of objects. */
+   __u64 array;
+};
+
+/**
+ * DRM_PVR_OBJ_ARRAY() - Helper macro for filling  drm_pvr_obj_array.
+ * @cnt: Number of elements pointed to py @ptr.
+ * @ptr: Pointer to start of a C array.
+ *
+ * Return: Literal of type  drm_pvr_obj_array.
+ */
+#define DRM_PVR_OBJ_ARRAY(cnt, ptr) \
+   { .stride = sizeof((ptr)[0]), .count = (cnt), .array = 
(__u64)(uintptr_t)(ptr) }
+
+/**
+ * DOC: PowerVR IOCTL interface
+ */
+
+/**
+ * PVR_IOCTL() - Build a PowerVR IOCTL number
+ * @_ioctl: An incrementing id for this IOCTL. Added to %DRM_COMMAND_BASE.
+ * @_mode: Must be one of %DRM_IOR, %DRM_IOW or %DRM_IOWR.
+ * @_data: The type of the args struct passed by this IOCTL.
+ *
+ * The struct referred to by @_data must have a ``drm_pvr_ioctl_`` prefix and 
an
+ * ``_args suffix``. They are therefore omitted from @_data.
+ *
+ * This should only be used to build the constants described below; it should
+ * never be used to call an IOCTL directly.
+ *
+ * Return: An IOCTL number to be passed to ioctl() from userspace.
+ */
+#define PVR_IOCTL(_ioctl, _mode, _data) \
+   _mode(DRM_COMMAND_BASE + (_ioctl), struct drm_pvr_ioctl_##_data##_args)
+
+#define DRM_IOCTL_PVR_DEV_QUERY PVR_IOCTL(0x00, DRM_IOWR, dev_query)
+#define DRM_IOCTL_PVR_CREATE_BO PVR_IOCTL(0x01, DRM_IOWR, create_bo)
+#define DRM_IOCTL_PVR_GET_BO_MMAP_OFFSET PVR_IOCTL(0x02, DRM_IOWR, 
get_bo_mmap_offset)
+#define DRM_IOCTL_PVR_CREATE_VM_CONTEXT PVR_IOCTL(0x03, DRM_IOWR, 
create_vm_context)
+#define DRM_IOCTL_PVR_DESTROY_VM_CONTEXT PVR_IOCTL(0x04, DRM_IOW, 
destroy_vm_context)
+#define DRM_IOCTL_PVR_VM_MAP PVR_IOCTL(0x05, DRM_IOW, vm_map)
+#define