Re: [Nouveau] [PATCH drm-misc-next 2/2] drm/nouveau: enable dynamic job-flow control

2023-11-21 Thread Dave Airlie
On Tue, 14 Nov 2023 at 10:27, Danilo Krummrich  wrote:
>
> Make use of the scheduler's credit limit and scheduler job's credit
> count to account for the actual size of a job, such that we fill up the
> ring efficiently.

For the two:

Reviewed-by: Dave Airlie 

>
> Signed-off-by: Danilo Krummrich 
> ---
>  drivers/gpu/drm/nouveau/nouveau_abi16.c | 3 ++-
>  drivers/gpu/drm/nouveau/nouveau_drm.c   | 2 +-
>  drivers/gpu/drm/nouveau/nouveau_exec.c  | 4 +++-
>  drivers/gpu/drm/nouveau/nouveau_sched.c | 9 -
>  drivers/gpu/drm/nouveau/nouveau_sched.h | 3 ++-
>  drivers/gpu/drm/nouveau/nouveau_uvmm.c  | 4 +++-
>  6 files changed, 15 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c 
> b/drivers/gpu/drm/nouveau/nouveau_abi16.c
> index f8e59cfb1d34..207945700c94 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
> @@ -316,7 +316,8 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
> if (ret)
> goto done;
>
> -   ret = nouveau_sched_init(>sched, drm, drm->sched_wq);
> +   ret = nouveau_sched_init(>sched, drm, drm->sched_wq,
> +chan->chan->dma.ib_max);
> if (ret)
> goto done;
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 7e5f19153829..6f6c31a9937b 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -320,7 +320,7 @@ nouveau_cli_init(struct nouveau_drm *drm, const char 
> *sname,
>  * locks which indirectly or directly are held for allocations
>  * elsewhere.
>  */
> -   ret = nouveau_sched_init(>sched, drm, NULL);
> +   ret = nouveau_sched_init(>sched, drm, NULL, 1);
> if (ret)
> goto done;
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c 
> b/drivers/gpu/drm/nouveau/nouveau_exec.c
> index 388831c53551..63c344f4f78e 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_exec.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
> @@ -258,10 +258,12 @@ nouveau_exec_job_init(struct nouveau_exec_job **pjob,
> }
> }
>
> +   args.file_priv = __args->file_priv;
> job->chan = __args->chan;
>
> args.sched = __args->sched;
> -   args.file_priv = __args->file_priv;
> +   /* Plus one to account for the HW fence. */
> +   args.credits = job->push.count + 1;
>
> args.in_sync.count = __args->in_sync.count;
> args.in_sync.s = __args->in_sync.s;
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c 
> b/drivers/gpu/drm/nouveau/nouveau_sched.c
> index faabd662b165..6406d6659361 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> @@ -12,7 +12,6 @@
>  #include "nouveau_abi16.h"
>  #include "nouveau_sched.h"
>
> -#define NOUVEAU_SCHED_HW_SUBMISSIONS   1
>  #define NOUVEAU_SCHED_JOB_TIMEOUT_MS   1
>
>  /* Starts at 0, since the DRM scheduler interprets those parameters as 
> (initial)
> @@ -85,10 +84,10 @@ nouveau_job_init(struct nouveau_job *job,
> ret = -ENOMEM;
> goto err_free_objs;
> }
> -
> }
>
> -   ret = drm_sched_job_init(>base, >entity, 1, NULL);
> +   ret = drm_sched_job_init(>base, >entity,
> +args->credits, NULL);
> if (ret)
> goto err_free_chains;
>
> @@ -396,7 +395,7 @@ static const struct drm_sched_backend_ops 
> nouveau_sched_ops = {
>
>  int
>  nouveau_sched_init(struct nouveau_sched *sched, struct nouveau_drm *drm,
> -  struct workqueue_struct *wq)
> +  struct workqueue_struct *wq, u32 credit_limit)
>  {
> struct drm_gpu_scheduler *drm_sched = >base;
> struct drm_sched_entity *entity = >entity;
> @@ -414,7 +413,7 @@ nouveau_sched_init(struct nouveau_sched *sched, struct 
> nouveau_drm *drm,
>
> ret = drm_sched_init(drm_sched, _sched_ops, wq,
>  NOUVEAU_SCHED_PRIORITY_COUNT,
> -NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
> +credit_limit, 0, job_hang_limit,
>  NULL, NULL, "nouveau_sched", drm->dev->dev);
> if (ret)
> goto fail_wq;
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.h 
> b/drivers/gpu/drm/nouveau/nouveau_sched.h
> index 026f33d9b70c..7ba8ffec135a 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sched.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.h
> @@ -27,6 +27,7 @@ enum nouveau_job_state {
>  struct nouveau_job_args {
> struct drm_file *file_priv;
> struct nouveau_sched *sched;
> +   u32 credits;
>
> enum dma_resv_usage resv_usage;
> bool sync;
> @@ -112,7 +113,7 @@ struct nouveau_sched {
>  };
>
>  int nouveau_sched_init(struct nouveau_sched *sched, 

Re: [Nouveau] [PATCH drm-misc-next] drm/nouveau: use GPUVM common infrastructure

2023-11-21 Thread Dave Airlie
On Tue, 14 Nov 2023 at 08:12, Danilo Krummrich  wrote:
>
> GPUVM provides common infrastructure to track external and evicted GEM
> objects as well as locking and validation helpers.
>
> Especially external and evicted object tracking is a huge improvement
> compared to the current brute force approach of iterating all mappings
> in order to lock and validate the GPUVM's GEM objects. Hence, make us of
> it.
>
> Signed-off-by: Danilo Krummrich 

Reviewed-by: Dave Airlie 

> ---
> Originally, this patch was part of [1]. However, while applying the series,
> I noticed that this patch needs another iteration, hence I held it back to
> rework it and send it separately.
>
> Changes since detached from [1]:
> 
> - Don't use drm_gpuvm_exec_lock() since it would, unnecessarily, lock all the
>   backing buffer's dma-resv locks. Instead, lock only the GEMs affected by the
>   requested mapping operations, directly or indirectly through map, remap or
>   unmap.
> - Validate backing buffers in drm_exec loop directly.
>
> [1] 
> https://lore.kernel.org/dri-devel/20231108001259.15123-1-d...@redhat.com/T/#u
> ---
>  drivers/gpu/drm/nouveau/nouveau_bo.c|   4 +-
>  drivers/gpu/drm/nouveau/nouveau_exec.c  |  57 +++---
>  drivers/gpu/drm/nouveau/nouveau_exec.h  |   4 -
>  drivers/gpu/drm/nouveau/nouveau_sched.c |   9 +-
>  drivers/gpu/drm/nouveau/nouveau_sched.h |   7 +-
>  drivers/gpu/drm/nouveau/nouveau_uvmm.c  | 134 +---
>  6 files changed, 100 insertions(+), 115 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index 7afad86da64b..b7dda486a7ea 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1061,17 +1061,18 @@ nouveau_bo_move(struct ttm_buffer_object *bo, bool 
> evict,
>  {
> struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
> struct nouveau_bo *nvbo = nouveau_bo(bo);
> +   struct drm_gem_object *obj = >base;
> struct ttm_resource *old_reg = bo->resource;
> struct nouveau_drm_tile *new_tile = NULL;
> int ret = 0;
>
> -
> if (new_reg->mem_type == TTM_PL_TT) {
> ret = nouveau_ttm_tt_bind(bo->bdev, bo->ttm, new_reg);
> if (ret)
> return ret;
> }
>
> +   drm_gpuvm_bo_gem_evict(obj, evict);
> nouveau_bo_move_ntfy(bo, new_reg);
> ret = ttm_bo_wait_ctx(bo, ctx);
> if (ret)
> @@ -1136,6 +1137,7 @@ nouveau_bo_move(struct ttm_buffer_object *bo, bool 
> evict,
>  out_ntfy:
> if (ret) {
> nouveau_bo_move_ntfy(bo, bo->resource);
> +   drm_gpuvm_bo_gem_evict(obj, !evict);
> }
> return ret;
>  }
> diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c 
> b/drivers/gpu/drm/nouveau/nouveau_exec.c
> index bf6c12f4342a..9d9835fb5970 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_exec.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
> @@ -1,7 +1,5 @@
>  // SPDX-License-Identifier: MIT
>
> -#include 
> -
>  #include "nouveau_drv.h"
>  #include "nouveau_gem.h"
>  #include "nouveau_mem.h"
> @@ -86,14 +84,12 @@
>   */
>
>  static int
> -nouveau_exec_job_submit(struct nouveau_job *job)
> +nouveau_exec_job_submit(struct nouveau_job *job,
> +   struct drm_gpuvm_exec *vme)
>  {
> struct nouveau_exec_job *exec_job = to_nouveau_exec_job(job);
> struct nouveau_cli *cli = job->cli;
> struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(cli);
> -   struct drm_exec *exec = >exec;
> -   struct drm_gem_object *obj;
> -   unsigned long index;
> int ret;
>
> /* Create a new fence, but do not emit yet. */
> @@ -102,52 +98,29 @@ nouveau_exec_job_submit(struct nouveau_job *job)
> return ret;
>
> nouveau_uvmm_lock(uvmm);
> -   drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT |
> -   DRM_EXEC_IGNORE_DUPLICATES);
> -   drm_exec_until_all_locked(exec) {
> -   struct drm_gpuva *va;
> -
> -   drm_gpuvm_for_each_va(va, >base) {
> -   if (unlikely(va == >base.kernel_alloc_node))
> -   continue;
> -
> -   ret = drm_exec_prepare_obj(exec, va->gem.obj, 1);
> -   drm_exec_retry_on_contention(exec);
> -   if (ret)
> -   goto err_uvmm_unlock;
> -   }
> +   ret = drm_gpuvm_exec_lock(vme);
> +   if (ret) {
> +   nouveau_uvmm_unlock(uvmm);
> +   return ret;
> }
> nouveau_uvmm_unlock(uvmm);
>
> -   drm_exec_for_each_locked_object(exec, index, obj) {
> -   struct nouveau_bo *nvbo = nouveau_gem_object(obj);
> -
> -   ret = nouveau_bo_validate(nvbo, true, false);
> -   if (ret)
> -   goto err_exec_fini;
> +   ret = 

Re: [Nouveau] [PATCH] nouveau/gsp: document some aspects of GSP-RM

2023-11-21 Thread David Airlie
On Wed, Nov 22, 2023 at 9:53 AM Timur Tabi  wrote:
>
> Document a few aspects of communication with GSP-RM.  These comments
> are derived from notes made during early development of GSP-RM
> support in Nouveau, but were not included in the initial patch set.
>
> Signed-off-by: Timur Tabi 
> ---
>  .../common/shared/msgq/inc/msgq/msgq_priv.h   | 79 +++--
>  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 86 ++-
>  2 files changed, 154 insertions(+), 11 deletions(-)
>
> diff --git 
> a/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
>  
> b/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
> index 5a2f273d95c8..1e94bf087b23 100644
> --- 
> a/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
> +++ 
> b/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
> @@ -26,21 +26,82 @@
>   * DEALINGS IN THE SOFTWARE.
>   */
>
> +#define GSP_MESSAGE_COMMAND_QUEUE_SIZE 0x4
> +#define GSP_MESSAGE_STATUS_QUEUE_SIZE  0x4
> +
> +/**
> + * msgqTxHeader -- TX queue data structure
> + * @version: the version of this structure, must be 0
> + * @size: the size of the entire queue, including this header
> + * @msgSize: the padded size of queue element, must be power-of-2, 16 is
> + * minimum

I don't think this is true anymore, I think 4k is the minimum size
since one of the 535 series.


> + * @msgCount: the number of elements in this queue
> + * @writePtr: head index of this queue
> + * @flags: 1 = swap the RX pointers
> + * @rxHdrOff: offset of readPtr in this structure
> + * @entryOff: offset of beginning of queue (msgqRxHeader), relative to
> + *  beginning of this structure
> + *
> + * The command queue is a queue of RPCs that are sent from the driver to the
> + * GSP.  The status queue is a queue of messages/responses from the GSP to 
> the
> + * driver.  Although the driver allocates memory for both queues (inside the
> + * same memory block), the command queue is owned by the driver and the 
> status
> + * queue is owned by the GSP.  In addition, the two headers must not share 
> the
> + * same 4K page.
> + *
> + * Unfortunately, depsite the fact that the queue size is a field in this

^ typo

> + * structure, the GSP has a hard-coded expectation of the sizes.  So the
> + * command queue size must be GSP_MESSAGE_COMMAND_QUEUE_SIZE and the status
> + * queue size must be GSP_MESSAGE_STATUS_QUEUE_SIZE.
> + *
> + * Each queue is prefixed with this data structure.  The idea is that a queue
> + * and its header are written to only by their owner.  That is, only the
> + * driver writes to the command queue and command queue header, and only the
> + * GSP writes to the status (receive) queue and its header.
> + *
> + * This is enforced by the concept of "swapping" the RX pointers.  This is
> + * why the 'flags' field must be set to 1.  'rxHdrOff' is how the GSP knows
> + * where the where the tail pointer of its status queue.
> + *
> + * When the driver writes a new RPC to the command queue, it updates 
> writePtr.
> + * When it reads a new message from the status queue, it updates readPtr.  In
> + * this way, the GSP knows when a new command is in the queue (it polls
> + * writePtr) and it knows how much free space is in the status queue (it
> + * checks readPtr).  The driver never cares about how much free space is in
> + * the status queue.
> + *
> + * As usual, producers write to the head pointer, and consumers read from the
> + * tail pointer.  When head == tail, the queue is empty.
> + *
> + * So to summarize:
> + * command.writePtr = head of command queue
> + * command.readPtr = tail of status queue
> + * status.writePtr = head of status queue
> + * status.readPtr = tail of command queue
> + */
>  typedef struct
>  {
> -NvU32 version;   // queue version
> -NvU32 size;  // bytes, page aligned
> -NvU32 msgSize;   // entry size, bytes, must be power-of-2, 16 is minimum
> -NvU32 msgCount;  // number of entries in queue
> -NvU32 writePtr;  // message id of next slot
> -NvU32 flags; // if set it means "i want to swap RX"
> -NvU32 rxHdrOff;  // Offset of msgqRxHeader from start of backing store.
> -NvU32 entryOff;  // Offset of entries from start of backing store.
> +   NvU32 version;
> +   NvU32 size;
> +   NvU32 msgSize;
> +   NvU32 msgCount;
> +   NvU32 writePtr;
> +   NvU32 flags;
> +   NvU32 rxHdrOff;
> +   NvU32 entryOff;
>  } msgqTxHeader;
>
> +/**
> + * msgqRxHeader - RX queue data structure
> + * @readPtr: tail index of the other queue
> + *
> + * Although this is a separate struct, it could easily be merged into
> + * msgqTxHeader.  msgqTxHeader.rxHdrOff is simply the offset of readPtr
> + * from the beginning of msgqTxHeader.
> + */
>  typedef struct
>  {
> -NvU32 readPtr; // message id of last message read
> +   NvU32 readPtr;
>  } 

[Nouveau] Fwd: Kernel 6.6.1 hangs on "loading initial ramdisk"

2023-11-21 Thread Bagas Sanjaya
Hi,

I notice a regression report on Bugzilla [1]. Quoting from it:

> After upgrading from 6.5.9 to 6.6.1 on my Dell Latitude E6420 (Intel 
> i5-2520M) with EndeavourOS, the boot process would hang at "loading initial 
> ramdisk". The issue is present on the 6.6.1 release of both Linux and 
> Linux-zen, but not the 6.5.9 release, which makes me think this is somehow 
> upstream in the kernel, rather than to do with packaging. My current 
> workaround is using the Linux LTS kernel.
> 
> I have been unable to consistently reproduce this bug. Between 50 and 30 
> percent of the time, the "loading initial ramdisk" will display, the disk 
> activity indicator will turn off briefly and then resume blinking, and then 
> the kernel boots as expected. The other 50 to 70 percent of the time, the 
> boot stops at "loading initial ramdisk" and the disk activity indicator turns 
> off, and does not resume blinking. The disk activity light is constantly 
> flashing during normal system operation, so I know it's not secretly booting 
> but not updating the display. I haven't been able to replicate this issue in 
> QEMU. I have seen similar bugs that have been solved by disabling IOMMU, but 
> this has not had any effect. Neither has disabling graphics drivers and 
> modesetting. I have been able to reproduce it while using Nouveau, so I don't 
> believe it has to do with Nvidia's proprietary drivers.
> 
> Examining dmesg and journalctl, there doesn't appear to be ANY logs from the 
> failed boots. I don't believe the kernel even is started on these failed 
> boots. Enabling GRUB debug messages 
> (linux,loader,init,fs,device,disk,partition) shows that the hang occurs after 
> GRUB attempts to start the loaded image- it's able to load the image into 
> memory, but the boot stalls after "Starting image" with a hex address 
> (presumably the start addr of the kernel).  
> 
> I've been trying to compile the kernel myself to see if I can solve the 
> issue, or at least aid in reproduceability, but this is not easy or fast to 
> do on a 2012 i5 processor. I'll update if I can successfully recompile the 
> kernel and if it yields any information.  
> 
> Please let me know if I should provide any additional information. This is my 
> first time filing a bug here.

See Bugzilla for the full thread and attached grub output.

Anyway, I'm adding this regression to regzbot:

#regzbot introduced: v6.5..v6.6 
https://bugzilla.kernel.org/show_bug.cgi?id=218173
#regzbot title: initramfs loading hang on nouveau system (Dell Latitude E6420)

Thanks.

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218173

-- 
An old man doll... just what I always wanted! - Clara


[Nouveau] [PATCH] nouveau/gsp: document some aspects of GSP-RM

2023-11-21 Thread Timur Tabi
Document a few aspects of communication with GSP-RM.  These comments
are derived from notes made during early development of GSP-RM
support in Nouveau, but were not included in the initial patch set.

Signed-off-by: Timur Tabi 
---
 .../common/shared/msgq/inc/msgq/msgq_priv.h   | 79 +++--
 .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 86 ++-
 2 files changed, 154 insertions(+), 11 deletions(-)

diff --git 
a/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
 
b/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
index 5a2f273d95c8..1e94bf087b23 100644
--- 
a/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
+++ 
b/drivers/gpu/drm/nouveau/include/nvrm/535.113.01/common/shared/msgq/inc/msgq/msgq_priv.h
@@ -26,21 +26,82 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#define GSP_MESSAGE_COMMAND_QUEUE_SIZE 0x4
+#define GSP_MESSAGE_STATUS_QUEUE_SIZE  0x4
+
+/**
+ * msgqTxHeader -- TX queue data structure
+ * @version: the version of this structure, must be 0
+ * @size: the size of the entire queue, including this header
+ * @msgSize: the padded size of queue element, must be power-of-2, 16 is
+ * minimum
+ * @msgCount: the number of elements in this queue
+ * @writePtr: head index of this queue
+ * @flags: 1 = swap the RX pointers
+ * @rxHdrOff: offset of readPtr in this structure
+ * @entryOff: offset of beginning of queue (msgqRxHeader), relative to
+ *  beginning of this structure
+ *
+ * The command queue is a queue of RPCs that are sent from the driver to the
+ * GSP.  The status queue is a queue of messages/responses from the GSP to the
+ * driver.  Although the driver allocates memory for both queues (inside the
+ * same memory block), the command queue is owned by the driver and the status
+ * queue is owned by the GSP.  In addition, the two headers must not share the
+ * same 4K page.
+ *
+ * Unfortunately, depsite the fact that the queue size is a field in this
+ * structure, the GSP has a hard-coded expectation of the sizes.  So the
+ * command queue size must be GSP_MESSAGE_COMMAND_QUEUE_SIZE and the status
+ * queue size must be GSP_MESSAGE_STATUS_QUEUE_SIZE.
+ *
+ * Each queue is prefixed with this data structure.  The idea is that a queue
+ * and its header are written to only by their owner.  That is, only the
+ * driver writes to the command queue and command queue header, and only the
+ * GSP writes to the status (receive) queue and its header.
+ *
+ * This is enforced by the concept of "swapping" the RX pointers.  This is
+ * why the 'flags' field must be set to 1.  'rxHdrOff' is how the GSP knows
+ * where the where the tail pointer of its status queue.
+ *
+ * When the driver writes a new RPC to the command queue, it updates writePtr.
+ * When it reads a new message from the status queue, it updates readPtr.  In
+ * this way, the GSP knows when a new command is in the queue (it polls
+ * writePtr) and it knows how much free space is in the status queue (it
+ * checks readPtr).  The driver never cares about how much free space is in
+ * the status queue.
+ *
+ * As usual, producers write to the head pointer, and consumers read from the
+ * tail pointer.  When head == tail, the queue is empty.
+ *
+ * So to summarize:
+ * command.writePtr = head of command queue
+ * command.readPtr = tail of status queue
+ * status.writePtr = head of status queue
+ * status.readPtr = tail of command queue
+ */
 typedef struct
 {
-NvU32 version;   // queue version
-NvU32 size;  // bytes, page aligned
-NvU32 msgSize;   // entry size, bytes, must be power-of-2, 16 is minimum
-NvU32 msgCount;  // number of entries in queue
-NvU32 writePtr;  // message id of next slot
-NvU32 flags; // if set it means "i want to swap RX"
-NvU32 rxHdrOff;  // Offset of msgqRxHeader from start of backing store.
-NvU32 entryOff;  // Offset of entries from start of backing store.
+   NvU32 version;
+   NvU32 size;
+   NvU32 msgSize;
+   NvU32 msgCount;
+   NvU32 writePtr;
+   NvU32 flags;
+   NvU32 rxHdrOff;
+   NvU32 entryOff;
 } msgqTxHeader;
 
+/**
+ * msgqRxHeader - RX queue data structure
+ * @readPtr: tail index of the other queue
+ *
+ * Although this is a separate struct, it could easily be merged into
+ * msgqTxHeader.  msgqTxHeader.rxHdrOff is simply the offset of readPtr
+ * from the beginning of msgqTxHeader.
+ */
 typedef struct
 {
-NvU32 readPtr; // message id of last message read
+   NvU32 readPtr;
 } msgqRxHeader;
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index dc44f5c7833f..265c0a413ea8 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1379,6 +1379,13 @@ r535_gsp_msg_post_event(void *priv, u32 fn, void *repv, 
u32 repc)
return 0;
 }
 

Re: [Nouveau] [PATCH] nouveau/gsp: fix getting max channel id for GSP

2023-11-21 Thread Dave Airlie
Self NAK, this isn't sufficient to fix events.

On Mon, 20 Nov 2023 at 12:07, Dave Airlie  wrote:
>
> From: Dave Airlie 
>
> The fence code uses the total number of channel ids to allocate a
> bunch of memory for fencing. This is probably not the best way to
> do this, but it's hard to fix right now.
>
> The GSP code realises it can fit 8 channels into a USERD
> page, so it claims it can support 256 channels max, but it then
> allocates channel ids sparsely (0, 8, 16 etc).
>
> This just exposes the multiplier to userspace so the fence code
> gets things right, however I think this might all need more thought.
>
> Link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/274
> Signed-off-by: Dave Airlie 
> ---
>  drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c | 7 ++-
>  drivers/gpu/drm/nouveau/nvkm/engine/fifo/priv.h | 2 ++
>  drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c | 7 +++
>  3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c
> index 22443fe4a39f..8e36cdd0e5fb 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c
> @@ -178,7 +178,12 @@ nvkm_fifo_info(struct nvkm_engine *engine, u64 mthd, u64 
> *data)
> return ret;
>
> switch (mthd) {
> -   case NV_DEVICE_HOST_CHANNELS: *data = fifo->chid ? fifo->chid->nr : 
> 0; return 0;
> +   case NV_DEVICE_HOST_CHANNELS:
> +   if (fifo->func->chid_total)
> +   *data = fifo->func->chid_total(fifo);
> +   else
> +   *data = fifo->chid ? fifo->chid->nr : 0;
> +   return 0;
> case NV_DEVICE_HOST_RUNLISTS:
> *data = 0;
> nvkm_runl_foreach(runl, fifo)
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/priv.h 
> b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/priv.h
> index a0f3277605a5..c21e982b03a5 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/priv.h
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/priv.h
> @@ -17,6 +17,8 @@ struct nvkm_fifo_func {
>
> int (*chid_nr)(struct nvkm_fifo *);
> int (*chid_ctor)(struct nvkm_fifo *, int nr);
> +
> +   int (*chid_total)(struct nvkm_fifo *);
> int (*runq_nr)(struct nvkm_fifo *);
> int (*runl_ctor)(struct nvkm_fifo *);
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
> index b374d72fd1c1..1e9c0b113cb5 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
> @@ -641,6 +641,12 @@ r535_fifo_dtor(struct nvkm_fifo *fifo)
> kfree(fifo->func);
>  }
>
> +static int
> +r535_fifo_chid_total(struct nvkm_fifo *fifo)
> +{
> +   return fifo->chid->nr * CHID_PER_USERD;
> +}
> +
>  int
>  r535_fifo_new(const struct nvkm_fifo_func *hw, struct nvkm_device *device,
>   enum nvkm_subdev_type type, int inst, struct nvkm_fifo **pfifo)
> @@ -652,6 +658,7 @@ r535_fifo_new(const struct nvkm_fifo_func *hw, struct 
> nvkm_device *device,
>
> rm->dtor = r535_fifo_dtor;
> rm->runl_ctor = r535_fifo_runl_ctor;
> +   rm->chid_total = r535_fifo_chid_total;
> rm->runl = _runl;
> rm->cgrp = hw->cgrp;
> rm->cgrp.func = _cgrp;
> --
> 2.42.0
>


Re: [Nouveau] [REGRESSION]: nouveau: Asynchronous wait on fence

2023-11-21 Thread Owen T. Heisler

On 11/21/23 09:16, Linux regression tracking (Thorsten Leemhuis) wrote:

On 15.11.23 07:19, Owen T. Heisler wrote:

On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:

On 28.10.23 04:46, Owen T. Heisler wrote:

#regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
#regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180

## Problem

1. Connect external display to DVI port on dock and run X with both
     displays in use.
2. Wait hours or days.
3. Suddenly the secondary Nvidia-connected display turns off and X stops
     responding to keyboard/mouse input. In *some* cases it is
possible to
     switch to a virtual TTY with Ctrl+Alt+Fn and log in there.



Here is a decoded kernel log from an
untainted kernel:

https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log



Maybe one of the nouveau developer can take a quick look at
d386a4b54607cf and suggest a simple way to revert it in latest mainline.
Maybe just removing the main chunk of code that is added is all that it
takes.


I was able to resolve the revert conflict; it was indeed trivial though 
I did not realize it initially. I am currently testing v6.6 with the 
culprit commit reverted. I need to test for at least a full week (ending 
11-23) before I can assume it fixes the problem.


After that I can try the latest v6.7-rc as you suggested.

I have updated the bug description at
.

Thanks again,
Owen

--
Owen T. Heisler



[Nouveau] [PATCH] nouveau/gsp: allocate enough space for all channel ids.

2023-11-21 Thread Dave Airlie
From: Dave Airlie 

This probably isn't the ideal fix, but we ended up using chids
sparsely, and lots of things rely on indexing into the full range,
so just allocate the full range up front.

The GSP code fixes 8 channels into a userd page, but we end up using
a single userd page per channel so end up sparsely using the range.

Fixes a few crashes seen with multiple channels.

Link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/277
Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
index 3adbb05ff587..d088e636edc3 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/r535.c
@@ -539,7 +539,7 @@ r535_fifo_runl_ctor(struct nvkm_fifo *fifo)
struct nvkm_runl *runl;
struct nvkm_engn *engn;
u32 cgids = 2048;
-   u32 chids = 2048 / CHID_PER_USERD;
+   u32 chids = 2048;
int ret;
NV2080_CTRL_FIFO_GET_DEVICE_INFO_TABLE_PARAMS *ctrl;
 
-- 
2.42.0



Re: [Nouveau] [REGRESSION]: nouveau: Asynchronous wait on fence

2023-11-21 Thread Linux regression tracking (Thorsten Leemhuis)
On 15.11.23 07:19, Owen T. Heisler wrote:
> On 10/31/23 04:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 28.10.23 04:46, Owen T. Heisler wrote:
>>> #regzbot introduced: d386a4b54607cf6f76e23815c2c9a3abc1d66882
>>> #regzbot link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/180
>>>
>>> ## Problem
>>>
>>> 1. Connect external display to DVI port on dock and run X with both
>>>     displays in use.
>>> 2. Wait hours or days.
>>> 3. Suddenly the secondary Nvidia-connected display turns off and X stops
>>>     responding to keyboard/mouse input. In *some* cases it is
>>> possible to
>>>     switch to a virtual TTY with Ctrl+Alt+Fn and log in there.
> 
>> You thus might want to check if the problem occurs with 6.6 -- and
>> ideally also check if reverting the culprit there fixes things for you.
> 
> The problem also occurs with v6.6.

You meanwhile might want to give 6.7-rc as well on the off chance that
it improves things, even if that is unlikely.

> Here is a decoded kernel log from an
> untainted kernel:
> 
> https://gitlab.freedesktop.org/drm/nouveau/uploads/c120faf09da46f9c74006df9f1d14442/async-wait-on-fence-180.log
> 
> The culprit commit does not revert cleanly on v6.6. I have not yet
> attempted to resolve the conflicts.
> 
> I have also updated the bug description at
> .

Maybe one of the nouveau developer can take a quick look at
d386a4b54607cf and suggest a simple way to revert it in latest mainline.
Maybe just removing the main chunk of code that is added is all that it
takes.

Ciao, Thorsten


Re: [Nouveau] [PATCH V2] drm/modes: Fix divide error in drm_mode_debug_printmodeline

2023-11-21 Thread Jani Nikula
On Mon, 20 Nov 2023, Ville Syrjälä  wrote:
> On Mon, Nov 20, 2023 at 10:41:18PM +0800, Edward Adam Davis wrote:
>> [Syz Log]
>> divide error:  [#1] PREEMPT SMP KASAN
>> CPU: 0 PID: 5068 Comm: syz-executor357 Not tainted 
>> 6.6.0-syzkaller-16039-gac347a0655db #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>> Google 10/09/2023
>> RIP: 0010:drm_mode_vrefresh drivers/gpu/drm/drm_modes.c:1303 [inline]
>> RIP: 0010:drm_mode_debug_printmodeline+0x118/0x4e0 
>> drivers/gpu/drm/drm_modes.c:60
>> Code: 00 41 0f b7 07 66 83 f8 02 b9 01 00 00 00 0f 43 c8 0f b7 c1 0f af e8 
>> 44 89 f0 48 69 c8 e8 03 00 00 89 e8 d1 e8 48 01 c8 31 d2 <48> f7 f5 49 89 c6 
>> eb 0c e8 fb 07 66 fc eb 05 e8 f4 07 66 fc 48 89
>> RSP: 0018:c9000391f8d0 EFLAGS: 00010246
>> RAX: 0001f400 RBX: 888025045000 RCX: 0001f400
>> RDX:  RSI: 8000 RDI: 888025045018
>> RBP:  R08: 8528b9af R09: 
>> R10: c9000391f8a0 R11: f52000723f17 R12: 0080
>> R13: dc00 R14: 0080 R15: 888025045016
>> FS:  56932380() GS:8880b980() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: 005fdeb8 CR3: 7fcff000 CR4: 003506f0
>> DR0:  DR1:  DR2: 
>> DR3:  DR6: fffe0ff0 DR7: 0400
>> Call Trace:
>>  
>>  drm_mode_setcrtc+0x83b/0x1880 drivers/gpu/drm/drm_crtc.c:794
>>  drm_ioctl_kernel+0x362/0x500 drivers/gpu/drm/drm_ioctl.c:792
>>  drm_ioctl+0x636/0xb00 drivers/gpu/drm/drm_ioctl.c:895
>>  vfs_ioctl fs/ioctl.c:51 [inline]
>>  __do_sys_ioctl fs/ioctl.c:871 [inline]
>>  __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:857
>>  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>>  do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
>>  entry_SYSCALL_64_after_hwframe+0x63/0x6b
>> 
>> [Analysis]
>> When calculating den in drm_mode_vrefresh(), if the vscan value is too 
>> large, 
>> there is a probability of unsigned integer overflow.
>> 
>> [Fix]
>> Before multiplying by vscan, first check if their product will overflow. 
>> If overflow occurs, return 0 and exit the subsequent process.
>> 
>> Reported-and-tested-by: syzbot+2e93e6fb36e6fdc56...@syzkaller.appspotmail.com
>> Fixes: ea40d7857d52 ("drm/vkms: fbdev emulation support")
>> Signed-off-by: Edward Adam Davis 
>> ---
>>  drivers/gpu/drm/drm_modes.c | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c
>> index ac9a406250c5..60739d861da2 100644
>> --- a/drivers/gpu/drm/drm_modes.c
>> +++ b/drivers/gpu/drm/drm_modes.c
>> @@ -36,6 +36,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include 
>>  #include 
>> @@ -1297,8 +1298,10 @@ int drm_mode_vrefresh(const struct drm_display_mode 
>> *mode)
>>  num *= 2;
>>  if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
>>  den *= 2;
>> -if (mode->vscan > 1)
>> -den *= mode->vscan;
>> +if (mode->vscan > 1) {
>> +if (unlikely(check_mul_overflow(den, mode->vscan, )))
>> +return 0;
>> +}
>
> I can't see any driver that actually supports vscan>1. Only
> nouveau has some code for it, but doesn't look like it does
> anything sensible. All other drivers for sure should be
> rejecting vscan>1 outright. Which driver is this?
>
> Is there an actual usecase where nouveau needs this (and does
> it even work?) or could we just rip out the whole thing and
> reject vscan>1 globally?

I thought the whole thing seemed familiar [1].

BR,
Jani.



[1] https://lore.kernel.org/r/20230802174746.2256-1-astraj...@yahoo.com


>
>>  
>>  return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(num, 1000), den);
>>  }
>> -- 
>> 2.25.1

-- 
Jani Nikula, Intel


Re: [Nouveau] [PATCH v3 00/16] drm: Convert to platform remove callback returning void

2023-11-21 Thread Thomas Zimmermann

Hi

Am 20.11.23 um 13:05 schrieb Uwe Kleine-König:

[Dropped a few people from To that resulted in bounces before.]

On Thu, Nov 02, 2023 at 05:56:41PM +0100, Uwe Kleine-König wrote:

Hello,

this series converts all platform drivers below drivers/gpu/drm to use
.remove_new(). It starts with a fix for a problem that potentially might
crash the kernel that I stumbled over while implementing the conversion.

Some of the conversion patches following this fix were already send in
earlier series:


https://lore.kernel.org/dri-devel/20230801110239.831099-1-u.kleine-koe...@pengutronix.de

https://lore.kernel.org/dri-devel/20230318190804.234610-1-u.kleine-koe...@pengutronix.de

and three patches (bridge/tpd12s015, exynos + tilcdc) are new. Parts of
the above series were picked up, the patches resend here are not.


Apart from a Reviewed-by: by Toni Valkeinen for patch #16 and Inki Dae
who wrote to have taken patch #8 (but that didn't appear in neither next
nor drm-misc-next yet).

Also in v2 they didn't result in euphoric replies.

Can someone who cares about drm as a whole please care for this series
apply it?


Except for patches 8 and 16, I've pushed this patchset into drm-misc-next.

Best regards
Thomas



Best regards
Uwe
  


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature.asc
Description: OpenPGP digital signature