On 28-Jan-26 5:29 PM, Lancelot SIX wrote:


On 27/01/2026 05:44, Lazar, Lijo wrote:


On 24-Jan-26 2:21 AM, Alex Deucher wrote:
On Thu, Jan 22, 2026 at 5:52 AM Lijo Lazar <[email protected]> wrote:

Add cwsr parameters to userqueue ioctl. User should pass the GPU virtual
address for save/restore buffer, and size allocated. They are supported
only for user compute queues.

Signed-off-by: Lijo Lazar <[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 13 +++++++++----
  include/uapi/drm/amdgpu_drm.h              | 16 ++++++++++++++++
  2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c b/drivers/ gpu/drm/amd/amdgpu/mes_userqueue.c
index 7ad8297eb0d8..2765317f04df 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c
@@ -343,16 +343,21 @@ static int mes_userq_mqd_create(struct amdgpu_usermode_queue *queue,

                 if (amdgpu_cwsr_is_enabled(adev)) {
                         cwsr_params.ctx_save_area_address =
-                               userq_props->ctx_save_area_addr;
-                       cwsr_params.cwsr_sz = userq_props- >ctx_save_area_size; -                       cwsr_params.ctl_stack_sz = userq_props- >ctl_stack_size;
-
+                               compute_mqd->ctx_save_area_va;
+                       cwsr_params.cwsr_sz = compute_mqd- >ctx_save_area_size; +                       cwsr_params.ctl_stack_sz = compute_mqd- >ctl_stack_size;
                         r = amdgpu_userq_input_cwsr_params_validate(
                                 queue, &cwsr_params);
                         if (r) {
                                 kfree(compute_mqd);
                                 goto free_mqd;
                         }
+                       userq_props->ctx_save_area_addr =
+                               compute_mqd->ctx_save_area_va;
+                       userq_props->ctx_save_area_size =
+                               compute_mqd->ctx_save_area_size;
+                       userq_props->ctl_stack_size =
+                               compute_mqd->ctl_stack_size;
                 }

                 kfree(compute_mqd);
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/ amdgpu_drm.h
index c178b8e0bd3f..b7a858365174 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -460,6 +460,22 @@ struct drm_amdgpu_userq_mqd_compute_gfx11 {
          * to get the size.
          */
         __u64   eop_va;
+       /**
+        * @ctx_save_area_va: Virtual address of the GPU memory for save/restore buffer. +        * This must be from a separate GPU object, and use AMDGPU_INFO IOCTL +        * to get the size. This includes control stack, wave context and debugger memory.
+        */
+       __u64 ctx_save_area_va;
+       /**
+        * @ctx_save_area_size:  Total size (in bytes) allocated for save/restore buffer.
+        * Use AMDGPU_INFO IOCTL to get the size.
+        */
+       __u32 ctx_save_area_size;
+       /**
+        * @ctl_stack_size: Size (in bytes) of control stack region in the save/restore buffer.
+        * Use AMDGPU_INFO IOCTL to get the size.
+        */
+       __u32 ctl_stack_size;

Does it matter where the ctl_stack is within the save area?


This is the legacy way. Probably, this can be avoided. Adding David and Lancelot.

Hi David/Lancelot,

Do you have the background of userspace passing back control stack size?

https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/ amdkfd/kfd_chardev.c#L260

Can driver assume that context save area takes care of everything and assume that user allotted as per the right control stack size?

Thanks,
Lijo

Hi,

As far as ROCr is concerned, the control stack is just an element that contributes to the size that need to be allocated for the CWSR area.  I do not expect ROCr needs to know anything about it if it can query the driver for the minimum size the CWSR allocation should be.

If userspace processes are interested in accessing the control stack (like the debugger for example), the way to access it and know its current size is by reading the CWSR area header maintained by the driver.  See "struct kfd_context_save_area_header", which contains the effective size (of valid data).  This struct is at the beginning of the cwsr area (ctx_save_area_va), and contains everything needed to effectively decode CWSR.

Does that answer your question?


Thanks, that clarifies. Control stack size is expected to be passed to mqd. I think driver can use the size it calculated as long as user has allocated the minimum size required for the whole save area. Will remove this from input parameter.

Thanks for the pointer to save area header. The interface to query the used size is missing.

Thanks,
Lijo

Best,
Lancelot.

cc Jonathan.


Alex

  };

  /* userq signal/wait ioctl */
--
2.49.0




Reply via email to