On 2025-12-05 14:46, Felix Kuehling wrote:
On 2025-12-04 17:51, Philip Yang wrote:
On 2025-12-03 12:55, Kuehling, Felix wrote:
On 2025-12-01 09:28, Philip Yang wrote:
To reduce queue switch latency further, move MQD to VRAM domain, add
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS flag to allocate contiguous pages
using one buddy block.
Why does it need to be contiguous? In the next patch you're mapping
it in the GART anyway.
Without AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS flag, amdgpu_bo_gpu_offset
trigger this warning
WARN_ON_ONCE(bo->tbo.resource->mem_type == TTM_PL_VRAM &&
!(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
This makes senses because we pass the FB aperture address to CP, this
should be contiguous pages.
That's right, if you query the VRAM offset of the BO, you're assuming
that it's contiguous. If you're mapping it in the GART, you wouldn't
use that VRAM offset, you'd use the GART address instead. Since you
don't want to use the mapping of the BO in the FB aperture (because
that gives you the wrong MTYPE), you should not need to use
amdgpu_bo_gpu_offset at all.
With this patch, MQD move to VRAM and pass amdgpu_bo_gpu_offset to CP,
this works with FB aperture mtype, so need contiguous flag.
The next patch with GART mapping, yes, we can remove contiguous flag.
Because MQD size is small, pinned, use contiguous allocation is also
simpler to update GART mapping.
Regards,
Philip
Regards,
Felix
Regards,
Philip
Regards,
Felix
Signed-off-by: Philip Yang <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 3 ++-
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 9cd1660b8f60..c11e37915365 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -329,7 +329,8 @@ int amdgpu_amdkfd_alloc_gtt_mem(struct
amdgpu_device *adev, size_t size,
bp.size = size;
bp.byte_align = PAGE_SIZE;
bp.domain = domain;
- bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC;
+ bp.flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
+ AMDGPU_GEM_CREATE_CPU_GTT_USWC;
bp.type = ttm_bo_type_kernel;
bp.resv = NULL;
bp.bo_ptr_size = sizeof(struct amdgpu_bo);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index a489d43d5f64..c6945c842267 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -139,7 +139,7 @@ static struct kfd_mem_obj *allocate_mqd(struct
kfd_node *node,
(ALIGN(q->ctl_stack_size, PAGE_SIZE) +
ALIGN(sizeof(struct v9_mqd), PAGE_SIZE)) *
NUM_XCC(node->xcc_mask),
- AMDGPU_GEM_DOMAIN_GTT,
+ AMDGPU_GEM_DOMAIN_VRAM,
&(mqd_mem_obj->gtt_mem),
&(mqd_mem_obj->gpu_addr),
(void *)&(mqd_mem_obj->cpu_ptr), true);