[RFC PATCH] drm/amdkfd: Run restore_workers on freezable WQs
Make restore workers freezable so we don't have to explicitly flush them in suspend and GPU reset code paths, and we don't accidentally try to restore BOs while the GPU is suspended. Not having to flush restore_work also helps avoid lock/fence dependencies in the GPU reset case where we're not allowed to wait for fences. This is an RFC and request for testing. I have not tested this myself yet. My notes below: Restore work won't be frozen during GPU reset. Does it matter? Queues will stay evicted until resume in any case. But restore_work may be in trouble if it gets run in the middle of a GPU reset. In that case, if anything fails, it will just reschedule itself, so should be fine as long as it doesn't interfere with the reset itself (respects any mechanisms in place to prevent HW access during the reset). What HW access does restore_work perform: - migrating buffers: uses GPU scheduler, should be suspended during reset - TLB flushes in userptr restore worker: not called directly, relies on HWS to flush TLBs on VMID attachment - TLB flushes in SVM restore worker: Does TLB flush in the mapping code - Resuming user mode queues: should not happen while GPU reset keeps queue eviction counter elevated Ergo: Except for the SVM case, it's OK to not flush restore work before GPU resets. I'll need to rethink the interaction of SVM restore_work with GPU resets. What about cancelling p->eviction_work? Eviction work would normally be needed to signal eviction fences, but we're doing that explicitly in suspend_all_processes. Does eviction_work wait for fences anywhere? Yes, indirectly by flushing restore_work. So we should not try to cancel it during a GPU reset. Problem: accessing p->ef concurrently in evict_process_worker and suspend_all_processes. Need a spinlock to use and update it safely. Problem: What if evict_process_worker gets stuck in flushing restore_work? We can skip all of that if p->ef is NULL (which it is during the reset). Even if it gets stuck, it's no problem if the reset doesn't depend on it. It should get unstuck after the reset. Signed-off-by: Felix Kuehling --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 9 ++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 49 +-- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +- 4 files changed, 44 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 54f31a420229..89e632257663 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1644,7 +1644,8 @@ int amdgpu_amdkfd_criu_resume(void *p) goto out_unlock; } WRITE_ONCE(pinfo->block_mmu_notifications, false); - schedule_delayed_work(>restore_userptr_work, 0); + queue_delayed_work(system_freezable_wq, + >restore_userptr_work, 0); out_unlock: mutex_unlock(>lock); @@ -2458,7 +2459,8 @@ int amdgpu_amdkfd_evict_userptr(struct mmu_interval_notifier *mni, KFD_QUEUE_EVICTION_TRIGGER_USERPTR); if (r) pr_err("Failed to quiesce KFD\n"); - schedule_delayed_work(_info->restore_userptr_work, + queue_delayed_work(system_freezable_wq, + _info->restore_userptr_work, msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS)); } mutex_unlock(_info->notifier_lock); @@ -2793,7 +2795,8 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct work_struct *work) /* If validation failed, reschedule another attempt */ if (evicted_bos) { - schedule_delayed_work(_info->restore_userptr_work, + queue_delayed_work(system_freezable_wq, + _info->restore_userptr_work, msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS)); kfd_smi_event_queue_restore_rescheduled(mm); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 9cc32f577e38..cf017d027fee 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -919,6 +919,7 @@ struct kfd_process { * during restore */ struct dma_fence *ef; + spinlock_t ef_lock; /* Work items for evicting and restoring BOs */ struct delayed_work eviction_work; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index fbf053001af9..a07cba58ec5e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -664,7 +664,8 @@ int kfd_process_create_wq(void) if (!kfd_process_wq) kfd_process_wq = alloc_workqueue("kfd_process_wq", 0, 0); if (!kfd_restore_wq) -
Re: drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headers
On 10/27/2023 15:41, Alex Deucher wrote: For pptable structs that use flexible array sizes, use flexible arrays. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926 Signed-off-by: Alex Deucher Reviewed-by: Mario Limonciello --- .../drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 4 ++-- .../amd/pm/powerplay/hwmgr/vega10_pptable.h | 24 +-- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h index 9fcad69a9f34..2cf2a7b12623 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h @@ -367,7 +367,7 @@ typedef struct _ATOM_Tonga_VCE_State_Record { typedef struct _ATOM_Tonga_VCE_State_Table { UCHAR ucRevId; UCHAR ucNumEntries; - ATOM_Tonga_VCE_State_Record entries[1]; + ATOM_Tonga_VCE_State_Record entries[]; } ATOM_Tonga_VCE_State_Table; typedef struct _ATOM_Tonga_PowerTune_Table { @@ -481,7 +481,7 @@ typedef struct _ATOM_Tonga_Hard_Limit_Record { typedef struct _ATOM_Tonga_Hard_Limit_Table { UCHAR ucRevId; UCHAR ucNumEntries; - ATOM_Tonga_Hard_Limit_Record entries[1]; + ATOM_Tonga_Hard_Limit_Record entries[]; } ATOM_Tonga_Hard_Limit_Table; typedef struct _ATOM_Tonga_GPIO_Table { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h index 8b0590b834cc..de2926df5ed7 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h @@ -129,7 +129,7 @@ typedef struct _ATOM_Vega10_State { typedef struct _ATOM_Vega10_State_Array { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Vega10_State states[1]; /* Dynamically allocate entries. */ + ATOM_Vega10_State states[]; /* Dynamically allocate entries. */ } ATOM_Vega10_State_Array; typedef struct _ATOM_Vega10_CLK_Dependency_Record { @@ -169,37 +169,37 @@ typedef struct _ATOM_Vega10_GFXCLK_Dependency_Table { typedef struct _ATOM_Vega10_MCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ -ATOM_Vega10_MCLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ +ATOM_Vega10_MCLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_MCLK_Dependency_Table; typedef struct _ATOM_Vega10_SOCCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ -ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ +ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_SOCCLK_Dependency_Table; typedef struct _ATOM_Vega10_DCEFCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ -ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ +ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_DCEFCLK_Dependency_Table; typedef struct _ATOM_Vega10_PIXCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ + ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_PIXCLK_Dependency_Table; typedef struct _ATOM_Vega10_DISPCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries.*/ - ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ + ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_DISPCLK_Dependency_Table; typedef struct _ATOM_Vega10_PHYCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ + ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_PHYCLK_Dependency_Table; typedef struct _ATOM_Vega10_MM_Dependency_Record { @@ -213,7 +213,7 @@ typedef struct _ATOM_Vega10_MM_Dependency_Record { typedef struct _ATOM_Vega10_MM_Dependency_Table { UCHAR ucRevId; UCHAR
[PATCH] drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headers
For pptable structs that use flexible array sizes, use flexible arrays. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926 Signed-off-by: Alex Deucher --- .../drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 4 ++-- .../amd/pm/powerplay/hwmgr/vega10_pptable.h | 24 +-- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h index 9fcad69a9f34..2cf2a7b12623 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h @@ -367,7 +367,7 @@ typedef struct _ATOM_Tonga_VCE_State_Record { typedef struct _ATOM_Tonga_VCE_State_Table { UCHAR ucRevId; UCHAR ucNumEntries; - ATOM_Tonga_VCE_State_Record entries[1]; + ATOM_Tonga_VCE_State_Record entries[]; } ATOM_Tonga_VCE_State_Table; typedef struct _ATOM_Tonga_PowerTune_Table { @@ -481,7 +481,7 @@ typedef struct _ATOM_Tonga_Hard_Limit_Record { typedef struct _ATOM_Tonga_Hard_Limit_Table { UCHAR ucRevId; UCHAR ucNumEntries; - ATOM_Tonga_Hard_Limit_Record entries[1]; + ATOM_Tonga_Hard_Limit_Record entries[]; } ATOM_Tonga_Hard_Limit_Table; typedef struct _ATOM_Tonga_GPIO_Table { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h index 8b0590b834cc..de2926df5ed7 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h @@ -129,7 +129,7 @@ typedef struct _ATOM_Vega10_State { typedef struct _ATOM_Vega10_State_Array { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Vega10_State states[1]; /* Dynamically allocate entries. */ + ATOM_Vega10_State states[]; /* Dynamically allocate entries. */ } ATOM_Vega10_State_Array; typedef struct _ATOM_Vega10_CLK_Dependency_Record { @@ -169,37 +169,37 @@ typedef struct _ATOM_Vega10_GFXCLK_Dependency_Table { typedef struct _ATOM_Vega10_MCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ -ATOM_Vega10_MCLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ +ATOM_Vega10_MCLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_MCLK_Dependency_Table; typedef struct _ATOM_Vega10_SOCCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ -ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ +ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_SOCCLK_Dependency_Table; typedef struct _ATOM_Vega10_DCEFCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ -ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ +ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_DCEFCLK_Dependency_Table; typedef struct _ATOM_Vega10_PIXCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ + ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_PIXCLK_Dependency_Table; typedef struct _ATOM_Vega10_DISPCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries.*/ - ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ + ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_DISPCLK_Dependency_Table; typedef struct _ATOM_Vega10_PHYCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ + ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically allocate entries. */ } ATOM_Vega10_PHYCLK_Dependency_Table; typedef struct _ATOM_Vega10_MM_Dependency_Record { @@ -213,7 +213,7 @@ typedef struct _ATOM_Vega10_MM_Dependency_Record { typedef struct _ATOM_Vega10_MM_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries */ - ATOM_Vega10_MM_Dependency_Record entries[1];
[PATCH] drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2)
W/RREG32_RLC is hardedcoded to use instance 0. W/RREG32_SOC15_RLC should be used instead when inst != 0. v2: rebase Signed-off-by: Victor Lu --- .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c | 38 -- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 40 +-- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 2 +- 3 files changed, 37 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c index 80309d39737a..f6598b9e4faa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c @@ -306,8 +306,7 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device *adev, void *mqd, /* Activate doorbell logic before triggering WPTR poll. */ data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control, CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1); - WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_PQ_DOORBELL_CONTROL), - data); + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_DOORBELL_CONTROL, data); if (wptr) { /* Don't read wptr with get_user because the user @@ -336,27 +335,24 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device *adev, void *mqd, guessed_wptr += m->cp_hqd_pq_wptr_lo & ~(queue_size - 1); guessed_wptr += (uint64_t)m->cp_hqd_pq_wptr_hi << 32; - WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_LO), - lower_32_bits(guessed_wptr)); - WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_HI), - upper_32_bits(guessed_wptr)); - WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_POLL_ADDR), - lower_32_bits((uintptr_t)wptr)); - WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), - regCP_HQD_PQ_WPTR_POLL_ADDR_HI), + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_LO, + lower_32_bits(guessed_wptr)); + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_HI, + upper_32_bits(guessed_wptr)); + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_POLL_ADDR, + lower_32_bits((uintptr_t)wptr)); + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_POLL_ADDR_HI, upper_32_bits((uintptr_t)wptr)); - WREG32(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_PQ_WPTR_POLL_CNTL1), - (uint32_t)kgd_gfx_v9_get_queue_mask(adev, pipe_id, - queue_id)); + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_PQ_WPTR_POLL_CNTL1, + (uint32_t)kgd_gfx_v9_get_queue_mask(adev, pipe_id, queue_id)); } /* Start the EOP fetcher */ - WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_EOP_RPTR), - REG_SET_FIELD(m->cp_hqd_eop_rptr, -CP_HQD_EOP_RPTR, INIT_FETCHER, 1)); + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_EOP_RPTR, + REG_SET_FIELD(m->cp_hqd_eop_rptr, CP_HQD_EOP_RPTR, INIT_FETCHER, 1)); data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1); - WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_ACTIVE), data); + WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_ACTIVE, data); kgd_gfx_v9_release_queue(adev, inst); @@ -494,15 +490,15 @@ static uint32_t kgd_gfx_v9_4_3_set_address_watch( VALID, 1); - WREG32_RLC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst), + WREG32_XCC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regTCP_WATCH0_ADDR_H) + (watch_id * TCP_WATCH_STRIDE)), - watch_address_high); + watch_address_high, inst); - WREG32_RLC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst), + WREG32_XCC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regTCP_WATCH0_ADDR_L) + (watch_id * TCP_WATCH_STRIDE)), - watch_address_low); + watch_address_low, inst); return watch_address_cntl; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index 9285789b3a42..00fbc0f44c92 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -91,8 +91,8 @@ void kgd_gfx_v9_program_sh_mem_settings(struct amdgpu_device *adev, uint32_t vmi { kgd_gfx_v9_lock_srbm(adev, 0, 0, 0, vmid, inst); -
[PATCH] drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait (v3)
amdgpu_virt_kiq_reg_write_reg_wait is hardcoded to use MEC engine 0. Add xcc_inst as a parameter to allow it to use different MEC engines. v3: use first xcc for MMHUB in gmc_v9_0_flush_gpu_tlb v2: rebase Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 3 ++- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 26 ++-- 5 files changed, 22 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 7a084fbfd33c..82762c61d3ec 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -73,9 +73,10 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev) void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev, uint32_t reg0, uint32_t reg1, - uint32_t ref, uint32_t mask) + uint32_t ref, uint32_t mask, + uint32_t xcc_inst) { - struct amdgpu_kiq *kiq = >gfx.kiq[0]; + struct amdgpu_kiq *kiq = >gfx.kiq[xcc_inst]; struct amdgpu_ring *ring = >ring; signed long r, cnt = 0; unsigned long flags; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h index 03c0e38b8aea..5c64258eb728 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h @@ -334,7 +334,8 @@ bool amdgpu_virt_mmio_blocked(struct amdgpu_device *adev); void amdgpu_virt_init_setting(struct amdgpu_device *adev); void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev, uint32_t reg0, uint32_t rreg1, - uint32_t ref, uint32_t mask); + uint32_t ref, uint32_t mask, + uint32_t xcc_id); int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init); int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init); int amdgpu_virt_reset_gpu(struct amdgpu_device *adev); diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index d8a4fddab9c1..173237e99882 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -268,7 +268,7 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, if (adev->gfx.kiq[0].ring.sched.ready && !adev->enable_mes && (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) { amdgpu_virt_kiq_reg_write_reg_wait(adev, req, ack, inv_req, - 1 << vmid); + 1 << vmid, 0); return; } diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c index 19eaada35ede..2e4abb356e38 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c @@ -228,7 +228,7 @@ static void gmc_v11_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, if ((adev->gfx.kiq[0].ring.sched.ready || adev->mes.ring.sched.ready) && (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) { amdgpu_virt_kiq_reg_write_reg_wait(adev, req, ack, inv_req, - 1 << vmid); + 1 << vmid, 0); return; } diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 0ab9c554da78..32cc3645f02b 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -817,7 +817,7 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, uint32_t vmhub, uint32_t flush_type) { bool use_semaphore = gmc_v9_0_use_invalidate_semaphore(adev, vmhub); - u32 j, inv_req, tmp, sem, req, ack; + u32 j, inv_req, tmp, sem, req, ack, inst; const unsigned int eng = 17; struct amdgpu_vmhub *hub; @@ -832,13 +832,17 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, /* This is necessary for a HW workaround under SRIOV as well * as GFXOFF under bare metal */ - if (adev->gfx.kiq[0].ring.sched.ready && + if (vmhub >= AMDGPU_MMHUB0(0)) + inst = 0; + else + inst = vmhub; + if (adev->gfx.kiq[inst].ring.sched.ready && (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) { uint32_t req = hub->vm_inv_eng0_req + hub->eng_distance * eng; uint32_t ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
[PATCH] drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v4)
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0. Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter. Using amdgpu_sriov_runtime to determine whether to access via kiq or RLC is sufficient for now. v4: avoid using amdgpu_sriov_w/rreg v3: use W/RREG32_XCC to handle non-kiq case v2: define amdgpu_device_xcc_wreg/rreg instead of changing parameters of amdgpu_device_wreg/rreg Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 13 ++- .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 91 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 4 + drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 8 +- 9 files changed, 118 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 43c579f5a95e..e8dc75a3ff44 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1162,11 +1162,18 @@ uint32_t amdgpu_device_rreg(struct amdgpu_device *adev, uint32_t reg, uint32_t acc_flags); u32 amdgpu_device_indirect_rreg_ext(struct amdgpu_device *adev, u64 reg_addr); +uint32_t amdgpu_device_xcc_rreg(struct amdgpu_device *adev, + uint32_t reg, uint32_t acc_flags, + uint32_t xcc_id); void amdgpu_device_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t acc_flags); void amdgpu_device_indirect_wreg_ext(struct amdgpu_device *adev, u64 reg_addr, u32 reg_data); +void amdgpu_device_xcc_wreg(struct amdgpu_device *adev, + uint32_t reg, uint32_t v, + uint32_t acc_flags, + uint32_t xcc_id); void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t xcc_id); void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t value); @@ -1207,8 +1214,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev); #define RREG32_NO_KIQ(reg) amdgpu_device_rreg(adev, (reg), AMDGPU_REGS_NO_KIQ) #define WREG32_NO_KIQ(reg, v) amdgpu_device_wreg(adev, (reg), (v), AMDGPU_REGS_NO_KIQ) -#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg)) -#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v)) +#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg), 0) +#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v), 0) #define RREG8(reg) amdgpu_mm_rreg8(adev, (reg)) #define WREG8(reg, v) amdgpu_mm_wreg8(adev, (reg), (v)) @@ -1218,6 +1225,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev); #define WREG32(reg, v) amdgpu_device_wreg(adev, (reg), (v), 0) #define REG_SET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK) #define REG_GET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK) +#define RREG32_XCC(reg, inst) amdgpu_device_xcc_rreg(adev, (reg), 0, inst) +#define WREG32_XCC(reg, v, inst) amdgpu_device_xcc_wreg(adev, (reg), (v), 0, inst) #define RREG32_PCIE(reg) adev->pcie_rreg(adev, (reg)) #define WREG32_PCIE(reg, v) adev->pcie_wreg(adev, (reg), (v)) #define RREG32_PCIE_PORT(reg) adev->pciep_rreg(adev, (reg)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c index 490c8f5ddb60..80309d39737a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c @@ -300,7 +300,7 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device *adev, void *mqd, hqd_end = SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_AQL_DISPATCH_ID_HI); for (reg = hqd_base; reg <= hqd_end; reg++) - WREG32_RLC(reg, mqd_hqd[reg - hqd_base]); + WREG32_XCC(reg, mqd_hqd[reg - hqd_base], inst); /* Activate doorbell logic before triggering WPTR poll. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index 51011e8ee90d..9285789b3a42 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -239,7 +239,7 @@ int kgd_gfx_v9_hqd_load(struct amdgpu_device *adev, void *mqd, for (reg = hqd_base; reg <= SOC15_REG_OFFSET(GC, GET_INST(GC, inst), mmCP_HQD_PQ_WPTR_HI); reg++) - WREG32_RLC(reg, mqd_hqd[reg - hqd_base]); + WREG32_XCC(reg, mqd_hqd[reg - hqd_base], inst); /* Activate doorbell logic before triggering WPTR poll. */ diff --git
[PATCH] drm/amdgpu: Add xcc instance parameter to *REG32_SOC15_IP_NO_KIQ (v3)
The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface when programming other XCCs. Add xcc instance parameter to them. v3: xcc not needed for MMMHUB v2: rebase Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 drivers/gpu/drm/amd/amdgpu/soc15_common.h | 6 +++--- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 3a1050344b59..0ab9c554da78 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -856,9 +856,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, for (j = 0; j < adev->usec_timeout; j++) { /* a read return value of 1 means semaphore acquire */ if (vmhub >= AMDGPU_MMHUB0(0)) - tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem); + tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0); else - tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem); + tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem, vmhub); if (tmp & 0x1) break; udelay(1); @@ -869,9 +869,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, } if (vmhub >= AMDGPU_MMHUB0(0)) - WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req); + WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req, 0); else - WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req); + WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req, vmhub); /* * Issue a dummy read to wait for the ACK register to @@ -884,9 +884,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, for (j = 0; j < adev->usec_timeout; j++) { if (vmhub >= AMDGPU_MMHUB0(0)) - tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack); + tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack, 0); else - tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack); + tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack, vmhub); if (tmp & (1 << vmid)) break; udelay(1); @@ -899,9 +899,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, * write with 0 means semaphore release */ if (vmhub >= AMDGPU_MMHUB0(0)) - WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0); + WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0, 0); else - WREG32_SOC15_IP_NO_KIQ(GC, sem, 0); + WREG32_SOC15_IP_NO_KIQ(GC, sem, 0, vmhub); } spin_unlock(>gmc.invalidate_lock); diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h index da683afa0222..c75e9cd5c98b 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h +++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h @@ -69,7 +69,7 @@ #define RREG32_SOC15_IP(ip, reg) __RREG32_SOC15_RLC__(reg, 0, ip##_HWIP, 0) -#define RREG32_SOC15_IP_NO_KIQ(ip, reg) __RREG32_SOC15_RLC__(reg, AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0) +#define RREG32_SOC15_IP_NO_KIQ(ip, reg, inst) __RREG32_SOC15_RLC__(reg, AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst) #define RREG32_SOC15_NO_KIQ(ip, inst, reg) \ __RREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \ @@ -86,8 +86,8 @@ #define WREG32_SOC15_IP(ip, reg, value) \ __WREG32_SOC15_RLC__(reg, value, 0, ip##_HWIP, 0) -#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value) \ -__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0) +#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value, inst) \ +__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst) #define WREG32_SOC15_NO_KIQ(ip, inst, reg, value) \ __WREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \ -- 2.34.1
Re: [PATCHv2 1/2] drm/amdkfd: Populate cache info for GFX 9.4.3
On 2023-10-27 15:04, Mukul Joshi wrote: GFX 9.4.3 uses a new version of the GC info table which contains the cache info. This patch adds a new function to populate the cache info from IP discovery for GFX 9.4.3. Signed-off-by: Mukul Joshi --- v1->v2: - Separate out the original patch into 2 patches. drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 66 ++- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index 0e792a8496d6..cd8e459201f1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1404,6 +1404,66 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, return i; } +static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, + struct kfd_gpu_cache_info *pcache_info) +{ + struct amdgpu_device *adev = kdev->adev; + int i = 0; + + /* TCP L1 Cache per CU */ + if (adev->gfx.config.gc_tcp_size_per_cu) { + pcache_info[i].cache_size = adev->gfx.config.gc_tcp_size_per_cu; + pcache_info[i].cache_level = 1; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = 1; + i++; + } + /* Scalar L1 Instruction Cache per SQC */ + if (adev->gfx.config.gc_l1_instruction_cache_size_per_sqc) { + pcache_info[i].cache_size = + adev->gfx.config.gc_l1_instruction_cache_size_per_sqc; + pcache_info[i].cache_level = 1; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_INST_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.gc_num_cu_per_sqc; + i++; + } + /* Scalar L1 Data Cache per SQC */ + if (adev->gfx.config.gc_l1_data_cache_size_per_sqc) { + pcache_info[i].cache_size = adev->gfx.config.gc_l1_data_cache_size_per_sqc; + pcache_info[i].cache_level = 1; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.gc_num_cu_per_sqc; + i++; + } + /* L2 Data Cache per GPU (Total Tex Cache) */ + if (adev->gfx.config.gc_tcc_size) { + pcache_info[i].cache_size = adev->gfx.config.gc_tcc_size; + pcache_info[i].cache_level = 2; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh; + i++; + } + /* L3 Data Cache per GPU */ + if (adev->gmc.mall_size) { + pcache_info[i].cache_size = adev->gmc.mall_size / 1024; Is this /1024 a unit conversion? What are the units for L1/L2 caches compared to L3 caches? When we report the sizes in the topology, they should be in the same units for all cache levels, I believe. Given that L3 is likely the largest, I'm a bit suspicious of this conversion. Other than that, the series is Reviewed-by: Felix Kuehling + pcache_info[i].cache_level = 3; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh; + i++; + } + return i; +} + int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info **pcache_info) { int num_of_cache_types = 0; @@ -1461,10 +1521,14 @@ int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info **pc num_of_cache_types = ARRAY_SIZE(vega20_cache_info); break; case IP_VERSION(9, 4, 2): - case IP_VERSION(9, 4, 3): *pcache_info = aldebaran_cache_info; num_of_cache_types = ARRAY_SIZE(aldebaran_cache_info); break; + case IP_VERSION(9, 4, 3): + num_of_cache_types = + kfd_fill_gpu_cache_info_from_gfx_config_v2(kdev->kfd, + *pcache_info); +
[pull] amdgpu, amdkfd drm-next-6.7
Hi Dave, Sima, Fixes for 6.7. The following changes since commit 5258dfd4a6adb5f45f046b0dd2e73c680f880d9d: usb: typec: altmodes/displayport: fixup drm internal api change vs new user. (2023-10-27 07:55:41 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-next-6.7-2023-10-27 for you to fetch changes up to dd3dd9829bf9a4ecd55482050745efdd9f7f97fc: drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo (2023-10-27 14:23:01 -0400) amd-drm-next-6.7-2023-10-27: amdgpu: - RAS fixes - Seamless boot fixes - NBIO 7.7 fix - SMU 14.0 fixes - GC 11.5 fixes - DML2 fixes - ASPM fixes - VPE fixes - Misc code cleanups - SRIOV fixes - Add some missing copyright notices - DCN 3.5 fixes - FAMS fixes - Backlight fix - S/G display fix - fdinfo cleanups - EXT_COHERENT fixes for APU and NUMA systems amdkfd: - Misc fixes - Misc code cleanups - SVM fixes Agustin Gutierrez (1): drm/amd/display: Remove power sequencing check Alex Hung (2): drm/amd/display: Revert "drm/amd/display: allow edp updates for virtual signal" drm/amd/display: Set emulated sink type to HDMI accordingly. Alvin Lee (1): drm/amd/display: Update FAMS sequence for DCN30 & DCN32 Aric Cyr (1): drm/amd/display: 3.2.256 Aurabindo Pillai (1): drm/amd/display: add interface to query SubVP status Candice Li (2): drm/amdgpu: Identify data parity error corrected in replay mode drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table David Francis (1): drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems Fangzhi Zuo (1): drm/amd/display: Fix MST Multi-Stream Not Lighting Up on dcn35 George Shen (1): drm/amd/display: Update SDP VSC colorimetry from DP test automation request Hamza Mahfooz (1): drm/amd/display: fix S/G display enablement Hugo Hu (1): drm/amd/display: reprogram det size while seamless boot Ilya Bakoulin (1): drm/amd/display: Fix shaper using bad LUT params Iswara Nagulendran (1): drm/amd/display: Read before writing Backlight Mode Set Register James Zhu (1): drm/amdxcp: fix amdxcp unloads incompletely Jesse Zhang (1): drm/amdkfd: Fix shift out-of-bounds issue Jiadong Zhu (2): drm/amd/pm: drop unneeded dpm features disablement for SMU 14.0.0 drm/amdgpu: add tmz support for GC IP v11.5.0 Kenneth Feng (1): drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded Lang Yu (1): drm/amdgpu/vpe: correct queue stop programing Li Ma (2): drm/amdgpu: modify if condition in nbio_v7_7.c drm/amd/amdgpu: fix the GPU power print error in pm info Lijo Lazar (4): drm/amdgpu: Add API to get full IP version drm/amdgpu: Use discovery table's subrevision drm/amdgpu: Add a read to GFX v9.4.3 ring test drm/amdgpu: Use pcie domain of xcc acpi objects Lin.Cao (2): drm/amdgpu remove restriction of sriov max_pfn on Vega10 drm/amd: check num of link levels when update pcie param Ma Jun (1): drm/amd/pm: Fix the return value in default case Mario Limonciello (4): drm/amd: Disable ASPM for VI w/ all Intel systems drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported drm/amd: Move AMD_IS_APU check for ASPM into top level function drm/amd: Explicitly disable ASPM when dynamic switching disabled Michael Strauss (1): drm/amd/display: Disable SYMCLK32_SE RCO on DCN314 Mukul Joshi (1): drm/amdgpu: Fix typo in IP discovery parsing Nicholas Kazlauskas (2): drm/amd/display: Revert "Improve x86 and dmub ips handshake" drm/amd/display: Fix IPS handshake for idle optimizations Philip Yang (2): Revert "drm/amdkfd:remove unused code" Revert "drm/amdkfd: Use partial migrations in GPU page faults" Qu Huang (1): drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL Rob Clark (1): drm/amdgpu: Remove duplicate fdinfo fields Rodrigo Siqueira (5): drm/amd/display: Set the DML2 attribute to false in all DCNs older than version 3.5 drm/amd/display: Fix DMUB errors introduced by DML2 drm/amd/display: Correct enum typo drm/amd/display: Add prefix to amdgpu crtc functions drm/amd/display: Add prefix for plane functions Samson Tam (2): drm/amd/display: fix num_ways overflow error drm/amd/display: add null check for invalid opps Srinivasan Shanmugam (1): drm/amdkfd: Address 'remap_list' not described in 'svm_range_add' Stylon Wang (3): drm/amd/display: Add missing copyright notice in DMUB drm/amd/display: Fix copyright notice in DML2 code drm/amd/display: Fix copyright notice in DC code Sung Joon Kim (2): drm/amd/display: Add a check for idle power optimization
[PATCHv2 2/2] drm/amdkfd: Update cache info for GFX 9.4.3
Update cache info reporting based on compute and memory partitioning modes. Signed-off-by: Mukul Joshi --- v1->v2: - Separate into a separate patch. - Simplify the if condition to reduce indentation and make it logically more clear. drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 4e530791507e..dc7c8312e8c7 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -1602,10 +1602,13 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, unsigned int cu_sibling_map_mask; int first_active_cu; int i, j, k, xcc, start, end; + int num_xcc = NUM_XCC(knode->xcc_mask); struct kfd_cache_properties *pcache = NULL; + enum amdgpu_memory_partition mode; + struct amdgpu_device *adev = knode->adev; start = ffs(knode->xcc_mask) - 1; - end = start + NUM_XCC(knode->xcc_mask); + end = start + num_xcc; cu_sibling_map_mask = cu_info->bitmap[start][0][0]; cu_sibling_map_mask &= ((1 << pcache_info[cache_type].num_cu_shared) - 1); @@ -1624,7 +1627,18 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext, pcache->processor_id_low = cu_processor_id + (first_active_cu - 1); pcache->cache_level = pcache_info[cache_type].cache_level; - pcache->cache_size = pcache_info[cache_type].cache_size; + + if (KFD_GC_VERSION(knode) == IP_VERSION(9, 4, 3)) + mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); + else + mode = UNKNOWN_MEMORY_PARTITION_MODE; + + if (pcache->cache_level == 2) + pcache->cache_size = pcache_info[cache_type].cache_size * num_xcc; + else if (mode) + pcache->cache_size = pcache_info[cache_type].cache_size / mode; + else + pcache->cache_size = pcache_info[cache_type].cache_size; if (pcache_info[cache_type].flags & CRAT_CACHE_FLAGS_DATA_CACHE) pcache->cache_type |= HSA_CACHE_TYPE_DATA; -- 2.35.1
[PATCHv2 1/2] drm/amdkfd: Populate cache info for GFX 9.4.3
GFX 9.4.3 uses a new version of the GC info table which contains the cache info. This patch adds a new function to populate the cache info from IP discovery for GFX 9.4.3. Signed-off-by: Mukul Joshi --- v1->v2: - Separate out the original patch into 2 patches. drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 66 ++- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index 0e792a8496d6..cd8e459201f1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1404,6 +1404,66 @@ static int kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev, return i; } +static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev, + struct kfd_gpu_cache_info *pcache_info) +{ + struct amdgpu_device *adev = kdev->adev; + int i = 0; + + /* TCP L1 Cache per CU */ + if (adev->gfx.config.gc_tcp_size_per_cu) { + pcache_info[i].cache_size = adev->gfx.config.gc_tcp_size_per_cu; + pcache_info[i].cache_level = 1; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = 1; + i++; + } + /* Scalar L1 Instruction Cache per SQC */ + if (adev->gfx.config.gc_l1_instruction_cache_size_per_sqc) { + pcache_info[i].cache_size = + adev->gfx.config.gc_l1_instruction_cache_size_per_sqc; + pcache_info[i].cache_level = 1; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_INST_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.gc_num_cu_per_sqc; + i++; + } + /* Scalar L1 Data Cache per SQC */ + if (adev->gfx.config.gc_l1_data_cache_size_per_sqc) { + pcache_info[i].cache_size = adev->gfx.config.gc_l1_data_cache_size_per_sqc; + pcache_info[i].cache_level = 1; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.gc_num_cu_per_sqc; + i++; + } + /* L2 Data Cache per GPU (Total Tex Cache) */ + if (adev->gfx.config.gc_tcc_size) { + pcache_info[i].cache_size = adev->gfx.config.gc_tcc_size; + pcache_info[i].cache_level = 2; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh; + i++; + } + /* L3 Data Cache per GPU */ + if (adev->gmc.mall_size) { + pcache_info[i].cache_size = adev->gmc.mall_size / 1024; + pcache_info[i].cache_level = 3; + pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED | + CRAT_CACHE_FLAGS_DATA_CACHE | + CRAT_CACHE_FLAGS_SIMD_CACHE); + pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh; + i++; + } + return i; +} + int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info **pcache_info) { int num_of_cache_types = 0; @@ -1461,10 +1521,14 @@ int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info **pc num_of_cache_types = ARRAY_SIZE(vega20_cache_info); break; case IP_VERSION(9, 4, 2): - case IP_VERSION(9, 4, 3): *pcache_info = aldebaran_cache_info; num_of_cache_types = ARRAY_SIZE(aldebaran_cache_info); break; + case IP_VERSION(9, 4, 3): + num_of_cache_types = + kfd_fill_gpu_cache_info_from_gfx_config_v2(kdev->kfd, + *pcache_info); + break; case IP_VERSION(9, 1, 0): case IP_VERSION(9, 2, 2): *pcache_info = raven_cache_info; -- 2.35.1
[PATCH] drm/radeon: replace 1-element arrays with flexible-array members
Reported by coccinelle, the following patch will move the following 1 element arrays to flexible arrays. drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:6030:8-17: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7362:26-41: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7369:29-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7349:24-32: WARNING use flexible-array member instead
RE: [PATCH] drm/radeon: replace 1-element arrays with flexible-array members
[Public] > -Original Message- > From: José Pekkarinen > Sent: Friday, October 27, 2023 12:59 PM > To: Deucher, Alexander ; Koenig, Christian > ; Pan, Xinhui ; > sk...@linuxfoundation.org > Cc: José Pekkarinen ; airl...@gmail.com; > dan...@ffwll.ch; amd-gfx@lists.freedesktop.org; dri- > de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; linux-kernel- > ment...@lists.linuxfoundation.org > Subject: [PATCH] drm/radeon: replace 1-element arrays with flexible-array > members > > Reported by coccinelle, the following patch will move the following 1 element > arrays to flexible arrays. > > drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array > member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array > member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array > member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) >
[PATCH 6/7] drm/exec: Pass in initial # of objects
From: Rob Clark In cases where the # is known ahead of time, it is silly to do the table resize dance. Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 4 ++-- drivers/gpu/drm/drm_exec.c | 15 --- drivers/gpu/drm/nouveau/nouveau_exec.c | 2 +- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 2 +- include/drm/drm_exec.h | 2 +- 8 files changed, 22 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index efdb1c48f431..d27ca8f61929 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, } amdgpu_sync_create(>sync); - drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); return 0; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 720011019741..796fa6f1420b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_exec exec; int r; - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); drm_exec_until_all_locked() { r = amdgpu_vm_lock_pd(vm, , 0); if (likely(!r)) @@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_exec exec; int r; - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); drm_exec_until_all_locked() { r = amdgpu_vm_lock_pd(vm, , 0); if (likely(!r)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index ca4d2d430e28..16f1715148ad 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj, struct drm_exec exec; long r; - drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES); + drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked() { r = drm_exec_prepare_obj(, >tbo.base, 1); drm_exec_retry_on_contention(); @@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, } drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT | - DRM_EXEC_IGNORE_DUPLICATES); + DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked() { if (gobj) { r = drm_exec_lock_obj(, gobj); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index b6015157763a..3c351941701e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev, amdgpu_sync_create(); - drm_exec_init(, 0); + drm_exec_init(, 0, 0); drm_exec_until_all_locked() { r = drm_exec_lock_obj(, _data->meta_data_obj->tbo.base); @@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct amdgpu_device *adev, struct drm_exec exec; long r; - drm_exec_init(, 0); + drm_exec_init(, 0, 0); drm_exec_until_all_locked() { r = drm_exec_lock_obj(, _data->meta_data_obj->tbo.base); diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c index 5d2809de4517..27d11c20d148 100644 --- a/drivers/gpu/drm/drm_exec.c +++ b/drivers/gpu/drm/drm_exec.c @@ -69,16 +69,25 @@ static void drm_exec_unlock_all(struct drm_exec *exec) * drm_exec_init - initialize a drm_exec object * @exec: the drm_exec object to initialize * @flags: controls locking behavior, see DRM_EXEC_* defines + * @nr: the initial # of objects * * Initialize the object and make sure that we can track locked objects. + * + * If nr is non-zero then it is used as the initial objects table size. + * In either case, the table will grow (be re-allocated) on demand. */ -void drm_exec_init(struct drm_exec *exec, uint32_t flags) +void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsigned nr) { + size_t sz = PAGE_SIZE; + + if (nr) + sz = (size_t)nr * sizeof(void *); + exec->flags = flags; - exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL); + exec->objects = kmalloc(sz, GFP_KERNEL);
[PATCH 0/7] drm/msm/gem: drm_exec conversion
From: Rob Clark Simplify the exec path (removing a legacy optimization) and convert to drm_exec. One drm_exec patch to allow passing in the expected # of GEM objects to avoid re-allocation. I'd be a bit happier if I could avoid the extra objects table allocation in drm_exec in the first place, but wasn't really happy with any of the things I tried to get rid of that. Rob Clark (7): drm/msm/gem: Remove "valid" tracking drm/msm/gem: Remove submit_unlock_unpin_bo() drm/msm/gem: Don't queue job to sched in error cases drm/msm/gem: Split out submit_unpin_objects() helper drm/msm/gem: Cleanup submit_cleanup_bo() drm/exec: Pass in initial # of objects drm/msm/gem: Convert to drm_exec drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 4 +- drivers/gpu/drm/drm_exec.c | 15 +- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/msm_gem.h | 13 +- drivers/gpu/drm/msm/msm_gem_submit.c| 197 ++-- drivers/gpu/drm/msm/msm_ringbuffer.c| 3 +- drivers/gpu/drm/nouveau/nouveau_exec.c | 2 +- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 2 +- include/drm/drm_exec.h | 2 +- 12 files changed, 79 insertions(+), 170 deletions(-) -- 2.41.0
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
On 10/27/23 11:55, Lakha, Bhawanpreet wrote: [AMD Official Use Only - General] There was a consensus to use memset instead of {0}. I remember making changes related to that previously. Hm, seems like it's used rather consistently in the DM and in DC though. Bhawan *From:* Mahfooz, Hamza *Sent:* October 27, 2023 11:53 AM *To:* Yuran Pereira ; airl...@gmail.com *Cc:* Li, Sun peng (Leo) ; Lakha, Bhawanpreet ; Pan, Xinhui ; Siqueira, Rodrigo ; linux-ker...@vger.kernel.org ; amd-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ; Deucher, Alexander ; Koenig, Christian ; linux-kernel-ment...@lists.linuxfoundation.org *Subject:* Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; I would prefer setting pr_config = {0}; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- Hamza -- Hamza
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
[AMD Official Use Only - General] There was a consensus to use memset instead of {0}. I remember making changes related to that previously. Bhawan From: Mahfooz, Hamza Sent: October 27, 2023 11:53 AM To: Yuran Pereira ; airl...@gmail.com Cc: Li, Sun peng (Leo) ; Lakha, Bhawanpreet ; Pan, Xinhui ; Siqueira, Rodrigo ; linux-ker...@vger.kernel.org ; amd-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ; Deucher, Alexander ; Koenig, Christian ; linux-kernel-ment...@lists.linuxfoundation.org Subject: Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay On 10/26/23 17:25, Yuran Pereira wrote: > Since `pr_config` is not initialized after its declaration, the > following operations with `replay_enable_option` may be performed > when `replay_enable_option` is holding junk values which could > possibly lead to undefined behaviour > > ``` > ... > pr_config.replay_enable_option |= pr_enable_option_static_screen; > ... > > if (!pr_config.replay_timing_sync_supported) > pr_config.replay_enable_option &= ~pr_enable_option_general_ui; > ... > ``` > > This patch initializes `pr_config` after its declaration to ensure that > it doesn't contain junk data, and prevent any undefined behaviour > > Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") > Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") > Signed-off-by: Yuran Pereira > --- > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > index 32d3086c4cb7..40526507f50b 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > @@ -23,6 +23,7 @@ >* >*/ > > +#include > #include "amdgpu_dm_replay.h" > #include "dc.h" > #include "dm_helpers.h" > @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct > amdgpu_dm_connector *ac >struct replay_config pr_config; I would prefer setting pr_config = {0}; >union replay_debug_flags *debug_flags = NULL; > > + memset(_config, 0, sizeof(pr_config)); > + >// For eDP, if Replay is supported, return true to skip checks >if (link->replay_settings.config.replay_supported) >return true; -- Hamza
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
Also, please write the tagline in present tense. On 10/27/23 11:53, Hamza Mahfooz wrote: On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; I would prefer setting pr_config = {0}; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- Hamza
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; I would prefer setting pr_config = {0}; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- Hamza
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
[AMD Official Use Only - General] Thanks, Reviewed-by: Bhawanpreet Lakha From: Yuran Pereira Sent: October 26, 2023 5:25 PM To: airl...@gmail.com Cc: Yuran Pereira ; Wentland, Harry ; Li, Sun peng (Leo) ; Siqueira, Rodrigo ; Deucher, Alexander ; Koenig, Christian ; Pan, Xinhui ; dan...@ffwll.ch ; Lakha, Bhawanpreet ; amd-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; linux-kernel-ment...@lists.linuxfoundation.org Subject: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- 2.25.1
[PATCH 3/3] drm/amdgpu: add a retry for IP discovery init
AMD dGPUs have integrated FW that runs as soon as the device gets power and initializes the board (determines the amount of memory, provides configuration details to the driver, etc.). For direct PCIe attached cards this happens as soon as power is applied and normally completes well before the OS has even started loading. However, with hotpluggable ports like USB4, the driver needs to wait for this to complete before initializing the device. This normally takes 60-100ms, but could take longer on some older boards periodically due to memory training. Retry for up to a second. In the non-hotplug case, there should be no change in behavior and this should complete on the first try. v2: adjust test criteria v3: adjust checks for the masks, only enable on removable devices v4: skip bif_fb_en check Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 5f9d75900bfa..9ca4d89352d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -99,6 +99,7 @@ MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY); #define mmRCC_CONFIG_MEMSIZE 0xde3 +#define mmMP0_SMN_C2PMSG_330x16061 #define mmMM_INDEX 0x0 #define mmMM_INDEX_HI 0x6 #define mmMM_DATA 0x1 @@ -239,8 +240,26 @@ static int amdgpu_discovery_read_binary_from_sysmem(struct amdgpu_device *adev, static int amdgpu_discovery_read_binary_from_mem(struct amdgpu_device *adev, uint8_t *binary) { - uint64_t vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20; - int ret = 0; + uint64_t vram_size; + u32 msg; + int i, ret = 0; + + /* It can take up to a second for IFWI init to complete on some dGPUs, +* but generally it should be in the 60-100ms range. Normally this starts +* as soon as the device gets power so by the time the OS loads this has long +* completed. However, when a card is hotplugged via e.g., USB4, we need to +* wait for this to complete. Once the C2PMSG is updated, we can +* continue. +*/ + if (dev_is_removable(>pdev->dev)) { + for (i = 0; i < 1000; i++) { + msg = RREG32(mmMP0_SMN_C2PMSG_33); + if (msg & 0x8000) + break; + msleep(1); + } + } + vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20; if (vram_size) { uint64_t pos = vram_size - DISCOVERY_TMR_OFFSET; -- 2.41.0
[PATCH 1/3] drm/amdgpu: don't use ATRM for external devices
The ATRM ACPI method is for fetching the dGPU vbios rom image on laptops and all-in-one systems. It should not be used for external add in cards. If the dGPU is thunderbolt connected, don't try ATRM. v2: pci_is_thunderbolt_attached only works for Intel. Use pdev->external_facing instead. v3: dev_is_removable() seems to be what we want Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c index 38ccec913f00..f3a09ecb7699 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c @@ -29,6 +29,7 @@ #include "amdgpu.h" #include "atom.h" +#include #include #include #include @@ -287,6 +288,10 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device *adev) if (adev->flags & AMD_IS_APU) return false; + /* ATRM is for on-platform devices only */ + if (dev_is_removable(>pdev->dev)) + return false; + while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != NULL) { dhandle = ACPI_HANDLE(>dev); if (!dhandle) -- 2.41.0
[PATCH 2/3] drm/amdgpu: don't use pci_is_thunderbolt_attached()
It's only valid on Intel systems with the Intel VSEC. Use dev_is_removable() instead. This should do the right thing regardless of the platform. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 5 +++-- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 2381de831271..5c90080e93ba 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include #include @@ -2223,7 +2224,6 @@ static int amdgpu_device_parse_gpu_info_fw(struct amdgpu_device *adev) */ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) { - struct drm_device *dev = adev_to_drm(adev); struct pci_dev *parent; int i, r; bool total; @@ -2294,7 +2294,7 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) (amdgpu_is_atpx_hybrid() || amdgpu_has_atpx_dgpu_power_cntl()) && ((adev->flags & AMD_IS_APU) == 0) && - !pci_is_thunderbolt_attached(to_pci_dev(dev->dev))) + !dev_is_removable(>pdev->dev)) adev->flags |= AMD_IS_PX; if (!(adev->flags & AMD_IS_APU)) { @@ -4138,7 +4138,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, px = amdgpu_device_supports_px(ddev); - if (px || (!pci_is_thunderbolt_attached(adev->pdev) && + if (px || (!dev_is_removable(>pdev->dev) && apple_gmux_detect(NULL, NULL))) vga_switcheroo_register_client(adev->pdev, _switcheroo_ops, px); @@ -4288,7 +4288,7 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev) px = amdgpu_device_supports_px(adev_to_drm(adev)); - if (px || (!pci_is_thunderbolt_attached(adev->pdev) && + if (px || (!dev_is_removable(>pdev->dev) && apple_gmux_detect(NULL, NULL))) vga_switcheroo_unregister_client(adev->pdev); diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c index e523627cfe25..df218d5ca775 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c @@ -28,6 +28,7 @@ #include "nbio/nbio_2_3_offset.h" #include "nbio/nbio_2_3_sh_mask.h" #include +#include #include #define smnPCIE_CONFIG_CNTL0x11180044 @@ -361,7 +362,7 @@ static void nbio_v2_3_enable_aspm(struct amdgpu_device *adev, data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT; - if (pci_is_thunderbolt_attached(adev->pdev)) + if (dev_is_removable(>pdev->dev)) data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT; else data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT; @@ -480,7 +481,7 @@ static void nbio_v2_3_program_aspm(struct amdgpu_device *adev) def = data = RREG32_PCIE(smnPCIE_LC_CNTL); data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT; - if (pci_is_thunderbolt_attached(adev->pdev)) + if (dev_is_removable(>pdev->dev)) data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT; else data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT; -- 2.41.0
Re: [PATCH 2/2] drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo
Applied the series. Thanks! Alex On Thu, Oct 26, 2023 at 6:43 PM Umio Yasuno wrote: > > Remove unused variables from amdgpu_show_fdinfo > > Signed-off-by: Umio Yasuno > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 6 -- > 1 file changed, 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > index e9b5d1903..b960ca7ba 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > @@ -55,21 +55,15 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = { > > void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file) > { > - struct amdgpu_device *adev = drm_to_adev(file->minor->dev); > struct amdgpu_fpriv *fpriv = file->driver_priv; > struct amdgpu_vm *vm = >vm; > > struct amdgpu_mem_stats stats; > ktime_t usage[AMDGPU_HW_IP_NUM]; > - uint32_t bus, dev, fn, domain; > unsigned int hw_ip; > int ret; > > memset(, 0, sizeof(stats)); > - bus = adev->pdev->bus->number; > - domain = pci_domain_nr(adev->pdev->bus); > - dev = PCI_SLOT(adev->pdev->devfn); > - fn = PCI_FUNC(adev->pdev->devfn); > > ret = amdgpu_bo_reserve(vm->root.bo, false); > if (ret) > -- > 2.42.0 > >
RE: [PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4
[AMD Official Use Only - General] Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: Friday, October 27, 2023 19:33 To: amd-gfx@lists.freedesktop.org Cc: Zhou1, Tao Subject: [PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4 Reset/query RAS error status and count. v2: use XGMI IP version instead of WAFL version. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 46 ++-- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index 2b7dc490ba6b..0533f873001b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -103,6 +103,16 @@ static const int walf_pcs_err_noncorrectable_mask_reg_aldebaran[] = { smnPCS_GOPX1_PCS_ERROR_NONCORRECTABLE_MASK + 0x10 }; +static const int xgmi3x16_pcs_err_status_reg_v6_4[] = { + smnPCS_XGMI3X16_PCS_ERROR_STATUS, + smnPCS_XGMI3X16_PCS_ERROR_STATUS + 0x10 }; + +static const int xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[] = { + smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK, + smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK + 0x10 }; + static const struct amdgpu_pcs_ras_field xgmi_pcs_ras_fields[] = { {"XGMI PCS DataLossErr", SOC15_REG_FIELD(XGMI0_PCS_GOPX16_PCS_ERROR_STATUS, DataLossErr)}, @@ -958,6 +968,16 @@ static void amdgpu_xgmi_reset_ras_error_count(struct amdgpu_device *adev) default: break; } + + switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) { + case IP_VERSION(6, 4, 0): + for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); i++) + pcs_clear_status(adev, + xgmi3x16_pcs_err_status_reg_v6_4[i]); + break; + default: + break; + } } static int amdgpu_xgmi_query_pcs_error_status(struct amdgpu_device *adev, @@ -975,7 +995,9 @@ static int amdgpu_xgmi_query_pcs_error_status(struct amdgpu_device *adev, if (is_xgmi_pcs) { if (amdgpu_ip_version(adev, XGMI_HWIP, 0) == - IP_VERSION(6, 1, 0)) { + IP_VERSION(6, 1, 0) || + amdgpu_ip_version(adev, XGMI_HWIP, 0) == + IP_VERSION(6, 4, 0)) { pcs_ras_fields = _pcs_ras_fields[0]; field_array_size = ARRAY_SIZE(xgmi3x16_pcs_ras_fields); } else { @@ -1013,7 +1035,7 @@ static void amdgpu_xgmi_query_ras_error_count(struct amdgpu_device *adev, void *ras_error_status) { struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status; - int i; + int i, supported = 1; uint32_t data, mask_data = 0; uint32_t ue_cnt = 0, ce_cnt = 0; @@ -1077,7 +1099,25 @@ static void amdgpu_xgmi_query_ras_error_count(struct amdgpu_device *adev, } break; default: - dev_warn(adev->dev, "XGMI RAS error query not supported"); + supported = 0; + break; + } + + switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) { + case IP_VERSION(6, 4, 0): + /* check xgmi3x16 pcs error */ + for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); i++) { + data = RREG32_PCIE(xgmi3x16_pcs_err_status_reg_v6_4[i]); + mask_data = + RREG32_PCIE(xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[i]); + if (data) + amdgpu_xgmi_query_pcs_error_status(adev, data, + mask_data, _cnt, _cnt, true, true); + } + break; + default: + if (!supported) + dev_warn(adev->dev, "XGMI RAS error query not supported"); break; } -- 2.35.1
Re: [PATCH] drm/amdgpu: add unmap latency when gfx11 set kiq resources
[Public] Acked-by: Alex Deucher From: Tong Liu01 Sent: Thursday, October 26, 2023 11:41 PM To: amd-gfx@lists.freedesktop.org Cc: Evan Quan ; Chen, Horace ; Tuikov, Luben ; Koenig, Christian ; Deucher, Alexander ; Xiao, Jack ; Zhang, Hawking ; Liu, Monk ; Xu, Feifei ; Chang, HaiJun ; Liu01, Tong (Esther) Subject: [PATCH] drm/amdgpu: add unmap latency when gfx11 set kiq resources [why] If driver does not set unmap latency for KIQ, the default value of KIQ unmap latency is zero. When do unmap queue, KIQ will return that almost immediately after receiving unmap command. So, the queue status will be saved to MQD incorrectly or lost in some chance. [how] Set unmap latency when do kiq set resources. The unmap latency is set to be 1 second that is synchronized with Windows driver. Signed-off-by: Tong Liu01 --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index fd22943685f7..7aef7a3a340f 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -155,6 +155,7 @@ static void gfx11_kiq_set_resources(struct amdgpu_ring *kiq_ring, uint64_t queue { amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_SET_RESOURCES, 6)); amdgpu_ring_write(kiq_ring, PACKET3_SET_RESOURCES_VMID_MASK(0) | + PACKET3_SET_RESOURCES_UNMAP_LATENTY(0xa) | /* unmap_latency: 0xa (~ 1s) */ PACKET3_SET_RESOURCES_QUEUE_TYPE(0)); /* vmid_mask:0 queue_type:0 (KIQ) */ amdgpu_ring_write(kiq_ring, lower_32_bits(queue_mask)); /* queue mask lo */ amdgpu_ring_write(kiq_ring, upper_32_bits(queue_mask)); /* queue mask hi */ -- 2.34.1
[PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4
Reset/query RAS error status and count. v2: use XGMI IP version instead of WAFL version. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 46 ++-- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index 2b7dc490ba6b..0533f873001b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -103,6 +103,16 @@ static const int walf_pcs_err_noncorrectable_mask_reg_aldebaran[] = { smnPCS_GOPX1_PCS_ERROR_NONCORRECTABLE_MASK + 0x10 }; +static const int xgmi3x16_pcs_err_status_reg_v6_4[] = { + smnPCS_XGMI3X16_PCS_ERROR_STATUS, + smnPCS_XGMI3X16_PCS_ERROR_STATUS + 0x10 +}; + +static const int xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[] = { + smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK, + smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK + 0x10 +}; + static const struct amdgpu_pcs_ras_field xgmi_pcs_ras_fields[] = { {"XGMI PCS DataLossErr", SOC15_REG_FIELD(XGMI0_PCS_GOPX16_PCS_ERROR_STATUS, DataLossErr)}, @@ -958,6 +968,16 @@ static void amdgpu_xgmi_reset_ras_error_count(struct amdgpu_device *adev) default: break; } + + switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) { + case IP_VERSION(6, 4, 0): + for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); i++) + pcs_clear_status(adev, + xgmi3x16_pcs_err_status_reg_v6_4[i]); + break; + default: + break; + } } static int amdgpu_xgmi_query_pcs_error_status(struct amdgpu_device *adev, @@ -975,7 +995,9 @@ static int amdgpu_xgmi_query_pcs_error_status(struct amdgpu_device *adev, if (is_xgmi_pcs) { if (amdgpu_ip_version(adev, XGMI_HWIP, 0) == - IP_VERSION(6, 1, 0)) { + IP_VERSION(6, 1, 0) || + amdgpu_ip_version(adev, XGMI_HWIP, 0) == + IP_VERSION(6, 4, 0)) { pcs_ras_fields = _pcs_ras_fields[0]; field_array_size = ARRAY_SIZE(xgmi3x16_pcs_ras_fields); } else { @@ -1013,7 +1035,7 @@ static void amdgpu_xgmi_query_ras_error_count(struct amdgpu_device *adev, void *ras_error_status) { struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status; - int i; + int i, supported = 1; uint32_t data, mask_data = 0; uint32_t ue_cnt = 0, ce_cnt = 0; @@ -1077,7 +1099,25 @@ static void amdgpu_xgmi_query_ras_error_count(struct amdgpu_device *adev, } break; default: - dev_warn(adev->dev, "XGMI RAS error query not supported"); + supported = 0; + break; + } + + switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) { + case IP_VERSION(6, 4, 0): + /* check xgmi3x16 pcs error */ + for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); i++) { + data = RREG32_PCIE(xgmi3x16_pcs_err_status_reg_v6_4[i]); + mask_data = + RREG32_PCIE(xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[i]); + if (data) + amdgpu_xgmi_query_pcs_error_status(adev, data, + mask_data, _cnt, _cnt, true, true); + } + break; + default: + if (!supported) + dev_warn(adev->dev, "XGMI RAS error query not supported"); break; } -- 2.35.1
[PATCH 1/2] drm/amdgpu: set XGMI IP version manually for v6_4
The version can't be queried from discovery table. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 0b711bac2092..d22f22d706e0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -2464,6 +2464,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev) if (amdgpu_ip_version(adev, XGMI_HWIP, 0) == IP_VERSION(4, 8, 0)) adev->gmc.xgmi.supported = true; + if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3)) + adev->ip_versions[XGMI_HWIP][0] = IP_VERSION(6, 4, 0); + /* set NBIO version */ switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) { case IP_VERSION(6, 1, 0): -- 2.35.1
RE: [PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check
[AMD Official Use Only - General] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Candice Li Sent: Friday, October 27, 2023 18:30 To: amd-gfx@lists.freedesktop.org Cc: Li, Candice Subject: [PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check Drop checking deferred error which can be handled by poison consumption. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c index 743d2f68b09020..770b4b4e313838 100644 --- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c @@ -91,8 +91,7 @@ static void umc_v12_0_reset_error_count(struct amdgpu_device *adev) static bool umc_v12_0_is_uncorrectable_error(uint64_t mc_umc_status) { return ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1) && - (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Deferred) == 1 || - REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) == 1 || + (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) == 1 +|| REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UC) == 1 || REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, TCC) == 1)); } -- 2.25.1
[PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check
Drop checking deferred error which can be handled by poison consumption. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c index 743d2f68b09020..770b4b4e313838 100644 --- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c @@ -91,8 +91,7 @@ static void umc_v12_0_reset_error_count(struct amdgpu_device *adev) static bool umc_v12_0_is_uncorrectable_error(uint64_t mc_umc_status) { return ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1) && - (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Deferred) == 1 || - REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) == 1 || + (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) == 1 || REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UC) == 1 || REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, TCC) == 1)); } -- 2.25.1
[PATCH] drm/amdgpu: fix check order ras->in_recovery is earlier than ras feature
Checking ras->in_recovery is earlier than ras feature that causes the below null pointer issue. So update the check order to fix it. BUG: kernel NULL pointer dereference, address: 00e8 RIP: 0010:amdgpu_ras_reset_error_count+0xf6/0x190 [amdgpu] Call Trace: ? show_regs+0x72/0x90 ? __die+0x25/0x80 ? page_fault_oops+0x79/0x190 ? do_user_addr_fault+0x30c/0x640 ? __wake_up_klogd.part.0+0x40/0x70 ? exc_page_fault+0x81/0x1b0 ? asm_exc_page_fault+0x27/0x30 ? amdgpu_ras_reset_error_count+0xf6/0x190 [amdgpu] ? __pfx_gmc_v9_0_late_init+0x10/0x10 [amdgpu] gmc_v9_0_late_init+0x97/0xe0 [amdgpu] Fixes: be5c7eb10406 ("drm/amdgpu: bypass RAS error reset in some conditions") Signed-off-by: Bob Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 303fbb6a48b6..3af50754800d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1229,15 +1229,15 @@ int amdgpu_ras_reset_error_count(struct amdgpu_device *adev, return -EOPNOTSUPP; } + if (!amdgpu_ras_is_supported(adev, block) || + !amdgpu_ras_get_mca_debug_mode(adev)) + return -EOPNOTSUPP; + /* skip ras error reset in gpu reset */ if ((amdgpu_in_reset(adev) || atomic_read(>in_recovery)) && mca_funcs && mca_funcs->mca_set_debug_mode) return -EOPNOTSUPP; - if (!amdgpu_ras_is_supported(adev, block) || - !amdgpu_ras_get_mca_debug_mode(adev)) - return -EOPNOTSUPP; - if (block_obj->hw_ops->reset_ras_error_count) block_obj->hw_ops->reset_ras_error_count(adev); -- 2.34.1
Re: [PATCH 3/5] drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v3)
On 10/26/2023 2:22 AM, Victor Lu wrote: amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0. Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter. v3: use W/RREG32_XCC to handle non-kiq case v2: define amdgpu_device_xcc_wreg/rreg instead of changing parameters of amdgpu_device_wreg/rreg Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 13 ++- .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 84 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 +- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 8 +- 8 files changed, 107 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index a2e8c2b60857..09989ebb5da3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1168,11 +1168,18 @@ uint32_t amdgpu_device_rreg(struct amdgpu_device *adev, uint32_t reg, uint32_t acc_flags); u32 amdgpu_device_indirect_rreg_ext(struct amdgpu_device *adev, u64 reg_addr); +uint32_t amdgpu_device_xcc_rreg(struct amdgpu_device *adev, + uint32_t reg, uint32_t acc_flags, + uint32_t xcc_id); void amdgpu_device_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t acc_flags); void amdgpu_device_indirect_wreg_ext(struct amdgpu_device *adev, u64 reg_addr, u32 reg_data); +void amdgpu_device_xcc_wreg(struct amdgpu_device *adev, + uint32_t reg, uint32_t v, + uint32_t acc_flags, + uint32_t xcc_id); void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t xcc_id); void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t value); @@ -1213,8 +1220,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev); #define RREG32_NO_KIQ(reg) amdgpu_device_rreg(adev, (reg), AMDGPU_REGS_NO_KIQ) #define WREG32_NO_KIQ(reg, v) amdgpu_device_wreg(adev, (reg), (v), AMDGPU_REGS_NO_KIQ) -#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg)) -#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v)) +#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg), 0) +#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v), 0) #define RREG8(reg) amdgpu_mm_rreg8(adev, (reg)) #define WREG8(reg, v) amdgpu_mm_wreg8(adev, (reg), (v)) @@ -1224,6 +1231,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev); #define WREG32(reg, v) amdgpu_device_wreg(adev, (reg), (v), 0) #define REG_SET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK) #define REG_GET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK) +#define RREG32_XCC(reg, flag, inst) amdgpu_device_xcc_rreg(adev, (reg), flag, inst) +#define WREG32_XCC(reg, v, flag, inst) amdgpu_device_xcc_wreg(adev, (reg), (v), flag, inst) #define RREG32_PCIE(reg) adev->pcie_rreg(adev, (reg)) #define WREG32_PCIE(reg, v) adev->pcie_wreg(adev, (reg), (v)) #define RREG32_PCIE_PORT(reg) adev->pciep_rreg(adev, (reg)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c index 490c8f5ddb60..c94df54e2657 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c @@ -300,7 +300,7 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device *adev, void *mqd, hqd_end = SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_AQL_DISPATCH_ID_HI); for (reg = hqd_base; reg <= hqd_end; reg++) - WREG32_RLC(reg, mqd_hqd[reg - hqd_base]); + WREG32_XCC(reg, mqd_hqd[reg - hqd_base], AMDGPU_REGS_RLC, inst); /* Activate doorbell logic before triggering WPTR poll. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index 51011e8ee90d..47c8c334c779 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -239,7 +239,7 @@ int kgd_gfx_v9_hqd_load(struct amdgpu_device *adev, void *mqd, for (reg = hqd_base; reg <= SOC15_REG_OFFSET(GC, GET_INST(GC, inst), mmCP_HQD_PQ_WPTR_HI); reg++) - WREG32_RLC(reg, mqd_hqd[reg - hqd_base]); + WREG32_XCC(reg, mqd_hqd[reg - hqd_base], AMDGPU_REGS_RLC, inst); /* Activate doorbell logic before triggering WPTR poll. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
Re: [PATCH 2/5] drm/amdgpu: Add xcc instance parameter to *REG32_SOC15_IP_NO_KIQ (v2)
On 10/26/2023 2:22 AM, Victor Lu wrote: The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface when programming other XCCs. Add xcc instance parameter to them. v2: rebase Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 drivers/gpu/drm/amd/amdgpu/soc15_common.h | 6 +++--- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index fee3141bb607..b7b1b04b66cb 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -856,9 +856,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, for (j = 0; j < adev->usec_timeout; j++) { /* a read return value of 1 means semaphore acquire */ if (vmhub >= AMDGPU_MMHUB0(0)) - tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem); + tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem, vmhub - AMDGPU_MMHUB(0)); The parameter expects xcc id, but this doesn't return that. Only a max of 4 MMHUBs are there. To get corresponding xcc, it needs more calculation. If xcc doesn't matter for MMHUB registers, use the first xcc. Thanks, Lijo else - tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem); + tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem, vmhub); if (tmp & 0x1) break; udelay(1); @@ -869,9 +869,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, } if (vmhub >= AMDGPU_MMHUB0(0)) - WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req); + WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req, vmhub - AMDGPU_MMHUB(0)); else - WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req); + WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req, vmhub); /* * Issue a dummy read to wait for the ACK register to @@ -884,9 +884,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, for (j = 0; j < adev->usec_timeout; j++) { if (vmhub >= AMDGPU_MMHUB0(0)) - tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack); + tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack, vmhub - AMDGPU_MMHUB(0)); else - tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack); + tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack, vmhub); if (tmp & (1 << vmid)) break; udelay(1); @@ -899,9 +899,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, * write with 0 means semaphore release */ if (vmhub >= AMDGPU_MMHUB0(0)) - WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0); + WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0, vmhub - AMDGPU_MMHUB(0)); else - WREG32_SOC15_IP_NO_KIQ(GC, sem, 0); + WREG32_SOC15_IP_NO_KIQ(GC, sem, 0, vmhub); } spin_unlock(>gmc.invalidate_lock); diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h index da683afa0222..c75e9cd5c98b 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h +++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h @@ -69,7 +69,7 @@ #define RREG32_SOC15_IP(ip, reg) __RREG32_SOC15_RLC__(reg, 0, ip##_HWIP, 0) -#define RREG32_SOC15_IP_NO_KIQ(ip, reg) __RREG32_SOC15_RLC__(reg, AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0) +#define RREG32_SOC15_IP_NO_KIQ(ip, reg, inst) __RREG32_SOC15_RLC__(reg, AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst) #define RREG32_SOC15_NO_KIQ(ip, inst, reg) \ __RREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \ @@ -86,8 +86,8 @@ #define WREG32_SOC15_IP(ip, reg, value) \ __WREG32_SOC15_RLC__(reg, value, 0, ip##_HWIP, 0) -#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value) \ -__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0) +#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value, inst) \ +__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst) #define WREG32_SOC15_NO_KIQ(ip, inst, reg, value) \ __WREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg, \
[PATCH] drm/amdgpu set doorbell range when gpu recovery in sriov environment
GFX doorbell range should be set after flr otherwise the GFX doorbell range will overlap with MEC. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index d9ccacd06fba..9074ced34277 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -6467,6 +6467,12 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring *ring) if (adev->gfx.me.mqd_backup[mqd_idx]) memcpy(adev->gfx.me.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); } else { + if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev)) { + mutex_lock(>srbm_mutex); + if (ring->doorbell_index == adev->doorbell_index.gfx_ring0 << 1) + gfx_v10_0_cp_gfx_set_doorbell(adev, ring); + mutex_unlock(>srbm_mutex); + } /* restore mqd with the backup copy */ if (adev->gfx.me.mqd_backup[mqd_idx]) memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd)); -- 2.25.1
Re: [PATCH] drm/amdgpu/gfx10,11: use memcpy_to/fromio for MQDs
Am 26.10.23 um 20:56 schrieb Alex Deucher: Since they were moved to VRAM, we need to use the IO variants of memcpy. Fixes: 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM") Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 12 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 12 ++-- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 9032d7a24d7c..306252cd67fd 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -6457,11 +6457,11 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring *ring) nv_grbm_select(adev, 0, 0, 0, 0); mutex_unlock(>srbm_mutex); if (adev->gfx.me.mqd_backup[mqd_idx]) - memcpy(adev->gfx.me.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); + memcpy_fromio(adev->gfx.me.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); } else { /* restore mqd with the backup copy */ if (adev->gfx.me.mqd_backup[mqd_idx]) - memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd)); + memcpy_toio(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd)); /* reset the ring */ ring->wptr = 0; *ring->wptr_cpu_addr = 0; @@ -6735,7 +6735,7 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring *ring) if (amdgpu_in_reset(adev)) { /* for GPU_RESET case */ /* reset MQD to a clean status */ if (adev->gfx.kiq[0].mqd_backup) - memcpy(mqd, adev->gfx.kiq[0].mqd_backup, sizeof(*mqd)); + memcpy_toio(mqd, adev->gfx.kiq[0].mqd_backup, sizeof(*mqd)); /* reset ring buffer */ ring->wptr = 0; @@ -6758,7 +6758,7 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring *ring) mutex_unlock(>srbm_mutex); if (adev->gfx.kiq[0].mqd_backup) - memcpy(adev->gfx.kiq[0].mqd_backup, mqd, sizeof(*mqd)); + memcpy_fromio(adev->gfx.kiq[0].mqd_backup, mqd, sizeof(*mqd)); } return 0; @@ -6779,11 +6779,11 @@ static int gfx_v10_0_kcq_init_queue(struct amdgpu_ring *ring) mutex_unlock(>srbm_mutex); if (adev->gfx.mec.mqd_backup[mqd_idx]) - memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); + memcpy_fromio(adev->gfx.mec.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); } else { /* restore MQD to a clean status */ if (adev->gfx.mec.mqd_backup[mqd_idx]) - memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], sizeof(*mqd)); + memcpy_toio(mqd, adev->gfx.mec.mqd_backup[mqd_idx], sizeof(*mqd)); /* reset ring buffer */ ring->wptr = 0; atomic64_set((atomic64_t *)ring->wptr_cpu_addr, 0); diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 762d7a19f1be..43d066bc5245 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -3684,11 +3684,11 @@ static int gfx_v11_0_gfx_init_queue(struct amdgpu_ring *ring) soc21_grbm_select(adev, 0, 0, 0, 0); mutex_unlock(>srbm_mutex); if (adev->gfx.me.mqd_backup[mqd_idx]) - memcpy(adev->gfx.me.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); + memcpy_fromio(adev->gfx.me.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); } else { /* restore mqd with the backup copy */ if (adev->gfx.me.mqd_backup[mqd_idx]) - memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd)); + memcpy_toio(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd)); /* reset the ring */ ring->wptr = 0; *ring->wptr_cpu_addr = 0; @@ -3977,7 +3977,7 @@ static int gfx_v11_0_kiq_init_queue(struct amdgpu_ring *ring) if (amdgpu_in_reset(adev)) { /* for GPU_RESET case */ /* reset MQD to a clean status */ if (adev->gfx.kiq[0].mqd_backup) - memcpy(mqd, adev->gfx.kiq[0].mqd_backup, sizeof(*mqd)); + memcpy_toio(mqd, adev->gfx.kiq[0].mqd_backup, sizeof(*mqd)); /* reset ring buffer */ ring->wptr = 0; @@ -4000,7 +4000,7 @@ static int gfx_v11_0_kiq_init_queue(struct amdgpu_ring *ring) mutex_unlock(>srbm_mutex); if (adev->gfx.kiq[0].mqd_backup) - memcpy(adev->gfx.kiq[0].mqd_backup, mqd, sizeof(*mqd)); + memcpy_fromio(adev->gfx.kiq[0].mqd_backup, mqd, sizeof(*mqd)); } return 0;
RE: [PATCH] drm/amdgpu: use mode-2 reset for RAS poison consumption
[AMD Official Use Only - General] Reviewed-by: Stanley.Yang Regards, Stanley > -Original Message- > From: amd-gfx On Behalf Of Tao > Zhou > Sent: Friday, October 27, 2023 12:04 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhou1, Tao > Subject: [PATCH] drm/amdgpu: use mode-2 reset for RAS poison > consumption > > Switch from mode-1 reset to mode-2 for poison consumption. > > Signed-off-by: Tao Zhou > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > index f74347cc087a..d65e21914d8c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > @@ -166,8 +166,12 @@ static int amdgpu_umc_do_page_retirement(struct > amdgpu_device *adev, > } > } > > - if (reset) > + if (reset) { > + /* use mode-2 reset for poison consumption */ > + if (!entry) > + con->gpu_reset_flags |= > AMDGPU_RAS_GPU_RESET_MODE2_RESET; > amdgpu_ras_reset_gpu(adev); > + } > } > > kfree(err_data->err_addr); > -- > 2.35.1
Re: [PATCH] MAINTAINERS: Update the GPU Scheduler email
Am 26.10.23 um 21:32 schrieb Alex Deucher: On Thu, Oct 26, 2023 at 1:45 PM Luben Tuikov wrote: Update the GPU Scheduler maintainer email. Cc: Alex Deucher Cc: Christian König Cc: Daniel Vetter Cc: Dave Airlie Cc: AMD Graphics Cc: Direct Rendering Infrastructure - Development Signed-off-by: Luben Tuikov Acked-by: Alex Deucher Acked-by: Christian König --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 4452508bc1b040..f13e476ed8038b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7153,7 +7153,7 @@ F:Documentation/devicetree/bindings/display/xlnx/ F: drivers/gpu/drm/xlnx/ DRM GPU SCHEDULER -M: Luben Tuikov +M: Luben Tuikov L: dri-de...@lists.freedesktop.org S: Maintained T: git git://anongit.freedesktop.org/drm/drm-misc base-commit: 56e449603f0ac580700621a356d35d5716a62ce5 -- 2.42.0
Re: [PATCH 1/2] drm/amdgpu: Remove duplicate fdinfo fields
Am 27.10.23 um 00:05 schrieb Umio Yasuno: From: Rob Clark Some of the fields that are handled by drm_show_fdinfo() crept back in when rebasing the patch. Remove them again. Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") Signed-off-by: Rob Clark Reviewed-by: Co-developed-by: Umio Yasuno Signed-off-by: Umio Yasuno Reviewed-by: Christian König for the series. --- This thread has been inactive for nearly 5 months, so I re-created the patch. drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c index 6038b5021..e9b5d1903 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c @@ -87,9 +87,6 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file) */ drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid); - drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name); - drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn); - drm_printf(p, "drm-client-id:\t%llu\n", vm->immediate.fence_context); drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL); drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL); drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);