[RFC PATCH] drm/amdkfd: Run restore_workers on freezable WQs

2023-10-27 Thread Felix Kuehling
Make restore workers freezable so we don't have to explicitly flush them
in suspend and GPU reset code paths, and we don't accidentally try to
restore BOs while the GPU is suspended. Not having to flush restore_work
also helps avoid lock/fence dependencies in the GPU reset case where we're
not allowed to wait for fences.

This is an RFC and request for testing. I have not tested this myself yet.
My notes below:

Restore work won't be frozen during GPU reset. Does it matter? Queues will
stay evicted until resume in any case. But restore_work may be in trouble
if it gets run in the middle of a GPU reset. In that case, if anything
fails, it will just reschedule itself, so should be fine as long as it
doesn't interfere with the reset itself (respects any mechanisms in place
to prevent HW access during the reset).

What HW access does restore_work perform:
- migrating buffers: uses GPU scheduler, should be suspended during reset
- TLB flushes in userptr restore worker: not called directly, relies on
  HWS to flush TLBs on VMID attachment
- TLB flushes in SVM restore worker: Does TLB flush in the mapping code
- Resuming user mode queues: should not happen while GPU reset keeps queue
  eviction counter elevated
Ergo: Except for the SVM case, it's OK to not flush restore work before
GPU resets. I'll need to rethink the interaction of SVM restore_work with
GPU resets.

What about cancelling p->eviction_work? Eviction work would normally be
needed to signal eviction fences, but we're doing that explicitly in
suspend_all_processes. Does eviction_work wait for fences anywhere? Yes,
indirectly by flushing restore_work. So we should not try to cancel it
during a GPU reset.

Problem: accessing p->ef concurrently in evict_process_worker and
suspend_all_processes. Need a spinlock to use and update it safely.
Problem: What if evict_process_worker gets stuck in flushing restore_work?
We can skip all of that if p->ef is NULL (which it is during the reset).
Even if it gets stuck, it's no problem if the reset doesn't depend on it.
It should get unstuck after the reset.

Signed-off-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  9 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 49 +--
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  4 +-
 4 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 54f31a420229..89e632257663 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1644,7 +1644,8 @@ int amdgpu_amdkfd_criu_resume(void *p)
goto out_unlock;
}
WRITE_ONCE(pinfo->block_mmu_notifications, false);
-   schedule_delayed_work(>restore_userptr_work, 0);
+   queue_delayed_work(system_freezable_wq,
+  >restore_userptr_work, 0);
 
 out_unlock:
mutex_unlock(>lock);
@@ -2458,7 +2459,8 @@ int amdgpu_amdkfd_evict_userptr(struct 
mmu_interval_notifier *mni,
   KFD_QUEUE_EVICTION_TRIGGER_USERPTR);
if (r)
pr_err("Failed to quiesce KFD\n");
-   schedule_delayed_work(_info->restore_userptr_work,
+   queue_delayed_work(system_freezable_wq,
+   _info->restore_userptr_work,
msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
}
mutex_unlock(_info->notifier_lock);
@@ -2793,7 +2795,8 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct 
work_struct *work)
 
/* If validation failed, reschedule another attempt */
if (evicted_bos) {
-   schedule_delayed_work(_info->restore_userptr_work,
+   queue_delayed_work(system_freezable_wq,
+   _info->restore_userptr_work,
msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
 
kfd_smi_event_queue_restore_rescheduled(mm);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 9cc32f577e38..cf017d027fee 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -919,6 +919,7 @@ struct kfd_process {
 * during restore
 */
struct dma_fence *ef;
+   spinlock_t ef_lock;
 
/* Work items for evicting and restoring BOs */
struct delayed_work eviction_work;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index fbf053001af9..a07cba58ec5e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -664,7 +664,8 @@ int kfd_process_create_wq(void)
if (!kfd_process_wq)
kfd_process_wq = alloc_workqueue("kfd_process_wq", 0, 0);
if (!kfd_restore_wq)
-   

Re: drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headers

2023-10-27 Thread Mario Limonciello

On 10/27/2023 15:41, Alex Deucher wrote:

For pptable structs that use flexible array sizes, use flexible arrays.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926
Signed-off-by: Alex Deucher 


Reviewed-by: Mario Limonciello 


---
  .../drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h |  4 ++--
  .../amd/pm/powerplay/hwmgr/vega10_pptable.h   | 24 +--
  2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
index 9fcad69a9f34..2cf2a7b12623 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
@@ -367,7 +367,7 @@ typedef struct _ATOM_Tonga_VCE_State_Record {
  typedef struct _ATOM_Tonga_VCE_State_Table {
UCHAR ucRevId;
UCHAR ucNumEntries;
-   ATOM_Tonga_VCE_State_Record entries[1];
+   ATOM_Tonga_VCE_State_Record entries[];
  } ATOM_Tonga_VCE_State_Table;
  
  typedef struct _ATOM_Tonga_PowerTune_Table {

@@ -481,7 +481,7 @@ typedef struct _ATOM_Tonga_Hard_Limit_Record {
  typedef struct _ATOM_Tonga_Hard_Limit_Table {
UCHAR ucRevId;
UCHAR ucNumEntries;
-   ATOM_Tonga_Hard_Limit_Record entries[1];
+   ATOM_Tonga_Hard_Limit_Record entries[];
  } ATOM_Tonga_Hard_Limit_Table;
  
  typedef struct _ATOM_Tonga_GPIO_Table {

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
index 8b0590b834cc..de2926df5ed7 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
@@ -129,7 +129,7 @@ typedef struct _ATOM_Vega10_State {
  typedef struct _ATOM_Vega10_State_Array {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries. */
-   ATOM_Vega10_State states[1]; /* Dynamically 
allocate entries. */
+   ATOM_Vega10_State states[]; /* Dynamically 
allocate entries. */
  } ATOM_Vega10_State_Array;
  
  typedef struct _ATOM_Vega10_CLK_Dependency_Record {

@@ -169,37 +169,37 @@ typedef struct _ATOM_Vega10_GFXCLK_Dependency_Table {
  typedef struct _ATOM_Vega10_MCLK_Dependency_Table {
  UCHAR ucRevId;
  UCHAR ucNumEntries; /* Number of 
entries. */
-ATOM_Vega10_MCLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+ATOM_Vega10_MCLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
  } ATOM_Vega10_MCLK_Dependency_Table;
  
  typedef struct _ATOM_Vega10_SOCCLK_Dependency_Table {

  UCHAR ucRevId;
  UCHAR ucNumEntries; /* Number of 
entries. */
-ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
  } ATOM_Vega10_SOCCLK_Dependency_Table;
  
  typedef struct _ATOM_Vega10_DCEFCLK_Dependency_Table {

  UCHAR ucRevId;
  UCHAR ucNumEntries; /* Number of 
entries. */
-ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
  } ATOM_Vega10_DCEFCLK_Dependency_Table;
  
  typedef struct _ATOM_Vega10_PIXCLK_Dependency_Table {

UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries. */
-   ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+   ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
  } ATOM_Vega10_PIXCLK_Dependency_Table;
  
  typedef struct _ATOM_Vega10_DISPCLK_Dependency_Table {

UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries.*/
-   ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+   ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
  } ATOM_Vega10_DISPCLK_Dependency_Table;
  
  typedef struct _ATOM_Vega10_PHYCLK_Dependency_Table {

UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries. */
-   ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+   ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
  } ATOM_Vega10_PHYCLK_Dependency_Table;
  
  typedef struct _ATOM_Vega10_MM_Dependency_Record {

@@ -213,7 +213,7 @@ typedef struct _ATOM_Vega10_MM_Dependency_Record {
  typedef struct _ATOM_Vega10_MM_Dependency_Table {
UCHAR ucRevId;
UCHAR 

[PATCH] drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headers

2023-10-27 Thread Alex Deucher
For pptable structs that use flexible array sizes, use flexible arrays.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926
Signed-off-by: Alex Deucher 
---
 .../drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h |  4 ++--
 .../amd/pm/powerplay/hwmgr/vega10_pptable.h   | 24 +--
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
index 9fcad69a9f34..2cf2a7b12623 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h
@@ -367,7 +367,7 @@ typedef struct _ATOM_Tonga_VCE_State_Record {
 typedef struct _ATOM_Tonga_VCE_State_Table {
UCHAR ucRevId;
UCHAR ucNumEntries;
-   ATOM_Tonga_VCE_State_Record entries[1];
+   ATOM_Tonga_VCE_State_Record entries[];
 } ATOM_Tonga_VCE_State_Table;
 
 typedef struct _ATOM_Tonga_PowerTune_Table {
@@ -481,7 +481,7 @@ typedef struct _ATOM_Tonga_Hard_Limit_Record {
 typedef struct _ATOM_Tonga_Hard_Limit_Table {
UCHAR ucRevId;
UCHAR ucNumEntries;
-   ATOM_Tonga_Hard_Limit_Record entries[1];
+   ATOM_Tonga_Hard_Limit_Record entries[];
 } ATOM_Tonga_Hard_Limit_Table;
 
 typedef struct _ATOM_Tonga_GPIO_Table {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
index 8b0590b834cc..de2926df5ed7 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
@@ -129,7 +129,7 @@ typedef struct _ATOM_Vega10_State {
 typedef struct _ATOM_Vega10_State_Array {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries. */
-   ATOM_Vega10_State states[1]; /* Dynamically 
allocate entries. */
+   ATOM_Vega10_State states[]; /* Dynamically 
allocate entries. */
 } ATOM_Vega10_State_Array;
 
 typedef struct _ATOM_Vega10_CLK_Dependency_Record {
@@ -169,37 +169,37 @@ typedef struct _ATOM_Vega10_GFXCLK_Dependency_Table {
 typedef struct _ATOM_Vega10_MCLK_Dependency_Table {
 UCHAR ucRevId;
 UCHAR ucNumEntries; /* Number of 
entries. */
-ATOM_Vega10_MCLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+ATOM_Vega10_MCLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
 } ATOM_Vega10_MCLK_Dependency_Table;
 
 typedef struct _ATOM_Vega10_SOCCLK_Dependency_Table {
 UCHAR ucRevId;
 UCHAR ucNumEntries; /* Number of 
entries. */
-ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
 } ATOM_Vega10_SOCCLK_Dependency_Table;
 
 typedef struct _ATOM_Vega10_DCEFCLK_Dependency_Table {
 UCHAR ucRevId;
 UCHAR ucNumEntries; /* Number of 
entries. */
-ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
 } ATOM_Vega10_DCEFCLK_Dependency_Table;
 
 typedef struct _ATOM_Vega10_PIXCLK_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries. */
-   ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+   ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
 } ATOM_Vega10_PIXCLK_Dependency_Table;
 
 typedef struct _ATOM_Vega10_DISPCLK_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries.*/
-   ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+   ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
 } ATOM_Vega10_DISPCLK_Dependency_Table;
 
 typedef struct _ATOM_Vega10_PHYCLK_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries. */
-   ATOM_Vega10_CLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+   ATOM_Vega10_CLK_Dependency_Record entries[];/* Dynamically 
allocate entries. */
 } ATOM_Vega10_PHYCLK_Dependency_Table;
 
 typedef struct _ATOM_Vega10_MM_Dependency_Record {
@@ -213,7 +213,7 @@ typedef struct _ATOM_Vega10_MM_Dependency_Record {
 typedef struct _ATOM_Vega10_MM_Dependency_Table {
UCHAR ucRevId;
UCHAR ucNumEntries; /* Number 
of entries */
-   ATOM_Vega10_MM_Dependency_Record entries[1];   

[PATCH] drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2)

2023-10-27 Thread Victor Lu
W/RREG32_RLC is hardedcoded to use instance 0. W/RREG32_SOC15_RLC
should be used instead when inst != 0.

v2: rebase

Signed-off-by: Victor Lu 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c   | 38 --
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 40 +--
 drivers/gpu/drm/amd/amdgpu/soc15_common.h |  2 +-
 3 files changed, 37 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
index 80309d39737a..f6598b9e4faa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
@@ -306,8 +306,7 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device 
*adev, void *mqd,
/* Activate doorbell logic before triggering WPTR poll. */
data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
 CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
-   WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
regCP_HQD_PQ_DOORBELL_CONTROL),
-   data);
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_DOORBELL_CONTROL, 
data);
 
if (wptr) {
/* Don't read wptr with get_user because the user
@@ -336,27 +335,24 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device 
*adev, void *mqd,
guessed_wptr += m->cp_hqd_pq_wptr_lo & ~(queue_size - 1);
guessed_wptr += (uint64_t)m->cp_hqd_pq_wptr_hi << 32;
 
-   WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
regCP_HQD_PQ_WPTR_LO),
-  lower_32_bits(guessed_wptr));
-   WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
regCP_HQD_PQ_WPTR_HI),
-  upper_32_bits(guessed_wptr));
-   WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
regCP_HQD_PQ_WPTR_POLL_ADDR),
-  lower_32_bits((uintptr_t)wptr));
-   WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst),
-   regCP_HQD_PQ_WPTR_POLL_ADDR_HI),
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_LO,
+   lower_32_bits(guessed_wptr));
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_PQ_WPTR_HI,
+   upper_32_bits(guessed_wptr));
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), 
regCP_HQD_PQ_WPTR_POLL_ADDR,
+   lower_32_bits((uintptr_t)wptr));
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), 
regCP_HQD_PQ_WPTR_POLL_ADDR_HI,
upper_32_bits((uintptr_t)wptr));
-   WREG32(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
regCP_PQ_WPTR_POLL_CNTL1),
-  (uint32_t)kgd_gfx_v9_get_queue_mask(adev, pipe_id,
-  queue_id));
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), 
regCP_PQ_WPTR_POLL_CNTL1,
+   (uint32_t)kgd_gfx_v9_get_queue_mask(adev, pipe_id, 
queue_id));
}
 
/* Start the EOP fetcher */
-   WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_EOP_RPTR),
-  REG_SET_FIELD(m->cp_hqd_eop_rptr,
-CP_HQD_EOP_RPTR, INIT_FETCHER, 1));
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_EOP_RPTR,
+  REG_SET_FIELD(m->cp_hqd_eop_rptr, CP_HQD_EOP_RPTR, INIT_FETCHER, 
1));
 
data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
-   WREG32_RLC(SOC15_REG_OFFSET(GC, GET_INST(GC, inst), regCP_HQD_ACTIVE), 
data);
+   WREG32_SOC15_RLC(GC, GET_INST(GC, inst), regCP_HQD_ACTIVE, data);
 
kgd_gfx_v9_release_queue(adev, inst);
 
@@ -494,15 +490,15 @@ static uint32_t kgd_gfx_v9_4_3_set_address_watch(
VALID,
1);
 
-   WREG32_RLC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst),
+   WREG32_XCC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst),
regTCP_WATCH0_ADDR_H) +
(watch_id * TCP_WATCH_STRIDE)),
-   watch_address_high);
+   watch_address_high, inst);
 
-   WREG32_RLC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst),
+   WREG32_XCC((SOC15_REG_OFFSET(GC, GET_INST(GC, inst),
regTCP_WATCH0_ADDR_L) +
(watch_id * TCP_WATCH_STRIDE)),
-   watch_address_low);
+   watch_address_low, inst);
 
return watch_address_cntl;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index 9285789b3a42..00fbc0f44c92 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -91,8 +91,8 @@ void kgd_gfx_v9_program_sh_mem_settings(struct amdgpu_device 
*adev, uint32_t vmi
 {
kgd_gfx_v9_lock_srbm(adev, 0, 0, 0, vmid, inst);
 
-   

[PATCH] drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait (v3)

2023-10-27 Thread Victor Lu
amdgpu_virt_kiq_reg_write_reg_wait is hardcoded to use MEC engine 0.
Add xcc_inst as a parameter to allow it to use different MEC engines.

v3: use first xcc for MMHUB in gmc_v9_0_flush_gpu_tlb

v2: rebase

Signed-off-by: Victor Lu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c |  5 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |  3 ++-
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 26 ++--
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 7a084fbfd33c..82762c61d3ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -73,9 +73,10 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
 
 void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev,
uint32_t reg0, uint32_t reg1,
-   uint32_t ref, uint32_t mask)
+   uint32_t ref, uint32_t mask,
+   uint32_t xcc_inst)
 {
-   struct amdgpu_kiq *kiq = >gfx.kiq[0];
+   struct amdgpu_kiq *kiq = >gfx.kiq[xcc_inst];
struct amdgpu_ring *ring = >ring;
signed long r, cnt = 0;
unsigned long flags;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 03c0e38b8aea..5c64258eb728 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -334,7 +334,8 @@ bool amdgpu_virt_mmio_blocked(struct amdgpu_device *adev);
 void amdgpu_virt_init_setting(struct amdgpu_device *adev);
 void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev,
uint32_t reg0, uint32_t rreg1,
-   uint32_t ref, uint32_t mask);
+   uint32_t ref, uint32_t mask,
+   uint32_t xcc_id);
 int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init);
 int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init);
 int amdgpu_virt_reset_gpu(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index d8a4fddab9c1..173237e99882 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -268,7 +268,7 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
if (adev->gfx.kiq[0].ring.sched.ready && !adev->enable_mes &&
(amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
amdgpu_virt_kiq_reg_write_reg_wait(adev, req, ack, inv_req,
-   1 << vmid);
+   1 << vmid, 0);
return;
}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
index 19eaada35ede..2e4abb356e38 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
@@ -228,7 +228,7 @@ static void gmc_v11_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
if ((adev->gfx.kiq[0].ring.sched.ready || adev->mes.ring.sched.ready) &&
(amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
amdgpu_virt_kiq_reg_write_reg_wait(adev, req, ack, inv_req,
-   1 << vmid);
+   1 << vmid, 0);
return;
}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 0ab9c554da78..32cc3645f02b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -817,7 +817,7 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
uint32_t vmhub, uint32_t flush_type)
 {
bool use_semaphore = gmc_v9_0_use_invalidate_semaphore(adev, vmhub);
-   u32 j, inv_req, tmp, sem, req, ack;
+   u32 j, inv_req, tmp, sem, req, ack, inst;
const unsigned int eng = 17;
struct amdgpu_vmhub *hub;
 
@@ -832,13 +832,17 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
/* This is necessary for a HW workaround under SRIOV as well
 * as GFXOFF under bare metal
 */
-   if (adev->gfx.kiq[0].ring.sched.ready &&
+   if (vmhub >= AMDGPU_MMHUB0(0))
+   inst = 0;
+   else
+   inst = vmhub;
+   if (adev->gfx.kiq[inst].ring.sched.ready &&
(amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
uint32_t req = hub->vm_inv_eng0_req + hub->eng_distance * eng;
uint32_t ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
 

[PATCH] drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v4)

2023-10-27 Thread Victor Lu
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0.

Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC
and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter.

Using amdgpu_sriov_runtime to determine whether to access via kiq or
RLC is sufficient for now.

v4: avoid using amdgpu_sriov_w/rreg

v3: use W/RREG32_XCC to handle non-kiq case

v2: define amdgpu_device_xcc_wreg/rreg instead of changing parameters
of amdgpu_device_wreg/rreg

Signed-off-by: Victor Lu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   | 13 ++-
 .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c   |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 91 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   |  8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h  |  4 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c   |  8 +-
 9 files changed, 118 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 43c579f5a95e..e8dc75a3ff44 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1162,11 +1162,18 @@ uint32_t amdgpu_device_rreg(struct amdgpu_device *adev,
uint32_t reg, uint32_t acc_flags);
 u32 amdgpu_device_indirect_rreg_ext(struct amdgpu_device *adev,
u64 reg_addr);
+uint32_t amdgpu_device_xcc_rreg(struct amdgpu_device *adev,
+   uint32_t reg, uint32_t acc_flags,
+   uint32_t xcc_id);
 void amdgpu_device_wreg(struct amdgpu_device *adev,
uint32_t reg, uint32_t v,
uint32_t acc_flags);
 void amdgpu_device_indirect_wreg_ext(struct amdgpu_device *adev,
 u64 reg_addr, u32 reg_data);
+void amdgpu_device_xcc_wreg(struct amdgpu_device *adev,
+   uint32_t reg, uint32_t v,
+   uint32_t acc_flags,
+   uint32_t xcc_id);
 void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
 uint32_t reg, uint32_t v, uint32_t xcc_id);
 void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t 
value);
@@ -1207,8 +1214,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev);
 #define RREG32_NO_KIQ(reg) amdgpu_device_rreg(adev, (reg), AMDGPU_REGS_NO_KIQ)
 #define WREG32_NO_KIQ(reg, v) amdgpu_device_wreg(adev, (reg), (v), 
AMDGPU_REGS_NO_KIQ)

-#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg))
-#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v))
+#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg), 0)
+#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v), 0)

 #define RREG8(reg) amdgpu_mm_rreg8(adev, (reg))
 #define WREG8(reg, v) amdgpu_mm_wreg8(adev, (reg), (v))
@@ -1218,6 +1225,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev);
 #define WREG32(reg, v) amdgpu_device_wreg(adev, (reg), (v), 0)
 #define REG_SET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK)
 #define REG_GET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK)
+#define RREG32_XCC(reg, inst) amdgpu_device_xcc_rreg(adev, (reg), 0, inst)
+#define WREG32_XCC(reg, v, inst) amdgpu_device_xcc_wreg(adev, (reg), (v), 0, 
inst)
 #define RREG32_PCIE(reg) adev->pcie_rreg(adev, (reg))
 #define WREG32_PCIE(reg, v) adev->pcie_wreg(adev, (reg), (v))
 #define RREG32_PCIE_PORT(reg) adev->pciep_rreg(adev, (reg))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
index 490c8f5ddb60..80309d39737a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
@@ -300,7 +300,7 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device 
*adev, void *mqd,
hqd_end = SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
regCP_HQD_AQL_DISPATCH_ID_HI);

for (reg = hqd_base; reg <= hqd_end; reg++)
-   WREG32_RLC(reg, mqd_hqd[reg - hqd_base]);
+   WREG32_XCC(reg, mqd_hqd[reg - hqd_base], inst);


/* Activate doorbell logic before triggering WPTR poll. */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index 51011e8ee90d..9285789b3a42 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -239,7 +239,7 @@ int kgd_gfx_v9_hqd_load(struct amdgpu_device *adev, void 
*mqd,

for (reg = hqd_base;
 reg <= SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
mmCP_HQD_PQ_WPTR_HI); reg++)
-   WREG32_RLC(reg, mqd_hqd[reg - hqd_base]);
+   WREG32_XCC(reg, mqd_hqd[reg - hqd_base], inst);


/* Activate doorbell logic before triggering WPTR poll. */
diff --git 

[PATCH] drm/amdgpu: Add xcc instance parameter to *REG32_SOC15_IP_NO_KIQ (v3)

2023-10-27 Thread Victor Lu
The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface
when programming other XCCs.

Add xcc instance parameter to them.

v3: xcc not needed for MMMHUB

v2: rebase

Signed-off-by: Victor Lu 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 
 drivers/gpu/drm/amd/amdgpu/soc15_common.h |  6 +++---
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 3a1050344b59..0ab9c554da78 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -856,9 +856,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
for (j = 0; j < adev->usec_timeout; j++) {
/* a read return value of 1 means semaphore acquire */
if (vmhub >= AMDGPU_MMHUB0(0))
-   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem);
+   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0);
else
-   tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem);
+   tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem, vmhub);
if (tmp & 0x1)
break;
udelay(1);
@@ -869,9 +869,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
}
 
if (vmhub >= AMDGPU_MMHUB0(0))
-   WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req);
+   WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req, 0);
else
-   WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req);
+   WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req, vmhub);
 
/*
 * Issue a dummy read to wait for the ACK register to
@@ -884,9 +884,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
 
for (j = 0; j < adev->usec_timeout; j++) {
if (vmhub >= AMDGPU_MMHUB0(0))
-   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack);
+   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack, 0);
else
-   tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack);
+   tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack, vmhub);
if (tmp & (1 << vmid))
break;
udelay(1);
@@ -899,9 +899,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
 * write with 0 means semaphore release
 */
if (vmhub >= AMDGPU_MMHUB0(0))
-   WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0);
+   WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0, 0);
else
-   WREG32_SOC15_IP_NO_KIQ(GC, sem, 0);
+   WREG32_SOC15_IP_NO_KIQ(GC, sem, 0, vmhub);
}
 
spin_unlock(>gmc.invalidate_lock);
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h 
b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
index da683afa0222..c75e9cd5c98b 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h
+++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
@@ -69,7 +69,7 @@
 
 #define RREG32_SOC15_IP(ip, reg) __RREG32_SOC15_RLC__(reg, 0, ip##_HWIP, 0)
 
-#define RREG32_SOC15_IP_NO_KIQ(ip, reg) __RREG32_SOC15_RLC__(reg, 
AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0)
+#define RREG32_SOC15_IP_NO_KIQ(ip, reg, inst) __RREG32_SOC15_RLC__(reg, 
AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst)
 
 #define RREG32_SOC15_NO_KIQ(ip, inst, reg) \
__RREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] 
+ reg, \
@@ -86,8 +86,8 @@
 #define WREG32_SOC15_IP(ip, reg, value) \
 __WREG32_SOC15_RLC__(reg, value, 0, ip##_HWIP, 0)
 
-#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value) \
-__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0)
+#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value, inst) \
+__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst)
 
 #define WREG32_SOC15_NO_KIQ(ip, inst, reg, value) \
__WREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] 
+ reg, \
-- 
2.34.1



Re: [PATCHv2 1/2] drm/amdkfd: Populate cache info for GFX 9.4.3

2023-10-27 Thread Felix Kuehling

On 2023-10-27 15:04, Mukul Joshi wrote:

GFX 9.4.3 uses a new version of the GC info table which
contains the cache info. This patch adds a new function
to populate the cache info from IP discovery for GFX 9.4.3.

Signed-off-by: Mukul Joshi 
---
v1->v2:
- Separate out the original patch into 2 patches.

  drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 66 ++-
  1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 0e792a8496d6..cd8e459201f1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1404,6 +1404,66 @@ static int 
kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev,
return i;
  }
  
+static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev,

+  struct kfd_gpu_cache_info 
*pcache_info)
+{
+   struct amdgpu_device *adev = kdev->adev;
+   int i = 0;
+
+   /* TCP L1 Cache per CU */
+   if (adev->gfx.config.gc_tcp_size_per_cu) {
+   pcache_info[i].cache_size = adev->gfx.config.gc_tcp_size_per_cu;
+   pcache_info[i].cache_level = 1;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = 1;
+   i++;
+   }
+   /* Scalar L1 Instruction Cache per SQC */
+   if (adev->gfx.config.gc_l1_instruction_cache_size_per_sqc) {
+   pcache_info[i].cache_size =
+   adev->gfx.config.gc_l1_instruction_cache_size_per_sqc;
+   pcache_info[i].cache_level = 1;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_INST_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = 
adev->gfx.config.gc_num_cu_per_sqc;
+   i++;
+   }
+   /* Scalar L1 Data Cache per SQC */
+   if (adev->gfx.config.gc_l1_data_cache_size_per_sqc) {
+   pcache_info[i].cache_size = 
adev->gfx.config.gc_l1_data_cache_size_per_sqc;
+   pcache_info[i].cache_level = 1;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = 
adev->gfx.config.gc_num_cu_per_sqc;
+   i++;
+   }
+   /* L2 Data Cache per GPU (Total Tex Cache) */
+   if (adev->gfx.config.gc_tcc_size) {
+   pcache_info[i].cache_size = adev->gfx.config.gc_tcc_size;
+   pcache_info[i].cache_level = 2;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh;
+   i++;
+   }
+   /* L3 Data Cache per GPU */
+   if (adev->gmc.mall_size) {
+   pcache_info[i].cache_size = adev->gmc.mall_size / 1024;


Is this /1024 a unit conversion? What are the units for L1/L2 caches 
compared to L3 caches?


When we report the sizes in the topology, they should be in the same 
units for all cache levels, I believe. Given that L3 is likely the 
largest, I'm a bit suspicious of this conversion.


Other than that, the series is

Reviewed-by: Felix Kuehling 



+   pcache_info[i].cache_level = 3;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh;
+   i++;
+   }
+   return i;
+}
+
  int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info 
**pcache_info)
  {
int num_of_cache_types = 0;
@@ -1461,10 +1521,14 @@ int kfd_get_gpu_cache_info(struct kfd_node *kdev, 
struct kfd_gpu_cache_info **pc
num_of_cache_types = ARRAY_SIZE(vega20_cache_info);
break;
case IP_VERSION(9, 4, 2):
-   case IP_VERSION(9, 4, 3):
*pcache_info = aldebaran_cache_info;
num_of_cache_types = ARRAY_SIZE(aldebaran_cache_info);
break;
+   case IP_VERSION(9, 4, 3):
+   num_of_cache_types =
+   
kfd_fill_gpu_cache_info_from_gfx_config_v2(kdev->kfd,
+   
*pcache_info);
+  

[pull] amdgpu, amdkfd drm-next-6.7

2023-10-27 Thread Alex Deucher
Hi Dave, Sima,

Fixes for 6.7.

The following changes since commit 5258dfd4a6adb5f45f046b0dd2e73c680f880d9d:

  usb: typec: altmodes/displayport: fixup drm internal api change vs new user. 
(2023-10-27 07:55:41 +1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-6.7-2023-10-27

for you to fetch changes up to dd3dd9829bf9a4ecd55482050745efdd9f7f97fc:

  drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo (2023-10-27 
14:23:01 -0400)


amd-drm-next-6.7-2023-10-27:

amdgpu:
- RAS fixes
- Seamless boot fixes
- NBIO 7.7 fix
- SMU 14.0 fixes
- GC 11.5 fixes
- DML2 fixes
- ASPM fixes
- VPE fixes
- Misc code cleanups
- SRIOV fixes
- Add some missing copyright notices
- DCN 3.5 fixes
- FAMS fixes
- Backlight fix
- S/G display fix
- fdinfo cleanups
- EXT_COHERENT fixes for APU and NUMA systems

amdkfd:
- Misc fixes
- Misc code cleanups
- SVM fixes


Agustin Gutierrez (1):
  drm/amd/display: Remove power sequencing check

Alex Hung (2):
  drm/amd/display: Revert "drm/amd/display: allow edp updates for virtual 
signal"
  drm/amd/display: Set emulated sink type to HDMI accordingly.

Alvin Lee (1):
  drm/amd/display: Update FAMS sequence for DCN30 & DCN32

Aric Cyr (1):
  drm/amd/display: 3.2.256

Aurabindo Pillai (1):
  drm/amd/display: add interface to query SubVP status

Candice Li (2):
  drm/amdgpu: Identify data parity error corrected in replay mode
  drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table

David Francis (1):
  drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems

Fangzhi Zuo (1):
  drm/amd/display: Fix MST Multi-Stream Not Lighting Up on dcn35

George Shen (1):
  drm/amd/display: Update SDP VSC colorimetry from DP test automation 
request

Hamza Mahfooz (1):
  drm/amd/display: fix S/G display enablement

Hugo Hu (1):
  drm/amd/display: reprogram det size while seamless boot

Ilya Bakoulin (1):
  drm/amd/display: Fix shaper using bad LUT params

Iswara Nagulendran (1):
  drm/amd/display: Read before writing Backlight Mode Set Register

James Zhu (1):
  drm/amdxcp: fix amdxcp unloads incompletely

Jesse Zhang (1):
  drm/amdkfd: Fix shift out-of-bounds issue

Jiadong Zhu (2):
  drm/amd/pm: drop unneeded dpm features disablement for SMU 14.0.0
  drm/amdgpu: add tmz support for GC IP v11.5.0

Kenneth Feng (1):
  drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded

Lang Yu (1):
  drm/amdgpu/vpe: correct queue stop programing

Li Ma (2):
  drm/amdgpu: modify if condition in nbio_v7_7.c
  drm/amd/amdgpu: fix the GPU power print error in pm info

Lijo Lazar (4):
  drm/amdgpu: Add API to get full IP version
  drm/amdgpu: Use discovery table's subrevision
  drm/amdgpu: Add a read to GFX v9.4.3 ring test
  drm/amdgpu: Use pcie domain of xcc acpi objects

Lin.Cao (2):
  drm/amdgpu remove restriction of sriov max_pfn on Vega10
  drm/amd: check num of link levels when update pcie param

Ma Jun (1):
  drm/amd/pm: Fix the return value in default case

Mario Limonciello (4):
  drm/amd: Disable ASPM for VI w/ all Intel systems
  drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not 
supported
  drm/amd: Move AMD_IS_APU check for ASPM into top level function
  drm/amd: Explicitly disable ASPM when dynamic switching disabled

Michael Strauss (1):
  drm/amd/display: Disable SYMCLK32_SE RCO on DCN314

Mukul Joshi (1):
  drm/amdgpu: Fix typo in IP discovery parsing

Nicholas Kazlauskas (2):
  drm/amd/display: Revert "Improve x86 and dmub ips handshake"
  drm/amd/display: Fix IPS handshake for idle optimizations

Philip Yang (2):
  Revert "drm/amdkfd:remove unused code"
  Revert "drm/amdkfd: Use partial migrations in GPU page faults"

Qu Huang (1):
  drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL

Rob Clark (1):
  drm/amdgpu: Remove duplicate fdinfo fields

Rodrigo Siqueira (5):
  drm/amd/display: Set the DML2 attribute to false in all DCNs older than 
version 3.5
  drm/amd/display: Fix DMUB errors introduced by DML2
  drm/amd/display: Correct enum typo
  drm/amd/display: Add prefix to amdgpu crtc functions
  drm/amd/display: Add prefix for plane functions

Samson Tam (2):
  drm/amd/display: fix num_ways overflow error
  drm/amd/display: add null check for invalid opps

Srinivasan Shanmugam (1):
  drm/amdkfd: Address 'remap_list' not described in 'svm_range_add'

Stylon Wang (3):
  drm/amd/display: Add missing copyright notice in DMUB
  drm/amd/display: Fix copyright notice in DML2 code
  drm/amd/display: Fix copyright notice in DC code

Sung Joon Kim (2):
  drm/amd/display: Add a check for idle power optimization
  

[PATCHv2 2/2] drm/amdkfd: Update cache info for GFX 9.4.3

2023-10-27 Thread Mukul Joshi
Update cache info reporting based on compute and
memory partitioning modes.

Signed-off-by: Mukul Joshi 
---
v1->v2:
- Separate into a separate patch.
- Simplify the if condition to reduce indentation and make it
  logically more clear.

 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 4e530791507e..dc7c8312e8c7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1602,10 +1602,13 @@ static int fill_in_l2_l3_pcache(struct 
kfd_cache_properties **props_ext,
unsigned int cu_sibling_map_mask;
int first_active_cu;
int i, j, k, xcc, start, end;
+   int num_xcc = NUM_XCC(knode->xcc_mask);
struct kfd_cache_properties *pcache = NULL;
+   enum amdgpu_memory_partition mode;
+   struct amdgpu_device *adev = knode->adev;
 
start = ffs(knode->xcc_mask) - 1;
-   end = start + NUM_XCC(knode->xcc_mask);
+   end = start + num_xcc;
cu_sibling_map_mask = cu_info->bitmap[start][0][0];
cu_sibling_map_mask &=
((1 << pcache_info[cache_type].num_cu_shared) - 1);
@@ -1624,7 +1627,18 @@ static int fill_in_l2_l3_pcache(struct 
kfd_cache_properties **props_ext,
pcache->processor_id_low = cu_processor_id
+ (first_active_cu - 1);
pcache->cache_level = pcache_info[cache_type].cache_level;
-   pcache->cache_size = pcache_info[cache_type].cache_size;
+
+   if (KFD_GC_VERSION(knode) == IP_VERSION(9, 4, 3))
+   mode = 
adev->gmc.gmc_funcs->query_mem_partition_mode(adev);
+   else
+   mode = UNKNOWN_MEMORY_PARTITION_MODE;
+
+   if (pcache->cache_level == 2)
+   pcache->cache_size = pcache_info[cache_type].cache_size 
* num_xcc;
+   else if (mode)
+   pcache->cache_size = pcache_info[cache_type].cache_size 
/ mode;
+   else
+   pcache->cache_size = pcache_info[cache_type].cache_size;
 
if (pcache_info[cache_type].flags & CRAT_CACHE_FLAGS_DATA_CACHE)
pcache->cache_type |= HSA_CACHE_TYPE_DATA;
-- 
2.35.1



[PATCHv2 1/2] drm/amdkfd: Populate cache info for GFX 9.4.3

2023-10-27 Thread Mukul Joshi
GFX 9.4.3 uses a new version of the GC info table which
contains the cache info. This patch adds a new function
to populate the cache info from IP discovery for GFX 9.4.3.

Signed-off-by: Mukul Joshi 
---
v1->v2:
- Separate out the original patch into 2 patches.

 drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 66 ++-
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 0e792a8496d6..cd8e459201f1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1404,6 +1404,66 @@ static int 
kfd_fill_gpu_cache_info_from_gfx_config(struct kfd_dev *kdev,
return i;
 }
 
+static int kfd_fill_gpu_cache_info_from_gfx_config_v2(struct kfd_dev *kdev,
+  struct kfd_gpu_cache_info 
*pcache_info)
+{
+   struct amdgpu_device *adev = kdev->adev;
+   int i = 0;
+
+   /* TCP L1 Cache per CU */
+   if (adev->gfx.config.gc_tcp_size_per_cu) {
+   pcache_info[i].cache_size = adev->gfx.config.gc_tcp_size_per_cu;
+   pcache_info[i].cache_level = 1;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = 1;
+   i++;
+   }
+   /* Scalar L1 Instruction Cache per SQC */
+   if (adev->gfx.config.gc_l1_instruction_cache_size_per_sqc) {
+   pcache_info[i].cache_size =
+   adev->gfx.config.gc_l1_instruction_cache_size_per_sqc;
+   pcache_info[i].cache_level = 1;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_INST_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = 
adev->gfx.config.gc_num_cu_per_sqc;
+   i++;
+   }
+   /* Scalar L1 Data Cache per SQC */
+   if (adev->gfx.config.gc_l1_data_cache_size_per_sqc) {
+   pcache_info[i].cache_size = 
adev->gfx.config.gc_l1_data_cache_size_per_sqc;
+   pcache_info[i].cache_level = 1;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = 
adev->gfx.config.gc_num_cu_per_sqc;
+   i++;
+   }
+   /* L2 Data Cache per GPU (Total Tex Cache) */
+   if (adev->gfx.config.gc_tcc_size) {
+   pcache_info[i].cache_size = adev->gfx.config.gc_tcc_size;
+   pcache_info[i].cache_level = 2;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh;
+   i++;
+   }
+   /* L3 Data Cache per GPU */
+   if (adev->gmc.mall_size) {
+   pcache_info[i].cache_size = adev->gmc.mall_size / 1024;
+   pcache_info[i].cache_level = 3;
+   pcache_info[i].flags = (CRAT_CACHE_FLAGS_ENABLED |
+   CRAT_CACHE_FLAGS_DATA_CACHE |
+   CRAT_CACHE_FLAGS_SIMD_CACHE);
+   pcache_info[i].num_cu_shared = adev->gfx.config.max_cu_per_sh;
+   i++;
+   }
+   return i;
+}
+
 int kfd_get_gpu_cache_info(struct kfd_node *kdev, struct kfd_gpu_cache_info 
**pcache_info)
 {
int num_of_cache_types = 0;
@@ -1461,10 +1521,14 @@ int kfd_get_gpu_cache_info(struct kfd_node *kdev, 
struct kfd_gpu_cache_info **pc
num_of_cache_types = ARRAY_SIZE(vega20_cache_info);
break;
case IP_VERSION(9, 4, 2):
-   case IP_VERSION(9, 4, 3):
*pcache_info = aldebaran_cache_info;
num_of_cache_types = ARRAY_SIZE(aldebaran_cache_info);
break;
+   case IP_VERSION(9, 4, 3):
+   num_of_cache_types =
+   
kfd_fill_gpu_cache_info_from_gfx_config_v2(kdev->kfd,
+   
*pcache_info);
+   break;
case IP_VERSION(9, 1, 0):
case IP_VERSION(9, 2, 2):
*pcache_info = raven_cache_info;
-- 
2.35.1



[PATCH] drm/radeon: replace 1-element arrays with flexible-array members

2023-10-27 Thread José Pekkarinen
Reported by coccinelle, the following patch will move the
following 1 element arrays to flexible arrays.

drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:6030:8-17: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7362:26-41: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7369:29-44: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7349:24-32: WARNING use flexible-array member 
instead 

RE: [PATCH] drm/radeon: replace 1-element arrays with flexible-array members

2023-10-27 Thread Deucher, Alexander
[Public]

> -Original Message-
> From: José Pekkarinen 
> Sent: Friday, October 27, 2023 12:59 PM
> To: Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> sk...@linuxfoundation.org
> Cc: José Pekkarinen ; airl...@gmail.com;
> dan...@ffwll.ch; amd-gfx@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; linux-kernel-
> ment...@lists.linuxfoundation.org
> Subject: [PATCH] drm/radeon: replace 1-element arrays with flexible-array
> members
>
> Reported by coccinelle, the following patch will move the following 1 element
> arrays to flexible arrays.
>
> drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array
> member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array
> member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array
> member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> 

[PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-27 Thread Rob Clark
From: Rob Clark 

In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |  4 ++--
 drivers/gpu/drm/drm_exec.c  | 15 ---
 drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  2 +-
 include/drm/drm_exec.h  |  2 +-
 8 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index efdb1c48f431..d27ca8f61929 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p,
}
 
amdgpu_sync_create(>sync);
-   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 720011019741..796fa6f1420b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
@@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index ca4d2d430e28..16f1715148ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object 
*obj,
struct drm_exec exec;
long r;
 
-   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES);
+   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked() {
r = drm_exec_prepare_obj(, >tbo.base, 1);
drm_exec_retry_on_contention();
@@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
}
 
drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT |
- DRM_EXEC_IGNORE_DUPLICATES);
+ DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked() {
if (gobj) {
r = drm_exec_lock_obj(, gobj);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b6015157763a..3c351941701e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device 
*adev,
 
amdgpu_sync_create();
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
drm_exec_until_all_locked() {
r = drm_exec_lock_obj(,
  _data->meta_data_obj->tbo.base);
@@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct amdgpu_device 
*adev,
struct drm_exec exec;
long r;
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
drm_exec_until_all_locked() {
r = drm_exec_lock_obj(,
  _data->meta_data_obj->tbo.base);
diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
index 5d2809de4517..27d11c20d148 100644
--- a/drivers/gpu/drm/drm_exec.c
+++ b/drivers/gpu/drm/drm_exec.c
@@ -69,16 +69,25 @@ static void drm_exec_unlock_all(struct drm_exec *exec)
  * drm_exec_init - initialize a drm_exec object
  * @exec: the drm_exec object to initialize
  * @flags: controls locking behavior, see DRM_EXEC_* defines
+ * @nr: the initial # of objects
  *
  * Initialize the object and make sure that we can track locked objects.
+ *
+ * If nr is non-zero then it is used as the initial objects table size.
+ * In either case, the table will grow (be re-allocated) on demand.
  */
-void drm_exec_init(struct drm_exec *exec, uint32_t flags)
+void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsigned nr)
 {
+   size_t sz = PAGE_SIZE;
+
+   if (nr)
+   sz = (size_t)nr * sizeof(void *);
+
exec->flags = flags;
-   exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL);
+   exec->objects = kmalloc(sz, GFP_KERNEL);
 

[PATCH 0/7] drm/msm/gem: drm_exec conversion

2023-10-27 Thread Rob Clark
From: Rob Clark 

Simplify the exec path (removing a legacy optimization) and convert to
drm_exec.  One drm_exec patch to allow passing in the expected # of GEM
objects to avoid re-allocation.

I'd be a bit happier if I could avoid the extra objects table allocation
in drm_exec in the first place, but wasn't really happy with any of the
things I tried to get rid of that.

Rob Clark (7):
  drm/msm/gem: Remove "valid" tracking
  drm/msm/gem: Remove submit_unlock_unpin_bo()
  drm/msm/gem: Don't queue job to sched in error cases
  drm/msm/gem: Split out submit_unpin_objects() helper
  drm/msm/gem: Cleanup submit_cleanup_bo()
  drm/exec: Pass in initial # of objects
  drm/msm/gem: Convert to drm_exec

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |   4 +-
 drivers/gpu/drm/drm_exec.c  |  15 +-
 drivers/gpu/drm/msm/Kconfig |   1 +
 drivers/gpu/drm/msm/msm_gem.h   |  13 +-
 drivers/gpu/drm/msm/msm_gem_submit.c| 197 ++--
 drivers/gpu/drm/msm/msm_ringbuffer.c|   3 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c  |   2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |   2 +-
 include/drm/drm_exec.h  |   2 +-
 12 files changed, 79 insertions(+), 170 deletions(-)

-- 
2.41.0



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz

On 10/27/23 11:55, Lakha, Bhawanpreet wrote:

[AMD Official Use Only - General]



There was a consensus to use memset instead of {0}. I remember making 
changes related to that previously.


Hm, seems like it's used rather consistently in the DM and in DC
though.



Bhawan


*From:* Mahfooz, Hamza 
*Sent:* October 27, 2023 11:53 AM
*To:* Yuran Pereira ; airl...@gmail.com 

*Cc:* Li, Sun peng (Leo) ; Lakha, Bhawanpreet 
; Pan, Xinhui ; Siqueira, 
Rodrigo ; linux-ker...@vger.kernel.org 
; amd-gfx@lists.freedesktop.org 
; dri-de...@lists.freedesktop.org 
; Deucher, Alexander 
; Koenig, Christian 
; 
linux-kernel-ment...@lists.linuxfoundation.org 

*Subject:* Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in 
amdgpu_dm_setup_replay

On 10/26/23 17:25, Yuran Pereira wrote:

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
  ...
  pr_config.replay_enable_option |= pr_enable_option_static_screen;
  ...

  if (!pr_config.replay_timing_sync_supported)
  pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
  ...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
    *
    */
  
+#include 

   #include "amdgpu_dm_replay.h"
   #include "dc.h"
   #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
amdgpu_dm_connector *ac
    struct replay_config pr_config;


I would prefer setting pr_config = {0};


    union replay_debug_flags *debug_flags = NULL;
  
+ memset(_config, 0, sizeof(pr_config));

+
    // For eDP, if Replay is supported, return true to skip checks
    if (link->replay_settings.config.replay_supported)
    return true;

--
Hamza


--
Hamza



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Lakha, Bhawanpreet
[AMD Official Use Only - General]


There was a consensus to use memset instead of {0}. I remember making changes 
related to that previously.

Bhawan


From: Mahfooz, Hamza 
Sent: October 27, 2023 11:53 AM
To: Yuran Pereira ; airl...@gmail.com 

Cc: Li, Sun peng (Leo) ; Lakha, Bhawanpreet 
; Pan, Xinhui ; Siqueira, 
Rodrigo ; linux-ker...@vger.kernel.org 
; amd-gfx@lists.freedesktop.org 
; dri-de...@lists.freedesktop.org 
; Deucher, Alexander 
; Koenig, Christian ; 
linux-kernel-ment...@lists.linuxfoundation.org 

Subject: Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in 
amdgpu_dm_setup_replay

On 10/26/23 17:25, Yuran Pereira wrote:
> Since `pr_config` is not initialized after its declaration, the
> following operations with `replay_enable_option` may be performed
> when `replay_enable_option` is holding junk values which could
> possibly lead to undefined behaviour
>
> ```
>  ...
>  pr_config.replay_enable_option |= pr_enable_option_static_screen;
>  ...
>
>  if (!pr_config.replay_timing_sync_supported)
>  pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
>  ...
> ```
>
> This patch initializes `pr_config` after its declaration to ensure that
> it doesn't contain junk data, and prevent any undefined behaviour
>
> Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
> Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
> Signed-off-by: Yuran Pereira 
> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> index 32d3086c4cb7..40526507f50b 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> @@ -23,6 +23,7 @@
>*
>*/
>
> +#include 
>   #include "amdgpu_dm_replay.h"
>   #include "dc.h"
>   #include "dm_helpers.h"
> @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
> amdgpu_dm_connector *ac
>struct replay_config pr_config;

I would prefer setting pr_config = {0};

>union replay_debug_flags *debug_flags = NULL;
>
> + memset(_config, 0, sizeof(pr_config));
> +
>// For eDP, if Replay is supported, return true to skip checks
>if (link->replay_settings.config.replay_supported)
>return true;
--
Hamza



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz

Also, please write the tagline in present tense.
On 10/27/23 11:53, Hamza Mahfooz wrote:

On 10/26/23 17:25, Yuran Pereira wrote:

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
 ...
 pr_config.replay_enable_option |= pr_enable_option_static_screen;
 ...

 if (!pr_config.replay_timing_sync_supported)
 pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
 ...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c

index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
   *
   */
+#include 
  #include "amdgpu_dm_replay.h"
  #include "dc.h"
  #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, 
struct amdgpu_dm_connector *ac

  struct replay_config pr_config;


I would prefer setting pr_config = {0};


  union replay_debug_flags *debug_flags = NULL;
+    memset(_config, 0, sizeof(pr_config));
+
  // For eDP, if Replay is supported, return true to skip checks
  if (link->replay_settings.config.replay_supported)
  return true;

--
Hamza



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz

On 10/26/23 17:25, Yuran Pereira wrote:

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
 ...
 pr_config.replay_enable_option |= pr_enable_option_static_screen;
 ...

 if (!pr_config.replay_timing_sync_supported)
 pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
 ...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
   *
   */
  
+#include 

  #include "amdgpu_dm_replay.h"
  #include "dc.h"
  #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
amdgpu_dm_connector *ac
struct replay_config pr_config;


I would prefer setting pr_config = {0};


union replay_debug_flags *debug_flags = NULL;
  
+	memset(_config, 0, sizeof(pr_config));

+
// For eDP, if Replay is supported, return true to skip checks
if (link->replay_settings.config.replay_supported)
return true;

--
Hamza



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Lakha, Bhawanpreet
[AMD Official Use Only - General]

Thanks,

Reviewed-by: Bhawanpreet Lakha 


From: Yuran Pereira 
Sent: October 26, 2023 5:25 PM
To: airl...@gmail.com 
Cc: Yuran Pereira ; Wentland, Harry 
; Li, Sun peng (Leo) ; Siqueira, 
Rodrigo ; Deucher, Alexander 
; Koenig, Christian ; Pan, 
Xinhui ; dan...@ffwll.ch ; Lakha, 
Bhawanpreet ; amd-gfx@lists.freedesktop.org 
; dri-de...@lists.freedesktop.org 
; linux-ker...@vger.kernel.org 
; linux-kernel-ment...@lists.linuxfoundation.org 

Subject: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in 
amdgpu_dm_setup_replay

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
...
pr_config.replay_enable_option |= pr_enable_option_static_screen;
...

if (!pr_config.replay_timing_sync_supported)
pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
  *
  */

+#include 
 #include "amdgpu_dm_replay.h"
 #include "dc.h"
 #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
amdgpu_dm_connector *ac
 struct replay_config pr_config;
 union replay_debug_flags *debug_flags = NULL;

+   memset(_config, 0, sizeof(pr_config));
+
 // For eDP, if Replay is supported, return true to skip checks
 if (link->replay_settings.config.replay_supported)
 return true;
--
2.25.1



[PATCH 3/3] drm/amdgpu: add a retry for IP discovery init

2023-10-27 Thread Alex Deucher
AMD dGPUs have integrated FW that runs as soon as the
device gets power and initializes the board (determines
the amount of memory, provides configuration details to
the driver, etc.).  For direct PCIe attached cards this
happens as soon as power is applied and normally completes
well before the OS has even started loading.  However, with
hotpluggable ports like USB4, the driver needs to wait for
this to complete before initializing the device.

This normally takes 60-100ms, but could take longer on
some older boards periodically due to memory training.

Retry for up to a second.  In the non-hotplug case, there
should be no change in behavior and this should complete
on the first try.

v2: adjust test criteria
v3: adjust checks for the masks, only enable on removable devices
v4: skip bif_fb_en check

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 5f9d75900bfa..9ca4d89352d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -99,6 +99,7 @@
 MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY);
 
 #define mmRCC_CONFIG_MEMSIZE   0xde3
+#define mmMP0_SMN_C2PMSG_330x16061
 #define mmMM_INDEX 0x0
 #define mmMM_INDEX_HI  0x6
 #define mmMM_DATA  0x1
@@ -239,8 +240,26 @@ static int amdgpu_discovery_read_binary_from_sysmem(struct 
amdgpu_device *adev,
 static int amdgpu_discovery_read_binary_from_mem(struct amdgpu_device *adev,
 uint8_t *binary)
 {
-   uint64_t vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20;
-   int ret = 0;
+   uint64_t vram_size;
+   u32 msg;
+   int i, ret = 0;
+
+   /* It can take up to a second for IFWI init to complete on some dGPUs,
+* but generally it should be in the 60-100ms range.  Normally this 
starts
+* as soon as the device gets power so by the time the OS loads this 
has long
+* completed.  However, when a card is hotplugged via e.g., USB4, we 
need to
+* wait for this to complete.  Once the C2PMSG is updated, we can
+* continue.
+*/
+   if (dev_is_removable(>pdev->dev)) {
+   for (i = 0; i < 1000; i++) {
+   msg = RREG32(mmMP0_SMN_C2PMSG_33);
+   if (msg & 0x8000)
+   break;
+   msleep(1);
+   }
+   }
+   vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20;
 
if (vram_size) {
uint64_t pos = vram_size - DISCOVERY_TMR_OFFSET;
-- 
2.41.0



[PATCH 1/3] drm/amdgpu: don't use ATRM for external devices

2023-10-27 Thread Alex Deucher
The ATRM ACPI method is for fetching the dGPU vbios rom
image on laptops and all-in-one systems.  It should not be
used for external add in cards.  If the dGPU is thunderbolt
connected, don't try ATRM.

v2: pci_is_thunderbolt_attached only works for Intel.  Use
pdev->external_facing instead.
v3: dev_is_removable() seems to be what we want

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
index 38ccec913f00..f3a09ecb7699 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@@ -29,6 +29,7 @@
 #include "amdgpu.h"
 #include "atom.h"
 
+#include 
 #include 
 #include 
 #include 
@@ -287,6 +288,10 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device 
*adev)
if (adev->flags & AMD_IS_APU)
return false;
 
+   /* ATRM is for on-platform devices only */
+   if (dev_is_removable(>pdev->dev))
+   return false;
+
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
dhandle = ACPI_HANDLE(>dev);
if (!dhandle)
-- 
2.41.0



[PATCH 2/3] drm/amdgpu: don't use pci_is_thunderbolt_attached()

2023-10-27 Thread Alex Deucher
It's only valid on Intel systems with the Intel VSEC.
Use dev_is_removable() instead.  This should do the right
thing regardless of the platform.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 5 +++--
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2381de831271..5c90080e93ba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2223,7 +2224,6 @@ static int amdgpu_device_parse_gpu_info_fw(struct 
amdgpu_device *adev)
  */
 static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
 {
-   struct drm_device *dev = adev_to_drm(adev);
struct pci_dev *parent;
int i, r;
bool total;
@@ -2294,7 +2294,7 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
(amdgpu_is_atpx_hybrid() ||
 amdgpu_has_atpx_dgpu_power_cntl()) &&
((adev->flags & AMD_IS_APU) == 0) &&
-   !pci_is_thunderbolt_attached(to_pci_dev(dev->dev)))
+   !dev_is_removable(>pdev->dev))
adev->flags |= AMD_IS_PX;
 
if (!(adev->flags & AMD_IS_APU)) {
@@ -4138,7 +4138,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
px = amdgpu_device_supports_px(ddev);
 
-   if (px || (!pci_is_thunderbolt_attached(adev->pdev) &&
+   if (px || (!dev_is_removable(>pdev->dev) &&
apple_gmux_detect(NULL, NULL)))
vga_switcheroo_register_client(adev->pdev,
   _switcheroo_ops, px);
@@ -4288,7 +4288,7 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
 
px = amdgpu_device_supports_px(adev_to_drm(adev));
 
-   if (px || (!pci_is_thunderbolt_attached(adev->pdev) &&
+   if (px || (!dev_is_removable(>pdev->dev) &&
apple_gmux_detect(NULL, NULL)))
vga_switcheroo_unregister_client(adev->pdev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
index e523627cfe25..df218d5ca775 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
@@ -28,6 +28,7 @@
 #include "nbio/nbio_2_3_offset.h"
 #include "nbio/nbio_2_3_sh_mask.h"
 #include 
+#include 
 #include 
 
 #define smnPCIE_CONFIG_CNTL0x11180044
@@ -361,7 +362,7 @@ static void nbio_v2_3_enable_aspm(struct amdgpu_device 
*adev,
 
data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << 
PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT;
 
-   if (pci_is_thunderbolt_attached(adev->pdev))
+   if (dev_is_removable(>pdev->dev))
data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT  << 
PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
else
data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << 
PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
@@ -480,7 +481,7 @@ static void nbio_v2_3_program_aspm(struct amdgpu_device 
*adev)
 
def = data = RREG32_PCIE(smnPCIE_LC_CNTL);
data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << 
PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT;
-   if (pci_is_thunderbolt_attached(adev->pdev))
+   if (dev_is_removable(>pdev->dev))
data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT  << 
PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
else
data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << 
PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
-- 
2.41.0



Re: [PATCH 2/2] drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo

2023-10-27 Thread Alex Deucher
Applied the series.  Thanks!

Alex

On Thu, Oct 26, 2023 at 6:43 PM Umio Yasuno
 wrote:
>
> Remove unused variables from amdgpu_show_fdinfo
>
> Signed-off-by: Umio Yasuno 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 6 --
>  1 file changed, 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> index e9b5d1903..b960ca7ba 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> @@ -55,21 +55,15 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
>
>  void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
>  {
> -   struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
> struct amdgpu_fpriv *fpriv = file->driver_priv;
> struct amdgpu_vm *vm = >vm;
>
> struct amdgpu_mem_stats stats;
> ktime_t usage[AMDGPU_HW_IP_NUM];
> -   uint32_t bus, dev, fn, domain;
> unsigned int hw_ip;
> int ret;
>
> memset(, 0, sizeof(stats));
> -   bus = adev->pdev->bus->number;
> -   domain = pci_domain_nr(adev->pdev->bus);
> -   dev = PCI_SLOT(adev->pdev->devfn);
> -   fn = PCI_FUNC(adev->pdev->devfn);
>
> ret = amdgpu_bo_reserve(vm->root.bo, false);
> if (ret)
> --
> 2.42.0
>
>


RE: [PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4

2023-10-27 Thread Zhang, Hawking
[AMD Official Use Only - General]

Series is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: amd-gfx  On Behalf Of Tao Zhou
Sent: Friday, October 27, 2023 19:33
To: amd-gfx@lists.freedesktop.org
Cc: Zhou1, Tao 
Subject: [PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4

Reset/query RAS error status and count.

v2: use XGMI IP version instead of WAFL version.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 46 ++--
 1 file changed, 43 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 2b7dc490ba6b..0533f873001b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -103,6 +103,16 @@ static const int 
walf_pcs_err_noncorrectable_mask_reg_aldebaran[] = {
smnPCS_GOPX1_PCS_ERROR_NONCORRECTABLE_MASK + 0x10  };

+static const int xgmi3x16_pcs_err_status_reg_v6_4[] = {
+   smnPCS_XGMI3X16_PCS_ERROR_STATUS,
+   smnPCS_XGMI3X16_PCS_ERROR_STATUS + 0x10 };
+
+static const int xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[] = {
+   smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK,
+   smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK + 0x10 };
+
 static const struct amdgpu_pcs_ras_field xgmi_pcs_ras_fields[] = {
{"XGMI PCS DataLossErr",
 SOC15_REG_FIELD(XGMI0_PCS_GOPX16_PCS_ERROR_STATUS, DataLossErr)}, @@ 
-958,6 +968,16 @@ static void amdgpu_xgmi_reset_ras_error_count(struct 
amdgpu_device *adev)
default:
break;
}
+
+   switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) {
+   case IP_VERSION(6, 4, 0):
+   for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); 
i++)
+   pcs_clear_status(adev,
+   xgmi3x16_pcs_err_status_reg_v6_4[i]);
+   break;
+   default:
+   break;
+   }
 }

 static int amdgpu_xgmi_query_pcs_error_status(struct amdgpu_device *adev, @@ 
-975,7 +995,9 @@ static int amdgpu_xgmi_query_pcs_error_status(struct 
amdgpu_device *adev,

if (is_xgmi_pcs) {
if (amdgpu_ip_version(adev, XGMI_HWIP, 0) ==
-   IP_VERSION(6, 1, 0)) {
+   IP_VERSION(6, 1, 0) ||
+   amdgpu_ip_version(adev, XGMI_HWIP, 0) ==
+   IP_VERSION(6, 4, 0)) {
pcs_ras_fields = _pcs_ras_fields[0];
field_array_size = ARRAY_SIZE(xgmi3x16_pcs_ras_fields);
} else {
@@ -1013,7 +1035,7 @@ static void amdgpu_xgmi_query_ras_error_count(struct 
amdgpu_device *adev,
 void *ras_error_status)
 {
struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status;
-   int i;
+   int i, supported = 1;
uint32_t data, mask_data = 0;
uint32_t ue_cnt = 0, ce_cnt = 0;

@@ -1077,7 +1099,25 @@ static void amdgpu_xgmi_query_ras_error_count(struct 
amdgpu_device *adev,
}
break;
default:
-   dev_warn(adev->dev, "XGMI RAS error query not supported");
+   supported = 0;
+   break;
+   }
+
+   switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) {
+   case IP_VERSION(6, 4, 0):
+   /* check xgmi3x16 pcs error */
+   for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); 
i++) {
+   data = RREG32_PCIE(xgmi3x16_pcs_err_status_reg_v6_4[i]);
+   mask_data =
+   
RREG32_PCIE(xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[i]);
+   if (data)
+   amdgpu_xgmi_query_pcs_error_status(adev, data,
+   mask_data, _cnt, _cnt, 
true, true);
+   }
+   break;
+   default:
+   if (!supported)
+   dev_warn(adev->dev, "XGMI RAS error query not 
supported");
break;
}

--
2.35.1



Re: [PATCH] drm/amdgpu: add unmap latency when gfx11 set kiq resources

2023-10-27 Thread Deucher, Alexander
[Public]

Acked-by: Alex Deucher 

From: Tong Liu01 
Sent: Thursday, October 26, 2023 11:41 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Evan Quan ; Chen, Horace ; Tuikov, 
Luben ; Koenig, Christian ; 
Deucher, Alexander ; Xiao, Jack ; 
Zhang, Hawking ; Liu, Monk ; Xu, 
Feifei ; Chang, HaiJun ; Liu01, Tong 
(Esther) 
Subject: [PATCH] drm/amdgpu: add unmap latency when gfx11 set kiq resources

[why]
If driver does not set unmap latency for KIQ, the default value of KIQ
unmap latency is zero. When do unmap queue, KIQ will return that almost
immediately after receiving unmap command. So, the queue status will be
saved to MQD incorrectly or lost in some chance.

[how]
Set unmap latency when do kiq set resources. The unmap latency is set to
be 1 second that is synchronized with Windows driver.

Signed-off-by: Tong Liu01 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index fd22943685f7..7aef7a3a340f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -155,6 +155,7 @@ static void gfx11_kiq_set_resources(struct amdgpu_ring 
*kiq_ring, uint64_t queue
 {
 amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_SET_RESOURCES, 6));
 amdgpu_ring_write(kiq_ring, PACKET3_SET_RESOURCES_VMID_MASK(0) |
+ PACKET3_SET_RESOURCES_UNMAP_LATENTY(0xa) | /* 
unmap_latency: 0xa (~ 1s) */
   PACKET3_SET_RESOURCES_QUEUE_TYPE(0));  /* 
vmid_mask:0 queue_type:0 (KIQ) */
 amdgpu_ring_write(kiq_ring, lower_32_bits(queue_mask)); /* queue mask 
lo */
 amdgpu_ring_write(kiq_ring, upper_32_bits(queue_mask)); /* queue mask 
hi */
--
2.34.1



[PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4

2023-10-27 Thread Tao Zhou
Reset/query RAS error status and count.

v2: use XGMI IP version instead of WAFL version.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 46 ++--
 1 file changed, 43 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 2b7dc490ba6b..0533f873001b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -103,6 +103,16 @@ static const int 
walf_pcs_err_noncorrectable_mask_reg_aldebaran[] = {
smnPCS_GOPX1_PCS_ERROR_NONCORRECTABLE_MASK + 0x10
 };
 
+static const int xgmi3x16_pcs_err_status_reg_v6_4[] = {
+   smnPCS_XGMI3X16_PCS_ERROR_STATUS,
+   smnPCS_XGMI3X16_PCS_ERROR_STATUS + 0x10
+};
+
+static const int xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[] = {
+   smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK,
+   smnPCS_XGMI3X16_PCS_ERROR_NONCORRECTABLE_MASK + 0x10
+};
+
 static const struct amdgpu_pcs_ras_field xgmi_pcs_ras_fields[] = {
{"XGMI PCS DataLossErr",
 SOC15_REG_FIELD(XGMI0_PCS_GOPX16_PCS_ERROR_STATUS, DataLossErr)},
@@ -958,6 +968,16 @@ static void amdgpu_xgmi_reset_ras_error_count(struct 
amdgpu_device *adev)
default:
break;
}
+
+   switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) {
+   case IP_VERSION(6, 4, 0):
+   for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); 
i++)
+   pcs_clear_status(adev,
+   xgmi3x16_pcs_err_status_reg_v6_4[i]);
+   break;
+   default:
+   break;
+   }
 }
 
 static int amdgpu_xgmi_query_pcs_error_status(struct amdgpu_device *adev,
@@ -975,7 +995,9 @@ static int amdgpu_xgmi_query_pcs_error_status(struct 
amdgpu_device *adev,
 
if (is_xgmi_pcs) {
if (amdgpu_ip_version(adev, XGMI_HWIP, 0) ==
-   IP_VERSION(6, 1, 0)) {
+   IP_VERSION(6, 1, 0) ||
+   amdgpu_ip_version(adev, XGMI_HWIP, 0) ==
+   IP_VERSION(6, 4, 0)) {
pcs_ras_fields = _pcs_ras_fields[0];
field_array_size = ARRAY_SIZE(xgmi3x16_pcs_ras_fields);
} else {
@@ -1013,7 +1035,7 @@ static void amdgpu_xgmi_query_ras_error_count(struct 
amdgpu_device *adev,
 void *ras_error_status)
 {
struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status;
-   int i;
+   int i, supported = 1;
uint32_t data, mask_data = 0;
uint32_t ue_cnt = 0, ce_cnt = 0;
 
@@ -1077,7 +1099,25 @@ static void amdgpu_xgmi_query_ras_error_count(struct 
amdgpu_device *adev,
}
break;
default:
-   dev_warn(adev->dev, "XGMI RAS error query not supported");
+   supported = 0;
+   break;
+   }
+
+   switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) {
+   case IP_VERSION(6, 4, 0):
+   /* check xgmi3x16 pcs error */
+   for (i = 0; i < ARRAY_SIZE(xgmi3x16_pcs_err_status_reg_v6_4); 
i++) {
+   data = RREG32_PCIE(xgmi3x16_pcs_err_status_reg_v6_4[i]);
+   mask_data =
+   
RREG32_PCIE(xgmi3x16_pcs_err_noncorrectable_mask_reg_v6_4[i]);
+   if (data)
+   amdgpu_xgmi_query_pcs_error_status(adev, data,
+   mask_data, _cnt, _cnt, 
true, true);
+   }
+   break;
+   default:
+   if (!supported)
+   dev_warn(adev->dev, "XGMI RAS error query not 
supported");
break;
}
 
-- 
2.35.1



[PATCH 1/2] drm/amdgpu: set XGMI IP version manually for v6_4

2023-10-27 Thread Tao Zhou
The version can't be queried from discovery table.

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 0b711bac2092..d22f22d706e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -2464,6 +2464,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev)
if (amdgpu_ip_version(adev, XGMI_HWIP, 0) == IP_VERSION(4, 8, 0))
adev->gmc.xgmi.supported = true;
 
+   if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3))
+   adev->ip_versions[XGMI_HWIP][0] = IP_VERSION(6, 4, 0);
+
/* set NBIO version */
switch (amdgpu_ip_version(adev, NBIO_HWIP, 0)) {
case IP_VERSION(6, 1, 0):
-- 
2.35.1



RE: [PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check

2023-10-27 Thread Zhang, Hawking
[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: amd-gfx  On Behalf Of Candice Li
Sent: Friday, October 27, 2023 18:30
To: amd-gfx@lists.freedesktop.org
Cc: Li, Candice 
Subject: [PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check

Drop checking deferred error which can be handled by poison consumption.

Signed-off-by: Candice Li 
---
 drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
index 743d2f68b09020..770b4b4e313838 100644
--- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
@@ -91,8 +91,7 @@ static void umc_v12_0_reset_error_count(struct amdgpu_device 
*adev)  static bool umc_v12_0_is_uncorrectable_error(uint64_t mc_umc_status)  {
return ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) 
== 1) &&
-   (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, 
Deferred) == 1 ||
-   REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) 
== 1 ||
+   (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) 
== 1
+||
REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UC) 
== 1 ||
REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, TCC) 
== 1));  }
--
2.25.1



[PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check

2023-10-27 Thread Candice Li
Drop checking deferred error which can be handled by poison
consumption.

Signed-off-by: Candice Li 
---
 drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
index 743d2f68b09020..770b4b4e313838 100644
--- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
@@ -91,8 +91,7 @@ static void umc_v12_0_reset_error_count(struct amdgpu_device 
*adev)
 static bool umc_v12_0_is_uncorrectable_error(uint64_t mc_umc_status)
 {
return ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) 
== 1) &&
-   (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, 
Deferred) == 1 ||
-   REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) 
== 1 ||
+   (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) 
== 1 ||
REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UC) 
== 1 ||
REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, TCC) 
== 1));
 }
-- 
2.25.1



[PATCH] drm/amdgpu: fix check order ras->in_recovery is earlier than ras feature

2023-10-27 Thread Bob Zhou
Checking ras->in_recovery is earlier than ras feature that causes the
below null pointer issue. So update the check order to fix it.

BUG: kernel NULL pointer dereference, address: 00e8
RIP: 0010:amdgpu_ras_reset_error_count+0xf6/0x190 [amdgpu]
Call Trace:
 
 ? show_regs+0x72/0x90
 ? __die+0x25/0x80
 ? page_fault_oops+0x79/0x190
 ? do_user_addr_fault+0x30c/0x640
 ? __wake_up_klogd.part.0+0x40/0x70
 ? exc_page_fault+0x81/0x1b0
 ? asm_exc_page_fault+0x27/0x30
 ? amdgpu_ras_reset_error_count+0xf6/0x190 [amdgpu]
 ? __pfx_gmc_v9_0_late_init+0x10/0x10 [amdgpu]
 gmc_v9_0_late_init+0x97/0xe0 [amdgpu]

Fixes: be5c7eb10406 ("drm/amdgpu: bypass RAS error reset in some conditions")

Signed-off-by: Bob Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 303fbb6a48b6..3af50754800d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1229,15 +1229,15 @@ int amdgpu_ras_reset_error_count(struct amdgpu_device 
*adev,
return -EOPNOTSUPP;
}
 
+   if (!amdgpu_ras_is_supported(adev, block) ||
+   !amdgpu_ras_get_mca_debug_mode(adev))
+   return -EOPNOTSUPP;
+
/* skip ras error reset in gpu reset */
if ((amdgpu_in_reset(adev) || atomic_read(>in_recovery)) &&
mca_funcs && mca_funcs->mca_set_debug_mode)
return -EOPNOTSUPP;
 
-   if (!amdgpu_ras_is_supported(adev, block) ||
-   !amdgpu_ras_get_mca_debug_mode(adev))
-   return -EOPNOTSUPP;
-
if (block_obj->hw_ops->reset_ras_error_count)
block_obj->hw_ops->reset_ras_error_count(adev);
 
-- 
2.34.1



Re: [PATCH 3/5] drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v3)

2023-10-27 Thread Lazar, Lijo




On 10/26/2023 2:22 AM, Victor Lu wrote:

amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0.

Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC
and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter.

v3: use W/RREG32_XCC to handle non-kiq case

v2: define amdgpu_device_xcc_wreg/rreg instead of changing parameters
 of amdgpu_device_wreg/rreg

Signed-off-by: Victor Lu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   | 13 ++-
  .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c   |  2 +-
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 84 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   |  8 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |  4 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  4 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c   |  8 +-
  8 files changed, 107 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index a2e8c2b60857..09989ebb5da3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1168,11 +1168,18 @@ uint32_t amdgpu_device_rreg(struct amdgpu_device *adev,
uint32_t reg, uint32_t acc_flags);
  u32 amdgpu_device_indirect_rreg_ext(struct amdgpu_device *adev,
u64 reg_addr);
+uint32_t amdgpu_device_xcc_rreg(struct amdgpu_device *adev,
+   uint32_t reg, uint32_t acc_flags,
+   uint32_t xcc_id);
  void amdgpu_device_wreg(struct amdgpu_device *adev,
uint32_t reg, uint32_t v,
uint32_t acc_flags);
  void amdgpu_device_indirect_wreg_ext(struct amdgpu_device *adev,
 u64 reg_addr, u32 reg_data);
+void amdgpu_device_xcc_wreg(struct amdgpu_device *adev,
+   uint32_t reg, uint32_t v,
+   uint32_t acc_flags,
+   uint32_t xcc_id);
  void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
 uint32_t reg, uint32_t v, uint32_t xcc_id);
  void amdgpu_mm_wreg8(struct amdgpu_device *adev, uint32_t offset, uint8_t 
value);
@@ -1213,8 +1220,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev);
  #define RREG32_NO_KIQ(reg) amdgpu_device_rreg(adev, (reg), AMDGPU_REGS_NO_KIQ)
  #define WREG32_NO_KIQ(reg, v) amdgpu_device_wreg(adev, (reg), (v), 
AMDGPU_REGS_NO_KIQ)
  
-#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg))

-#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v))
+#define RREG32_KIQ(reg) amdgpu_kiq_rreg(adev, (reg), 0)
+#define WREG32_KIQ(reg, v) amdgpu_kiq_wreg(adev, (reg), (v), 0)
  
  #define RREG8(reg) amdgpu_mm_rreg8(adev, (reg))

  #define WREG8(reg, v) amdgpu_mm_wreg8(adev, (reg), (v))
@@ -1224,6 +1231,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev);
  #define WREG32(reg, v) amdgpu_device_wreg(adev, (reg), (v), 0)
  #define REG_SET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK)
  #define REG_GET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK)
+#define RREG32_XCC(reg, flag, inst) amdgpu_device_xcc_rreg(adev, (reg), flag, 
inst)
+#define WREG32_XCC(reg, v, flag, inst) amdgpu_device_xcc_wreg(adev, (reg), 
(v), flag, inst)
  #define RREG32_PCIE(reg) adev->pcie_rreg(adev, (reg))
  #define WREG32_PCIE(reg, v) adev->pcie_wreg(adev, (reg), (v))
  #define RREG32_PCIE_PORT(reg) adev->pciep_rreg(adev, (reg))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
index 490c8f5ddb60..c94df54e2657 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c
@@ -300,7 +300,7 @@ static int kgd_gfx_v9_4_3_hqd_load(struct amdgpu_device 
*adev, void *mqd,
hqd_end = SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
regCP_HQD_AQL_DISPATCH_ID_HI);
  
  	for (reg = hqd_base; reg <= hqd_end; reg++)

-   WREG32_RLC(reg, mqd_hqd[reg - hqd_base]);
+   WREG32_XCC(reg, mqd_hqd[reg - hqd_base], AMDGPU_REGS_RLC, inst);
  
  
  	/* Activate doorbell logic before triggering WPTR poll. */

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index 51011e8ee90d..47c8c334c779 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -239,7 +239,7 @@ int kgd_gfx_v9_hqd_load(struct amdgpu_device *adev, void 
*mqd,
  
  	for (reg = hqd_base;

 reg <= SOC15_REG_OFFSET(GC, GET_INST(GC, inst), 
mmCP_HQD_PQ_WPTR_HI); reg++)
-   WREG32_RLC(reg, mqd_hqd[reg - hqd_base]);
+   WREG32_XCC(reg, mqd_hqd[reg - hqd_base], AMDGPU_REGS_RLC, inst);
  
  
  	/* Activate doorbell logic before triggering WPTR poll. */

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 

Re: [PATCH 2/5] drm/amdgpu: Add xcc instance parameter to *REG32_SOC15_IP_NO_KIQ (v2)

2023-10-27 Thread Lazar, Lijo




On 10/26/2023 2:22 AM, Victor Lu wrote:

The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface
when programming other XCCs.

Add xcc instance parameter to them.

v2: rebase

Signed-off-by: Victor Lu 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 
  drivers/gpu/drm/amd/amdgpu/soc15_common.h |  6 +++---
  2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index fee3141bb607..b7b1b04b66cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -856,9 +856,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
for (j = 0; j < adev->usec_timeout; j++) {
/* a read return value of 1 means semaphore acquire */
if (vmhub >= AMDGPU_MMHUB0(0))
-   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem);
+   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, sem, vmhub 
- AMDGPU_MMHUB(0));


The parameter expects xcc id, but this doesn't return that. Only a max 
of 4 MMHUBs are there. To get corresponding xcc, it needs more 
calculation. If xcc doesn't matter for MMHUB registers, use the first xcc.


Thanks,
Lijo

else
-   tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem);
+   tmp = RREG32_SOC15_IP_NO_KIQ(GC, sem, vmhub);
if (tmp & 0x1)
break;
udelay(1);
@@ -869,9 +869,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
}
  
  	if (vmhub >= AMDGPU_MMHUB0(0))

-   WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req);
+   WREG32_SOC15_IP_NO_KIQ(MMHUB, req, inv_req, vmhub - 
AMDGPU_MMHUB(0));
else
-   WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req);
+   WREG32_SOC15_IP_NO_KIQ(GC, req, inv_req, vmhub);
  
  	/*

 * Issue a dummy read to wait for the ACK register to
@@ -884,9 +884,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
  
  	for (j = 0; j < adev->usec_timeout; j++) {

if (vmhub >= AMDGPU_MMHUB0(0))
-   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack);
+   tmp = RREG32_SOC15_IP_NO_KIQ(MMHUB, ack, vmhub - 
AMDGPU_MMHUB(0));
else
-   tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack);
+   tmp = RREG32_SOC15_IP_NO_KIQ(GC, ack, vmhub);
if (tmp & (1 << vmid))
break;
udelay(1);
@@ -899,9 +899,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
 * write with 0 means semaphore release
 */
if (vmhub >= AMDGPU_MMHUB0(0))
-   WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0);
+   WREG32_SOC15_IP_NO_KIQ(MMHUB, sem, 0, vmhub - 
AMDGPU_MMHUB(0));
else
-   WREG32_SOC15_IP_NO_KIQ(GC, sem, 0);
+   WREG32_SOC15_IP_NO_KIQ(GC, sem, 0, vmhub);
}
  
  	spin_unlock(>gmc.invalidate_lock);

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h 
b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
index da683afa0222..c75e9cd5c98b 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h
+++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
@@ -69,7 +69,7 @@
  
  #define RREG32_SOC15_IP(ip, reg) __RREG32_SOC15_RLC__(reg, 0, ip##_HWIP, 0)
  
-#define RREG32_SOC15_IP_NO_KIQ(ip, reg) __RREG32_SOC15_RLC__(reg, AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0)

+#define RREG32_SOC15_IP_NO_KIQ(ip, reg, inst) __RREG32_SOC15_RLC__(reg, 
AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst)
  
  #define RREG32_SOC15_NO_KIQ(ip, inst, reg) \

__RREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] 
+ reg, \
@@ -86,8 +86,8 @@
  #define WREG32_SOC15_IP(ip, reg, value) \
 __WREG32_SOC15_RLC__(reg, value, 0, ip##_HWIP, 0)
  
-#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value) \

-__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, 0)
+#define WREG32_SOC15_IP_NO_KIQ(ip, reg, value, inst) \
+__WREG32_SOC15_RLC__(reg, value, AMDGPU_REGS_NO_KIQ, ip##_HWIP, inst)
  
  #define WREG32_SOC15_NO_KIQ(ip, inst, reg, value) \

__WREG32_SOC15_RLC__(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] 
+ reg, \


[PATCH] drm/amdgpu set doorbell range when gpu recovery in sriov environment

2023-10-27 Thread Lin . Cao
GFX doorbell range should be set after flr otherwise the GFX doorbell
range will overlap with MEC.

Signed-off-by: Lin.Cao 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index d9ccacd06fba..9074ced34277 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -6467,6 +6467,12 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring 
*ring)
if (adev->gfx.me.mqd_backup[mqd_idx])
memcpy(adev->gfx.me.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
} else {
+   if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev)) {
+   mutex_lock(>srbm_mutex);
+   if (ring->doorbell_index == 
adev->doorbell_index.gfx_ring0 << 1)
+   gfx_v10_0_cp_gfx_set_doorbell(adev, ring);
+   mutex_unlock(>srbm_mutex);
+   }
/* restore mqd with the backup copy */
if (adev->gfx.me.mqd_backup[mqd_idx])
memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], 
sizeof(*mqd));
-- 
2.25.1



Re: [PATCH] drm/amdgpu/gfx10,11: use memcpy_to/fromio for MQDs

2023-10-27 Thread Christian König

Am 26.10.23 um 20:56 schrieb Alex Deucher:

Since they were moved to VRAM, we need to use the IO
variants of memcpy.

Fixes: 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM")
Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 12 ++--
  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 12 ++--
  2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 9032d7a24d7c..306252cd67fd 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -6457,11 +6457,11 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring 
*ring)
nv_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
if (adev->gfx.me.mqd_backup[mqd_idx])
-   memcpy(adev->gfx.me.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
+   memcpy_fromio(adev->gfx.me.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
} else {
/* restore mqd with the backup copy */
if (adev->gfx.me.mqd_backup[mqd_idx])
-   memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], 
sizeof(*mqd));
+   memcpy_toio(mqd, adev->gfx.me.mqd_backup[mqd_idx], 
sizeof(*mqd));
/* reset the ring */
ring->wptr = 0;
*ring->wptr_cpu_addr = 0;
@@ -6735,7 +6735,7 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring 
*ring)
if (amdgpu_in_reset(adev)) { /* for GPU_RESET case */
/* reset MQD to a clean status */
if (adev->gfx.kiq[0].mqd_backup)
-   memcpy(mqd, adev->gfx.kiq[0].mqd_backup, sizeof(*mqd));
+   memcpy_toio(mqd, adev->gfx.kiq[0].mqd_backup, 
sizeof(*mqd));
  
  		/* reset ring buffer */

ring->wptr = 0;
@@ -6758,7 +6758,7 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
  
  		if (adev->gfx.kiq[0].mqd_backup)

-   memcpy(adev->gfx.kiq[0].mqd_backup, mqd, sizeof(*mqd));
+   memcpy_fromio(adev->gfx.kiq[0].mqd_backup, mqd, 
sizeof(*mqd));
}
  
  	return 0;

@@ -6779,11 +6779,11 @@ static int gfx_v10_0_kcq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
  
  		if (adev->gfx.mec.mqd_backup[mqd_idx])

-   memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
+   memcpy_fromio(adev->gfx.mec.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
} else {
/* restore MQD to a clean status */
if (adev->gfx.mec.mqd_backup[mqd_idx])
-   memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], 
sizeof(*mqd));
+   memcpy_toio(mqd, adev->gfx.mec.mqd_backup[mqd_idx], 
sizeof(*mqd));
/* reset ring buffer */
ring->wptr = 0;
atomic64_set((atomic64_t *)ring->wptr_cpu_addr, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 762d7a19f1be..43d066bc5245 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -3684,11 +3684,11 @@ static int gfx_v11_0_gfx_init_queue(struct amdgpu_ring 
*ring)
soc21_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(>srbm_mutex);
if (adev->gfx.me.mqd_backup[mqd_idx])
-   memcpy(adev->gfx.me.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
+   memcpy_fromio(adev->gfx.me.mqd_backup[mqd_idx], mqd, 
sizeof(*mqd));
} else {
/* restore mqd with the backup copy */
if (adev->gfx.me.mqd_backup[mqd_idx])
-   memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], 
sizeof(*mqd));
+   memcpy_toio(mqd, adev->gfx.me.mqd_backup[mqd_idx], 
sizeof(*mqd));
/* reset the ring */
ring->wptr = 0;
*ring->wptr_cpu_addr = 0;
@@ -3977,7 +3977,7 @@ static int gfx_v11_0_kiq_init_queue(struct amdgpu_ring 
*ring)
if (amdgpu_in_reset(adev)) { /* for GPU_RESET case */
/* reset MQD to a clean status */
if (adev->gfx.kiq[0].mqd_backup)
-   memcpy(mqd, adev->gfx.kiq[0].mqd_backup, sizeof(*mqd));
+   memcpy_toio(mqd, adev->gfx.kiq[0].mqd_backup, 
sizeof(*mqd));
  
  		/* reset ring buffer */

ring->wptr = 0;
@@ -4000,7 +4000,7 @@ static int gfx_v11_0_kiq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
  
  		if (adev->gfx.kiq[0].mqd_backup)

-   memcpy(adev->gfx.kiq[0].mqd_backup, mqd, sizeof(*mqd));
+   memcpy_fromio(adev->gfx.kiq[0].mqd_backup, mqd, 
sizeof(*mqd));
}
  
  	return 0;


RE: [PATCH] drm/amdgpu: use mode-2 reset for RAS poison consumption

2023-10-27 Thread Yang, Stanley
[AMD Official Use Only - General]

Reviewed-by: Stanley.Yang 

Regards,
Stanley
> -Original Message-
> From: amd-gfx  On Behalf Of Tao
> Zhou
> Sent: Friday, October 27, 2023 12:04 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhou1, Tao 
> Subject: [PATCH] drm/amdgpu: use mode-2 reset for RAS poison
> consumption
>
> Switch from mode-1 reset to mode-2 for poison consumption.
>
> Signed-off-by: Tao Zhou 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
> index f74347cc087a..d65e21914d8c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
> @@ -166,8 +166,12 @@ static int amdgpu_umc_do_page_retirement(struct
> amdgpu_device *adev,
>   }
>   }
>
> - if (reset)
> + if (reset) {
> + /* use mode-2 reset for poison consumption */
> + if (!entry)
> + con->gpu_reset_flags |=
> AMDGPU_RAS_GPU_RESET_MODE2_RESET;
>   amdgpu_ras_reset_gpu(adev);
> + }
>   }
>
>   kfree(err_data->err_addr);
> --
> 2.35.1



Re: [PATCH] MAINTAINERS: Update the GPU Scheduler email

2023-10-27 Thread Christian König

Am 26.10.23 um 21:32 schrieb Alex Deucher:

On Thu, Oct 26, 2023 at 1:45 PM Luben Tuikov  wrote:

Update the GPU Scheduler maintainer email.

Cc: Alex Deucher 
Cc: Christian König 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: AMD Graphics 
Cc: Direct Rendering Infrastructure - Development 

Signed-off-by: Luben Tuikov 

Acked-by: Alex Deucher 


Acked-by: Christian König 




---
  MAINTAINERS | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4452508bc1b040..f13e476ed8038b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7153,7 +7153,7 @@ F:Documentation/devicetree/bindings/display/xlnx/
  F: drivers/gpu/drm/xlnx/

  DRM GPU SCHEDULER
-M: Luben Tuikov 
+M: Luben Tuikov 
  L: dri-de...@lists.freedesktop.org
  S: Maintained
  T: git git://anongit.freedesktop.org/drm/drm-misc

base-commit: 56e449603f0ac580700621a356d35d5716a62ce5
--
2.42.0





Re: [PATCH 1/2] drm/amdgpu: Remove duplicate fdinfo fields

2023-10-27 Thread Christian König

Am 27.10.23 um 00:05 schrieb Umio Yasuno:

From: Rob Clark 

Some of the fields that are handled by drm_show_fdinfo() crept back in
when rebasing the patch.  Remove them again.

Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper")
Signed-off-by: Rob Clark 
Reviewed-by: 
Co-developed-by: Umio Yasuno 
Signed-off-by: Umio Yasuno 


Reviewed-by: Christian König  for the series.


---

This thread has been inactive for nearly 5 months, so I re-created the patch.

  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 ---
  1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 6038b5021..e9b5d1903 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -87,9 +87,6 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct 
drm_file *file)
 */
  
  	drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);

-   drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   drm_printf(p, "drm-client-id:\t%llu\n", vm->immediate.fence_context);
drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL);
drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);