[PATCH v2] drm/amdgpu: set error query ready after all IPs late init

2020-04-21 Thread Dennis Li
If set error query ready in amdgpu_ras_late_init, which will
cause some IP blocks aren't initialized, but their error query
is ready.

v2: change the prefix of title to "drm/amdgpu" and remove
the unnecessary "{}".

Change-Id: I5087527261cb1b462afd82ad7592cf1ef73b15bd
Signed-off-by: Dennis Li 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 423eed223aa5..e37e0982cd46 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2218,6 +2218,8 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
adev->ip_blocks[i].status.late_initialized = true;
}
 
+   amdgpu_ras_set_error_query_ready(adev, true);
+
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 68b82f7b0b80..8b14aee370c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1921,10 +1921,8 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
}
 
/* in resume phase, no need to create ras fs node */
-   if (adev->in_suspend || adev->in_gpu_reset) {
-   amdgpu_ras_set_error_query_ready(adev, true);
+   if (adev->in_suspend || adev->in_gpu_reset)
return 0;
-   }
 
if (ih_info->cb) {
r = amdgpu_ras_interrupt_add_handler(adev, ih_info);
@@ -1936,8 +1934,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
if (r)
goto sysfs;
 
-   amdgpu_ras_set_error_query_ready(adev, true);
-
return 0;
 cleanup:
amdgpu_ras_sysfs_remove(adev, ras_block);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu: set error query ready after all IPs late init

2020-04-21 Thread Zhang, Hawking
[AMD Public Use]

if (adev->in_suspend || adev->in_gpu_reset) {
-   amdgpu_ras_set_error_query_ready(adev, true);
return 0;
}
Please also remove the "{}". With that fixed, the patch is

Reviewed-by: Hawking Zhang 

Regards,
Hawking

-Original Message-
From: Chen, Guchun  
Sent: Wednesday, April 22, 2020 12:48
To: Li, Dennis ; amd-gfx@lists.freedesktop.org; Deucher, 
Alexander ; Zhou1, Tao ; Zhang, 
Hawking 
Cc: Li, Dennis 
Subject: RE: [PATCH] drm/amd/amdgpu: set error query ready after all IPs late 
init

[AMD Public Use]

Need to modify prefix of commit tile to 'drm/amdgpu'.

With that fixed, the patch is: Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Dennis Li  
Sent: Wednesday, April 22, 2020 12:31 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Zhou1, Tao ; Zhang, Hawking 
; Chen, Guchun 
Cc: Li, Dennis 
Subject: [PATCH] drm/amd/amdgpu: set error query ready after all IPs late init

If set error query ready in amdgpu_ras_late_init, which will cause some IP 
blocks aren't initialized, but their error query is ready.

Change-Id: I5087527261cb1b462afd82ad7592cf1ef73b15bd
Signed-off-by: Dennis Li 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
old mode 100644
new mode 100755
index 423eed223aa5..e37e0982cd46
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2218,6 +2218,8 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
adev->ip_blocks[i].status.late_initialized = true;
}
 
+   amdgpu_ras_set_error_query_ready(adev, true);
+
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 68b82f7b0b80..060866d372a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1922,7 +1922,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
 
/* in resume phase, no need to create ras fs node */
if (adev->in_suspend || adev->in_gpu_reset) {
-   amdgpu_ras_set_error_query_ready(adev, true);
return 0;
}
 
@@ -1936,8 +1935,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
if (r)
goto sysfs;
 
-   amdgpu_ras_set_error_query_ready(adev, true);
-
return 0;
 cleanup:
amdgpu_ras_sysfs_remove(adev, ras_block);
--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu: set error query ready after all IPs late init

2020-04-21 Thread Chen, Guchun
[AMD Public Use]

Need to modify prefix of commit tile to 'drm/amdgpu'.

With that fixed, the patch is: Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Dennis Li  
Sent: Wednesday, April 22, 2020 12:31 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Zhou1, Tao ; Zhang, Hawking 
; Chen, Guchun 
Cc: Li, Dennis 
Subject: [PATCH] drm/amd/amdgpu: set error query ready after all IPs late init

If set error query ready in amdgpu_ras_late_init, which will cause some IP 
blocks aren't initialized, but their error query is ready.

Change-Id: I5087527261cb1b462afd82ad7592cf1ef73b15bd
Signed-off-by: Dennis Li 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
old mode 100644
new mode 100755
index 423eed223aa5..e37e0982cd46
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2218,6 +2218,8 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
adev->ip_blocks[i].status.late_initialized = true;
}
 
+   amdgpu_ras_set_error_query_ready(adev, true);
+
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 68b82f7b0b80..060866d372a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1922,7 +1922,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
 
/* in resume phase, no need to create ras fs node */
if (adev->in_suspend || adev->in_gpu_reset) {
-   amdgpu_ras_set_error_query_ready(adev, true);
return 0;
}
 
@@ -1936,8 +1935,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
if (r)
goto sysfs;
 
-   amdgpu_ras_set_error_query_ready(adev, true);
-
return 0;
 cleanup:
amdgpu_ras_sysfs_remove(adev, ras_block);
--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: refine kiq access register

2020-04-21 Thread Yintian Tao
According to the current kiq access register method,
there will be race condition when using KIQ to read
register if multiple clients want to read at same time
just like the expample below:
1. client-A start to read REG-0 throguh KIQ
2. client-A poll the seqno-0
3. client-B start to read REG-1 through KIQ
4. client-B poll the seqno-1
5. the kiq complete these two read operation
6. client-A to read the register at the wb buffer and
   get REG-1 value

And if there are multiple clients to frequently write
registers through KIQ which may raise the KIQ ring buffer
overwritten problem.

Therefore, allocate fixed number wb slot for rreg use
and limit the submit number which depends on the kiq
ring_size in order to prevent the overwritten problem.

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   7 +-
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  12 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 129 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  13 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|   8 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |   8 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  34 +++--
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  12 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  12 +-
 12 files changed, 211 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 4e1d4cfe7a9f..4530e0de4257 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -526,7 +526,7 @@ static inline void amdgpu_set_ib_value(struct 
amdgpu_cs_parser *p,
 /*
  * Writeback
  */
-#define AMDGPU_MAX_WB 128  /* Reserve at most 128 WB slots for 
amdgpu-owned rings. */
+#define AMDGPU_MAX_WB 256  /* Reserve at most 256 WB slots for 
amdgpu-owned rings. */
 
 struct amdgpu_wb {
struct amdgpu_bo*wb_obj;
@@ -1028,6 +1028,11 @@ bool amdgpu_device_has_dc_support(struct amdgpu_device 
*adev);
 
 int emu_soc_asic_init(struct amdgpu_device *adev);
 
+int amdgpu_gfx_kiq_lock(struct amdgpu_kiq *kiq, bool read);
+void amdgpu_gfx_kiq_unlock(struct amdgpu_kiq *kiq);
+
+void amdgpu_gfx_kiq_consume(struct amdgpu_kiq *kiq, uint32_t *offs);
+void amdgpu_gfx_kiq_restore(struct amdgpu_kiq *kiq, uint32_t *offs);
 /*
  * Registers read & write functions.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 691c89705bcd..034c9f416499 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -309,6 +309,7 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
uint32_t doorbell_off)
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
+   struct amdgpu_kiq *kiq = >gfx.kiq;
struct amdgpu_ring *kiq_ring = >gfx.kiq.ring;
struct v10_compute_mqd *m;
uint32_t mec, pipe;
@@ -324,13 +325,19 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void 
*mqd,
pr_debug("kfd: set HIQ, mec:%d, pipe:%d, queue:%d.\n",
 mec, pipe, queue_id);
 
-   spin_lock(>gfx.kiq.ring_lock);
+   r = amdgpu_gfx_kiq_lock(kiq, false);
+   if (r) {
+   pr_err("failed to lock kiq\n");
+   goto out_unlock;
+   }
+
r = amdgpu_ring_alloc(kiq_ring, 7);
if (r) {
pr_err("Failed to alloc KIQ (%d).\n", r);
goto out_unlock;
}
 
+   amdgpu_gfx_kiq_consume(kiq, NULL);
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
amdgpu_ring_write(kiq_ring,
  PACKET3_MAP_QUEUES_QUEUE_SEL(0) | /* Queue_Sel */
@@ -350,8 +357,9 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
amdgpu_ring_write(kiq_ring, m->cp_hqd_pq_wptr_poll_addr_hi);
amdgpu_ring_commit(kiq_ring);
 
+   amdgpu_gfx_kiq_restore(kiq, NULL);
 out_unlock:
-   spin_unlock(>gfx.kiq.ring_lock);
+   amdgpu_gfx_kiq_unlock(>gfx.kiq);
release_queue(kgd);
 
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index df841c2ac5e7..f243d9990ced 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -307,6 +307,7 @@ int kgd_gfx_v9_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
uint32_t doorbell_off)
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
+   struct amdgpu_kiq *kiq = >gfx.kiq;
struct amdgpu_ring *kiq_ring = >gfx.kiq.ring;
struct v9_mqd *m;
uint32_t mec, pipe;
@@ -322,13 +323,19 @@ int kgd_gfx_v9_hiq_mqd_load(struct kgd_dev 

[PATCH] drm/amd/amdgpu: set error query ready after all IPs late init

2020-04-21 Thread Dennis Li
If set error query ready in amdgpu_ras_late_init, which will
cause some IP blocks aren't initialized, but their error query
is ready.

Change-Id: I5087527261cb1b462afd82ad7592cf1ef73b15bd
Signed-off-by: Dennis Li 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
old mode 100644
new mode 100755
index 423eed223aa5..e37e0982cd46
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2218,6 +2218,8 @@ static int amdgpu_device_ip_late_init(struct 
amdgpu_device *adev)
adev->ip_blocks[i].status.late_initialized = true;
}
 
+   amdgpu_ras_set_error_query_ready(adev, true);
+
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 68b82f7b0b80..060866d372a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1922,7 +1922,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
 
/* in resume phase, no need to create ras fs node */
if (adev->in_suspend || adev->in_gpu_reset) {
-   amdgpu_ras_set_error_query_ready(adev, true);
return 0;
}
 
@@ -1936,8 +1935,6 @@ int amdgpu_ras_late_init(struct amdgpu_device *adev,
if (r)
goto sysfs;
 
-   amdgpu_ras_set_error_query_ready(adev, true);
-
return 0;
 cleanup:
amdgpu_ras_sysfs_remove(adev, ras_block);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL

2020-04-21 Thread Tao, Yintian
Acked-by: Yintian Tao 

-Original Message-
From: Christian König  
Sent: 2020年4月21日 22:23
To: Liu, Monk ; He, Jacob ; Tao, Yintian 
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL

In pp_one_vf mode avoid the extra overhead and read/write the registers without 
the KIQ.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++---  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 10 --  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 13 ++---
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..560ec1c29977 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,14 +7030,21 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v10_0_check_rlcg_range(struct amdgpu_device *adev, diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..a9bcc00f4348 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5615,12 +5615,18 @@ static void gfx_v8_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)  {
u32 data;
 
-   data = RREG32(mmRLC_SPM_VMID);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(mmRLC_SPM_VMID);
+   else
+   data = RREG32(mmRLC_SPM_VMID);
 
data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
 
-   WREG32(mmRLC_SPM_VMID, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_NO_KIQ(mmRLC_SPM_VMID, data);
+   else
+   WREG32(mmRLC_SPM_VMID, data);
 }
 
 static const struct amdgpu_rlc_funcs iceland_rlc_funcs = { diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..c7de10869c81 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,14 +4950,21 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v9_0_check_rlcg_range(struct amdgpu_device *adev,
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: put the audio codec into suspend state before gpu reset

2020-04-21 Thread Quan, Evan
Mode1 reset is also affected as I confirmed on navi10 unfortunately.
That is why the original design(switch to mode1 reset on audio suspended 
failure) over our previous discussions was not taken.
Anyway, I sent out a V2 patch to limit this for baco and mode1 reset only.

Regards,
Evan
-Original Message-
From: Alex Deucher  
Sent: Wednesday, April 22, 2020 4:18 AM
To: Quan, Evan 
Cc: amd-gfx list ; Deucher, Alexander 

Subject: Re: [PATCH] drm/amdgpu: put the audio codec into suspend state before 
gpu reset

On Tue, Apr 21, 2020 at 8:00 AM Evan Quan  wrote:
>
> At default, the autosuspend delay of audio controller is 3S. If the 
> gpu reset is triggered within 3S(after audio controller idle), the 
> audio controller may be unable into suspended state. Then the sudden 
> gpu reset will cause some audio errors. The change here is targeted to 
> resolve this.
>
> However if the audio controller is in use when the gpu reset 
> triggered, this change may be still not enough to put the audio 
> controller into suspend state. Under this case, the gpu reset will 
> still proceed but there will be a warning message printed("failed to 
> suspend display audio").
>
> Change-Id: I33d85e6fcad1882eb33f9cde8916d57be8d5a87a
> Signed-off-by: Evan Quan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 60 
> ++
>  1 file changed, 60 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 2d4b78d96426..983e294d0300 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -69,6 +69,7 @@
>
>  #include 
>  #include 
> +#include 
>
>  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> @@ -4146,6 +4147,49 @@ static void amdgpu_device_unlock_adev(struct 
> amdgpu_device *adev)
> mutex_unlock(>lock_reset);  }
>
> +static void amdgpu_device_resume_display_audio(struct amdgpu_device 
> +*adev) {
> +   struct pci_dev *p = NULL;
> +
> +   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
> +   adev->pdev->bus->number, 1);
> +   if (p) {
> +   pm_runtime_enable(&(p->dev));
> +   pm_runtime_resume(&(p->dev));
> +   }
> +}
> +
> +static int amdgpu_device_suspend_display_audio(struct amdgpu_device 
> +*adev) {
> +   struct pci_dev *p = NULL;
> +   unsigned long end_jiffies;
> +
> +   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
> +   adev->pdev->bus->number, 1);
> +   if (!p)
> +   return -ENODEV;
> +

With mode1 reset affect this or just BACO?  If it's just BACO, we should check 
the reset method here and return an error if not using BACO.

Alex


> +   /*
> +* 3S is the audio controller default autosuspend delay setting.
> +* 4S used here is guaranteed to cover that.
> +*/
> +   end_jiffies = msecs_to_jiffies(4000) + jiffies;
> +   while (!pm_runtime_status_suspended(&(p->dev))) {
> +   if (!pm_runtime_suspend(&(p->dev)))
> +   break;
> +
> +   if (time_after(jiffies, end_jiffies)) {
> +   dev_warn(adev->dev, "failed to suspend display 
> audio\n");
> +   /* TODO: abort the succeeding gpu reset? */
> +   return -ETIMEDOUT;
> +   }
> +   }
> +
> +   pm_runtime_disable(&(p->dev));
> +
> +   return 0;
> +}
> +
>  /**
>   * amdgpu_device_gpu_recover - reset the asic and recover scheduler
>   *
> @@ -4170,6 +4214,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
> bool use_baco =
> (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) ?
> true : false;
> +   bool audio_suspended = false;
>
> /*
>  * Flush RAM to disk so that after reboot @@ -4227,6 +4272,19 
> @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> return 0;
> }
>
> +   /*
> +* Try to put the audio codec into suspend state
> +* before gpu reset started.
> +*
> +* Due to the power domain of the graphics device
> +* is shared with AZ power domain. Without this,
> +* we may change the audio hardware from behind
> +* the audio driver's back. That will trigger
> +* some audio codec errors.
> +*/
> +   if (!amdgpu_device_suspend_display_audio(tmp_adev))
> +   audio_suspended = true;
> +
> amdgpu_ras_set_error_query_ready(tmp_adev, false);
>
> 
> cancel_delayed_work_sync(_adev->delayed_init_work);
> @@ -4339,6 +4397,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
> /*unlock kfd: SRIOV would do 

[PATCH] drm/amdgpu: put the audio codec into suspend state before gpu reset V2

2020-04-21 Thread Evan Quan
At default, the autosuspend delay of audio controller is 3S. If the
gpu reset is triggered within 3S(after audio controller idle),
the audio controller may be unable into suspended state. Then
the sudden gpu reset will cause some audio errors. The change
here is targeted to resolve this.

However if the audio controller is in use when the gpu reset
triggered, this change may be still not enough to put the
audio controller into suspend state. Under this case, the
gpu reset will still proceed but there will be a warning
message printed("failed to suspend display audio").

V2: limit this for BACO and mode1 reset only

Change-Id: I33d85e6fcad1882eb33f9cde8916d57be8d5a87a
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 70 ++
 1 file changed, 70 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2d4b78d96426..70f43b1aed78 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -69,6 +69,7 @@
 
 #include 
 #include 
+#include 
 
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -4146,6 +4147,59 @@ static void amdgpu_device_unlock_adev(struct 
amdgpu_device *adev)
mutex_unlock(>lock_reset);
 }
 
+static void amdgpu_device_resume_display_audio(struct amdgpu_device *adev)
+{
+   struct pci_dev *p = NULL;
+
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, 1);
+   if (p) {
+   pm_runtime_enable(&(p->dev));
+   pm_runtime_resume(&(p->dev));
+   }
+}
+
+static int amdgpu_device_suspend_display_audio(struct amdgpu_device *adev)
+{
+   enum amd_reset_method reset_method;
+   struct pci_dev *p = NULL;
+   unsigned long end_jiffies;
+
+   /*
+* For now, only BACO and mode1 reset are confirmed
+* to suffer the audio issue without proper suspended.
+*/
+   reset_method = amdgpu_asic_reset_method(adev);
+   if ((reset_method != AMD_RESET_METHOD_BACO) &&
+(reset_method != AMD_RESET_METHOD_MODE1))
+   return -EINVAL;
+
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, 1);
+   if (!p)
+   return -ENODEV;
+
+   /*
+* 3S is the audio controller default autosuspend delay setting.
+* 4S used here is guaranteed to cover that.
+*/
+   end_jiffies = msecs_to_jiffies(4000) + jiffies;
+   while (!pm_runtime_status_suspended(&(p->dev))) {
+   if (!pm_runtime_suspend(&(p->dev)))
+   break;
+
+   if (time_after(jiffies, end_jiffies)) {
+   dev_warn(adev->dev, "failed to suspend display 
audio\n");
+   /* TODO: abort the succeeding gpu reset? */
+   return -ETIMEDOUT;
+   }
+   }
+
+   pm_runtime_disable(&(p->dev));
+
+   return 0;
+}
+
 /**
  * amdgpu_device_gpu_recover - reset the asic and recover scheduler
  *
@@ -4170,6 +4224,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
bool use_baco =
(amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) ?
true : false;
+   bool audio_suspended = false;
 
/*
 * Flush RAM to disk so that after reboot
@@ -4227,6 +4282,19 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
return 0;
}
 
+   /*
+* Try to put the audio codec into suspend state
+* before gpu reset started.
+*
+* Due to the power domain of the graphics device
+* is shared with AZ power domain. Without this,
+* we may change the audio hardware from behind
+* the audio driver's back. That will trigger
+* some audio codec errors.
+*/
+   if (!amdgpu_device_suspend_display_audio(tmp_adev))
+   audio_suspended = true;
+
amdgpu_ras_set_error_query_ready(tmp_adev, false);
 
cancel_delayed_work_sync(_adev->delayed_init_work);
@@ -4339,6 +4407,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
/*unlock kfd: SRIOV would do it separately */
if (!(in_ras_intr && !use_baco) && !amdgpu_sriov_vf(tmp_adev))
amdgpu_amdkfd_post_reset(tmp_adev);
+   if (audio_suspended)
+   amdgpu_device_resume_display_audio(tmp_adev);
amdgpu_device_unlock_adev(tmp_adev);
}
 
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v5] drm/amdgpu: cleanup coding style in amdkfd a bit

2020-04-21 Thread Felix Kuehling


On 2020-04-21 21:46, Bernard Zhao wrote:

Make the code a bit more readable by using a common
error handling pattern.
With that done the patch is Reviewed-by: Christian König
.

Signed-off-by: Bernard Zhao 


Thanks. The patch is

Reviewed-by: Felix Kuehling 

I removed the history from the commit message, made Christian's 
Reviewed-by official and applied the patch.


Regards,
  Felix



Changes since V1:
*commit message improve
*code style refactoring

Changes since V2:
*code style adjust

Changes since V3:
*find the best way to merge unnecessary if/else check branch

Changes since V4:
*Improve the subject line and commit message

Link for V1:
*https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226587%2Fdata=02%7C01%7CFelix.Kuehling%40amd.com%7C27509aa0455042beedc708d7e65efc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637231167951862118sdata=JPCxsDycw0JvZfym1akV7d%2B4oPzvzQ04Zl1rv3WN%2Fj0%3Dreserved=0
---
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 20 +--
  1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..acb612c53b9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,15 +660,15 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
-   pr_err("Failed to reserve buffers in ttm\n");
+   if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
+   return ret;
}
  
-	return ret;

+   ctx->reserved = true;
+   return 0;
  }
  
  /**

@@ -733,17 +733,15 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else
-   pr_err("Failed to reserve buffers in ttm.\n");
-
if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
+   return ret;
}
  
-	return ret;

+   ctx->reserved = true;
+   return 0;
  }
  
  /**

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V3] amdgpu:optimization-- reduce no need mutex_lock area

2020-04-21 Thread Felix Kuehling
Thanks again for the patch. I'm going to apply this with some minor 
fixes. The headline should start with "drm/amdgpu:".  I'll also change 
the wording of the headline and commit message:


   drm/amdgpu: shrink critical section in
   amdgpu_amdkfd_gpuvm_free_memory_of_gpu

   Reduce the mem->lock`s protected code area, no need to protect pr_debug.
   This also simplifies error handling.


There is one more cosmetic change I'm going to make, see inline. I'll 
apply your patch with those updates if you're OK with that.



On 2020-04-21 2:48, Bernard Zhao wrote:

Maybe we could reduce the mutex_lock(>lock)`s protected code area,
and no need to protect pr_debug.

Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve

Changes since V2:
*move comment along with the mutex_unlock

Link for V1:
*https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226588%2Fdata=02%7C01%7CFelix.Kuehling%40amd.com%7Cc715f5d02aae40846b8f08d7e5c001fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230485152789682sdata=4UdZiWMAbW8eR1BS2%2F6qMvs5K6cHWy5c8I32ReQ4uz0%3Dreserved=0
Link for V2:
*https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1227907%2Fdata=02%7C01%7CFelix.Kuehling%40amd.com%7Cc715f5d02aae40846b8f08d7e5c001fd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230485152799673sdata=Wt%2Bk7k4MtSX4zIDgmLEOB6ZKzfuqAd6GJZ3Creqf1aY%3Dreserved=0
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 327317c54f7c..f03d9843d723 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1285,21 +1285,21 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
struct bo_vm_reservation_context ctx;
struct ttm_validate_buffer *bo_list_entry;
int ret;
+   unsigned int mapped_to_gpu_memory;


I'll move this before the "int ret;" line. It makes the code more 
readable if long declarations come before short ones.


Regards,
  Felix


  
  	mutex_lock(>lock);

+   mapped_to_gpu_memory = mem->mapped_to_gpu_memory;
+   mutex_unlock(>lock);
+   /* lock is not needed after this, since mem is unused and will
+* be freed anyway
+*/
  
-	if (mem->mapped_to_gpu_memory > 0) {

+   if (mapped_to_gpu_memory > 0) {
pr_debug("BO VA 0x%llx size 0x%lx is still mapped.\n",
mem->va, bo_size);
-   mutex_unlock(>lock);
return -EBUSY;
}
  
-	mutex_unlock(>lock);

-   /* lock is not needed after this, since mem is unused and will
-* be freed anyway
-*/
-
/* No more MMU notifiers */
amdgpu_mn_unregister(mem->bo);
  
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL

2020-04-21 Thread Liu, Monk
Reviewed-by: Monk Liu 

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Christian König  
Sent: Tuesday, April 21, 2020 10:23 PM
To: Liu, Monk ; He, Jacob ; Tao, Yintian 
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL

In pp_one_vf mode avoid the extra overhead and read/write the registers without 
the KIQ.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++---  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 10 --  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 13 ++---
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..560ec1c29977 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,14 +7030,21 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v10_0_check_rlcg_range(struct amdgpu_device *adev, diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..a9bcc00f4348 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5615,12 +5615,18 @@ static void gfx_v8_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)  {
u32 data;
 
-   data = RREG32(mmRLC_SPM_VMID);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(mmRLC_SPM_VMID);
+   else
+   data = RREG32(mmRLC_SPM_VMID);
 
data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
 
-   WREG32(mmRLC_SPM_VMID, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_NO_KIQ(mmRLC_SPM_VMID, data);
+   else
+   WREG32(mmRLC_SPM_VMID, data);
 }
 
 static const struct amdgpu_rlc_funcs iceland_rlc_funcs = { diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..c7de10869c81 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,14 +4950,21 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v9_0_check_rlcg_range(struct amdgpu_device *adev,
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: Track GPU memory utilization per process

2020-04-21 Thread Felix Kuehling
See comments inline.

Am 2020-04-19 um 9:58 p.m. schrieb Mukul Joshi:
> Track GPU memory usage on a per process basis and report it through
> sysfs.
>
> Signed-off-by: Mukul Joshi 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12 ++
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  7 
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 51 ++--
>  3 files changed, 66 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index f8fa03a12add..91d4f45730ae 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -39,6 +39,7 @@
>  #include "kfd_device_queue_manager.h"
>  #include "kfd_dbgmgr.h"
>  #include "amdgpu_amdkfd.h"
> +#include "amdgpu_object.h"
>  
>  static long kfd_ioctl(struct file *, unsigned int, unsigned long);
>  static int kfd_open(struct inode *, struct file *);
> @@ -1322,6 +1323,9 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
> *filep,
>   goto err_free;
>   }
>  
> + /* Update the VRAM usage count */
> + pdd->vram_usage += args->size;
> +
>   mutex_unlock(>mutex);
>  
>   args->handle = MAKE_HANDLE(args->gpu_id, idr_handle);
> @@ -1351,6 +1355,7 @@ static int kfd_ioctl_free_memory_of_gpu(struct file 
> *filep,
>   void *mem;
>   struct kfd_dev *dev;
>   int ret;
> + uint64_t size = 0;
>  
>   dev = kfd_device_by_id(GET_GPU_ID(args->handle));
>   if (!dev)
> @@ -1372,6 +1377,11 @@ static int kfd_ioctl_free_memory_of_gpu(struct file 
> *filep,
>   goto err_unlock;
>   }
>  
> + if (((struct kgd_mem *)mem)->bo)
> + size = ((struct kgd_mem *)mem)->bo->tbo.mem.size;
> + else
> + pr_debug("BO is NULL\n");

Like Harish said, you'll need to find out whether the BO is in VRAM by
checking bo->preferred_domain == AMDGPU_GEM_DOMAIN_VRAM. Also, it would
be cleaner to implement the AMDGPU BO-specific details of this in
amdgpu_amdkfd_gpuvm_free_memory_of_gpu. I had suggested returning the
freed VRAM size as an output parameter of that function to KFD.

I realized that there is another problem we need to account for.
kfd_ioctl_free_memory_of_gpu is also used for unreferencing imported
memory (imported graphics buffers and IPC buffers). We don't count their
VRAM usage during import, so we shouldn't count it during free either.
For that we'd probably need to add a flag in struct kgd_mem to indicate
that it's imported, so we don't count its VRAM usage during free.

Alternatively, we could to count the VRAM usage during import. That way,
some memory might be counted multiple times (in each importing process).
But I think that's not unreasonable.

Regards,
  Felix


> +
>   ret = amdgpu_amdkfd_gpuvm_free_memory_of_gpu(dev->kgd,
>   (struct kgd_mem *)mem);
>  
> @@ -1382,6 +1392,8 @@ static int kfd_ioctl_free_memory_of_gpu(struct file 
> *filep,
>   kfd_process_device_remove_obj_handle(
>   pdd, GET_IDR_HANDLE(args->handle));
>  
> + pdd->vram_usage -= size;
> +
>  err_unlock:
>   mutex_unlock(>mutex);
>   return ret;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 43b888b311c7..e7918fc3cee5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -616,6 +616,8 @@ enum kfd_pdd_bound {
>   PDD_BOUND_SUSPENDED,
>  };
>  
> +#define MAX_VRAM_FILENAME_LEN 11
> +
>  /* Data that is per-process-per device. */
>  struct kfd_process_device {
>   /*
> @@ -658,6 +660,11 @@ struct kfd_process_device {
>  
>   /* Is this process/pasid bound to this device? (amd_iommu_bind_pasid) */
>   enum kfd_pdd_bound bound;
> +
> + /* VRAM usage */
> + uint64_t vram_usage;
> + struct attribute attr_vram;
> + char vram_filename[MAX_VRAM_FILENAME_LEN];
>  };
>  
>  #define qpd_to_pdd(x) container_of(x, struct kfd_process_device, qpd)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index fe0cd49d4ea7..c7f1e5d89bd9 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -79,18 +79,22 @@ static struct kfd_procfs_tree procfs;
>  static ssize_t kfd_procfs_show(struct kobject *kobj, struct attribute *attr,
>  char *buffer)
>  {
> - int val = 0;
> -
>   if (strcmp(attr->name, "pasid") == 0) {
>   struct kfd_process *p = container_of(attr, struct kfd_process,
>attr_pasid);
> - val = p->pasid;
> +
> + return snprintf(buffer, PAGE_SIZE, "%d\n", p->pasid);
> + } else if (strncmp(attr->name, "vram_", 5) == 0) {
> + struct kfd_process_device *pdd = container_of(attr, struct 
> kfd_process_device,
> +   

Re: [PATCH 0/4] Some XGMI related gpu reset fixes and cleanups

2020-04-21 Thread Andrey Grodzovsky

Patch 1 Acked-by: Andrey Grodzovsky 

Patches 2-4 Reviewed-by: Andrey Grodzovsky 

Andrey

On 4/21/20 1:23 AM, Evan Quan wrote:

Patch 1 and 2 are the necessary fixes for XGMI setup. Since these
operations are needed for other devices from the same hive. That's
missing now.
Patch 3 are 4 are basically code cosmetic.

Evan Quan (4):
   drm/amdgpu: correct fbdev suspend on gpu reset
   drm/amdgpu: correct cancel_delayed_work_sync on gpu reset
   drm/amdgpu: optimize the gpu reset for XGMI setup V2
   drm/amdgpu: code cleanup around gpu reset

  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 92 --
  1 file changed, 31 insertions(+), 61 deletions(-)


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: put the audio codec into suspend state before gpu reset

2020-04-21 Thread Alex Deucher
On Tue, Apr 21, 2020 at 8:00 AM Evan Quan  wrote:
>
> At default, the autosuspend delay of audio controller is 3S. If the
> gpu reset is triggered within 3S(after audio controller idle),
> the audio controller may be unable into suspended state. Then
> the sudden gpu reset will cause some audio errors. The change
> here is targeted to resolve this.
>
> However if the audio controller is in use when the gpu reset
> triggered, this change may be still not enough to put the
> audio controller into suspend state. Under this case, the
> gpu reset will still proceed but there will be a warning
> message printed("failed to suspend display audio").
>
> Change-Id: I33d85e6fcad1882eb33f9cde8916d57be8d5a87a
> Signed-off-by: Evan Quan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 60 ++
>  1 file changed, 60 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 2d4b78d96426..983e294d0300 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -69,6 +69,7 @@
>
>  #include 
>  #include 
> +#include 
>
>  MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
>  MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
> @@ -4146,6 +4147,49 @@ static void amdgpu_device_unlock_adev(struct 
> amdgpu_device *adev)
> mutex_unlock(>lock_reset);
>  }
>
> +static void amdgpu_device_resume_display_audio(struct amdgpu_device *adev)
> +{
> +   struct pci_dev *p = NULL;
> +
> +   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
> +   adev->pdev->bus->number, 1);
> +   if (p) {
> +   pm_runtime_enable(&(p->dev));
> +   pm_runtime_resume(&(p->dev));
> +   }
> +}
> +
> +static int amdgpu_device_suspend_display_audio(struct amdgpu_device *adev)
> +{
> +   struct pci_dev *p = NULL;
> +   unsigned long end_jiffies;
> +
> +   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
> +   adev->pdev->bus->number, 1);
> +   if (!p)
> +   return -ENODEV;
> +

With mode1 reset affect this or just BACO?  If it's just BACO, we
should check the reset method here and return an error if not using
BACO.

Alex


> +   /*
> +* 3S is the audio controller default autosuspend delay setting.
> +* 4S used here is guaranteed to cover that.
> +*/
> +   end_jiffies = msecs_to_jiffies(4000) + jiffies;
> +   while (!pm_runtime_status_suspended(&(p->dev))) {
> +   if (!pm_runtime_suspend(&(p->dev)))
> +   break;
> +
> +   if (time_after(jiffies, end_jiffies)) {
> +   dev_warn(adev->dev, "failed to suspend display 
> audio\n");
> +   /* TODO: abort the succeeding gpu reset? */
> +   return -ETIMEDOUT;
> +   }
> +   }
> +
> +   pm_runtime_disable(&(p->dev));
> +
> +   return 0;
> +}
> +
>  /**
>   * amdgpu_device_gpu_recover - reset the asic and recover scheduler
>   *
> @@ -4170,6 +4214,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
> bool use_baco =
> (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) ?
> true : false;
> +   bool audio_suspended = false;
>
> /*
>  * Flush RAM to disk so that after reboot
> @@ -4227,6 +4272,19 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
> return 0;
> }
>
> +   /*
> +* Try to put the audio codec into suspend state
> +* before gpu reset started.
> +*
> +* Due to the power domain of the graphics device
> +* is shared with AZ power domain. Without this,
> +* we may change the audio hardware from behind
> +* the audio driver's back. That will trigger
> +* some audio codec errors.
> +*/
> +   if (!amdgpu_device_suspend_display_audio(tmp_adev))
> +   audio_suspended = true;
> +
> amdgpu_ras_set_error_query_ready(tmp_adev, false);
>
> cancel_delayed_work_sync(_adev->delayed_init_work);
> @@ -4339,6 +4397,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
> /*unlock kfd: SRIOV would do it separately */
> if (!(in_ras_intr && !use_baco) && !amdgpu_sriov_vf(tmp_adev))
> amdgpu_amdkfd_post_reset(tmp_adev);
> +   if (audio_suspended)
> +   amdgpu_device_resume_display_audio(tmp_adev);
> amdgpu_device_unlock_adev(tmp_adev);
> }
>
> --
> 2.26.2
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re:Re: [PATCH] amdgpu: fixes memleak issue when init failed

2020-04-21 Thread 赵军奎

发件人:"Christian König" 
发送日期:2020-04-21 21:02:27
收件人:"赵军奎" 
抄送人:Alex Deucher ,"David (ChunMing) Zhou" 
,David Airlie ,Daniel Vetter 
,Tom St Denis ,Ori Messinger 
,Sam Ravnborg 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org,opensource.ker...@vivo.com
主题:Re: [PATCH] amdgpu: fixes memleak issue when init failed>Am 21.04.20 um 
14:09 schrieb 赵军奎:
>> From: "Christian König" 
>> Date: 2020-04-21 19:22:49
>> To:  Bernard Zhao ,Alex Deucher 
>> ,"David (ChunMing) Zhou" 
>> ,David Airlie ,Daniel Vetter 
>> ,Tom St Denis ,Ori Messinger 
>> ,Sam Ravnborg 
>> ,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
>> Cc:  opensource.ker...@vivo.com
>> Subject: Re: [PATCH] amdgpu: fixes memleak issue when init failed>Am 
>> 21.04.20 um 13:17 schrieb Bernard Zhao:
 VRAM manager and DRM MM when init failed, there is no operaction
 to free kzalloc memory & remove device file.
 This will lead to memleak & cause stability issue.
>>> NAK, failure to create sysfs nodes are not critical.
>>>
>>> Christian.
>>>
>> OK, get it.
>> By the way, should i modify this patch to just handle  in error 
>> branch, or that it is also unnecessary?
>
>What you can do is to drop the "return ret" if anything with the sysfs 
>nodes goes wrong and instead print the error code.

Emmm, for this part, i am not sure, my modify first print the error, secone 
release not free memory, 
and last return error, make everything clear to the system.
I think it`s the same with what you mentioned, is there something that I 
misunderstood?

>
>It's really annoying that loading, unloading and loading the driver 
>again sometimes fails because we have a bug in the sysfs files cleanup.
>
>We certainly should fix those bugs as well, but they are just not 
>critical for correct driver functionality.
>
>Regards,
>Christian.


>>
>> Regards,
>> Bernard
>>
 Signed-off-by: Bernard Zhao 
 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 24 
1 file changed, 19 insertions(+), 5 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
 b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
 index 82a3299e53c0..4c5fb153e6b4 100644
 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
 +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
 @@ -175,30 +175,44 @@ static int amdgpu_vram_mgr_init(struct 
 ttm_mem_type_manager *man,
ret = device_create_file(adev->dev, 
 _attr_mem_info_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file 
 mem_info_vram_total\n");
 -  return ret;
 +  goto VRAM_TOTAL_FAIL;
}
ret = device_create_file(adev->dev, 
 _attr_mem_info_vis_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file 
 mem_info_vis_vram_total\n");
 -  return ret;
 +  goto VIS_VRAM_TOTA_FAIL;
}
ret = device_create_file(adev->dev, 
 _attr_mem_info_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file 
 mem_info_vram_used\n");
 -  return ret;
 +  goto VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, 
 _attr_mem_info_vis_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file 
 mem_info_vis_vram_used\n");
 -  return ret;
 +  goto VIS_VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, 
 _attr_mem_info_vram_vendor);
if (ret) {
DRM_ERROR("Failed to create device file 
 mem_info_vram_vendor\n");
 -  return ret;
 +  goto VRAM_VERDOR_FAIL;
}

return 0;
 +
 +VRAM_VERDOR_FAIL:
 +  device_remove_file(adev->dev, _attr_mem_info_vis_vram_used);
 +VIS_VRAM_USED_FAIL:
 +  device_remove_file(adev->dev, _attr_mem_info_vram_used);
 +RVAM_USED_FAIL:
 +  device_remove_file(adev->dev, _attr_mem_info_vis_vram_total);
 +VIS_VRAM_TOTA_FAIL:
 +  device_remove_file(adev->dev, _attr_mem_info_vram_total);
 +VRAM_TOTAL_FAIL:
 +  kfree(mgr);
 +  man->priv = NULL;
 +
 +  return ret;
}

/**
>>
>


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] amdgpu: fixes memleak issue when init failed

2020-04-21 Thread Christian König

Am 21.04.20 um 15:39 schrieb 赵军奎:

发件人:"Christian König" 
发送日期:2020-04-21 21:02:27
收件人:"赵军奎" 
抄送人:Alex Deucher ,"David (ChunMing) Zhou" ,David Airlie 
,Daniel Vetter ,Tom St Denis ,Ori Messinger 
,Sam Ravnborg 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org,opensource.ker...@vivo.com
主题:Re: [PATCH] amdgpu: fixes memleak issue when init failed>Am 21.04.20 um 
14:09 schrieb 赵军奎:

From: "Christian König" 
Date: 2020-04-21 19:22:49
To:  Bernard Zhao ,Alex Deucher ,"David (ChunMing) Zhou" 
,David Airlie ,Daniel Vetter ,Tom St Denis 
,Ori Messinger ,Sam Ravnborg 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc:  opensource.ker...@vivo.com
Subject: Re: [PATCH] amdgpu: fixes memleak issue when init failed>Am 21.04.20 
um 13:17 schrieb Bernard Zhao:

VRAM manager and DRM MM when init failed, there is no operaction
to free kzalloc memory & remove device file.
This will lead to memleak & cause stability issue.

NAK, failure to create sysfs nodes are not critical.

Christian.


OK, get it.
By the way, should i modify this patch to just handle  in error 
branch, or that it is also unnecessary?

What you can do is to drop the "return ret" if anything with the sysfs
nodes goes wrong and instead print the error code.

Emmm, for this part, i am not sure, my modify first print the error, secone 
release not free memory,
and last return error, make everything clear to the system.
I think it`s the same with what you mentioned, is there something that I 
misunderstood?


Yes, maybe an example makes it more clear what to do here. Currently we 
print and error and return when something with the sysfs files goes wrong:


if (ret) {
    DRM_ERROR("Failed to create device file mem_info_vram_total\n");
    return ret;
}

But what we should do instead is just to print an error and continue and 
in the end return success status:


if (ret)
    DRM_ERROR("Failed to create device file mem_info_vram_total 
(%d)\n", r);


...
return 0;

Regards,
Christian.





It's really annoying that loading, unloading and loading the driver
again sometimes fails because we have a bug in the sysfs files cleanup.

We certainly should fix those bugs as well, but they are just not
critical for correct driver functionality.

Regards,
Christian.



Regards,
Bernard


Signed-off-by: Bernard Zhao 
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 24 
1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 82a3299e53c0..4c5fb153e6b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -175,30 +175,44 @@ static int amdgpu_vram_mgr_init(struct 
ttm_mem_type_manager *man,
ret = device_create_file(adev->dev, _attr_mem_info_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_total\n");
-   return ret;
+   goto VRAM_TOTAL_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_total\n");
-   return ret;
+   goto VIS_VRAM_TOTA_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_used\n");
-   return ret;
+   goto VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_used\n");
-   return ret;
+   goto VIS_VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_vendor);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vram_vendor\n");
-   return ret;
+   goto VRAM_VERDOR_FAIL;
}

	return 0;

+
+VRAM_VERDOR_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_used);
+VIS_VRAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_used);
+RVAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_total);
+VIS_VRAM_TOTA_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_total);
+VRAM_TOTAL_FAIL:
+   kfree(mgr);
+   man->priv = NULL;
+
+   return ret;
}

/**




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amdgpu: fix kernel-doc struct warning

2020-04-21 Thread Christian König

Am 21.04.20 um 16:33 schrieb Christian König:

Am 20.04.20 um 03:50 schrieb Randy Dunlap:

Fix a kernel-doc warning of missing struct field desription:

../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function 
parameter or member 'vm' not described in 'amdgpu_vm_eviction_lock'


Can't we just document the function parameter instead? Should only be 
one IIRC.


On the other hand forget that, the format doesn't match a proper 
kernel-doc for a function anyway.


Reviewed-by: Christian König 



Thanks,
Christian.



Fixes: a269e44989f3 ("drm/amdgpu: Avoid reclaim fs while eviction lock")
Signed-off-by: Randy Dunlap 
Cc: Signed-off-by: Alex Sierra 
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Cc: David (ChunMing) Zhou 
Cc: amd-gfx@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |    2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

--- lnx-57-rc2.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ lnx-57-rc2/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -82,7 +82,7 @@ struct amdgpu_prt_cb {
  struct dma_fence_cb cb;
  };
  -/**
+/*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no 
reclaim-FS

   * happens while holding this lock anywhere to prevent deadlocks when
   * an MMU notifier runs in reclaim-FS context.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amdgpu: fix kernel-doc struct warning

2020-04-21 Thread Christian König

Am 20.04.20 um 03:50 schrieb Randy Dunlap:

Fix a kernel-doc warning of missing struct field desription:

../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function parameter or 
member 'vm' not described in 'amdgpu_vm_eviction_lock'


Can't we just document the function parameter instead? Should only be 
one IIRC.


Thanks,
Christian.



Fixes: a269e44989f3 ("drm/amdgpu: Avoid reclaim fs while eviction lock")
Signed-off-by: Randy Dunlap 
Cc: Signed-off-by: Alex Sierra 
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Cc: David (ChunMing) Zhou 
Cc: amd-gfx@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

--- lnx-57-rc2.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ lnx-57-rc2/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -82,7 +82,7 @@ struct amdgpu_prt_cb {
struct dma_fence_cb cb;
  };
  
-/**

+/*
   * vm eviction_lock can be taken in MMU notifiers. Make sure no reclaim-FS
   * happens while holding this lock anywhere to prevent deadlocks when
   * an MMU notifier runs in reclaim-FS context.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL

2020-04-21 Thread Christian König
In pp_one_vf mode avoid the extra overhead and read/write the
registers without the KIQ.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 10 --
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 13 ++---
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..560ec1c29977 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,14 +7030,21 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)
 {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v10_0_check_rlcg_range(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..a9bcc00f4348 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5615,12 +5615,18 @@ static void gfx_v8_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
 {
u32 data;
 
-   data = RREG32(mmRLC_SPM_VMID);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(mmRLC_SPM_VMID);
+   else
+   data = RREG32(mmRLC_SPM_VMID);
 
data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
 
-   WREG32(mmRLC_SPM_VMID, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_NO_KIQ(mmRLC_SPM_VMID, data);
+   else
+   WREG32(mmRLC_SPM_VMID, data);
 }
 
 static const struct amdgpu_rlc_funcs iceland_rlc_funcs = {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..c7de10869c81 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,14 +4950,21 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)
 {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v9_0_check_rlcg_range(struct amdgpu_device *adev,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amd: display: fix kernel-doc struct warning

2020-04-21 Thread Harry Wentland
On 2020-04-19 9:50 p.m., Randy Dunlap wrote:
> Fix a kernel-doc warning of missing struct field desription:
> 
> ../drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:331: warning: Function 
> parameter or member 'hdcp_workqueue' not described in 'amdgpu_display_manager'
> 
> Fixes: 52704fcaf74b ("drm/amd/display: Initialize HDCP work queue")
> Signed-off-by: Randy Dunlap 
> Cc: Bhawanpreet Lakha 
> Cc: Harry Wentland 
> Cc: Alex Deucher 
> Cc: Leo Li 
> Cc: amd-gfx@lists.freedesktop.org

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |1 +
>  1 file changed, 1 insertion(+)
> 
> --- lnx-57-rc2.orig/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ lnx-57-rc2/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -139,6 +139,7 @@ struct amdgpu_dm_backlight_caps {
>   * @backlight_link: Link on which to control backlight
>   * @backlight_caps: Capabilities of the backlight device
>   * @freesync_module: Module handling freesync calculations
> + * @hdcp_workqueue: workqueue for display manager interaction with HDCP 
> module
>   * @fw_dmcu: Reference to DMCU firmware
>   * @dmcu_fw_version: Version of the DMCU firmware
>   * @soc_bounding_box: SOC bounding box values provided by gpu_info FW
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amdgpu: fix kernel-doc struct warning

2020-04-21 Thread Harry Wentland
On 2020-04-19 9:50 p.m., Randy Dunlap wrote:
> Fix a kernel-doc warning of missing struct field desription:
> 
> ../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function parameter or 
> member 'vm' not described in 'amdgpu_vm_eviction_lock'
> 
> Fixes: a269e44989f3 ("drm/amdgpu: Avoid reclaim fs while eviction lock")
> Signed-off-by: Randy Dunlap 
> Cc: Signed-off-by: Alex Sierra 
> Cc: Felix Kuehling 
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: David (ChunMing) Zhou 
> Cc: amd-gfx@lists.freedesktop.org


Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- lnx-57-rc2.orig/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ lnx-57-rc2/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -82,7 +82,7 @@ struct amdgpu_prt_cb {
>   struct dma_fence_cb cb;
>  };
>  
> -/**
> +/*
>   * vm eviction_lock can be taken in MMU notifiers. Make sure no reclaim-FS
>   * happens while holding this lock anywhere to prevent deadlocks when
>   * an MMU notifier runs in reclaim-FS context.
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Liu, Monk
> What are you talking about? The bits control what is used in the MC 
> interface, there is no increment or anything here.

My bad , it is  RLC_SPM_PERF_CNTR not RLC_SPM_PERF_COUNTER, I though it as 
COUNTER 


>>Agreed that sounds like a good idea to me as well no matter if we use RMW or 
>>just a write.

If we go NOKIQ way then we always use MMIO to update it,  no RMW package at all 
Besides RMW is not available in KCQ ring ... 
Looks we have no choice !

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Tao, Yintian  
Sent: Tuesday, April 21, 2020 9:46 PM
To: Koenig, Christian ; Liu, Monk ; 
He, Jacob ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

Hi  Christian


Great. Then can you modify the patch according to Monk's suggestion?
We need this patch for one important project.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian 
Sent: 2020年4月21日 21:38
To: Liu, Monk ; Tao, Yintian ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

> The problem is some fields are increased by hardware

What are you talking about? The bits control what is used in the MC interface, 
there is no increment or anything here.

> I think at least we should apply one change:  we use NO_KIQ for SRIOV 
> pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ 
> flood

Agreed that sounds like a good idea to me as well no matter if we use RMW or 
just a write.

Regards,
Christian.

Am 21.04.20 um 15:34 schrieb Liu, Monk:
> The problem is some fields are increased by hardware, and RLC simply 
> read its value, we cannot set those field together with VMID
>
> Christian, we should stop arguing on this small feature,  there is no way to 
> have a worse solution compared with current logic 
>
> I think at least we should apply one change:  we use NO_KIQ for SRIOV 
> pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ 
> flood
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian K?nig
> Sent: Tuesday, April 21, 2020 7:52 PM
> To: Liu, Monk ; Koenig, Christian 
> ; Tao, Yintian ; He, 
> Jacob ; amd-gfx@lists.freedesktop.org
> Cc: Gu, Frans 
> Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update
>
> Hi Monk,
>
> at least on Vega that should be fine. If the RLC should use anything else 
> than 0 here we should update that together with the VMID.
>
> Regards,
> Christian.
>
> Am 21.04.20 um 11:54 schrieb Liu, Monk:
> Could only be that the firmware updates the bits to something non 
> default, I'm going to double check that on a Vega10.
>> I think that will be a sure answer, otherwise why we need those field if we 
>> always write 0 to them and reader always expect 0 reading back from them ??
>>
>> Those fields are kind of performance counters
>>
>> _
>> Monk Liu|GPU Virtualization Team |AMD
>>
>>
>> -Original Message-
>> From: Christian König 
>> Sent: Tuesday, April 21, 2020 5:52 PM
>> To: Tao, Yintian ; Liu, Monk ; 
>> He, Jacob ; amd-gfx@lists.freedesktop.org
>> Cc: Gu, Frans 
>> Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update
>>
>> Am 21.04.20 um 11:45 schrieb Tao, Yintian:
>>> -Original Message-
>>> From: Christian König 
>>> Sent: 2020年4月21日 17:10
>>> To: Liu, Monk ; Tao, Yintian 
>>> ; He, Jacob ; 
>>> amd-gfx@lists.freedesktop.org
>>> Cc: Gu, Frans 
>>> Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update
>>>
>>> The RLC SPM configuration register contains the information how the memory 
>>> access is made (VMID, MTYPE, etc) which should always be consistent.
>>>
>>> So instead of a read modify write cycle of the VMID always update the whole 
>>> register.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
>>> 4 files changed, 4 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> index 0a03e2ad5d95..2a6556371046 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> @@ -7030,12 +7030,7 @@ static int
>>> gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev,
>>> 
>>> static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, 
>>> unsigned vmid)  {
>>> -   u32 data;
>>> -
>>> -   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
>>> -
>>> -   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
>>> -   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
>>> RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
>>> +   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, 

RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Tao, Yintian
Hi  Christian


Great. Then can you modify the patch according to Monk's suggestion?
We need this patch for one important project.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian  
Sent: 2020年4月21日 21:38
To: Liu, Monk ; Tao, Yintian ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

> The problem is some fields are increased by hardware

What are you talking about? The bits control what is used in the MC interface, 
there is no increment or anything here.

> I think at least we should apply one change:  we use NO_KIQ for SRIOV 
> pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ 
> flood

Agreed that sounds like a good idea to me as well no matter if we use RMW or 
just a write.

Regards,
Christian.

Am 21.04.20 um 15:34 schrieb Liu, Monk:
> The problem is some fields are increased by hardware, and RLC simply 
> read its value, we cannot set those field together with VMID
>
> Christian, we should stop arguing on this small feature,  there is no way to 
> have a worse solution compared with current logic 
>
> I think at least we should apply one change:  we use NO_KIQ for SRIOV 
> pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ 
> flood
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian K?nig
> Sent: Tuesday, April 21, 2020 7:52 PM
> To: Liu, Monk ; Koenig, Christian 
> ; Tao, Yintian ; He, 
> Jacob ; amd-gfx@lists.freedesktop.org
> Cc: Gu, Frans 
> Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update
>
> Hi Monk,
>
> at least on Vega that should be fine. If the RLC should use anything else 
> than 0 here we should update that together with the VMID.
>
> Regards,
> Christian.
>
> Am 21.04.20 um 11:54 schrieb Liu, Monk:
> Could only be that the firmware updates the bits to something non 
> default, I'm going to double check that on a Vega10.
>> I think that will be a sure answer, otherwise why we need those field if we 
>> always write 0 to them and reader always expect 0 reading back from them ??
>>
>> Those fields are kind of performance counters
>>
>> _
>> Monk Liu|GPU Virtualization Team |AMD
>>
>>
>> -Original Message-
>> From: Christian König 
>> Sent: Tuesday, April 21, 2020 5:52 PM
>> To: Tao, Yintian ; Liu, Monk ; 
>> He, Jacob ; amd-gfx@lists.freedesktop.org
>> Cc: Gu, Frans 
>> Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update
>>
>> Am 21.04.20 um 11:45 schrieb Tao, Yintian:
>>> -Original Message-
>>> From: Christian König 
>>> Sent: 2020年4月21日 17:10
>>> To: Liu, Monk ; Tao, Yintian 
>>> ; He, Jacob ; 
>>> amd-gfx@lists.freedesktop.org
>>> Cc: Gu, Frans 
>>> Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update
>>>
>>> The RLC SPM configuration register contains the information how the memory 
>>> access is made (VMID, MTYPE, etc) which should always be consistent.
>>>
>>> So instead of a read modify write cycle of the VMID always update the whole 
>>> register.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
>>> 4 files changed, 4 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> index 0a03e2ad5d95..2a6556371046 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> @@ -7030,12 +7030,7 @@ static int
>>> gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev,
>>> 
>>> static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, 
>>> unsigned vmid)  {
>>> -   u32 data;
>>> -
>>> -   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
>>> -
>>> -   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
>>> -   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
>>> RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
>>> +   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
>>> [yttao]: The orig_val is 0 which means except VMID field other reset fields 
>>> will be set to 0. Whether it is legal?
>> According to the register specification that is the default value for those 
>> bits on gfx9/10.
>>
>> Could only be that the firmware updates the bits to something non default, 
>> I'm going to double check that on a Vega10.
>>
>> Regards,
>> Christian.
>>
>>> 
>>> WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git 
>>> a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>>> index b2f10e39eff1..a92486cd038f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>>> @@ -3570,12 +3570,7 @@ static int 

Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Christian König

The problem is some fields are increased by hardware


What are you talking about? The bits control what is used in the MC 
interface, there is no increment or anything here.



I think at least we should apply one change:  we use NO_KIQ for SRIOV 
pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ flood


Agreed that sounds like a good idea to me as well no matter if we use 
RMW or just a write.


Regards,
Christian.

Am 21.04.20 um 15:34 schrieb Liu, Monk:

The problem is some fields are increased by hardware, and RLC simply read its 
value, we cannot set those field together with VMID

Christian, we should stop arguing on this small feature,  there is no way to 
have a worse solution compared with current logic 

I think at least we should apply one change:  we use NO_KIQ for SRIOV 
pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ flood

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: Tuesday, April 21, 2020 7:52 PM
To: Liu, Monk ; Koenig, Christian ; Tao, 
Yintian ; He, Jacob ; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

Hi Monk,

at least on Vega that should be fine. If the RLC should use anything else than 
0 here we should update that together with the VMID.

Regards,
Christian.

Am 21.04.20 um 11:54 schrieb Liu, Monk:

Could only be that the firmware updates the bits to something non default, I'm 
going to double check that on a Vega10.

I think that will be a sure answer, otherwise why we need those field if we 
always write 0 to them and reader always expect 0 reading back from them ??

Those fields are kind of performance counters

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Christian König 
Sent: Tuesday, April 21, 2020 5:52 PM
To: Tao, Yintian ; Liu, Monk ;
He, Jacob ; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

Am 21.04.20 um 11:45 schrieb Tao, Yintian:

-Original Message-
From: Christian König 
Sent: 2020年4月21日 17:10
To: Liu, Monk ; Tao, Yintian ;
He, Jacob ; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update

The RLC SPM configuration register contains the information how the memory 
access is made (VMID, MTYPE, etc) which should always be consistent.

So instead of a read modify write cycle of the VMID always update the whole 
register.

Signed-off-by: Christian König 
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..2a6556371046 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,12 +7030,7 @@ static int
gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev,

static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
[yttao]: The orig_val is 0 which means except VMID field other reset fields 
will be set to 0. Whether it is legal?

According to the register specification that is the default value for those 
bits on gfx9/10.

Could only be that the firmware updates the bits to something non default, I'm 
going to double check that on a Vega10.

Regards,
Christian.


	WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git

a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index b2f10e39eff1..a92486cd038f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct
amdgpu_device *adev)

static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);

	WREG32(mmRLC_SPM_VMID, data);

}
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..44fdda68db98 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5613,12 +5613,7 @@ static void 

RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Liu, Monk
The problem is some fields are increased by hardware, and RLC simply read its 
value, we cannot set those field together with VMID 

Christian, we should stop arguing on this small feature,  there is no way to 
have a worse solution compared with current logic  

I think at least we should apply one change:  we use NO_KIQ for SRIOV 
pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ flood 

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: Tuesday, April 21, 2020 7:52 PM
To: Liu, Monk ; Koenig, Christian ; 
Tao, Yintian ; He, Jacob ; 
amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

Hi Monk,

at least on Vega that should be fine. If the RLC should use anything else than 
0 here we should update that together with the VMID.

Regards,
Christian.

Am 21.04.20 um 11:54 schrieb Liu, Monk:
 Could only be that the firmware updates the bits to something non default, 
 I'm going to double check that on a Vega10.
> I think that will be a sure answer, otherwise why we need those field if we 
> always write 0 to them and reader always expect 0 reading back from them ??
>
> Those fields are kind of performance counters
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, April 21, 2020 5:52 PM
> To: Tao, Yintian ; Liu, Monk ; 
> He, Jacob ; amd-gfx@lists.freedesktop.org
> Cc: Gu, Frans 
> Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update
>
> Am 21.04.20 um 11:45 schrieb Tao, Yintian:
>> -Original Message-
>> From: Christian König 
>> Sent: 2020年4月21日 17:10
>> To: Liu, Monk ; Tao, Yintian ; 
>> He, Jacob ; amd-gfx@lists.freedesktop.org
>> Cc: Gu, Frans 
>> Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update
>>
>> The RLC SPM configuration register contains the information how the memory 
>> access is made (VMID, MTYPE, etc) which should always be consistent.
>>
>> So instead of a read modify write cycle of the VMID always update the whole 
>> register.
>>
>> Signed-off-by: Christian König 
>> ---
>>drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
>> drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
>> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
>>4 files changed, 4 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> index 0a03e2ad5d95..2a6556371046 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> @@ -7030,12 +7030,7 @@ static int
>> gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev,
>>
>>static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, 
>> unsigned vmid)  {
>> -u32 data;
>> -
>> -data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
>> -
>> -data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
>> -data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
>> RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
>> +u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
>> [yttao]: The orig_val is 0 which means except VMID field other reset fields 
>> will be set to 0. Whether it is legal?
> According to the register specification that is the default value for those 
> bits on gfx9/10.
>
> Could only be that the firmware updates the bits to something non default, 
> I'm going to double check that on a Vega10.
>
> Regards,
> Christian.
>
>>
>>  WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git 
>> a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> index b2f10e39eff1..a92486cd038f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
>> @@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct 
>> amdgpu_device *adev)
>>
>>static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
>> vmid)  {
>> -u32 data;
>> -
>> -data = RREG32(mmRLC_SPM_VMID);
>> -
>> -data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
>> -data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
>> RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
>> +u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
>>
>>  WREG32(mmRLC_SPM_VMID, data);
>>}
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> index fc6c2f2bc76c..44fdda68db98 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
>> @@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct 
>> amdgpu_device *adev)
>>
>>static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
>> vmid)  {
>> -u32 data;
>> -
>> -data = RREG32(mmRLC_SPM_VMID);
>> -
>> -data &= 

Re:Re: [PATCH V3] amdgpu: remove unnecessary condition check

2020-04-21 Thread 赵军奎


From: "Christian König" 
Date: 2020-04-21 16:06:03
To:  1587180037-113840-1-git-send-email-bern...@vivo.com,Felix Kuehling 
,Alex Deucher ,"David 
(ChunMing) Zhou" ,David Airlie ,Daniel 
Vetter 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc:  opensource.ker...@vivo.com,Bernard Zhao 
Subject: Re: [PATCH V3] amdgpu: remove unnecessary condition check>Am 21.04.20 
um 10:03 schrieb Bernard Zhao:
>> There is no need to if check again, maybe we could merge
>> into the above else branch.
>>
>> Signed-off-by: Bernard Zhao 
>>
>> Changes since V1:
>> *commit message improve
>> *code style refactoring
>>
>> Changes since V2:
>> *code style adjust
>>
>> Link for V1:
>> *https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226587%2Fdata=02%7C01%7Cchristian.koenig%40amd.com%7C0b8fffafb715474289b208d7e5ca7f6c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230530201280350sdata=Sewv5ESX%2B0B4DbFbE03uM5sifrEcmJllC8pt7J42I7M%3Dreserved=0
>> ---
>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 18 +++---
>>   1 file changed, 7 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 9dff792c9290..5424bd921a7b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -660,13 +660,12 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
>>   
>>  ret = ttm_eu_reserve_buffers(>ticket, >list,
>>   false, >duplicates);
>> -if (!ret)
>> -ctx->reserved = true;
>> -else {
>> -pr_err("Failed to reserve buffers in ttm\n");
>> +if (ret) {
>> +pr_err("Failed to reserve buffers in ttm.\n");
>>  kfree(ctx->vm_pd);
>>  ctx->vm_pd = NULL;
>> -}
>> +} else
>> +ctx->reserved = true;
>
>That is still not correct coding style. In general when one branch of an 
>if/else uses {} the other one should use it as well.
>
>But I agree with Felix that this change looks rather superfluous to me 
>as well.
>
>Regards,
>Christian.

About the code style, you are right, I checked the refers:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?id=90280eaa88ac1a9140dc759941123530d5545bb6#n191
The if and else should use the same style.
But i have to say there are so many code not follow the kernel code-style in 
amdgpu module.
And also the ./scripts/checkpatch.pl did not throw any warning or error.

If this change looks rather superfluous to all of you, should i change to the 
V1 change?
After all i don`t think there is any necessary to check "ret" again, merge the 

maybe better.
Original code:
static int reserve_bo_and_cond_vms(struct kgd_mem *mem,.
if (!ret)
ctx->reserved = true;
else
pr_err("Failed to reserve buffers in ttm.\n");

if (ret) {
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
}

BR//bernard

>>   
>>  return ret;
>>   }
>> @@ -733,15 +732,12 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
>>   
>>  ret = ttm_eu_reserve_buffers(>ticket, >list,
>>   false, >duplicates);
>> -if (!ret)
>> -ctx->reserved = true;
>> -else
>> -pr_err("Failed to reserve buffers in ttm.\n");
>> -
>>  if (ret) {
>> +pr_err("Failed to reserve buffers in ttm.\n");
>>  kfree(ctx->vm_pd);
>>  ctx->vm_pd = NULL;
>> -}
>> +} else
>> +ctx->reserved = true;
>>   
>>  return ret;
>>   }
>


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH V4] amdgpu: reduce no need mutex_lock area

2020-04-21 Thread Bernard Zhao
Maybe we could reduce the mutex_lock(>lock)`s protected code area,
and no need to protect pr_debug.

Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve

Changes since V2:
*move comment along with the mutex_unlock

Changes since V3:
*lock protect the if check, there is some possibility of multi-threaded
 racing modify.

Link for V1:
*https://lore.kernel.org/patchwork/patch/1226588/
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 327317c54f7c..549bdb429883 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1289,9 +1289,9 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
mutex_lock(>lock);
 
if (mem->mapped_to_gpu_memory > 0) {
+   mutex_unlock(>lock);
pr_debug("BO VA 0x%llx size 0x%lx is still mapped.\n",
mem->va, bo_size);
-   mutex_unlock(>lock);
return -EBUSY;
}
 
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re:Re: [PATCH] amdgpu: fixes memleak issue when init failed

2020-04-21 Thread 赵军奎

From: "Christian König" 
Date: 2020-04-21 19:22:49
To:  Bernard Zhao ,Alex Deucher 
,"David (ChunMing) Zhou" ,David 
Airlie ,Daniel Vetter ,Tom St Denis 
,Ori Messinger ,Sam Ravnborg 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc:  opensource.ker...@vivo.com
Subject: Re: [PATCH] amdgpu: fixes memleak issue when init failed>Am 21.04.20 
um 13:17 schrieb Bernard Zhao:
>> VRAM manager and DRM MM when init failed, there is no operaction
>> to free kzalloc memory & remove device file.
>> This will lead to memleak & cause stability issue.
>
>NAK, failure to create sysfs nodes are not critical.
>
>Christian.
>

OK, get it.
By the way, should i modify this patch to just handle  in error 
branch, or that it is also unnecessary?

Regards,
Bernard

>>
>> Signed-off-by: Bernard Zhao 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 24 
>>   1 file changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>> index 82a3299e53c0..4c5fb153e6b4 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>> @@ -175,30 +175,44 @@ static int amdgpu_vram_mgr_init(struct 
>> ttm_mem_type_manager *man,
>>  ret = device_create_file(adev->dev, _attr_mem_info_vram_total);
>>  if (ret) {
>>  DRM_ERROR("Failed to create device file mem_info_vram_total\n");
>> -return ret;
>> +goto VRAM_TOTAL_FAIL;
>>  }
>>  ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_total);
>>  if (ret) {
>>  DRM_ERROR("Failed to create device file 
>> mem_info_vis_vram_total\n");
>> -return ret;
>> +goto VIS_VRAM_TOTA_FAIL;
>>  }
>>  ret = device_create_file(adev->dev, _attr_mem_info_vram_used);
>>  if (ret) {
>>  DRM_ERROR("Failed to create device file mem_info_vram_used\n");
>> -return ret;
>> +goto VRAM_USED_FAIL;
>>  }
>>  ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_used);
>>  if (ret) {
>>  DRM_ERROR("Failed to create device file 
>> mem_info_vis_vram_used\n");
>> -return ret;
>> +goto VIS_VRAM_USED_FAIL;
>>  }
>>  ret = device_create_file(adev->dev, _attr_mem_info_vram_vendor);
>>  if (ret) {
>>  DRM_ERROR("Failed to create device file 
>> mem_info_vram_vendor\n");
>> -return ret;
>> +goto VRAM_VERDOR_FAIL;
>>  }
>>   
>>  return 0;
>> +
>> +VRAM_VERDOR_FAIL:
>> +device_remove_file(adev->dev, _attr_mem_info_vis_vram_used);
>> +VIS_VRAM_USED_FAIL:
>> +device_remove_file(adev->dev, _attr_mem_info_vram_used);
>> +RVAM_USED_FAIL:
>> +device_remove_file(adev->dev, _attr_mem_info_vis_vram_total);
>> +VIS_VRAM_TOTA_FAIL:
>> +device_remove_file(adev->dev, _attr_mem_info_vram_total);
>> +VRAM_TOTAL_FAIL:
>> +kfree(mgr);
>> +man->priv = NULL;
>> +
>> +return ret;
>>   }
>>   
>>   /**
>


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v4] amdgpu: remove unnecessary condition check

2020-04-21 Thread Bernard Zhao
There is no need to if check again, maybe we could merge
into the above else branch.

Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve
*code style refactoring

Changes since V2:
*code style adjust

Changes since V3:
*find the best way to merge unnecessary if/else check branch

Link for V1:
*https://lore.kernel.org/patchwork/patch/1226587/
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 20 +--
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..acb612c53b9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,15 +660,15 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
-   pr_err("Failed to reserve buffers in ttm\n");
+   if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
+   return ret;
}
 
-   return ret;
+   ctx->reserved = true;
+   return 0;
 }
 
 /**
@@ -733,17 +733,15 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else
-   pr_err("Failed to reserve buffers in ttm.\n");
-
if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
+   return ret;
}
 
-   return ret;
+   ctx->reserved = true;
+   return 0;
 }
 
 /**
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[add Markus.Elfring in mail list ]Re:Re: [PATCH V4] amdgpu: reduce no need mutex_lock area

2020-04-21 Thread 赵军奎


From: "Christian König" 
Date: 2020-04-21 15:41:27
To:  1587181464-114215-1-git-send-email-bern...@vivo.com,Felix Kuehling 
,Alex Deucher ,"David 
(ChunMing) Zhou" ,David Airlie ,Daniel 
Vetter 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc:  opensource.ker...@vivo.com,Bernard Zhao 
Subject: Re: [PATCH V4] amdgpu: reduce no need mutex_lock area>Am 21.04.20 um 
09:36 schrieb Bernard Zhao:
>> Maybe we could reduce the mutex_lock(>lock)`s protected code area,
>> and no need to protect pr_debug.
>
>Well that change looks rather superfluous to me.
>
>This is for freeing memory which by definition can only be done once and 
>so the should be exactly zero contention on the lock except in a case of 
>an error.
>
>Regards,
>Christian.
>
>>
>> Signed-off-by: Bernard Zhao 
>>
>> Changes since V1:
>> *commit message improve
>>
>> Changes since V2:
>> *move comment along with the mutex_unlock
>>
>> Changes since V3:
>> *lock protect the if check, there is some possibility of multi-threaded
>>   racing modify.
>>
>> Link for V1:
>> *https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226588%2Fdata=02%7C01%7Cchristian.koenig%40amd.com%7C9f2b2080f4174c421e4f08d7e5c6b899%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230513976133424sdata=6wezM%2F%2FjM5uuLblJejS9XFlE9DjWQ5zSt5PsqrfvCVo%3Dreserved=0
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 327317c54f7c..549bdb429883 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -1289,9 +1289,9 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
>>  mutex_lock(>lock);
>>   
>>  if (mem->mapped_to_gpu_memory > 0) {
>> +mutex_unlock(>lock);
>>  pr_debug("BO VA 0x%llx size 0x%lx is still mapped.\n",
>>  mem->va, bo_size);
>> -mutex_unlock(>lock);
>>  return -EBUSY;
>>  }
>>   
>

add Markus.Elfring in mail list 
Regards,
Bernard


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [V3] amdgpu: remove unnecessary condition check

2020-04-21 Thread Markus Elfring
>> But i have to say there are so many code not follow the kernel code-style in 
>> amdgpu module.
>> And also the ./scripts/checkpatch.pl did not throw any warning or error.
>
> That is unfortunately true, yes. But we try to push new code through the 
> usual code review and improve things as we go.
>
> On the other hand patches just to fix the coding style are usually seen as an 
> unnecessary interruption and just a waste of time.

Would you become interested in adjusting deviations from known programming 
guidelines
in an automatic way with the help of corresponding advanced development tools?

Regards,
Markus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [V3] amdgpu: remove unnecessary condition check

2020-04-21 Thread Markus Elfring
>>> There is no need to if check again, maybe we could merge
>>> into the above else branch.

I find also this commit message still improvable (besides the mentioned
implementation details around coding style concerns).
How will corresponding review comments be taken better into account?

Regards,
Markus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/amdgpu: Return more error codes in amdgpu_connector_set_property()

2020-04-21 Thread Markus Elfring
> The "if(!encoder)" branch return the same value 0 of the success
> branch, maybe return -EINVAL is more better.

I suggest to improve the commit message.

* Are you still unsure about the next changes?
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=ae83d0b416db002fe95601e7f97f64b59514d936#n151

* Would you like to adjust the patch subject another bit?

* How do you think about to add the tag “Fixes”
  because of adjustments for the exception handling?


It can be nicer if all patch reviewers (including me) will be explicitly 
specified
as recipients for such messages, can't it?

Regards,
Markus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [V3] amdgpu: remove unnecessary condition check

2020-04-21 Thread Markus Elfring
> But i have to say there are so many code not follow the kernel code-style in 
> amdgpu module.
> And also the ./scripts/checkpatch.pl did not throw any warning or error.

Will such information become more interesting for further evolution
in the affected software areas?

Regards,
Markus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH V3] amdgpu: remove unnecessary condition check

2020-04-21 Thread Bernard Zhao
There is no need to if check again, maybe we could merge
into the above else branch.

Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve
*code style refactoring

Changes since V2:
*code style adjust

Link for V1:
*https://lore.kernel.org/patchwork/patch/1226587/
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..5424bd921a7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,13 +660,12 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
-   pr_err("Failed to reserve buffers in ttm\n");
+   if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
-   }
+   } else
+   ctx->reserved = true;
 
return ret;
 }
@@ -733,15 +732,12 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else
-   pr_err("Failed to reserve buffers in ttm.\n");
-
if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
-   }
+   } else
+   ctx->reserved = true;
 
return ret;
 }
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] amdgpu: fixes memleak issue when init failed

2020-04-21 Thread Bernard Zhao
VRAM manager and DRM MM when init failed, there is no operaction
to free kzalloc memory & remove device file.
This will lead to memleak & cause stability issue.

Signed-off-by: Bernard Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 24 
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 82a3299e53c0..4c5fb153e6b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -175,30 +175,44 @@ static int amdgpu_vram_mgr_init(struct 
ttm_mem_type_manager *man,
ret = device_create_file(adev->dev, _attr_mem_info_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_total\n");
-   return ret;
+   goto VRAM_TOTAL_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_total\n");
-   return ret;
+   goto VIS_VRAM_TOTA_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_used\n");
-   return ret;
+   goto VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_used\n");
-   return ret;
+   goto VIS_VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_vendor);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vram_vendor\n");
-   return ret;
+   goto VRAM_VERDOR_FAIL;
}
 
return 0;
+
+VRAM_VERDOR_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_used);
+VIS_VRAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_used);
+RVAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_total);
+VIS_VRAM_TOTA_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_total);
+VRAM_TOTAL_FAIL:
+   kfree(mgr);
+   man->priv = NULL;
+
+   return ret;
 }
 
 /**
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[add Markus.Elfring in mail list]Re:[PATCH v2] amdgpu: fixes error branch not return errno issue

2020-04-21 Thread 赵军奎


From: Bernard Zhao 
Date: 2020-04-21 10:07:50
To:  Alex Deucher ,"Christian König" 
,"David (ChunMing) Zhou" ,David 
Airlie ,Daniel Vetter ,Lyude Paul 
,Sam Ravnborg ,Bernard Zhao 
,"José Roberto de Souza" ,Andrzej 
Pietrasiewicz 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc:  opensource.ker...@vivo.com
Subject: [PATCH v2] amdgpu: fixes error branch not return errno issue>The 
"if(!encoder)" branch return the same value 0 of the success
>branch, maybe return -EINVAL is more better.
>
>Signed-off-by: Bernard Zhao 
>
>---
>Changes since V1:
>* commit message improve
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 14 +++---
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c 
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
>index f355d9a..1f8c6b4 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
>@@ -474,12 +474,12 @@ static int amdgpu_connector_set_property(struct 
>drm_connector *connector,
>   /* need to find digital encoder on connector */
>   encoder = amdgpu_connector_find_encoder(connector, 
> DRM_MODE_ENCODER_TMDS);
>   if (!encoder)
>-  return 0;
>+  return -EINVAL;
> 
>   amdgpu_encoder = to_amdgpu_encoder(encoder);
> 
>   if (!amdgpu_encoder->enc_priv)
>-  return 0;
>+  return -EINVAL;
> 
>   dig = amdgpu_encoder->enc_priv;
>   new_coherent_mode = val ? true : false;
>@@ -494,7 +494,7 @@ static int amdgpu_connector_set_property(struct 
>drm_connector *connector,
>   /* need to find digital encoder on connector */
>   encoder = amdgpu_connector_find_encoder(connector, 
> DRM_MODE_ENCODER_TMDS);
>   if (!encoder)
>-  return 0;
>+  return -EINVAL;
> 
>   amdgpu_encoder = to_amdgpu_encoder(encoder);
> 
>@@ -509,7 +509,7 @@ static int amdgpu_connector_set_property(struct 
>drm_connector *connector,
>   /* need to find digital encoder on connector */
>   encoder = amdgpu_connector_find_encoder(connector, 
> DRM_MODE_ENCODER_TMDS);
>   if (!encoder)
>-  return 0;
>+  return -EINVAL;
> 
>   amdgpu_encoder = to_amdgpu_encoder(encoder);
> 
>@@ -523,7 +523,7 @@ static int amdgpu_connector_set_property(struct 
>drm_connector *connector,
>   /* need to find digital encoder on connector */
>   encoder = amdgpu_connector_find_encoder(connector, 
> DRM_MODE_ENCODER_TMDS);
>   if (!encoder)
>-  return 0;
>+  return -EINVAL;
> 
>   amdgpu_encoder = to_amdgpu_encoder(encoder);
> 
>@@ -537,7 +537,7 @@ static int amdgpu_connector_set_property(struct 
>drm_connector *connector,
>   /* need to find digital encoder on connector */
>   encoder = amdgpu_connector_find_encoder(connector, 
> DRM_MODE_ENCODER_TMDS);
>   if (!encoder)
>-  return 0;
>+  return -EINVAL;
> 
>   amdgpu_encoder = to_amdgpu_encoder(encoder);
> 
>@@ -551,7 +551,7 @@ static int amdgpu_connector_set_property(struct 
>drm_connector *connector,
>   /* need to find digital encoder on connector */
>   encoder = amdgpu_connector_find_encoder(connector, 
> DRM_MODE_ENCODER_TMDS);
>   if (!encoder)
>-  return 0;
>+  return -EINVAL;
> 
>   amdgpu_encoder = to_amdgpu_encoder(encoder);
> 
>-- 
>2.7.4

>

add Markus.Elfring in mail list 
Regards,
Bernard

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V2] drm/amdgpu: Remove an unnecessary condition check in reserve_bo_and_cond_vms()

2020-04-21 Thread Markus Elfring
> There is no need to if check again,

Thanks for this information.

* Should the function name be mentioned in this commit message?

* Would you like to adjust the patch subject another bit?


> maybe we could merge into the above else branch.

I suggest to reconsider this wording.
Are you still unsure about the next changes?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=ae83d0b416db002fe95601e7f97f64b59514d936#n151

How do you think about to add the tag “Fixes”?


It can be nicer if all patch reviewers (including me) will be explicitly 
specified
as recipients for such messages, can't it?

Regards,
Markus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v4] amdgpu: remove unnecessary condition check

2020-04-21 Thread Christian König

Am 21.04.20 um 14:22 schrieb Bernard Zhao:

There is no need to if check again, maybe we could merge
into the above else branch.

Signed-off-by: Bernard Zhao 


You could improve the subject line and commit message a bit, e.g. 
something like:


[PATCH] drm/amdgpu: cleanup coding style in amdkfd a bit

Make the code a bit more readable by using a common error handling pattern.


With that done the patch is Reviewed-by: Christian König 
.




Changes since V1:
*commit message improve
*code style refactoring

Changes since V2:
*code style adjust

Changes since V3:
*find the best way to merge unnecessary if/else check branch

Link for V1:
*https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226587%2Fdata=02%7C01%7Cchristian.koenig%40amd.com%7C6b521b697ea84cfd324308d7e5eeab7f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230685564330098sdata=m2lhfgcINuB%2BT3%2BhhQjtROI3Zp38QDEilF9RTceg3HY%3Dreserved=0
---
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 20 +--
  1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..acb612c53b9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,15 +660,15 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
-   pr_err("Failed to reserve buffers in ttm\n");
+   if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
+   return ret;
}
  
-	return ret;

+   ctx->reserved = true;
+   return 0;
  }
  
  /**

@@ -733,17 +733,15 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else
-   pr_err("Failed to reserve buffers in ttm.\n");
-
if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
+   return ret;
}
  
-	return ret;

+   ctx->reserved = true;
+   return 0;
  }
  
  /**


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] amdgpu: fixes memleak issue when init failed

2020-04-21 Thread Christian König

Am 21.04.20 um 14:09 schrieb 赵军奎:

From: "Christian König" 
Date: 2020-04-21 19:22:49
To:  Bernard Zhao ,Alex Deucher ,"David (ChunMing) Zhou" 
,David Airlie ,Daniel Vetter ,Tom St Denis 
,Ori Messinger ,Sam Ravnborg 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc:  opensource.ker...@vivo.com
Subject: Re: [PATCH] amdgpu: fixes memleak issue when init failed>Am 21.04.20 
um 13:17 schrieb Bernard Zhao:

VRAM manager and DRM MM when init failed, there is no operaction
to free kzalloc memory & remove device file.
This will lead to memleak & cause stability issue.

NAK, failure to create sysfs nodes are not critical.

Christian.


OK, get it.
By the way, should i modify this patch to just handle  in error 
branch, or that it is also unnecessary?


What you can do is to drop the "return ret" if anything with the sysfs 
nodes goes wrong and instead print the error code.


It's really annoying that loading, unloading and loading the driver 
again sometimes fails because we have a bug in the sysfs files cleanup.


We certainly should fix those bugs as well, but they are just not 
critical for correct driver functionality.


Regards,
Christian.



Regards,
Bernard


Signed-off-by: Bernard Zhao 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 24 
   1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 82a3299e53c0..4c5fb153e6b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -175,30 +175,44 @@ static int amdgpu_vram_mgr_init(struct 
ttm_mem_type_manager *man,
ret = device_create_file(adev->dev, _attr_mem_info_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_total\n");
-   return ret;
+   goto VRAM_TOTAL_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_total\n");
-   return ret;
+   goto VIS_VRAM_TOTA_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_used\n");
-   return ret;
+   goto VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_used\n");
-   return ret;
+   goto VIS_VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_vendor);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vram_vendor\n");
-   return ret;
+   goto VRAM_VERDOR_FAIL;
}
   
   	return 0;

+
+VRAM_VERDOR_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_used);
+VIS_VRAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_used);
+RVAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_total);
+VIS_VRAM_TOTA_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_total);
+VRAM_TOTAL_FAIL:
+   kfree(mgr);
+   man->priv = NULL;
+
+   return ret;
   }
   
   /**




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: put the audio codec into suspend state before gpu reset

2020-04-21 Thread Evan Quan
At default, the autosuspend delay of audio controller is 3S. If the
gpu reset is triggered within 3S(after audio controller idle),
the audio controller may be unable into suspended state. Then
the sudden gpu reset will cause some audio errors. The change
here is targeted to resolve this.

However if the audio controller is in use when the gpu reset
triggered, this change may be still not enough to put the
audio controller into suspend state. Under this case, the
gpu reset will still proceed but there will be a warning
message printed("failed to suspend display audio").

Change-Id: I33d85e6fcad1882eb33f9cde8916d57be8d5a87a
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 60 ++
 1 file changed, 60 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2d4b78d96426..983e294d0300 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -69,6 +69,7 @@
 
 #include 
 #include 
+#include 
 
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -4146,6 +4147,49 @@ static void amdgpu_device_unlock_adev(struct 
amdgpu_device *adev)
mutex_unlock(>lock_reset);
 }
 
+static void amdgpu_device_resume_display_audio(struct amdgpu_device *adev)
+{
+   struct pci_dev *p = NULL;
+
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, 1);
+   if (p) {
+   pm_runtime_enable(&(p->dev));
+   pm_runtime_resume(&(p->dev));
+   }
+}
+
+static int amdgpu_device_suspend_display_audio(struct amdgpu_device *adev)
+{
+   struct pci_dev *p = NULL;
+   unsigned long end_jiffies;
+
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, 1);
+   if (!p)
+   return -ENODEV;
+
+   /*
+* 3S is the audio controller default autosuspend delay setting.
+* 4S used here is guaranteed to cover that.
+*/
+   end_jiffies = msecs_to_jiffies(4000) + jiffies;
+   while (!pm_runtime_status_suspended(&(p->dev))) {
+   if (!pm_runtime_suspend(&(p->dev)))
+   break;
+
+   if (time_after(jiffies, end_jiffies)) {
+   dev_warn(adev->dev, "failed to suspend display 
audio\n");
+   /* TODO: abort the succeeding gpu reset? */
+   return -ETIMEDOUT;
+   }
+   }
+
+   pm_runtime_disable(&(p->dev));
+
+   return 0;
+}
+
 /**
  * amdgpu_device_gpu_recover - reset the asic and recover scheduler
  *
@@ -4170,6 +4214,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
bool use_baco =
(amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) ?
true : false;
+   bool audio_suspended = false;
 
/*
 * Flush RAM to disk so that after reboot
@@ -4227,6 +4272,19 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
return 0;
}
 
+   /*
+* Try to put the audio codec into suspend state
+* before gpu reset started.
+*
+* Due to the power domain of the graphics device
+* is shared with AZ power domain. Without this,
+* we may change the audio hardware from behind
+* the audio driver's back. That will trigger
+* some audio codec errors.
+*/
+   if (!amdgpu_device_suspend_display_audio(tmp_adev))
+   audio_suspended = true;
+
amdgpu_ras_set_error_query_ready(tmp_adev, false);
 
cancel_delayed_work_sync(_adev->delayed_init_work);
@@ -4339,6 +4397,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
/*unlock kfd: SRIOV would do it separately */
if (!(in_ras_intr && !use_baco) && !amdgpu_sriov_vf(tmp_adev))
amdgpu_amdkfd_post_reset(tmp_adev);
+   if (audio_suspended)
+   amdgpu_device_resume_display_audio(tmp_adev);
amdgpu_device_unlock_adev(tmp_adev);
}
 
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Christian König

Hi Monk,

at least on Vega that should be fine. If the RLC should use anything 
else than 0 here we should update that together with the VMID.


Regards,
Christian.

Am 21.04.20 um 11:54 schrieb Liu, Monk:

Could only be that the firmware updates the bits to something non default, I'm 
going to double check that on a Vega10.

I think that will be a sure answer, otherwise why we need those field if we 
always write 0 to them and reader always expect 0 reading back from them ??

Those fields are kind of performance counters

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Christian König 
Sent: Tuesday, April 21, 2020 5:52 PM
To: Tao, Yintian ; Liu, Monk ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

Am 21.04.20 um 11:45 schrieb Tao, Yintian:

-Original Message-
From: Christian König 
Sent: 2020年4月21日 17:10
To: Liu, Monk ; Tao, Yintian ;
He, Jacob ; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update

The RLC SPM configuration register contains the information how the memory 
access is made (VMID, MTYPE, etc) which should always be consistent.

So instead of a read modify write cycle of the VMID always update the whole 
register.

Signed-off-by: Christian König 
---
   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
   4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..2a6556371046 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,12 +7030,7 @@ static int
gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev,
   
   static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
[yttao]: The orig_val is 0 which means except VMID field other reset fields 
will be set to 0. Whether it is legal?

According to the register specification that is the default value for those 
bits on gfx9/10.

Could only be that the firmware updates the bits to something non default, I'm 
going to double check that on a Vega10.

Regards,
Christian.

   
   	WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git

a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index b2f10e39eff1..a92486cd038f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct
amdgpu_device *adev)
   
   static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
   
   	WREG32(mmRLC_SPM_VMID, data);

   }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..44fdda68db98 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct
amdgpu_device *adev)
   
   static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
   
   	WREG32(mmRLC_SPM_VMID, data);

   }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..b36fbf991313 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,12 +4950,7 @@ static int
gfx_v9_0_update_gfx_clock_gating(struct amdgpu_device *adev,
   
   static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
   
   	WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  }

--
2.17.1



Re: [PATCH] amdgpu: fixes memleak issue when init failed

2020-04-21 Thread Christian König

Am 21.04.20 um 13:17 schrieb Bernard Zhao:

VRAM manager and DRM MM when init failed, there is no operaction
to free kzalloc memory & remove device file.
This will lead to memleak & cause stability issue.


NAK, failure to create sysfs nodes are not critical.

Christian.



Signed-off-by: Bernard Zhao 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 24 
  1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 82a3299e53c0..4c5fb153e6b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -175,30 +175,44 @@ static int amdgpu_vram_mgr_init(struct 
ttm_mem_type_manager *man,
ret = device_create_file(adev->dev, _attr_mem_info_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_total\n");
-   return ret;
+   goto VRAM_TOTAL_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_total);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_total\n");
-   return ret;
+   goto VIS_VRAM_TOTA_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file mem_info_vram_used\n");
-   return ret;
+   goto VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vis_vram_used);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vis_vram_used\n");
-   return ret;
+   goto VIS_VRAM_USED_FAIL;
}
ret = device_create_file(adev->dev, _attr_mem_info_vram_vendor);
if (ret) {
DRM_ERROR("Failed to create device file 
mem_info_vram_vendor\n");
-   return ret;
+   goto VRAM_VERDOR_FAIL;
}
  
  	return 0;

+
+VRAM_VERDOR_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_used);
+VIS_VRAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_used);
+RVAM_USED_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vis_vram_total);
+VIS_VRAM_TOTA_FAIL:
+   device_remove_file(adev->dev, _attr_mem_info_vram_total);
+VRAM_TOTAL_FAIL:
+   kfree(mgr);
+   man->priv = NULL;
+
+   return ret;
  }
  
  /**


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Liu, Monk
>>> Could only be that the firmware updates the bits to something non default, 
>>> I'm going to double check that on a Vega10.

I think that will be a sure answer, otherwise why we need those field if we 
always write 0 to them and reader always expect 0 reading back from them ??

Those fields are kind of performance counters 

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Christian König  
Sent: Tuesday, April 21, 2020 5:52 PM
To: Tao, Yintian ; Liu, Monk ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

Am 21.04.20 um 11:45 schrieb Tao, Yintian:
>
> -Original Message-
> From: Christian König 
> Sent: 2020年4月21日 17:10
> To: Liu, Monk ; Tao, Yintian ; 
> He, Jacob ; amd-gfx@lists.freedesktop.org
> Cc: Gu, Frans 
> Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update
>
> The RLC SPM configuration register contains the information how the memory 
> access is made (VMID, MTYPE, etc) which should always be consistent.
>
> So instead of a read modify write cycle of the VMID always update the whole 
> register.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
> drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
>   4 files changed, 4 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 0a03e2ad5d95..2a6556371046 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -7030,12 +7030,7 @@ static int 
> gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev,
>   
>   static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
> vmid)  {
> - u32 data;
> -
> - data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
> -
> - data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
> - data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
> RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
> + u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
> [yttao]: The orig_val is 0 which means except VMID field other reset fields 
> will be set to 0. Whether it is legal?

According to the register specification that is the default value for those 
bits on gfx9/10.

Could only be that the firmware updates the bits to something non default, I'm 
going to double check that on a Vega10.

Regards,
Christian.

>   
>   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git 
> a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
> index b2f10e39eff1..a92486cd038f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
> @@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct 
> amdgpu_device *adev)
>   
>   static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
> vmid)  {
> - u32 data;
> -
> - data = RREG32(mmRLC_SPM_VMID);
> -
> - data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
> - data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
> RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
> + u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
>   
>   WREG32(mmRLC_SPM_VMID, data);
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> index fc6c2f2bc76c..44fdda68db98 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct 
> amdgpu_device *adev)
>   
>   static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
> vmid)  {
> - u32 data;
> -
> - data = RREG32(mmRLC_SPM_VMID);
> -
> - data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
> - data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
> RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
> + u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
>   
>   WREG32(mmRLC_SPM_VMID, data);
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> index 54eded9a6ac5..b36fbf991313 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -4950,12 +4950,7 @@ static int 
> gfx_v9_0_update_gfx_clock_gating(struct amdgpu_device *adev,
>   
>   static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
> vmid)  {
> - u32 data;
> -
> - data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
> -
> - data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
> - data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
> RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
> + u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
>   
>   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  }
> --
> 2.17.1
>


Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Christian König

Am 21.04.20 um 11:45 schrieb Tao, Yintian:


-Original Message-
From: Christian König 
Sent: 2020年4月21日 17:10
To: Liu, Monk ; Tao, Yintian ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update

The RLC SPM configuration register contains the information how the memory 
access is made (VMID, MTYPE, etc) which should always be consistent.

So instead of a read modify write cycle of the VMID always update the whole 
register.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
  4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..2a6556371046 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,12 +7030,7 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
  
  static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
[yttao]: The orig_val is 0 which means except VMID field other reset fields 
will be set to 0. Whether it is legal?


According to the register specification that is the default value for 
those bits on gfx9/10.


Could only be that the firmware updates the bits to something non 
default, I'm going to double check that on a Vega10.


Regards,
Christian.

  
  	WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c

index b2f10e39eff1..a92486cd038f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct amdgpu_device 
*adev)
  
  static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
  
  	WREG32(mmRLC_SPM_VMID, data);

  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..44fdda68db98 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct 
amdgpu_device *adev)
  
  static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
  
  	WREG32(mmRLC_SPM_VMID, data);

  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..b36fbf991313 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,12 +4950,7 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
  
  static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)  {

-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
  
  	WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  }

--
2.17.1



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Liu, Monk
Christian

Many fields looks not like going to be still  value at all, e.g.:

RLC_SPM_PERF_CNTR   5   0x0 PERF_CNTR that is used by RLC for 
memory transactions

By your change you always set above filed to 0,  is it right ? I really doubt 
it 

Beside: to make SRIOV VF less painful please use NOKIQ version read/write if 
"one vf mode" is detected :

amdgpu_sriov_is_pp_one_vf(adev)


_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Christian König  
Sent: Tuesday, April 21, 2020 5:10 PM
To: Liu, Monk ; Tao, Yintian ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update

The RLC SPM configuration register contains the information how the memory 
access is made (VMID, MTYPE, etc) which should always be consistent.

So instead of a read modify write cycle of the VMID always update the whole 
register.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
 4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..2a6556371046 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,12 +7030,7 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index b2f10e39eff1..a92486cd038f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct amdgpu_device 
*adev)
 
 static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..44fdda68db98 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct 
amdgpu_device *adev)
 
 static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..b36fbf991313 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,12 +4950,7 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  }
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V3] amdgpu: remove unnecessary condition check

2020-04-21 Thread Christian König

Am 21.04.20 um 10:44 schrieb 赵军奎:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..5424bd921a7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,13 +660,12 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
   
   	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
-   pr_err("Failed to reserve buffers in ttm\n");
+   if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
-   }
+   } else
+   ctx->reserved = true;

That is still not correct coding style. In general when one branch of an
if/else uses {} the other one should use it as well.

But I agree with Felix that this change looks rather superfluous to me
as well.

Regards,
Christian.

About the code style, you are right, I checked the refers:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2FDocumentation%2Fprocess%2Fcoding-style.rst%3Fid%3D90280eaa88ac1a9140dc759941123530d5545bb6%23n191data=02%7C01%7Cchristian.koenig%40amd.com%7C3640a853a31e4e98f66d08d7e5d05300%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230555230917911sdata=Ihum8tjQAw%2Bk67KTq%2B3zBBLHL1bSHBhBhSyG0jWeMcE%3Dreserved=0
The if and else should use the same style.
But i have to say there are so many code not follow the kernel code-style in 
amdgpu module.
And also the ./scripts/checkpatch.pl did not throw any warning or error.


That is unfortunately true, yes. But we try to push new code through the 
usual code review and improve things as we go.


On the other hand patches just to fix the coding style are usually seen 
as an unnecessary interruption and just a waste of time.


But in this particular case you could argue that the original code is 
not easily readable and you try to improve that.



If this change looks rather superfluous to all of you, should i change to the 
V1 change?
After all i don`t think there is any necessary to check "ret" again, merge the 

maybe better.
Original code:
static int reserve_bo_and_cond_vms(struct kgd_mem *mem,.
if (!ret)
ctx->reserved = true;
else
pr_err("Failed to reserve buffers in ttm.\n");

if (ret) {
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
}


In general I suggest to use the following pattern for this error 
handling and avoid the else branch altogether:


if (ret) {
    pr_err("Failed to reserve buffers in ttm.\n");
    kfree(ctx->vm_pd);
    ctx->vm_pd = NULL;
    return ret;
}

ctx->reserved = true;
return 0;

When things become more complex good practice is to insert a "goto 
error_cleanup"; instead of in place cleanup.


Regards,
Christian.



BR//bernard


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Tao, Yintian


-Original Message-
From: Christian König  
Sent: 2020年4月21日 17:10
To: Liu, Monk ; Tao, Yintian ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update

The RLC SPM configuration register contains the information how the memory 
access is made (VMID, MTYPE, etc) which should always be consistent.

So instead of a read modify write cycle of the VMID always update the whole 
register.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
 4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..2a6556371046 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,12 +7030,7 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
[yttao]: The orig_val is 0 which means except VMID field other reset fields 
will be set to 0. Whether it is legal?
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index b2f10e39eff1..a92486cd038f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct amdgpu_device 
*adev)
 
 static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..44fdda68db98 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct 
amdgpu_device *adev)
 
 static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..b36fbf991313 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,12 +4950,7 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  }
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Christian König
The RLC SPM configuration register contains the information how the memory
access is made (VMID, MTYPE, etc) which should always be consistent.

So instead of a read modify write cycle of the VMID always update
the whole register.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
 4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..2a6556371046 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,12 +7030,7 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)
 {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index b2f10e39eff1..a92486cd038f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct amdgpu_device 
*adev)
 
 static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)
 {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..44fdda68db98 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct 
amdgpu_device *adev)
 
 static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)
 {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..b36fbf991313 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,12 +4950,7 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned vmid)
 {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: only update spm vmid at first time

2020-04-21 Thread Yintian Tao
Original idea is from Monk which only update spm vmid
at first time which can release the frequent r/w register
burden under virtualization.

v2: set spm_vmid_updated to false when job timedout

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 8 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 2 ++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 35c381ec0423..3bf59dfef05d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -50,6 +50,9 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
DRM_ERROR("Process information: process %s pid %d thread %s pid %d\n",
  ti.process_name, ti.tgid, ti.task_name, ti.pid);
 
+   if (job->vm)
+   job->vm->spm_vmid_updated = false;
+
if (amdgpu_device_should_recover_gpu(ring->adev)) {
amdgpu_device_gpu_recover(ring->adev, job);
} else {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index accbb34ea670..636a6d23fd96 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1080,11 +1080,14 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,
struct dma_fence *fence = NULL;
bool pasid_mapping_needed = false;
unsigned patch_offset = 0;
-   bool update_spm_vmid_needed = (job->vm && 
(job->vm->reserved_vmid[vmhub] != NULL));
+   bool update_spm_vmid_needed = (job->vm && 
(job->vm->reserved_vmid[vmhub] != NULL) &&
+  !job->vm->spm_vmid_updated);
int r;
 
-   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid)
+   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid) {
adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
+   job->vm->spm_vmid_updated = true;
+   }
 
if (amdgpu_vmid_had_gpu_reset(adev, id)) {
gds_switch_needed = true;
@@ -2797,6 +2800,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
int r, i;
 
vm->va = RB_ROOT_CACHED;
+   vm->spm_vmid_updated = false;
for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
vm->reserved_vmid[i] = NULL;
INIT_LIST_HEAD(>evicted);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index ea771d84bf2b..02409e0ecf2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -319,6 +319,8 @@ struct amdgpu_vm {
boolbulk_moveable;
/* Flag to indicate if VM is used for compute */
boolis_compute_context;
+   /* flag to represent whether spm vmid has been updated */
+   boolspm_vmid_updated;
 };
 
 struct amdgpu_vm_manager {
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V3] amdgpu: remove unnecessary condition check

2020-04-21 Thread Christian König

Am 21.04.20 um 10:03 schrieb Bernard Zhao:

There is no need to if check again, maybe we could merge
into the above else branch.

Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve
*code style refactoring

Changes since V2:
*code style adjust

Link for V1:
*https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226587%2Fdata=02%7C01%7Cchristian.koenig%40amd.com%7C0b8fffafb715474289b208d7e5ca7f6c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230530201280350sdata=Sewv5ESX%2B0B4DbFbE03uM5sifrEcmJllC8pt7J42I7M%3Dreserved=0
---
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 18 +++---
  1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..5424bd921a7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,13 +660,12 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
-   pr_err("Failed to reserve buffers in ttm\n");
+   if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
-   }
+   } else
+   ctx->reserved = true;


That is still not correct coding style. In general when one branch of an 
if/else uses {} the other one should use it as well.


But I agree with Felix that this change looks rather superfluous to me 
as well.


Regards,
Christian.

  
  	return ret;

  }
@@ -733,15 +732,12 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else
-   pr_err("Failed to reserve buffers in ttm.\n");
-
if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
-   }
+   } else
+   ctx->reserved = true;
  
  	return ret;

  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V4] amdgpu: reduce no need mutex_lock area

2020-04-21 Thread Christian König

Am 21.04.20 um 09:36 schrieb Bernard Zhao:

Maybe we could reduce the mutex_lock(>lock)`s protected code area,
and no need to protect pr_debug.


Well that change looks rather superfluous to me.

This is for freeing memory which by definition can only be done once and 
so the should be exactly zero contention on the lock except in a case of 
an error.


Regards,
Christian.



Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve

Changes since V2:
*move comment along with the mutex_unlock

Changes since V3:
*lock protect the if check, there is some possibility of multi-threaded
  racing modify.

Link for V1:
*https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226588%2Fdata=02%7C01%7Cchristian.koenig%40amd.com%7C9f2b2080f4174c421e4f08d7e5c6b899%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230513976133424sdata=6wezM%2F%2FjM5uuLblJejS9XFlE9DjWQ5zSt5PsqrfvCVo%3Dreserved=0
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 327317c54f7c..549bdb429883 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1289,9 +1289,9 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
mutex_lock(>lock);
  
  	if (mem->mapped_to_gpu_memory > 0) {

+   mutex_unlock(>lock);
pr_debug("BO VA 0x%llx size 0x%lx is still mapped.\n",
mem->va, bo_size);
-   mutex_unlock(>lock);
return -EBUSY;
}
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH V2] amdgpu: remove unnecessary condition check

2020-04-21 Thread Christian König

Am 21.04.20 um 04:41 schrieb Bernard Zhao:

There is no need to if check again, maybe we could merge
into the above else branch.

Signed-off-by: Bernard Zhao 

---
Changes since V1:
*commit message improve
*code style refactoring

Link for V1:
* 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1226587%2Fdata=02%7C01%7Cchristian.koenig%40amd.com%7C50bb3a13f28b4e5d787508d7e59d9903%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637230337359422169sdata=LXUJgHOxfwSpacdW6suiI00z8egbRC3z3za0H3XtNV4%3Dreserved=0
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +---
  1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..a64eeb07bec4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,13 +660,15 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
+
+   if (ret) {
pr_err("Failed to reserve buffers in ttm\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
}
+   else {
+   ctx->reserved = true;
+   }
  
  	return ret;

  }
@@ -733,15 +735,15 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
  
  	ret = ttm_eu_reserve_buffers(>ticket, >list,

 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else
-   pr_err("Failed to reserve buffers in ttm.\n");
  
  	if (ret) {

+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
}
+   else {


Please use "} else {" here.

Christian.


+   ctx->reserved = true;
+   }
  
  	return ret;

  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH -next] drm/amdgpu: remove set but not used variable 'priority'

2020-04-21 Thread Christian König

Am 21.04.20 um 04:41 schrieb YueHaibing:

drivers/gpu/drm/amd/amdgpu/amdgpu_job.c: In function amdgpu_job_submit:
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:148:26: warning: variable priority set 
but not used [-Wunused-but-set-variable]

commit 33abcb1f5a17 ("drm/amdgpu: set compute queue priority at mqd_init")
left behind this, remove it.

Signed-off-by: YueHaibing 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 35c381ec0423..47207188c569 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -145,7 +145,6 @@ void amdgpu_job_free(struct amdgpu_job *job)
  int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
  void *owner, struct dma_fence **f)
  {
-   enum drm_sched_priority priority;
int r;
  
  	if (!f)

@@ -157,7 +156,6 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
  
  	*f = dma_fence_get(>base.s_fence->finished);

amdgpu_job_free_resources(job);
-   priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);
  
  	return 0;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH V2] amdgpu: optimization-- reduce noneed mutex_lock area

2020-04-21 Thread Bernard Zhao
Maybe we could reduce the mutex_lock(>lock)`s protected code area,
and noneed to protect pr_debug.

Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve

Link for V1:
*https://lore.kernel.org/patchwork/patch/1226588/
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 327317c54f7c..3c3769e57174 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1285,17 +1285,18 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
struct bo_vm_reservation_context ctx;
struct ttm_validate_buffer *bo_list_entry;
int ret;
+   unsigned int mapped_to_gpu_memory;
 
mutex_lock(>lock);
+   mapped_to_gpu_memory = mem->mapped_to_gpu_memory;
+   mutex_unlock(>lock);
 
-   if (mem->mapped_to_gpu_memory > 0) {
+   if (mapped_to_gpu_memory > 0) {
pr_debug("BO VA 0x%llx size 0x%lx is still mapped.\n",
mem->va, bo_size);
-   mutex_unlock(>lock);
return -EBUSY;
}
 
-   mutex_unlock(>lock);
/* lock is not needed after this, since mem is unused and will
 * be freed anyway
 */
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2] amdgpu: fixes error branch not return errno issue

2020-04-21 Thread Bernard Zhao
The "if(!encoder)" branch return the same value 0 of the success
branch, maybe return -EINVAL is more better.

Signed-off-by: Bernard Zhao 

---
Changes since V1:
* commit message improve
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
index f355d9a..1f8c6b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
@@ -474,12 +474,12 @@ static int amdgpu_connector_set_property(struct 
drm_connector *connector,
/* need to find digital encoder on connector */
encoder = amdgpu_connector_find_encoder(connector, 
DRM_MODE_ENCODER_TMDS);
if (!encoder)
-   return 0;
+   return -EINVAL;
 
amdgpu_encoder = to_amdgpu_encoder(encoder);
 
if (!amdgpu_encoder->enc_priv)
-   return 0;
+   return -EINVAL;
 
dig = amdgpu_encoder->enc_priv;
new_coherent_mode = val ? true : false;
@@ -494,7 +494,7 @@ static int amdgpu_connector_set_property(struct 
drm_connector *connector,
/* need to find digital encoder on connector */
encoder = amdgpu_connector_find_encoder(connector, 
DRM_MODE_ENCODER_TMDS);
if (!encoder)
-   return 0;
+   return -EINVAL;
 
amdgpu_encoder = to_amdgpu_encoder(encoder);
 
@@ -509,7 +509,7 @@ static int amdgpu_connector_set_property(struct 
drm_connector *connector,
/* need to find digital encoder on connector */
encoder = amdgpu_connector_find_encoder(connector, 
DRM_MODE_ENCODER_TMDS);
if (!encoder)
-   return 0;
+   return -EINVAL;
 
amdgpu_encoder = to_amdgpu_encoder(encoder);
 
@@ -523,7 +523,7 @@ static int amdgpu_connector_set_property(struct 
drm_connector *connector,
/* need to find digital encoder on connector */
encoder = amdgpu_connector_find_encoder(connector, 
DRM_MODE_ENCODER_TMDS);
if (!encoder)
-   return 0;
+   return -EINVAL;
 
amdgpu_encoder = to_amdgpu_encoder(encoder);
 
@@ -537,7 +537,7 @@ static int amdgpu_connector_set_property(struct 
drm_connector *connector,
/* need to find digital encoder on connector */
encoder = amdgpu_connector_find_encoder(connector, 
DRM_MODE_ENCODER_TMDS);
if (!encoder)
-   return 0;
+   return -EINVAL;
 
amdgpu_encoder = to_amdgpu_encoder(encoder);
 
@@ -551,7 +551,7 @@ static int amdgpu_connector_set_property(struct 
drm_connector *connector,
/* need to find digital encoder on connector */
encoder = amdgpu_connector_find_encoder(connector, 
DRM_MODE_ENCODER_TMDS);
if (!encoder)
-   return 0;
+   return -EINVAL;
 
amdgpu_encoder = to_amdgpu_encoder(encoder);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH V2] amdgpu: remove unnecessary condition check

2020-04-21 Thread Bernard Zhao
There is no need to if check again, maybe we could merge
into the above else branch.

Signed-off-by: Bernard Zhao 

---
Changes since V1:
*commit message improve
*code style refactoring

Link for V1:
* https://lore.kernel.org/patchwork/patch/1226587/
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..a64eeb07bec4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -660,13 +660,15 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else {
+
+   if (ret) {
pr_err("Failed to reserve buffers in ttm\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
}
+   else {
+   ctx->reserved = true;
+   }
 
return ret;
 }
@@ -733,15 +735,15 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
 
ret = ttm_eu_reserve_buffers(>ticket, >list,
 false, >duplicates);
-   if (!ret)
-   ctx->reserved = true;
-   else
-   pr_err("Failed to reserve buffers in ttm.\n");
 
if (ret) {
+   pr_err("Failed to reserve buffers in ttm.\n");
kfree(ctx->vm_pd);
ctx->vm_pd = NULL;
}
+   else {
+   ctx->reserved = true;
+   }
 
return ret;
 }
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH V3] amdgpu:optimization-- reduce no need mutex_lock area

2020-04-21 Thread Bernard Zhao
Maybe we could reduce the mutex_lock(>lock)`s protected code area,
and no need to protect pr_debug.

Signed-off-by: Bernard Zhao 

Changes since V1:
*commit message improve

Changes since V2:
*move comment along with the mutex_unlock

Link for V1:
*https://lore.kernel.org/patchwork/patch/1226588/
Link for V2:
*https://lore.kernel.org/patchwork/patch/1227907/
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 327317c54f7c..f03d9843d723 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1285,21 +1285,21 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
struct bo_vm_reservation_context ctx;
struct ttm_validate_buffer *bo_list_entry;
int ret;
+   unsigned int mapped_to_gpu_memory;
 
mutex_lock(>lock);
+   mapped_to_gpu_memory = mem->mapped_to_gpu_memory;
+   mutex_unlock(>lock);
+   /* lock is not needed after this, since mem is unused and will
+* be freed anyway
+*/
 
-   if (mem->mapped_to_gpu_memory > 0) {
+   if (mapped_to_gpu_memory > 0) {
pr_debug("BO VA 0x%llx size 0x%lx is still mapped.\n",
mem->va, bo_size);
-   mutex_unlock(>lock);
return -EBUSY;
}
 
-   mutex_unlock(>lock);
-   /* lock is not needed after this, since mem is unused and will
-* be freed anyway
-*/
-
/* No more MMU notifiers */
amdgpu_mn_unregister(mem->bo);
 
-- 
2.26.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re:Re: [PATCH V2] amdgpu: remove unnecessary condition check

2020-04-21 Thread 赵军奎

From: Felix Kuehling 
Date: 2020-04-21 12:24:19
To:  1587180037-113840-1-git-send-email-bern...@vivo.com,Alex Deucher 
,"Christian König" ,"David 
(ChunMing) Zhou" ,David Airlie ,Daniel 
Vetter 
,amd-gfx@lists.freedesktop.org,dri-de...@lists.freedesktop.org,linux-ker...@vger.kernel.org
Cc:  opensource.ker...@vivo.com,Bernard Zhao 
Subject: Re: [PATCH V2] amdgpu: remove unnecessary condition check>Hi Bernard,
>
>Please see comments inline.
>
>Am 2020-04-20 um 10:41 p.m. schrieb Bernard Zhao:
>> There is no need to if check again, maybe we could merge
>> into the above else branch.
>>
>> Signed-off-by: Bernard Zhao 
>>
>> ---
>> Changes since V1:
>> *commit message improve
>> *code style refactoring
>>
>> Link for V1:
>> * https://lore.kernel.org/patchwork/patch/1226587/
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +---
>>  1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 9dff792c9290..a64eeb07bec4 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -660,13 +660,15 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
>>  
>>  ret = ttm_eu_reserve_buffers(>ticket, >list,
>>   false, >duplicates);
>> -if (!ret)
>> -ctx->reserved = true;
>> -else {
>> +
>> +if (ret) {
>>  pr_err("Failed to reserve buffers in ttm\n");
>>  kfree(ctx->vm_pd);
>>  ctx->vm_pd = NULL;
>>  }
>> +else {
>> +ctx->reserved = true;
>> +}
>
>Here you're just reversing the if and else branches. This change looks
>completely superfluous to me.
>
>You're also breaking coding style conventions. The "else" should be on
>the same line as the closing brace "}". I'm pretty sure checkpatch.pl
>will complain about this.
>

In this file, only these two functions are  
format. 
So in V2, after improve the commit  info, 
I refer to the following code style suggestions and modify it to  format
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?id=90280eaa88ac1a9140dc759941123530d5545bb6#n191
(refer from Markus Elfring`s suggestion).

>>  
>>  return ret;
>>  }
>> @@ -733,15 +735,15 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
>>  
>>  ret = ttm_eu_reserve_buffers(>ticket, >list,
>>   false, >duplicates);
>> -if (!ret)
>> -ctx->reserved = true;
>> -else
>> -pr_err("Failed to reserve buffers in ttm.\n");
>>  
>>  if (ret) {
>> +pr_err("Failed to reserve buffers in ttm.\n");
>>  kfree(ctx->vm_pd);
>>  ctx->vm_pd = NULL;
>>  }
>> +else {
>> +ctx->reserved = true;
>> +}
>
>Same as above regarding coding style.
>
>To minimize unnecessary code changes, you can merge the "if (ret) ..."
>code into the else-branch of the previous if.
>
>Regards,
>  Felix
>
>
>>  
>>  return ret;
>>  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH -next] drm/amdgpu: remove set but not used variable 'priority'

2020-04-21 Thread YueHaibing
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c: In function amdgpu_job_submit:
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:148:26: warning: variable priority set 
but not used [-Wunused-but-set-variable]

commit 33abcb1f5a17 ("drm/amdgpu: set compute queue priority at mqd_init")
left behind this, remove it.

Signed-off-by: YueHaibing 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 35c381ec0423..47207188c569 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -145,7 +145,6 @@ void amdgpu_job_free(struct amdgpu_job *job)
 int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
  void *owner, struct dma_fence **f)
 {
-   enum drm_sched_priority priority;
int r;
 
if (!f)
@@ -157,7 +156,6 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
 
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
-   priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);
 
return 0;
-- 
2.17.1


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 0/8] psp code clean up

2020-04-21 Thread Clements, John
[AMD Public Use]

The series is: Reviewed-by: John Clements 

-Original Message-
From: Chen, Guchun  
Sent: Monday, April 20, 2020 8:56 PM
To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; 
Deucher, Alexander ; Clements, John 

Subject: RE: [PATCH 0/8] psp code clean up

[AMD Public Use]

The series is: Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Zhang, Hawking  
Sent: Monday, April 20, 2020 6:17 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Clements, John ; Chen, 
Guchun 
Cc: Zhang, Hawking 
Subject: [PATCH 0/8] psp code clean up

the seires make cleanup in current software psp support, including:
1). retire unnecessary ip callback function like support_vmr_ring, 
check_fw_loading.
2). remove unnecessary tOS version check 3). create common helper functions to 
avoid duplicate code per IP generation

Hawking Zhang (8):
  drm/amdgpu: retire support_vmr_ring interface
  drm/amdgpu: remove unnecessary tOS version check
  drm/amdgpu: retire unused check_fw_loading status
  drm/amdgpu: add helper function to init asd ucode
  drm/amdgpu: switch to helper function to init asd ucode
  drm/amdgpu: add helper function to init sos ucode
  drm/amdgpu: switch to helper function to init sos ucode
  drm/amdgpu: retire legacy vega10 sos version check

 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 130   
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h |  14 +-  
drivers/gpu/drm/amd/amdgpu/psp_v10_0.c  | 141 +  
drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 235 ++--  
drivers/gpu/drm/amd/amdgpu/psp_v12_0.c  | 172 +
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c   | 262 +++-
 6 files changed, 149 insertions(+), 805 deletions(-)

--
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx