RE: [PATCH] drm/amdgpu: turn back rlcg write for gfx_v10

2020-05-12 Thread Tao, Yintian
Ping...

-Original Message-
From: Yintian Tao  
Sent: 2020年5月12日 18:17
To: Deucher, Alexander ; Liu, Monk 
; Liu, Shaoyun 
Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
Subject: [PATCH] drm/amdgpu: turn back rlcg write for gfx_v10

There is no need to use amdgpu_mm_wreg_mmio_rlc() during initialization time 
because this interface is only designed for debugfs case to access the 
registers which are only permitted by RLCG during run-time. Therefore, turn 
back rlcg write for gfx_v10.
If we not turn back it, it will raise amdgpu load failure.
[   54.904333] amdgpu: SMU driver if version not matched
[   54.904393] amdgpu: SMU is initialized successfully!
[   54.905971] [drm] kiq ring mec 2 pipe 1 q 0
[   55.115416] amdgpu :00:06.0: [drm:amdgpu_ring_test_helper [amdgpu]] 
*ERROR* ring gfx_0.0.0 test failed (-110)
[   55.118877] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block 
 failed -110
[   55.126587] amdgpu :00:06.0: amdgpu_device_ip_init failed
[   55.133466] amdgpu :00:06.0: Fatal error during GPU init

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 449408cfd018..bd5dd4f64311 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4577,13 +4577,11 @@ static int gfx_v10_0_init_csb(struct amdgpu_device 
*adev)
adev->gfx.rlc.funcs->get_csb_buffer(adev, adev->gfx.rlc.cs_ptr);
 
/* csib */
-   /* amdgpu_mm_wreg_mmio_rlc will fall back to mmio if doesn't support 
rlcg_write */
-   amdgpu_mm_wreg_mmio_rlc(adev, SOC15_REG_OFFSET(GC, 0, 
mmRLC_CSIB_ADDR_HI),
-adev->gfx.rlc.clear_state_gpu_addr >> 32, 0);
-   amdgpu_mm_wreg_mmio_rlc(adev, SOC15_REG_OFFSET(GC, 0, 
mmRLC_CSIB_ADDR_LO),
-adev->gfx.rlc.clear_state_gpu_addr & 
0xfffc, 0);
-   amdgpu_mm_wreg_mmio_rlc(adev, SOC15_REG_OFFSET(GC, 0, 
mmRLC_CSIB_LENGTH),
-adev->gfx.rlc.clear_state_size, 0);
+   WREG32_SOC15_RLC(GC, 0, mmRLC_CSIB_ADDR_HI,
+adev->gfx.rlc.clear_state_gpu_addr >> 32);
+   WREG32_SOC15_RLC(GC, 0, mmRLC_CSIB_ADDR_LO,
+adev->gfx.rlc.clear_state_gpu_addr & 0xfffc);
+   WREG32_SOC15_RLC(GC, 0, mmRLC_CSIB_LENGTH, 
+adev->gfx.rlc.clear_state_size);
 
return 0;
 }
@@ -5192,7 +5190,7 @@ static int gfx_v10_0_cp_gfx_enable(struct amdgpu_device 
*adev, bool enable)
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
-   amdgpu_mm_wreg_mmio_rlc(adev, SOC15_REG_OFFSET(GC, 0, mmCP_ME_CNTL), 
tmp, 0);
+   WREG32_SOC15_RLC(GC, 0, mmCP_ME_CNTL, tmp);
 
for (i = 0; i < adev->usec_timeout; i++) {
if (RREG32_SOC15(GC, 0, mmCP_STAT) == 0)
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: protect ring overrun

2020-04-23 Thread Tao, Yintian
Hi  Christian

Thanks. I will remove the initialization of r.

Best Regards
Yintian Tao
-Original Message-
From: Christian König  
Sent: 2020年4月23日 20:22
To: Tao, Yintian ; Koenig, Christian 
; Liu, Monk ; Liu, Shaoyun 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: protect ring overrun

Am 23.04.20 um 11:06 schrieb Yintian Tao:
> Wait for the oldest sequence on the ring to be signaled in order to 
> make sure there will be no command overrun.
>
> v2: fix coding stype and remove abs operation

One nit pick below, with that fixed the patch is Reviewed-by: Christian König 


>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 10 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 22 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  8 +++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|  1 -
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |  1 -
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 14 +++---
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  8 +++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  8 +++-
>   9 files changed, 61 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 7531527067df..397bd5fa77cb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -192,14 +192,22 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct 
> dma_fence **f,
>* Used For polling fence.
>* Returns 0 on success, -ENOMEM on failure.
>*/
> -int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s)
> +int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s,
> +   uint32_t timeout)
>   {
>   uint32_t seq;
> + signed long r = 0;

Please drop the initialization of r here. That is usually seen as rather bad 
style because it prevents the compiler from raising an warning when this really 
isn't initialized.

Regards,
Christian.

>   
>   if (!s)
>   return -EINVAL;
>   
>   seq = ++ring->fence_drv.sync_seq;
> + r = amdgpu_fence_wait_polling(ring,
> +   seq - ring->fence_drv.num_fences_mask,
> +   timeout);
> + if (r < 1)
> + return -ETIMEDOUT;
> +
>   amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
>  seq, 0);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a721b0e0ff69..0103acc57474 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -675,13 +675,15 @@ uint32_t amdgpu_kiq_rreg(struct amdgpu_device 
> *adev, uint32_t reg)
>   
>   spin_lock_irqsave(>ring_lock, flags);
>   if (amdgpu_device_wb_get(adev, _val_offs)) {
> - spin_unlock_irqrestore(>ring_lock, flags);
>   pr_err("critical bug! too many kiq readers\n");
> - goto failed_kiq_read;
> + goto failed_unlock;
>   }
>   amdgpu_ring_alloc(ring, 32);
>   amdgpu_ring_emit_rreg(ring, reg, reg_val_offs);
> - amdgpu_fence_emit_polling(ring, );
> + r = amdgpu_fence_emit_polling(ring, , MAX_KIQ_REG_WAIT);
> + if (r)
> + goto failed_undo;
> +
>   amdgpu_ring_commit(ring);
>   spin_unlock_irqrestore(>ring_lock, flags);
>   
> @@ -712,7 +714,13 @@ uint32_t amdgpu_kiq_rreg(struct amdgpu_device *adev, 
> uint32_t reg)
>   amdgpu_device_wb_free(adev, reg_val_offs);
>   return value;
>   
> +failed_undo:
> + amdgpu_ring_undo(ring);
> +failed_unlock:
> + spin_unlock_irqrestore(>ring_lock, flags);
>   failed_kiq_read:
> + if (reg_val_offs)
> + amdgpu_device_wb_free(adev, reg_val_offs);
>   pr_err("failed to read reg:%x\n", reg);
>   return ~0;
>   }
> @@ -730,7 +738,10 @@ void amdgpu_kiq_wreg(struct amdgpu_device *adev, 
> uint32_t reg, uint32_t v)
>   spin_lock_irqsave(>ring_lock, flags);
>   amdgpu_ring_alloc(ring, 32);
>   amdgpu_ring_emit_wreg(ring, reg, v);
> - amdgpu_fence_emit_polling(ring, );
> + r = amdgpu_fence_emit_polling(ring, , MAX_KIQ_REG_WAIT);
> + if (r)
> + goto failed_undo;
> +
>   amdgpu_ring_commit(ring);
>   spin_unlock_irqrestore(>ring_lock, flags);
>   
> @@ -759,6 +770,9 @@ void amdgpu_kiq_wreg(struct amdgpu_device *adev, 
> uint32_t reg, uint32_t v)
>   
>   return;
>   
> +fa

RE: [PATCH 2/2] drm/amdgpu: drop the unused local variable

2020-04-23 Thread Tao, Yintian
Hi  Hawking

Can you help also remove the same local variable kiq for 
gfx_v10_0_ring_emit_rreg? Thanks in advance.
After that , Reviewed-by: Yintian Tao 



Best Regards
Yintian Tao
-Original Message-
From: amd-gfx  On Behalf Of Hawking Zhang
Sent: 2020年4月23日 17:02
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking 
Subject: [PATCH 2/2] drm/amdgpu: drop the unused local variable

local variable kiq won't be used in function gfx_v8_0_ring_emit_rreg

Change-Id: I6229987c8ce43ff0d55e1fae15ede9cb0827f76d
Signed-off-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 8dc8e90..9644614 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6395,7 +6395,6 @@ static void gfx_v8_0_ring_emit_rreg(struct amdgpu_ring 
*ring, uint32_t reg,
uint32_t reg_val_offs)
 {
struct amdgpu_device *adev = ring->adev;
-   struct amdgpu_kiq *kiq = >gfx.kiq;
 
amdgpu_ring_write(ring, PACKET3(PACKET3_COPY_DATA, 4));
amdgpu_ring_write(ring, 0 | /* src: register*/
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cyintian.tao%40amd.com%7C3e351ebdd7ff45259e6108d7e76502d0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637232293332601376sdata=dH2o%2Bl9%2FjqzP2IIN43qIDrbpmmQpXjPwgGvyaAc%2B1L8%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/8] drm/amdgpu: ignore TA ucode for SRIOV

2020-04-23 Thread Tao, Yintian
Series is Acked-by: Yintian Tao 

-Original Message-
From: amd-gfx  On Behalf Of Monk Liu
Sent: 2020年4月23日 15:02
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk 
Subject: [PATCH 1/8] drm/amdgpu: ignore TA ucode for SRIOV

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index 0afd610..b4b0242 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -194,6 +194,8 @@ static int psp_v11_0_init_microcode(struct psp_context *psp)
case CHIP_NAVI10:
case CHIP_NAVI14:
case CHIP_NAVI12:
+   if (amdgpu_sriov_vf(adev))
+   break;
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_ta.bin", 
chip_name);
err = request_firmware(>psp.ta_fw, fw_name, adev->dev);
if (err) {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cyintian.tao%40amd.com%7C591f6154216d45f81a0c08d7e7543671%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637232221913347447sdata=jFv%2FMqS%2B0lHnuaN0%2B1GG6iSfyyAFPZbFBxa%2BiEL2tsg%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: protect kiq overrun

2020-04-22 Thread Tao, Yintian
Hi  Christian


Ok , I got it. The real max number can be submitted to kiq ring buffer is 1024.
If we use the num_fneces_mask value then the max submission number will be 
reduced to 512, do you think whether it is ok?


Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian  
Sent: 2020年4月23日 2:43
To: Tao, Yintian ; Liu, Monk ; Kuehling, 
Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: protect kiq overrun

Am 22.04.20 um 16:50 schrieb Yintian Tao:
> Wait for the oldest sequence on the kiq ring to be signaled in order 
> to make sure there will be no kiq overrun.
>
> v2: remove unused the variable and correct
>  kiq max_sub_num value

First of all this should probably be added to the fence handling code and not 
the kiq code.

Then you are kind of duplicating some of the functionality we have in the ring 
handling here. Probably better to avoid this, see
amdgpu_fence_driver_init_ring() as well. That's also why I suggested to use the 
num_fences_mask value.

Regards,
Christian.

>
> Signed-off-by: Yintian Tao 
> ---
>   .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  6 
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  6 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 30 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |  3 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  6 
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  6 
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  7 +
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  7 +
>   8 files changed, 71 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> index 691c89705bcd..fac8b9713dfc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> @@ -325,6 +325,12 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void 
> *mqd,
>mec, pipe, queue_id);
>   
>   spin_lock(>gfx.kiq.ring_lock);
> + r = amdgpu_gfx_kiq_is_avail(>gfx.kiq);
> + if (r) {
> + pr_err("critical bug! too many kiq submission\n");
> + goto out_unlock;
> + }
> +
>   r = amdgpu_ring_alloc(kiq_ring, 7);
>   if (r) {
>   pr_err("Failed to alloc KIQ (%d).\n", r); diff --git 
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> index df841c2ac5e7..fd42c126510f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> @@ -323,6 +323,12 @@ int kgd_gfx_v9_hiq_mqd_load(struct kgd_dev *kgd, void 
> *mqd,
>mec, pipe, queue_id);
>   
>   spin_lock(>gfx.kiq.ring_lock);
> + r = amdgpu_gfx_kiq_is_avail(>gfx.kiq);
> + if (r) {
> + pr_err("critical bug! too many kiq submissions\n");
> + goto out_unlock;
> + }
> +
>   r = amdgpu_ring_alloc(kiq_ring, 7);
>   if (r) {
>   pr_err("Failed to alloc KIQ (%d).\n", r); diff --git 
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a721b0e0ff69..84e66c45df37 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -321,6 +321,9 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
>AMDGPU_RING_PRIO_DEFAULT);
>   if (r)
>   dev_warn(adev->dev, "(%d) failed to init kiq ring\n", r);
> + else
> + kiq->max_sub_num = (ring->ring_size / 4) /
> + (ring->funcs->align_mask + 1);
>   
>   return r;
>   }
> @@ -663,6 +666,21 @@ int amdgpu_gfx_cp_ecc_error_irq(struct amdgpu_device 
> *adev,
>   return 0;
>   }
>   
> +int amdgpu_gfx_kiq_is_avail(struct amdgpu_kiq *kiq) {
> + uint32_t seq = 0;
> + signed long r = 0;
> +
> + seq = abs(kiq->ring.fence_drv.sync_seq - kiq->max_sub_num);
> + if (seq > kiq->max_sub_num) {
> + r = amdgpu_fence_wait_polling(>ring, seq,
> +   MAX_KIQ_REG_WAIT);
> + return r < 1 ? -ETIME : 0;
> + }
> +
> + return 0;
> +}
> +
>   uint32_t amdgpu_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>   {
>   signed long r, cnt = 0;
> @@ -674,6 +692,12 @@ uint32_t amdgpu_kiq_rreg(struct amdgpu_device *adev, 
> uint32_t reg)
>   BUG_ON(!ring->funcs->emit_rreg);
>   
>   spin_lock_irqsave(>ring_lock, flags);
> + r = amdgpu

RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

2020-04-22 Thread Tao, Yintian
Hi Shaoyun


Yes, you are right. It is the rare corner case.


Best Regards
Yintian Tao

-Original Message-
From: Liu, Shaoyun  
Sent: 2020年4月22日 23:51
To: Tao, Yintian ; Koenig, Christian 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

[AMD Official Use Only - Internal Distribution Only]

OK, I see, the submission it self be signaled so the ring space for this 
submission will be re-use by other submission , but the CPU still  not read the 
out put value yet. 

Thanks
Shaoyun.liu

-Original Message-
From: Tao, Yintian 
Sent: Wednesday, April 22, 2020 11:47 AM
To: Tao, Yintian ; Liu, Shaoyun ; 
Koenig, Christian ; Liu, Monk ; 
Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Add more 

Especially for the multi-VF environment, we have to wait through msleep() 
instead udeay.
Because the max udelay time is 15VF * 6ms(world-switch) = 90ms.


-Original Message-
From: amd-gfx  On Behalf Of Tao, Yintian
Sent: 2020年4月22日 23:43
To: Liu, Shaoyun ; Koenig, Christian 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi  Shaoyun



No, the second patch can't solve this rare case because only Slot-D is signaled 
and the Slot-A can be overwritten. 
The second patch think the sequence is signaled the Slot-A buffer can be freed.

if you store  the output value in each ring buffer itself , each kiq operation 
will be atomic and self contain .  
[yttao]: If we wan to really make the kiq operation be atomic then we have to 
do the things below:
spin_lock_irqsave(>ring_lock, flags); .
Fulfill the command buffer
.
if (r < 1 && (adev->in_gpu_reset || in_interrupt()))
goto failed_kiq_write;

might_sleep();
while (r < 1 && cnt++ < MAX_KIQ_REG_TRY) {

msleep(MAX_KIQ_REG_BAILOUT_INTERVAL); /* here will break atomic 
and we need directly use udealy*/
r = amdgpu_fence_wait_polling(ring, seq, MAX_KIQ_REG_WAIT);
}
spin_lock_irqrestore(>ring_lock, flags);


Best Regards
Yintian Tao
-Original Message-
From: Liu, Shaoyun 
Sent: 2020年4月22日 23:35
To: Tao, Yintian ; Koenig, Christian 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

[AMD Official Use Only - Internal Distribution Only]

This is the issue you try to solve  with your second patch (protect kiq 
overrun) . For current  patch , if you store  the output value in each ring 
buffer itself , each kiq operation will be atomic and self contain . 

Shaoyun.liu

-----Original Message-
From: Tao, Yintian 
Sent: Wednesday, April 22, 2020 11:00 AM
To: Koenig, Christian ; Liu, Shaoyun 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi  Shaoyun


There is one rare corner case which will raise problem when using ring buffer 
to store value.

It is assumed there are only total four slots at KIQ ring buffer.

And these four slots are fulfilled with command to read registers.  Slot-A 
Slot-B Slot-C Slot-D

And they are waiting for the sequence fences to be signaled. Here, there is one 
new command to write register to be submitted

1. Slot-A under msleep not to read register 2. Slot-B under msleep not to read 
register 3. Slot-C under msleep not to read register.
4. Slot-D happen to find the sequence signaled and here the new write command 
will overwrite the Slot-A contents.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian 
Sent: 2020年4月22日 22:52
To: Liu, Shaoyun ; Tao, Yintian ; 
Liu, Monk ; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi Shaoyun,

the ring buffer is usually filled with command and not read results.

Allocating extra space would only work if we use the special NOP command and 
that is way more complicated and fragile than just using the wb functions which 
where made for this stuff.

Regards,
Christian.

Am 22.04.20 um 16:48 schrieb Liu, Shaoyun:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi ,Yintian & Christian
> I still don't understand why we need this complicated  change here . Why can 
> not just allocate few more extra space in the ring for each read  and use the 
> space to store the output value  ?
>
> Regards
> Shaoyun.liu
> 
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian König
> Sent: Wednesday, April 22, 2020 8:42 AM
> To: Tao, Yintian ; Liu, Monk ; 
> Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq

RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

2020-04-22 Thread Tao, Yintian
Add more 

Especially for the multi-VF environment, we have to wait through msleep() 
instead udeay.
Because the max udelay time is 15VF * 6ms(world-switch) = 90ms.


-Original Message-
From: amd-gfx  On Behalf Of Tao, Yintian
Sent: 2020年4月22日 23:43
To: Liu, Shaoyun ; Koenig, Christian 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi  Shaoyun



No, the second patch can't solve this rare case because only Slot-D is signaled 
and the Slot-A can be overwritten. 
The second patch think the sequence is signaled the Slot-A buffer can be freed.

if you store  the output value in each ring buffer itself , each kiq operation 
will be atomic and self contain .  
[yttao]: If we wan to really make the kiq operation be atomic then we have to 
do the things below:
spin_lock_irqsave(>ring_lock, flags); .
Fulfill the command buffer
.
if (r < 1 && (adev->in_gpu_reset || in_interrupt()))
goto failed_kiq_write;

might_sleep();
while (r < 1 && cnt++ < MAX_KIQ_REG_TRY) {

msleep(MAX_KIQ_REG_BAILOUT_INTERVAL); /* here will break atomic 
and we need directly use udealy*/
r = amdgpu_fence_wait_polling(ring, seq, MAX_KIQ_REG_WAIT);
}
spin_lock_irqrestore(>ring_lock, flags);


Best Regards
Yintian Tao
-Original Message-
From: Liu, Shaoyun 
Sent: 2020年4月22日 23:35
To: Tao, Yintian ; Koenig, Christian 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

[AMD Official Use Only - Internal Distribution Only]

This is the issue you try to solve  with your second patch (protect kiq 
overrun) . For current  patch , if you store  the output value in each ring 
buffer itself , each kiq operation will be atomic and self contain . 

Shaoyun.liu

-----Original Message-
From: Tao, Yintian 
Sent: Wednesday, April 22, 2020 11:00 AM
To: Koenig, Christian ; Liu, Shaoyun 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi  Shaoyun


There is one rare corner case which will raise problem when using ring buffer 
to store value.

It is assumed there are only total four slots at KIQ ring buffer.

And these four slots are fulfilled with command to read registers.  Slot-A 
Slot-B Slot-C Slot-D

And they are waiting for the sequence fences to be signaled. Here, there is one 
new command to write register to be submitted

1. Slot-A under msleep not to read register 2. Slot-B under msleep not to read 
register 3. Slot-C under msleep not to read register.
4. Slot-D happen to find the sequence signaled and here the new write command 
will overwrite the Slot-A contents.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian 
Sent: 2020年4月22日 22:52
To: Liu, Shaoyun ; Tao, Yintian ; 
Liu, Monk ; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi Shaoyun,

the ring buffer is usually filled with command and not read results.

Allocating extra space would only work if we use the special NOP command and 
that is way more complicated and fragile than just using the wb functions which 
where made for this stuff.

Regards,
Christian.

Am 22.04.20 um 16:48 schrieb Liu, Shaoyun:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi ,Yintian & Christian
> I still don't understand why we need this complicated  change here . Why can 
> not just allocate few more extra space in the ring for each read  and use the 
> space to store the output value  ?
>
> Regards
> Shaoyun.liu
> 
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian König
> Sent: Wednesday, April 22, 2020 8:42 AM
> To: Tao, Yintian ; Liu, Monk ; 
> Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq read 
> reg
>
> Am 22.04.20 um 14:36 schrieb Yintian Tao:
>> According to the current kiq read register method, there will be race 
>> condition when using KIQ to read register if multiple clients want to 
>> read at same time just like the expample below:
>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the
>> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
>> the seqno-1 5. the kiq complete these two read operation 6. client-A 
>> to read the register at the wb buffer and
>>  get REG-1 value
>>
>> Therefore, use amdgpu_device_wb_get() to request reg_val_offs for 
>> each kiq read register.
>>
>> v2: fix the error remove
>>
>> Signed-off-by: Yintian Tao 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h   

RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

2020-04-22 Thread Tao, Yintian
Hi  Shaoyun



No, the second patch can't solve this rare case because only Slot-D is signaled 
and the Slot-A can be overwritten. 
The second patch think the sequence is signaled the Slot-A buffer can be freed.

if you store  the output value in each ring buffer itself , each kiq operation 
will be atomic and self contain .  
[yttao]: If we wan to really make the kiq operation be atomic then we have to 
do the things below:
spin_lock_irqsave(>ring_lock, flags);
.
Fulfill the command buffer
.
if (r < 1 && (adev->in_gpu_reset || in_interrupt()))
goto failed_kiq_write;

might_sleep();
while (r < 1 && cnt++ < MAX_KIQ_REG_TRY) {

msleep(MAX_KIQ_REG_BAILOUT_INTERVAL); /* here will break atomic 
and we need directly use udealy*/
r = amdgpu_fence_wait_polling(ring, seq, MAX_KIQ_REG_WAIT);
}
spin_lock_irqrestore(>ring_lock, flags);


Best Regards
Yintian Tao 
-Original Message-
From: Liu, Shaoyun  
Sent: 2020年4月22日 23:35
To: Tao, Yintian ; Koenig, Christian 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

[AMD Official Use Only - Internal Distribution Only]

This is the issue you try to solve  with your second patch (protect kiq 
overrun) . For current  patch , if you store  the output value in each ring 
buffer itself , each kiq operation will be atomic and self contain . 

Shaoyun.liu

-----Original Message-
From: Tao, Yintian 
Sent: Wednesday, April 22, 2020 11:00 AM
To: Koenig, Christian ; Liu, Shaoyun 
; Liu, Monk ; Kuehling, Felix 

Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi  Shaoyun


There is one rare corner case which will raise problem when using ring buffer 
to store value.

It is assumed there are only total four slots at KIQ ring buffer.

And these four slots are fulfilled with command to read registers.  Slot-A 
Slot-B Slot-C Slot-D

And they are waiting for the sequence fences to be signaled. Here, there is one 
new command to write register to be submitted

1. Slot-A under msleep not to read register 2. Slot-B under msleep not to read 
register 3. Slot-C under msleep not to read register.
4. Slot-D happen to find the sequence signaled and here the new write command 
will overwrite the Slot-A contents.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian 
Sent: 2020年4月22日 22:52
To: Liu, Shaoyun ; Tao, Yintian ; 
Liu, Monk ; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi Shaoyun,

the ring buffer is usually filled with command and not read results.

Allocating extra space would only work if we use the special NOP command and 
that is way more complicated and fragile than just using the wb functions which 
where made for this stuff.

Regards,
Christian.

Am 22.04.20 um 16:48 schrieb Liu, Shaoyun:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi ,Yintian & Christian
> I still don't understand why we need this complicated  change here . Why can 
> not just allocate few more extra space in the ring for each read  and use the 
> space to store the output value  ?
>
> Regards
> Shaoyun.liu
> 
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian König
> Sent: Wednesday, April 22, 2020 8:42 AM
> To: Tao, Yintian ; Liu, Monk ; 
> Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq read 
> reg
>
> Am 22.04.20 um 14:36 schrieb Yintian Tao:
>> According to the current kiq read register method, there will be race 
>> condition when using KIQ to read register if multiple clients want to 
>> read at same time just like the expample below:
>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the
>> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
>> the seqno-1 5. the kiq complete these two read operation 6. client-A 
>> to read the register at the wb buffer and
>>  get REG-1 value
>>
>> Therefore, use amdgpu_device_wb_get() to request reg_val_offs for 
>> each kiq read register.
>>
>> v2: fix the error remove
>>
>> Signed-off-by: Yintian Tao 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  2 +-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  | 19 ++---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  1 -
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  5 +++--
>>drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   |  7 +++---
>>drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  7 +++---
>>drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 27 
>>

RE: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

2020-04-22 Thread Tao, Yintian
Hi  Shaoyun


There is one rare corner case which will raise problem when using ring buffer 
to store value.

It is assumed there are only total four slots at KIQ ring buffer.

And these four slots are fulfilled with command to read registers.  Slot-A 
Slot-B Slot-C Slot-D

And they are waiting for the sequence fences to be signaled. Here, there is one 
new command to write register to be submitted

1. Slot-A under msleep not to read register
2. Slot-B under msleep not to read register
3. Slot-C under msleep not to read register.
4. Slot-D happen to find the sequence signaled and here the new write command 
will overwrite the Slot-A contents.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian  
Sent: 2020年4月22日 22:52
To: Liu, Shaoyun ; Tao, Yintian ; 
Liu, Monk ; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq read reg

Hi Shaoyun,

the ring buffer is usually filled with command and not read results.

Allocating extra space would only work if we use the special NOP command and 
that is way more complicated and fragile than just using the wb functions which 
where made for this stuff.

Regards,
Christian.

Am 22.04.20 um 16:48 schrieb Liu, Shaoyun:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi ,Yintian & Christian
> I still don't understand why we need this complicated  change here . Why can 
> not just allocate few more extra space in the ring for each read  and use the 
> space to store the output value  ?
>
> Regards
> Shaoyun.liu
> 
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian König
> Sent: Wednesday, April 22, 2020 8:42 AM
> To: Tao, Yintian ; Liu, Monk ; 
> Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: request reg_val_offs each kiq read 
> reg
>
> Am 22.04.20 um 14:36 schrieb Yintian Tao:
>> According to the current kiq read register method, there will be race 
>> condition when using KIQ to read register if multiple clients want to 
>> read at same time just like the expample below:
>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the
>> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
>> the seqno-1 5. the kiq complete these two read operation 6. client-A 
>> to read the register at the wb buffer and
>>  get REG-1 value
>>
>> Therefore, use amdgpu_device_wb_get() to request reg_val_offs for 
>> each kiq read register.
>>
>> v2: fix the error remove
>>
>> Signed-off-by: Yintian Tao 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  2 +-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  | 19 ++---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  1 -
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  5 +++--
>>drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   |  7 +++---
>>drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  7 +++---
>>drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 27 
>>7 files changed, 41 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 4e1d4cfe7a9f..7ee5a4da398a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -526,7 +526,7 @@ static inline void amdgpu_set_ib_value(struct 
>> amdgpu_cs_parser *p,
>>/*
>> * Writeback
>> */
>> -#define AMDGPU_MAX_WB 128   /* Reserve at most 128 WB slots for 
>> amdgpu-owned rings. */
>> +#define AMDGPU_MAX_WB 256   /* Reserve at most 256 WB slots for 
>> amdgpu-owned rings. */
>>
>>struct amdgpu_wb {
>>  struct amdgpu_bo*wb_obj;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index ea576b4260a4..d5a59d7c48d6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -304,10 +304,6 @@ int amdgpu_gfx_kiq_init_ring(struct 
>> amdgpu_device *adev,
>>
>>  spin_lock_init(>ring_lock);
>>
>> -r = amdgpu_device_wb_get(adev, >reg_val_offs);
>> -if (r)
>> -return r;
>> -
>>  ring->adev = NULL;
>>  ring->ring_obj = NULL;
>>  ring->use_doorbell = true;
>> @@ -331,7 +327,6 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device 
>> *adev,
>>
>>void amdgpu_gfx_kiq_free_ring(struct amdgpu_ring *ring)
>>{
>> -amdgpu_device_wb_free(ring->adev, ring->adev->gfx.kiq.reg_val_offs);
>>  amdgpu_rin

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Hi  Christian

Thanks, I got it. I will send another patch for the KIQ overrun problem

Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian  
Sent: 2020年4月22日 20:33
To: Tao, Yintian ; Liu, Monk ; Kuehling, 
Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register

Am 22.04.20 um 14:20 schrieb Tao, Yintian:
> Hi  Christian
>
>
> Please see inline commetns.
> -Original Message-
> From: Koenig, Christian 
> Sent: 2020年4月22日 19:57
> To: Tao, Yintian ; Liu, Monk ; 
> Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>
> Am 22.04.20 um 13:49 schrieb Tao, Yintian:
>> Hi  Christian
>>
>>
>> Can you help answer the questions below? Thanks in advance.
>> -Original Message-----
>> From: Koenig, Christian 
>> Sent: 2020年4月22日 19:03
>> To: Tao, Yintian ; Liu, Monk ; 
>> Kuehling, Felix 
>> Cc: amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>>
>> Am 22.04.20 um 11:29 schrieb Yintian Tao:
>>> According to the current kiq access register method, there will be 
>>> race condition when using KIQ to read register if multiple clients 
>>> want to read at same time just like the expample below:
>>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the
>>> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
>>> the seqno-1 5. the kiq complete these two read operation 6. client-A 
>>> to read the register at the wb buffer and
>>>   get REG-1 value
>>>
>>> And if there are multiple clients to frequently write registers 
>>> through KIQ which may raise the KIQ ring buffer overwritten problem.
>>>
>>> Therefore, allocate fixed number wb slot for rreg use and limit the 
>>> submit number which depends on the kiq ring_size in order to prevent 
>>> the overwritten problem.
>>>
>>> v2: directly use amdgpu_device_wb_get() for each read instead
>>>of to reserve fixde number slot.
>>>if there is no enough kiq ring buffer or rreg slot then
>>>directly print error log and return instead of busy waiting
>> I would split that into three patches. One for each problem we have here:
>>
>> 1. Fix kgd_hiq_mqd_load() and maybe other occasions to use 
>> spin_lock_irqsave().
>> [yttao]: Do you mean that we need to use spin_lock_irqsave for the functions 
>> just like kgd_hiq_mqd_load()?
> Yes, I strongly think so.
>
> See when you have one spin lock you either need always need to lock it with 
> irqs disabled or never.
>
> In other words we always need to either use spin_lock() or 
> spin_lock_irqsave(), but never mix them with the same lock.
>
> The only exception to this rule is when you take multiple locks, e.g.
> you can do:
>
> spin_lock_irqsave(, flags);
> spin_lock(, flags);
> spin_lock(, flags);
> 
> spin_unlock_irqsave(, flags);
>
> Here you don't need to use spin_lock_irqsave for b and c. But we rarely have 
> that case in the code.
> [yttao]: thanks , I got it. I will submit another patch for it.
>
>> 2. Prevent the overrung of the KIQ. Please drop the approach with the 
>> atomic here. Instead just add a amdgpu_fence_wait_polling() into
>> amdgpu_fence_emit_polling() as I discussed with Monk.
>> [yttao]: Sorry, I can't get your original idea for the 
>> amdgpu_fence_wait_polling(). Can you give more details about it? Thanks in 
>> advance.
>>
>> "That is actually only a problem because the KIQ uses polling waits.
>>
>> See amdgpu_fence_emit() waits for the oldest possible fence to be signaled 
>> before emitting a new one.
>>
>> I suggest that we do the same in amdgpu_fence_emit_polling(). A one liner 
>> like the following should be enough:
>>
>> amdgpu_fence_wait_polling(ring, seq - ring->fence_drv.num_fences_mask, 
>> timeout);"
>> [yttao]: there is no usage of num_fences_mask at kiq fence polling, the 
>> num_fences_mask is only effective at dma_fence architecture.
>>  If I understand correctly, do you want the protype code below? 
>> If the protype code is wrong, can you help give one sample? Thanks in 
>> advance.
>>
>> int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s) {
>>   uint32_t seq;
>>
>>   if (!s)
>>   return -EINVAL;
>> +amdgpu_fence_wait_polling(ring, seq, timeout);
>>   seq = ++ring->fence_drv.sync_

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Hi  Christian


Please see inline commetns.
-Original Message-
From: Koenig, Christian  
Sent: 2020年4月22日 19:57
To: Tao, Yintian ; Liu, Monk ; Kuehling, 
Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register

Am 22.04.20 um 13:49 schrieb Tao, Yintian:
> Hi  Christian
>
>
> Can you help answer the questions below? Thanks in advance.
> -Original Message-
> From: Koenig, Christian 
> Sent: 2020年4月22日 19:03
> To: Tao, Yintian ; Liu, Monk ; 
> Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>
> Am 22.04.20 um 11:29 schrieb Yintian Tao:
>> According to the current kiq access register method, there will be 
>> race condition when using KIQ to read register if multiple clients 
>> want to read at same time just like the expample below:
>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the
>> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
>> the seqno-1 5. the kiq complete these two read operation 6. client-A 
>> to read the register at the wb buffer and
>>  get REG-1 value
>>
>> And if there are multiple clients to frequently write registers 
>> through KIQ which may raise the KIQ ring buffer overwritten problem.
>>
>> Therefore, allocate fixed number wb slot for rreg use and limit the 
>> submit number which depends on the kiq ring_size in order to prevent 
>> the overwritten problem.
>>
>> v2: directly use amdgpu_device_wb_get() for each read instead
>>   of to reserve fixde number slot.
>>   if there is no enough kiq ring buffer or rreg slot then
>>   directly print error log and return instead of busy waiting
> I would split that into three patches. One for each problem we have here:
>
> 1. Fix kgd_hiq_mqd_load() and maybe other occasions to use 
> spin_lock_irqsave().
> [yttao]: Do you mean that we need to use spin_lock_irqsave for the functions 
> just like kgd_hiq_mqd_load()?

Yes, I strongly think so.

See when you have one spin lock you either need always need to lock it with 
irqs disabled or never.

In other words we always need to either use spin_lock() or spin_lock_irqsave(), 
but never mix them with the same lock.

The only exception to this rule is when you take multiple locks, e.g. 
you can do:

spin_lock_irqsave(, flags);
spin_lock(, flags);
spin_lock(, flags);

spin_unlock_irqsave(, flags);

Here you don't need to use spin_lock_irqsave for b and c. But we rarely have 
that case in the code.
[yttao]: thanks , I got it. I will submit another patch for it.

> 2. Prevent the overrung of the KIQ. Please drop the approach with the 
> atomic here. Instead just add a amdgpu_fence_wait_polling() into
> amdgpu_fence_emit_polling() as I discussed with Monk.
> [yttao]: Sorry, I can't get your original idea for the 
> amdgpu_fence_wait_polling(). Can you give more details about it? Thanks in 
> advance.
>
> "That is actually only a problem because the KIQ uses polling waits.
>
> See amdgpu_fence_emit() waits for the oldest possible fence to be signaled 
> before emitting a new one.
>
> I suggest that we do the same in amdgpu_fence_emit_polling(). A one liner 
> like the following should be enough:
>
> amdgpu_fence_wait_polling(ring, seq - ring->fence_drv.num_fences_mask, 
> timeout);"
> [yttao]: there is no usage of num_fences_mask at kiq fence polling, the 
> num_fences_mask is only effective at dma_fence architecture.
>   If I understand correctly, do you want the protype code below? 
> If the protype code is wrong, can you help give one sample? Thanks in advance.
>
> int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s) {
>  uint32_t seq;
>
>  if (!s)
>  return -EINVAL;
> + amdgpu_fence_wait_polling(ring, seq, timeout);
>  seq = ++ring->fence_drv.sync_seq;

Your understanding sounds more or less correct. The code should look something 
like this:

seq = ++ring->fence_drv.sync_seq;
amdgpu_fence_wait_polling(ring, seq -
number_of_allowed_submissions_to_the_kiq, timeout);
[yttao]: whether we need directly wait at the first just like below? Otherwise, 
amdgpu_ring_emit_wreg may overwrite the KIQ ring buffer.
+   amdgpu_fence_wait_polling(ring, seq - 
number_of_allowed_submissions_to_the_kiq, timeout);
spin_lock_irqsave(>ring_lock, flags);
amdgpu_ring_alloc(ring, 32);
amdgpu_ring_emit_wreg(ring, reg, v);
amdgpu_fence_emit_polling(ring, ); /* wait */
amdgpu_ring_commit(ring);
spin_unlock_irqrestore(>ring_lock, flags);

I just used num_fences_mask as number_of_allowed_submissions_to_the_ki

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Hi  Christian


Can you help answer the questions below? Thanks in advance.
-Original Message-
From: Koenig, Christian  
Sent: 2020年4月22日 19:03
To: Tao, Yintian ; Liu, Monk ; Kuehling, 
Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register

Am 22.04.20 um 11:29 schrieb Yintian Tao:
> According to the current kiq access register method, there will be 
> race condition when using KIQ to read register if multiple clients 
> want to read at same time just like the expample below:
> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the 
> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
> the seqno-1 5. the kiq complete these two read operation 6. client-A 
> to read the register at the wb buffer and
> get REG-1 value
>
> And if there are multiple clients to frequently write registers 
> through KIQ which may raise the KIQ ring buffer overwritten problem.
>
> Therefore, allocate fixed number wb slot for rreg use and limit the 
> submit number which depends on the kiq ring_size in order to prevent 
> the overwritten problem.
>
> v2: directly use amdgpu_device_wb_get() for each read instead
>  of to reserve fixde number slot.
>  if there is no enough kiq ring buffer or rreg slot then
>  directly print error log and return instead of busy waiting

I would split that into three patches. One for each problem we have here:

1. Fix kgd_hiq_mqd_load() and maybe other occasions to use spin_lock_irqsave().
[yttao]: Do you mean that we need to use spin_lock_irqsave for the functions 
just like kgd_hiq_mqd_load()?

2. Prevent the overrung of the KIQ. Please drop the approach with the atomic 
here. Instead just add a amdgpu_fence_wait_polling() into
amdgpu_fence_emit_polling() as I discussed with Monk.
[yttao]: Sorry, I can't get your original idea for the 
amdgpu_fence_wait_polling(). Can you give more details about it? Thanks in 
advance.

"That is actually only a problem because the KIQ uses polling waits.

See amdgpu_fence_emit() waits for the oldest possible fence to be signaled 
before emitting a new one.

I suggest that we do the same in amdgpu_fence_emit_polling(). A one liner like 
the following should be enough:

amdgpu_fence_wait_polling(ring, seq - ring->fence_drv.num_fences_mask, 
timeout);"
[yttao]: there is no usage of num_fences_mask at kiq fence polling, the 
num_fences_mask is only effective at dma_fence architecture.
If I understand correctly, do you want the protype code below? 
If the protype code is wrong, can you help give one sample? Thanks in advance.

int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s) 
{
uint32_t seq;

if (!s)
return -EINVAL;
+   amdgpu_fence_wait_polling(ring, seq, timeout); 
seq = ++ring->fence_drv.sync_seq;
amdgpu_ring_emit_fence(ring, ring->fence_drv.gpu_addr,
¦  seq, 0); 

*s = seq;

return 0;
}




3. Use amdgpu_device_wb_get() each time we need to submit a read.
[yttao]: yes, I will do it.

Regards,
Christian.

>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  8 +-
>   .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 13 ++-
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 13 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 83 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |  5 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  | 13 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|  8 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |  8 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 35 +---
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c| 13 ++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 13 ++-
>   12 files changed, 167 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 4e1d4cfe7a9f..1157c1a0b888 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -526,7 +526,7 @@ static inline void amdgpu_set_ib_value(struct 
> amdgpu_cs_parser *p,
>   /*
>* Writeback
>*/
> -#define AMDGPU_MAX_WB 128/* Reserve at most 128 WB slots for 
> amdgpu-owned rings. */
> +#define AMDGPU_MAX_WB 256/* Reserve at most 256 WB slots for 
> amdgpu-owned rings. */
>   
>   struct amdgpu_wb {
>   struct amdgpu_bo*wb_obj;
> @@ -1028,6 +1028,12 @@ bool amdgpu_device_has_dc_support(struct 
> amdgpu_device *adev);
>   
>   int emu_soc_asic_init(struct amdgpu_device *adev);
>   
> +int amdgpu_gfx_kiq_lock(struct amdgpu_kiq *kiq, bool read,
> +   

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Hi  Christian

Please see inline comments.

-Original Message-
From: Koenig, Christian  
Sent: 2020年4月22日 16:23
To: Tao, Yintian ; Liu, Monk ; Liu, 
Shaoyun ; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register

Am 22.04.20 um 10:06 schrieb Tao, Yintian:
> Hi  Christian
>
> Please see inline comments
>
> -Original Message-
> From: Koenig, Christian 
> Sent: 2020年4月22日 15:54
> To: Tao, Yintian ; Liu, Monk ; 
> Liu, Shaoyun ; Kuehling, Felix 
> 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>
> Am 22.04.20 um 09:49 schrieb Tao, Yintian:
>> Hi Christian
>>
>>
>> Please see inline comments.
>> -Original Message-
>> From: Christian König 
>> Sent: 2020年4月22日 15:40
>> To: Tao, Yintian ; Koenig, Christian 
>> ; Liu, Monk ; Liu, 
>> Shaoyun ; Kuehling, Felix 
>> 
>> Cc: amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>>
>> Am 22.04.20 um 09:35 schrieb Tao, Yintian:
>>> Hi  Christian
>>>
>>>
>>>> BUG_ON(in_interrupt());
>>> That won't work like this. The KIQ is also used in interrupt context in the 
>>> driver, that's why we used spin_lock_irqsave().
>>> [yttao]: According to the current drm-next code, I have not find where to 
>>> access register through KIQ.
>>> And you need to wait for the free kiq ring buffer space if 
>>> there is no free kiq ring buffer, here, you wait at interrupt context is 
>>> illegal.
>> Waiting in atomic context is illegal as well, but we don't have much other 
>> choice.
>> [yttao]: no, there is no sleep in atomic context at my patch.
> I'm not talking about a sleeping, but busy waiting.
>
>> We just need to make sure that waiting never happens by making the buffers 
>> large enough and if it still happens print and error.
>> [yttao]: this is not the good choice because KMD need to protect it instead 
>> of hoping user not frequently invoke KIQ acess.
> The only other choice we have is busy waiting, e.g. loop until we get a free 
> slot.
> [yttao]: Yes, now may patch use msleep() to busy waiting. Or you means need 
> to use udelay()? If we use udelay(), it will be the nightmare under multi-VF.
>   Because it is assumed that there are 16VF within world-switch 
> 6ms, the bad situation is that one VF will udelay(16*6ms = 96ms) to get one 
> free slot.

You can't use msleep() here since sleeping in atomic or interrupt context is 
forbidden.

The trick is that in atomic context the CPU can't switch to a different 
process, so we have a very limited number of concurrent KIQ reads which can 
happen.

With a MAX_WB of 256 we can easily have 128 CPUs and don't run into problems.
[yttao]: fine, this is a good idea. But it seems current drm-next code, KIQ 
access still use msleep to wait the fence which is not correct according to 
your comments.
I think we need submit another patch to add one more condition 
"in_atomic()" to prevent it but this function cannot know about held spinlocks 
in non-preemptible kernels.

Regards,
Christian.

>
>
> Regards,
> Christian.
>
>>> And I would either say that we should use the trick with the NOP to reserve 
>>> space on the ring buffer or call amdgpu_device_wb_get() for each read. 
>>> amdgpu_device_wb_get() also uses find_first_zero_bit() and should work 
>>> equally well.
>>> [yttao]: sorry, can you give me more details about how to use NOP to 
>>> reserve space? I will use amdgpu_device_wb_get() for the read operation.
>> We could use the NOP PM4 command as Felix suggested, this command has 
>> a
>> header+length and says that the next X dw should be ignore on the 
>> header+ring
>> buffer.
>>
>> But I think using amdgpu_device_wb_get() is better anyway.
>> [yttao]: yes, I agreed with amdgpu_device_wb_get() method because it 
>> will fix prevent potential read race condition but NOP method will 
>> not prevent it
>>
>> Regards,
>> Christian.
>>
>>>
>>> -Original Message-
>>> From: Koenig, Christian 
>>> Sent: 2020年4月22日 15:23
>>> To: Tao, Yintian ; Liu, Monk 
>>> ; Liu, Shaoyun ; Kuehling, 
>>> Felix 
>>> Cc: amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>>>
>>>> BUG_ON(in_interrupt());
>>> That won't work like this. The KIQ is also used in interrupt context in the 
>>> driver, that's why we u

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Hi  Christian

Please see inline comments

-Original Message-
From: Koenig, Christian  
Sent: 2020年4月22日 15:54
To: Tao, Yintian ; Liu, Monk ; Liu, 
Shaoyun ; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register

Am 22.04.20 um 09:49 schrieb Tao, Yintian:
> Hi Christian
>
>
> Please see inline comments.
> -Original Message-
> From: Christian König 
> Sent: 2020年4月22日 15:40
> To: Tao, Yintian ; Koenig, Christian 
> ; Liu, Monk ; Liu, Shaoyun 
> ; Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>
> Am 22.04.20 um 09:35 schrieb Tao, Yintian:
>> Hi  Christian
>>
>>
>>> BUG_ON(in_interrupt());
>> That won't work like this. The KIQ is also used in interrupt context in the 
>> driver, that's why we used spin_lock_irqsave().
>> [yttao]: According to the current drm-next code, I have not find where to 
>> access register through KIQ.
>>  And you need to wait for the free kiq ring buffer space if 
>> there is no free kiq ring buffer, here, you wait at interrupt context is 
>> illegal.
> Waiting in atomic context is illegal as well, but we don't have much other 
> choice.
> [yttao]: no, there is no sleep in atomic context at my patch.

I'm not talking about a sleeping, but busy waiting.

> We just need to make sure that waiting never happens by making the buffers 
> large enough and if it still happens print and error.
> [yttao]: this is not the good choice because KMD need to protect it instead 
> of hoping user not frequently invoke KIQ acess.

The only other choice we have is busy waiting, e.g. loop until we get a free 
slot.
[yttao]: Yes, now may patch use msleep() to busy waiting. Or you means need to 
use udelay()? If we use udelay(), it will be the nightmare under multi-VF.
Because it is assumed that there are 16VF within world-switch 
6ms, the bad situation is that one VF will udelay(16*6ms = 96ms) to get one 
free slot.


Regards,
Christian.

>
>> And I would either say that we should use the trick with the NOP to reserve 
>> space on the ring buffer or call amdgpu_device_wb_get() for each read. 
>> amdgpu_device_wb_get() also uses find_first_zero_bit() and should work 
>> equally well.
>> [yttao]: sorry, can you give me more details about how to use NOP to reserve 
>> space? I will use amdgpu_device_wb_get() for the read operation.
> We could use the NOP PM4 command as Felix suggested, this command has 
> a
> header+length and says that the next X dw should be ignore on the ring
> buffer.
>
> But I think using amdgpu_device_wb_get() is better anyway.
> [yttao]: yes, I agreed with amdgpu_device_wb_get() method because it 
> will fix prevent potential read race condition but NOP method will not 
> prevent it
>
> Regards,
> Christian.
>
>>
>>
>> -Original Message-
>> From: Koenig, Christian 
>> Sent: 2020年4月22日 15:23
>> To: Tao, Yintian ; Liu, Monk ; 
>> Liu, Shaoyun ; Kuehling, Felix 
>> 
>> Cc: amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>>
>>> BUG_ON(in_interrupt());
>> That won't work like this. The KIQ is also used in interrupt context in the 
>> driver, that's why we used spin_lock_irqsave().
>>
>> And I would either say that we should use the trick with the NOP to reserve 
>> space on the ring buffer or call amdgpu_device_wb_get() for each read. 
>> amdgpu_device_wb_get() also uses find_first_zero_bit() and should work 
>> equally well.
>>
>> You also don't need to worry to much about overflowing the wb area.
>> Since we run in an atomic context we can have at most the number of CPU in 
>> the system + interrupt context here.
>>
>> Regards,
>> Christian.
>>
>> Am 22.04.20 um 09:11 schrieb Tao, Yintian:
>>> Add Felix and Shaoyun
>>>
>>> -Original Message-
>>> From: Yintian Tao 
>>> Sent: 2020年4月22日 12:42
>>> To: Koenig, Christian ; Liu, Monk 
>>> 
>>> Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
>>> 
>>> Subject: [PATCH] drm/amdgpu: refine kiq access register
>>>
>>> According to the current kiq access register method, there will be race 
>>> condition when using KIQ to read register if multiple clients want to read 
>>> at same time just like the expample below:
>>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. 
>>> client-B start to read REG-1 through KIQ 4. client-B poll the seqno-1 5. 
>>> the kiq complete these

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Hi Christian


Please see inline comments.
-Original Message-
From: Christian König  
Sent: 2020年4月22日 15:40
To: Tao, Yintian ; Koenig, Christian 
; Liu, Monk ; Liu, Shaoyun 
; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register

Am 22.04.20 um 09:35 schrieb Tao, Yintian:
> Hi  Christian
>
>
>> BUG_ON(in_interrupt());
> That won't work like this. The KIQ is also used in interrupt context in the 
> driver, that's why we used spin_lock_irqsave().
> [yttao]: According to the current drm-next code, I have not find where to 
> access register through KIQ.
>   And you need to wait for the free kiq ring buffer space if 
> there is no free kiq ring buffer, here, you wait at interrupt context is 
> illegal.

Waiting in atomic context is illegal as well, but we don't have much other 
choice.
[yttao]: no, there is no sleep in atomic context at my patch.

We just need to make sure that waiting never happens by making the buffers 
large enough and if it still happens print and error.
[yttao]: this is not the good choice because KMD need to protect it instead of 
hoping user not frequently invoke KIQ acess.

> And I would either say that we should use the trick with the NOP to reserve 
> space on the ring buffer or call amdgpu_device_wb_get() for each read. 
> amdgpu_device_wb_get() also uses find_first_zero_bit() and should work 
> equally well.
> [yttao]: sorry, can you give me more details about how to use NOP to reserve 
> space? I will use amdgpu_device_wb_get() for the read operation.

We could use the NOP PM4 command as Felix suggested, this command has a 
header+length and says that the next X dw should be ignore on the ring
buffer.

But I think using amdgpu_device_wb_get() is better anyway.
[yttao]: yes, I agreed with amdgpu_device_wb_get() method because it will fix 
prevent potential read race condition but NOP method will not prevent it

Regards,
Christian.

>
>
>
> -Original Message-----
> From: Koenig, Christian 
> Sent: 2020年4月22日 15:23
> To: Tao, Yintian ; Liu, Monk ; Liu, 
> Shaoyun ; Kuehling, Felix 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq access register
>
>> BUG_ON(in_interrupt());
> That won't work like this. The KIQ is also used in interrupt context in the 
> driver, that's why we used spin_lock_irqsave().
>
> And I would either say that we should use the trick with the NOP to reserve 
> space on the ring buffer or call amdgpu_device_wb_get() for each read. 
> amdgpu_device_wb_get() also uses find_first_zero_bit() and should work 
> equally well.
>
> You also don't need to worry to much about overflowing the wb area.
> Since we run in an atomic context we can have at most the number of CPU in 
> the system + interrupt context here.
>
> Regards,
> Christian.
>
> Am 22.04.20 um 09:11 schrieb Tao, Yintian:
>> Add Felix and Shaoyun
>>
>> -Original Message-
>> From: Yintian Tao 
>> Sent: 2020年4月22日 12:42
>> To: Koenig, Christian ; Liu, Monk
>> 
>> Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
>> Subject: [PATCH] drm/amdgpu: refine kiq access register
>>
>> According to the current kiq access register method, there will be race 
>> condition when using KIQ to read register if multiple clients want to read 
>> at same time just like the expample below:
>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. 
>> client-B start to read REG-1 through KIQ 4. client-B poll the seqno-1 5. the 
>> kiq complete these two read operation 6. client-A to read the register at 
>> the wb buffer and
>>  get REG-1 value
>>
>> And if there are multiple clients to frequently write registers through KIQ 
>> which may raise the KIQ ring buffer overwritten problem.
>>
>> Therefore, allocate fixed number wb slot for rreg use and limit the submit 
>> number which depends on the kiq ring_size in order to prevent the 
>> overwritten problem.
>>
>> Signed-off-by: Yintian Tao 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   7 +-
>>.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  12 +-
>>.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  12 +-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 129 --
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   6 +-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |   6 +-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  13 +-
>>drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|   8 +-
>>drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |   8 +-
>>drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  34 +++--
>

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Hi  Christian


> BUG_ON(in_interrupt());
That won't work like this. The KIQ is also used in interrupt context in the 
driver, that's why we used spin_lock_irqsave().
[yttao]: According to the current drm-next code, I have not find where to 
access register through KIQ.
And you need to wait for the free kiq ring buffer space if 
there is no free kiq ring buffer, here, you wait at interrupt context is 
illegal.

And I would either say that we should use the trick with the NOP to reserve 
space on the ring buffer or call amdgpu_device_wb_get() for each read. 
amdgpu_device_wb_get() also uses find_first_zero_bit() and should work equally 
well.
[yttao]: sorry, can you give me more details about how to use NOP to reserve 
space? I will use amdgpu_device_wb_get() for the read operation.



-Original Message-
From: Koenig, Christian  
Sent: 2020年4月22日 15:23
To: Tao, Yintian ; Liu, Monk ; Liu, 
Shaoyun ; Kuehling, Felix 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq access register

> BUG_ON(in_interrupt());
That won't work like this. The KIQ is also used in interrupt context in the 
driver, that's why we used spin_lock_irqsave().

And I would either say that we should use the trick with the NOP to reserve 
space on the ring buffer or call amdgpu_device_wb_get() for each read. 
amdgpu_device_wb_get() also uses find_first_zero_bit() and should work equally 
well.

You also don't need to worry to much about overflowing the wb area. 
Since we run in an atomic context we can have at most the number of CPU in the 
system + interrupt context here.

Regards,
Christian.

Am 22.04.20 um 09:11 schrieb Tao, Yintian:
> Add Felix and Shaoyun
>
> -Original Message-
> From: Yintian Tao 
> Sent: 2020年4月22日 12:42
> To: Koenig, Christian ; Liu, Monk 
> 
> Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
> Subject: [PATCH] drm/amdgpu: refine kiq access register
>
> According to the current kiq access register method, there will be race 
> condition when using KIQ to read register if multiple clients want to read at 
> same time just like the expample below:
> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. 
> client-B start to read REG-1 through KIQ 4. client-B poll the seqno-1 5. the 
> kiq complete these two read operation 6. client-A to read the register at the 
> wb buffer and
> get REG-1 value
>
> And if there are multiple clients to frequently write registers through KIQ 
> which may raise the KIQ ring buffer overwritten problem.
>
> Therefore, allocate fixed number wb slot for rreg use and limit the submit 
> number which depends on the kiq ring_size in order to prevent the overwritten 
> problem.
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   7 +-
>   .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  12 +-
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  12 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 129 --
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   6 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |   6 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  13 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|   8 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |   8 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  34 +++--
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  12 +-
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  12 +-
>   12 files changed, 211 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 4e1d4cfe7a9f..4530e0de4257 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -526,7 +526,7 @@ static inline void amdgpu_set_ib_value(struct 
> amdgpu_cs_parser *p,
>   /*
>* Writeback
>*/
> -#define AMDGPU_MAX_WB 128/* Reserve at most 128 WB slots for 
> amdgpu-owned rings. */
> +#define AMDGPU_MAX_WB 256/* Reserve at most 256 WB slots for 
> amdgpu-owned rings. */
>   
>   struct amdgpu_wb {
>   struct amdgpu_bo*wb_obj;
> @@ -1028,6 +1028,11 @@ bool amdgpu_device_has_dc_support(struct 
> amdgpu_device *adev);
>   
>   int emu_soc_asic_init(struct amdgpu_device *adev);
>   
> +int amdgpu_gfx_kiq_lock(struct amdgpu_kiq *kiq, bool read); void 
> +amdgpu_gfx_kiq_unlock(struct amdgpu_kiq *kiq);
> +
> +void amdgpu_gfx_kiq_consume(struct amdgpu_kiq *kiq, uint32_t *offs); 
> +void amdgpu_gfx_kiq_restore(struct amdgpu_kiq *kiq, uint32_t *offs);
>   /*
>* Registers read & write functions.
>*/
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> index 

RE: [PATCH] drm/amdgpu: refine kiq access register

2020-04-22 Thread Tao, Yintian
Add Felix and Shaoyun

-Original Message-
From: Yintian Tao  
Sent: 2020年4月22日 12:42
To: Koenig, Christian ; Liu, Monk 
Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
Subject: [PATCH] drm/amdgpu: refine kiq access register

According to the current kiq access register method, there will be race 
condition when using KIQ to read register if multiple clients want to read at 
same time just like the expample below:
1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. 
client-B start to read REG-1 through KIQ 4. client-B poll the seqno-1 5. the 
kiq complete these two read operation 6. client-A to read the register at the 
wb buffer and
   get REG-1 value

And if there are multiple clients to frequently write registers through KIQ 
which may raise the KIQ ring buffer overwritten problem.

Therefore, allocate fixed number wb slot for rreg use and limit the submit 
number which depends on the kiq ring_size in order to prevent the overwritten 
problem.

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   7 +-
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  12 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 129 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  13 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|   8 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |   8 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |  34 +++--
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  12 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  12 +-
 12 files changed, 211 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 4e1d4cfe7a9f..4530e0de4257 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -526,7 +526,7 @@ static inline void amdgpu_set_ib_value(struct 
amdgpu_cs_parser *p,
 /*
  * Writeback
  */
-#define AMDGPU_MAX_WB 128  /* Reserve at most 128 WB slots for 
amdgpu-owned rings. */
+#define AMDGPU_MAX_WB 256  /* Reserve at most 256 WB slots for 
amdgpu-owned rings. */
 
 struct amdgpu_wb {
struct amdgpu_bo*wb_obj;
@@ -1028,6 +1028,11 @@ bool amdgpu_device_has_dc_support(struct amdgpu_device 
*adev);
 
 int emu_soc_asic_init(struct amdgpu_device *adev);
 
+int amdgpu_gfx_kiq_lock(struct amdgpu_kiq *kiq, bool read); void 
+amdgpu_gfx_kiq_unlock(struct amdgpu_kiq *kiq);
+
+void amdgpu_gfx_kiq_consume(struct amdgpu_kiq *kiq, uint32_t *offs); 
+void amdgpu_gfx_kiq_restore(struct amdgpu_kiq *kiq, uint32_t *offs);
 /*
  * Registers read & write functions.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 691c89705bcd..034c9f416499 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -309,6 +309,7 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
uint32_t doorbell_off)
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
+   struct amdgpu_kiq *kiq = >gfx.kiq;
struct amdgpu_ring *kiq_ring = >gfx.kiq.ring;
struct v10_compute_mqd *m;
uint32_t mec, pipe;
@@ -324,13 +325,19 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void 
*mqd,
pr_debug("kfd: set HIQ, mec:%d, pipe:%d, queue:%d.\n",
 mec, pipe, queue_id);
 
-   spin_lock(>gfx.kiq.ring_lock);
+   r = amdgpu_gfx_kiq_lock(kiq, false);
+   if (r) {
+   pr_err("failed to lock kiq\n");
+   goto out_unlock;
+   }
+
r = amdgpu_ring_alloc(kiq_ring, 7);
if (r) {
pr_err("Failed to alloc KIQ (%d).\n", r);
goto out_unlock;
}
 
+   amdgpu_gfx_kiq_consume(kiq, NULL);
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
amdgpu_ring_write(kiq_ring,
  PACKET3_MAP_QUEUES_QUEUE_SEL(0) | /* Queue_Sel */ @@ 
-350,8 +357,9 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
amdgpu_ring_write(kiq_ring, m->cp_hqd_pq_wptr_poll_addr_hi);
amdgpu_ring_commit(kiq_ring);
 
+   amdgpu_gfx_kiq_restore(kiq, NULL);
 out_unlock:
-   spin_unlock(>gfx.kiq.ring_lock);
+   amdgpu_gfx_kiq_unlock(>gfx.kiq);
release_queue(kgd);
 
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index df841c2ac5e7..f243d9990ced 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -307,6 +307,7 @@ int kgd_gfx_v9_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
uint32_t

RE: [PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL

2020-04-21 Thread Tao, Yintian
Acked-by: Yintian Tao 

-Original Message-
From: Christian König  
Sent: 2020年4月21日 22:23
To: Liu, Monk ; He, Jacob ; Tao, Yintian 
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: change how we update mmRLC_SPM_MC_CNTL

In pp_one_vf mode avoid the extra overhead and read/write the registers without 
the KIQ.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++---  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 10 --  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 13 ++---
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..560ec1c29977 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,14 +7030,21 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v10_0_check_rlcg_range(struct amdgpu_device *adev, diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..a9bcc00f4348 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5615,12 +5615,18 @@ static void gfx_v8_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)  {
u32 data;
 
-   data = RREG32(mmRLC_SPM_VMID);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(mmRLC_SPM_VMID);
+   else
+   data = RREG32(mmRLC_SPM_VMID);
 
data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
 
-   WREG32(mmRLC_SPM_VMID, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_NO_KIQ(mmRLC_SPM_VMID, data);
+   else
+   WREG32(mmRLC_SPM_VMID, data);
 }
 
 static const struct amdgpu_rlc_funcs iceland_rlc_funcs = { diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..c7de10869c81 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,14 +4950,21 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
+   u32 reg, data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   reg = SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_MC_CNTL);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   data = RREG32_NO_KIQ(reg);
+   else
+   data = RREG32(reg);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   if (amdgpu_sriov_is_pp_one_vf(adev))
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   else
+   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v9_0_check_rlcg_range(struct amdgpu_device *adev,
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Tao, Yintian
Hi  Christian


Great. Then can you modify the patch according to Monk's suggestion?
We need this patch for one important project.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian  
Sent: 2020年4月21日 21:38
To: Liu, Monk ; Tao, Yintian ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update

> The problem is some fields are increased by hardware

What are you talking about? The bits control what is used in the MC interface, 
there is no increment or anything here.

> I think at least we should apply one change:  we use NO_KIQ for SRIOV 
> pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ 
> flood

Agreed that sounds like a good idea to me as well no matter if we use RMW or 
just a write.

Regards,
Christian.

Am 21.04.20 um 15:34 schrieb Liu, Monk:
> The problem is some fields are increased by hardware, and RLC simply 
> read its value, we cannot set those field together with VMID
>
> Christian, we should stop arguing on this small feature,  there is no way to 
> have a worse solution compared with current logic 
>
> I think at least we should apply one change:  we use NO_KIQ for SRIOV 
> pp_one_vf_mode case to access this SPM register to avoid SRIOV KIQ 
> flood
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian K?nig
> Sent: Tuesday, April 21, 2020 7:52 PM
> To: Liu, Monk ; Koenig, Christian 
> ; Tao, Yintian ; He, 
> Jacob ; amd-gfx@lists.freedesktop.org
> Cc: Gu, Frans 
> Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update
>
> Hi Monk,
>
> at least on Vega that should be fine. If the RLC should use anything else 
> than 0 here we should update that together with the VMID.
>
> Regards,
> Christian.
>
> Am 21.04.20 um 11:54 schrieb Liu, Monk:
>>>>> Could only be that the firmware updates the bits to something non 
>>>>> default, I'm going to double check that on a Vega10.
>> I think that will be a sure answer, otherwise why we need those field if we 
>> always write 0 to them and reader always expect 0 reading back from them ??
>>
>> Those fields are kind of performance counters
>>
>> _________
>> Monk Liu|GPU Virtualization Team |AMD
>>
>>
>> -Original Message-
>> From: Christian König 
>> Sent: Tuesday, April 21, 2020 5:52 PM
>> To: Tao, Yintian ; Liu, Monk ; 
>> He, Jacob ; amd-gfx@lists.freedesktop.org
>> Cc: Gu, Frans 
>> Subject: Re: [PATCH] drm/amdgpu: cleanup SPM VMID update
>>
>> Am 21.04.20 um 11:45 schrieb Tao, Yintian:
>>> -Original Message-
>>> From: Christian König 
>>> Sent: 2020年4月21日 17:10
>>> To: Liu, Monk ; Tao, Yintian 
>>> ; He, Jacob ; 
>>> amd-gfx@lists.freedesktop.org
>>> Cc: Gu, Frans 
>>> Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update
>>>
>>> The RLC SPM configuration register contains the information how the memory 
>>> access is made (VMID, MTYPE, etc) which should always be consistent.
>>>
>>> So instead of a read modify write cycle of the VMID always update the whole 
>>> register.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
>>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
>>> 4 files changed, 4 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> index 0a03e2ad5d95..2a6556371046 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> @@ -7030,12 +7030,7 @@ static int
>>> gfx_v10_0_update_gfx_clock_gating(struct amdgpu_device *adev,
>>> 
>>> static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, 
>>> unsigned vmid)  {
>>> -   u32 data;
>>> -
>>> -   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
>>> -
>>> -   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
>>> -   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
>>> RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
>>> +   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
>>> [yttao]: The orig_val is 0 which means except VMID field other reset fields 
>>> will be set to 0. Whe

RE: [PATCH] drm/amdgpu: cleanup SPM VMID update

2020-04-21 Thread Tao, Yintian


-Original Message-
From: Christian König  
Sent: 2020年4月21日 17:10
To: Liu, Monk ; Tao, Yintian ; He, Jacob 
; amd-gfx@lists.freedesktop.org
Cc: Gu, Frans 
Subject: [PATCH] drm/amdgpu: cleanup SPM VMID update

The RLC SPM configuration register contains the information how the memory 
access is made (VMID, MTYPE, etc) which should always be consistent.

So instead of a read modify write cycle of the VMID always update the whole 
register.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 7 +--  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 7 +--
 4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..2a6556371046 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7030,12 +7030,7 @@ static int gfx_v10_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
[yttao]: The orig_val is 0 which means except VMID field other reset fields 
will be set to 0. Whether it is legal?
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  } diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index b2f10e39eff1..a92486cd038f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3570,12 +3570,7 @@ static int gfx_v7_0_rlc_resume(struct amdgpu_device 
*adev)
 
 static void gfx_v7_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index fc6c2f2bc76c..44fdda68db98 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -5613,12 +5613,7 @@ static void gfx_v8_0_unset_safe_mode(struct 
amdgpu_device *adev)
 
 static void gfx_v8_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32(mmRLC_SPM_VMID);
-
-   data &= ~RLC_SPM_VMID__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_VMID__RLC_SPM_VMID_MASK) << 
RLC_SPM_VMID__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_VMID, RLC_SPM_VMID, vmid);
 
WREG32(mmRLC_SPM_VMID, data);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 54eded9a6ac5..b36fbf991313 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,12 +4950,7 @@ static int gfx_v9_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
 
 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)  {
-   u32 data;
-
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
-
-   data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
-   data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
+   u32 data = REG_SET_FIELD(0, RLC_SPM_MC_CNTL, RLC_SPM_VMID, vmid);
 
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  }
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: why we need to do infinite RLC_SPM register setting during VM flush

2020-04-20 Thread Tao, Yintian
Hi  Christian


Yes, because only pp_one_vf mode can run RGP. And according to Jacob’s comments,
only when running RGP benchmark then UMD will enable this feature, Otherwise 
UMD will not enable this feature.

Therefore, Multi-VF will never enter into this case.


Best Regards
Yintian Tao

From: Koenig, Christian 
Sent: 2020年4月20日 20:42
To: Tao, Yintian ; Liu, Monk ; He, Jacob 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: why we need to do infinite RLC_SPM register setting during VM flush

Monk needs to answer this, but I don't think that this will work.

This explanation even sounds like only one VF can use the feature at the same 
time, is that correct?

Regards,
Christian.

Am 20.04.20 um 14:08 schrieb Tao, Yintian:
Hi  Monk, Christian


According to the discussion with Jacob offline, UMD will only enable SPM 
feature when testing RGP.
And under virtualization , only pp_one_vf mode can test RGP.
Therefore, whether we can directly use MMIO to READ/WRITE register for 
RLC_SPM_MC_CNTL?


Best Regards
Yintian Tao

From: amd-gfx 
<mailto:amd-gfx-boun...@lists.freedesktop.org>
 On Behalf Of Liu, Monk
Sent: 2020年4月20日 16:33
To: Koenig, Christian 
<mailto:christian.koe...@amd.com>; He, Jacob 
<mailto:jacob...@amd.com>
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: why we need to do infinite RLC_SPM register setting during VM flush

Christian

What we want to do is like:
Read reg value from RLC_SPM_MC_CNTL to tmp
Set bits:3:0 to VMID  to tmp
Write tmp to RLC_SPM_MC_CNTL

I didn’t find any PM4 packet on GFX9/10 can achieve above goal ….


_
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

From: Christian König 
mailto:ckoenig.leichtzumer...@gmail.com>>
Sent: Monday, April 20, 2020 4:03 PM
To: Liu, Monk mailto:monk@amd.com>>; He, Jacob 
mailto:jacob...@amd.com>>; Koenig, Christian 
mailto:christian.koe...@amd.com>>
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: why we need to do infinite RLC_SPM register setting during VM flush

I would also prefer to update the SPM VMID register using PM4 packets instead 
of the current handling.

Regards,
Christian.

Am 20.04.20 um 09:50 schrieb Liu, Monk:
I just try to explain what I want to do here, no real patch formalized yet

_
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

From: He, Jacob <mailto:jacob...@amd.com>
Sent: Monday, April 20, 2020 3:45 PM
To: Liu, Monk <mailto:monk@amd.com>; Koenig, Christian 
<mailto:christian.koe...@amd.com>
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: why we need to do infinite RLC_SPM register setting during VM flush


[AMD Official Use Only - Internal Distribution Only]

Do you miss a file which adds spm_updated to vm structure?

From: Liu, Monk mailto:monk@amd.com>>
Sent: Monday, April 20, 2020 3:32 PM
To: He, Jacob mailto:jacob...@amd.com>>; Koenig, Christian 
mailto:christian.koe...@amd.com>>
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>
Subject: why we need to do infinite RLC_SPM register setting during VM flush


Hi Jaco & Christian



As titled , check below patch:



commit 10790a09ea584cc832353a5c2a481012e5e31a13

Author: Jacob He mailto:jacob...@amd.com>>

Date:   Fri Feb 28 20:24:41 2020 +0800



drm/amdgpu: Update SPM_VMID with the job's vmid when application reserves 
the vmid



SPM access the video memory according to SPM_VMID. It should be updated

with the job's vmid right before the job is scheduled. SPM_VMID is a

global resource



Change-Id: Id3881908960398f87e7c95026a54ff83ff826700

Signed-off-by: Jacob He mailto:jacob...@amd.com>>

Reviewed-by: Christian König 
mailto:christian.koe...@amd.com>>



diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 6e6fc8c..ba2236a 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

@@ -1056,8 +1056,12 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,

struct dma_fence *fence = NULL;

bool pasid_mapping_needed = false;

unsigned patch_offset = 0;

+   bool update_spm_vmid_needed = (job->vm && 
(job->vm->reserved_vmid[vmhub] != NULL));

int r;



+   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid)

+   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);

+

if (amdgpu_vmid_had_gpu_reset(adev, id)) {

gds_switch_needed = true;

vm_flush_needed = true;



this update_spm_vmid() looks an completely overkill to me, we only need to do 
it once for

RE: [PATCH] drm/amdgpu: update spm register through mmio

2020-04-20 Thread Tao, Yintian
Add Monk

-Original Message-
From: Yintian Tao  
Sent: 2020年4月20日 20:37
To: Koenig, Christian ; monk@amd.co
Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
Subject: [PATCH] drm/amdgpu: update spm register through mmio

According to UMD design, only running performance analysis benchmark just like 
RGP, GPA and so on need to update spm register and others will not support this 
feature.
Therefore, we can directly access spm register through mmio.

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 4 +++-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 ++--
 drivers/gpu/drm/amd/amdgpu/soc15_common.h | 3 +++
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index accbb34ea670..820f560adc33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1083,7 +1083,9 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,
bool update_spm_vmid_needed = (job->vm && 
(job->vm->reserved_vmid[vmhub] != NULL));
int r;
 
-   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid)
+   if ((!amdgpu_sriov_vf(adev) || amdgpu_passthrough(adev) ||
+amdgpu_sriov_is_pp_one_vf(adev)) &&
+   update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid)
adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
 
if (amdgpu_vmid_had_gpu_reset(adev, id)) { diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0a03e2ad5d95..bfb873f023c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7032,12 +7032,12 @@ static void gfx_v10_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)  {
u32 data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   data = RREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v10_0_check_rlcg_range(struct amdgpu_device *adev, diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 84fcf842316d..514efc4fe269 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4950,12 +4950,12 @@ static void gfx_v9_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)  {
u32 data;
 
-   data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
+   data = RREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL);
 
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
+   WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
 static bool gfx_v9_0_check_rlcg_range(struct amdgpu_device *adev, diff --git 
a/drivers/gpu/drm/amd/amdgpu/soc15_common.h 
b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
index c893c645a4b2..56d02aa690a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h
+++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
@@ -35,6 +35,9 @@
 #define RREG32_SOC15(ip, inst, reg) \
RREG32(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg)
 
+#define RREG32_SOC15_NO_KIQ(ip, inst, reg) \
+   RREG32_NO_KIQ(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg)
+
 #define RREG32_SOC15_OFFSET(ip, inst, reg, offset) \
RREG32((adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg) + 
offset)
 
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: why we need to do infinite RLC_SPM register setting during VM flush

2020-04-20 Thread Tao, Yintian
Hi  Monk, Christian


According to the discussion with Jacob offline, UMD will only enable SPM 
feature when testing RGP.
And under virtualization , only pp_one_vf mode can test RGP.
Therefore, whether we can directly use MMIO to READ/WRITE register for 
RLC_SPM_MC_CNTL?


Best Regards
Yintian Tao

From: amd-gfx  On Behalf Of Liu, Monk
Sent: 2020年4月20日 16:33
To: Koenig, Christian ; He, Jacob 
Cc: amd-gfx@lists.freedesktop.org
Subject: RE: why we need to do infinite RLC_SPM register setting during VM flush

Christian

What we want to do is like:
Read reg value from RLC_SPM_MC_CNTL to tmp
Set bits:3:0 to VMID  to tmp
Write tmp to RLC_SPM_MC_CNTL

I didn’t find any PM4 packet on GFX9/10 can achieve above goal ….


_
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

From: Christian König 
mailto:ckoenig.leichtzumer...@gmail.com>>
Sent: Monday, April 20, 2020 4:03 PM
To: Liu, Monk mailto:monk@amd.com>>; He, Jacob 
mailto:jacob...@amd.com>>; Koenig, Christian 
mailto:christian.koe...@amd.com>>
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: why we need to do infinite RLC_SPM register setting during VM flush

I would also prefer to update the SPM VMID register using PM4 packets instead 
of the current handling.

Regards,
Christian.

Am 20.04.20 um 09:50 schrieb Liu, Monk:
I just try to explain what I want to do here, no real patch formalized yet

_
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

From: He, Jacob 
Sent: Monday, April 20, 2020 3:45 PM
To: Liu, Monk ; Koenig, Christian 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: why we need to do infinite RLC_SPM register setting during VM flush


[AMD Official Use Only - Internal Distribution Only]

Do you miss a file which adds spm_updated to vm structure?

From: Liu, Monk mailto:monk@amd.com>>
Sent: Monday, April 20, 2020 3:32 PM
To: He, Jacob mailto:jacob...@amd.com>>; Koenig, Christian 
mailto:christian.koe...@amd.com>>
Cc: amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>
Subject: why we need to do infinite RLC_SPM register setting during VM flush


Hi Jaco & Christian



As titled , check below patch:



commit 10790a09ea584cc832353a5c2a481012e5e31a13

Author: Jacob He mailto:jacob...@amd.com>>

Date:   Fri Feb 28 20:24:41 2020 +0800



drm/amdgpu: Update SPM_VMID with the job's vmid when application reserves 
the vmid



SPM access the video memory according to SPM_VMID. It should be updated

with the job's vmid right before the job is scheduled. SPM_VMID is a

global resource



Change-Id: Id3881908960398f87e7c95026a54ff83ff826700

Signed-off-by: Jacob He mailto:jacob...@amd.com>>

Reviewed-by: Christian König 
mailto:christian.koe...@amd.com>>



diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 6e6fc8c..ba2236a 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

@@ -1056,8 +1056,12 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,

struct dma_fence *fence = NULL;

bool pasid_mapping_needed = false;

unsigned patch_offset = 0;

+   bool update_spm_vmid_needed = (job->vm && 
(job->vm->reserved_vmid[vmhub] != NULL));

int r;



+   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid)

+   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);

+

if (amdgpu_vmid_had_gpu_reset(adev, id)) {

gds_switch_needed = true;

vm_flush_needed = true;



this update_spm_vmid() looks an completely overkill to me, we only need to do 
it once for its VM …



in SRIOV the register reading/writing for update_spm_vmid() is now carried by 
KIQ thus there is too much burden on KIQ for such unnecessary jobs …



I want to change it to only do it once per VM, like:



diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 6e6fc8c..ba2236a 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

@@ -1056,8 +1056,12 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,

struct dma_fence *fence = NULL;

   bool pasid_mapping_needed = false;

unsigned patch_offset = 0;

+   bool update_spm_vmid_needed = (job->vm && 
(job->vm->reserved_vmid[vmhub] != NULL));

int r;



+   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid &&  
!vm->spm_updated) {

+   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);

+   vm->spm_updated = true;

+   }



if (amdgpu_vmid_had_gpu_reset(adev, id)) {

 

RE: [PATCH] drm/amdgpu: refine kiq read register

2020-04-20 Thread Tao, Yintian
Hi  Christian


This patch has not been merged because it is still under discussion among you, 
Monk and Felix.

Instead of this crude hack please let us just allocate a fixed number of write 
back slots and use them round robin. Then we can make sure that we don't have 
more than a fixed number of reads in flight at the same time as well using the 
fence values.
[yttao]: Yes, the fixed number of write back slots can also fix the problem-1 
which Monk described but it still can't fix the problem-2. But it seems the 
fixed number solution can fix one potential bug raised by msleep() when kiq 
read register.
Because currently there is no protect mechanism for KIQ ring 
submission. Now, there are 5 submitter which can infinitely write kiq ring 
buffer without any limitation.
1. kiq read/write register
2. amdgpu_vm_flush
3. invalidate tlb
4. kfd hiq_mqd_load


Hi  Felix

I have one question about function kgd_gfx_v9_hiq_mqd_load(). I see it directly 
write contents into kiq ring and not wait for the fence. Do you know how KFD 
know the hiq_mqd_load complete? Thanks in advance.



Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian  
Sent: 2020年4月20日 15:19
To: Liu, Monk ; Kuehling, Felix ; 
Tao, Yintian 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq read register

Hi Monk,

> Can we first get the first problem done ?

Please absolutely not! See the problem introduced here is quite worse than the 
actual fix.

Previously we ended up with an invalid value in a concurrent register read, now 
the KIQs overwrites its own commands and most likely causes a hang or the 
hardware to execute something random.

Instead of this crude hack please let us just allocate a fixed number of write 
back slots and use them round robin. Then we can make sure that we don't have 
more than a fixed number of reads in flight at the same time as well using the 
fence values.

This should fix both problems at the same time and not introduce another 
potential problematic hack.

If this patch is already committed please revert it immediately.

Regards,
Christian.

Am 20.04.20 um 08:20 schrieb Liu, Monk:
> Christian
>
>>>> Well I was under the assumption that this is actually what is done here.
> If that is not the case the patch is a rather clear NAK.
> <<<
>
> There are two kinds of problems in the current KIQ reading reg, Yintian tend 
> to fix one of them but not all ...
>
> The first problem is :
> During the sleep of the first KIQ reading, another KIQ reading initiated an 
> the read back register value flushed the first readback value, thus the first 
> reading will get the wrong result.
> This is the issue yintian's patch to address, by put the readback 
> value not in a shared WB but in a chunk DW of command submit
>
> The second problem is:
> Since we don't utilize GPU scheduler for KIQ submit thus if the KIQ is 
> busy with some commands then those unfinished commands maybe overwritten by a 
> new command submit, and that's not the Problem yintian's patch tend to 
> address. Felex pointed it out which is fine and we can use another patch to 
> address it, I'm also planning and scoping it.
>
> The optional way is:
> 1) We use GPU scheduler to manage KIQ activity, and all jobs are 
> submitted  to KIQ through a IB, thus no overwritten will happen
> 2) we still skip gpu scheduler but always use IB to put jobs on KIQ, 
> thus each JOB will occupy the fixed space/DW of RB, so we can avoid 
> overwrite unfinished command
>
> We can discuss the second problem later
>
> Can we first get the first problem done ? thanks
>
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: Christian König 
> Sent: Monday, April 20, 2020 1:03 AM
> To: Kuehling, Felix ; Tao, Yintian 
> ; Liu, Monk 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq read register
>
> Am 17.04.20 um 17:39 schrieb Felix Kuehling:
>> Am 2020-04-17 um 2:53 a.m. schrieb Yintian Tao:
>>> According to the current kiq read register method, there will be 
>>> race condition when using KIQ to read register if multiple clients 
>>> want to read at same time just like the expample below:
>>> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the
>>> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
>>> the seqno-1 5. the kiq complete these two read operation 6. client-A 
>>> to read the register at the wb buffer and
>>>  get REG-1 value
>>>
>>> Therefore, directly make kiq write the register value at the ring 
>>> buffer then there will be no race condition for the wb buffer.
>>>
>>> v2: supply th

RE: [PATCH] drm/amdgpu: refine kiq read register

2020-04-19 Thread Tao, Yintian
Hi Felix

Many thanks for your review. I have modified it according to your comments and 
suggestion.

Best Regards
Yintian Tao

-Original Message-
From: Kuehling, Felix  
Sent: 2020年4月17日 23:39
To: Tao, Yintian ; Liu, Monk 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq read register

Am 2020-04-17 um 2:53 a.m. schrieb Yintian Tao:
> According to the current kiq read register method, there will be race 
> condition when using KIQ to read register if multiple clients want to 
> read at same time just like the expample below:
> 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the 
> seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll 
> the seqno-1 5. the kiq complete these two read operation 6. client-A 
> to read the register at the wb buffer and
>get REG-1 value
>
> Therefore, directly make kiq write the register value at the ring 
> buffer then there will be no race condition for the wb buffer.
>
> v2: supply the read_clock and move the reg_val_offs back
>
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  | 11 --  
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  1 -  
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  5 +++--
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 14 +---
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 14 +---
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 28 
>  6 files changed, 33 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index ea576b4260a4..4e1c0239e561 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -304,10 +304,6 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device 
> *adev,
>  
>   spin_lock_init(>ring_lock);
>  
> - r = amdgpu_device_wb_get(adev, >reg_val_offs);
> - if (r)
> - return r;
> -
>   ring->adev = NULL;
>   ring->ring_obj = NULL;
>   ring->use_doorbell = true;
> @@ -331,7 +327,6 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device 
> *adev,
>  
>  void amdgpu_gfx_kiq_free_ring(struct amdgpu_ring *ring)  {
> - amdgpu_device_wb_free(ring->adev, ring->adev->gfx.kiq.reg_val_offs);
>   amdgpu_ring_fini(ring);
>  }
>  
> @@ -675,12 +670,14 @@ uint32_t amdgpu_kiq_rreg(struct amdgpu_device *adev, 
> uint32_t reg)
>   uint32_t seq;
>   struct amdgpu_kiq *kiq = >gfx.kiq;
>   struct amdgpu_ring *ring = >ring;
> + uint64_t reg_val_offs = 0;
>  
>   BUG_ON(!ring->funcs->emit_rreg);
>  
>   spin_lock_irqsave(>ring_lock, flags);
>   amdgpu_ring_alloc(ring, 32);
> - amdgpu_ring_emit_rreg(ring, reg);
> + reg_val_offs = (ring->wptr & ring->buf_mask) + 30;

I think that should be (ring->wptr + 30) & ring->buf_mask. Otherwise the 
reg_val_offset can be past the end of the ring.

But that still leaves a problem if another command is submitted to the KIQ 
before you read the returned reg_val from the ring. Your reg_val can be 
overwritten by the new command and you get the wrong result. Or the command can 
be overwritten with the reg_val, which will most likely hang the CP.

You could allocate space on the KIQ ring with a NOP command to prevent that 
space from being overwritten by other commands.

Regards,
  Felix


> + amdgpu_ring_emit_rreg(ring, reg, reg_val_offs);
>   amdgpu_fence_emit_polling(ring, );
>   amdgpu_ring_commit(ring);
>   spin_unlock_irqrestore(>ring_lock, flags); @@ -707,7 +704,7 @@ 
> uint32_t amdgpu_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
>   if (cnt > MAX_KIQ_REG_TRY)
>   goto failed_kiq_read;
>  
> - return adev->wb.wb[kiq->reg_val_offs];
> + return ring->ring[reg_val_offs];
>  
>  failed_kiq_read:
>   pr_err("failed to read reg:%x\n", reg); diff --git 
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> index 634746829024..ee698f0246d8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> @@ -103,7 +103,6 @@ struct amdgpu_kiq {
>   struct amdgpu_ring  ring;
>   struct amdgpu_irq_src   irq;
>   const struct kiq_pm4_funcs *pmf;
> - uint32_treg_val_offs;
>  };
>  
>  /*
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index f61664ee4940..a3d88f2aa9f4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -181,7 +181,8 @@ struct amdgpu_ri

RE: [PATCH] drm/amdgpu: refine kiq read register

2020-04-17 Thread Tao, Yintian
Hi  Christian 


Can you help give more details about how this spm trace works
After review the gfx_v9_0_update_spm_vmid function, I think it is some confused.


For example:
It is assumed that there are two gfx job which can be submitted to gfx ring. 
When second gfx job is submitted, the vmid of first gfx job write to 
mmRLC_SPM_MC_CNTL may be overwritten by the second gfx job vmid.
I am not sure whether it will raise problem.


Best Regards
Yintian Tao

-Original Message-
From: Liu, Monk  
Sent: 2020年4月17日 17:40
To: Koenig, Christian ; Tao, Yintian 
; Kuehling, Felix ; Deucher, 
Alexander ; Zhang, Hawking ; 
Ming, Davis ; Jiang, Jerry (SW) 
Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: refine kiq read register

Hi Christian

mmRLC_SPM_MC_CNTL

this register is a RLC register, with my understanding it is PF share 
register, and I did experiment proved it:
1) write abc to it in PF
2) read it from VF, it shows abc
3) write ff to it in VF, read it, it is still abc

So this register with current policy (L1) is a VF read, PF write register, and 
this register is physically shared among PF/VF 

We should not even try to write it in VF side, no matter CPU or KIQ (KIQ write 
within VF role will also be blocked by the L1 policy)

From what I can see so far: we need to drop this feature for SRIOV, or we need 
to change Policy 

+@Ming, Davis and @Jiang, Jerry (SW) for awareness

DRM-NEXT kernel branch has a new feature to massively use KIQ to read/write 
this register " mmRLC_SPM_MC_CNTL" which is a PF w/r bug VF R only register.
We need to figure out what should we do on it 

I will talk to UMD guys later (they initiated this feature in our kernel driver 
) _
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Koenig, Christian 
Sent: Friday, April 17, 2020 5:14 PM
To: Liu, Monk ; Tao, Yintian ; Kuehling, 
Felix ; Deucher, Alexander ; 
Zhang, Hawking 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: refine kiq read register

> Dynamic alloc each time doing KIQ reg read is a overkill to me
Yeah, that is a rather good argument.

> Now  we do KIQ read and write *every time* we do amdgpu_vm_flush 
> (omg... what's this  ??)

That is updating the VMID used for the SPM trace. And yes this 
read/modify/write is most likely not a good idea, we should rather just write 
the value we want to have or don't use the KIQ here.

Most likely the later because IIRC this is a per VF register.

Christian.

Am 17.04.20 um 11:06 schrieb Liu, Monk:
> Christian
>
> See we wanted to map the ring buffers read only and USWC for some time.
> That would result in either not working driver or rather crappy performance.
> <<
>
> For KIQ the ring buffer wouldn't be read only ... should be cacheable 
> type
>
> Dynamic alloc each time doing KIQ reg read is a overkill to me, leverage ring 
> buffer is a high efficient way.
>
> Besides looks now the KIQ register reading is really massive, check this code:
>
> 4949 static void gfx_v9_0_update_spm_vmid(struct amdgpu_device *adev, 
> unsigned vmid)
> 4950 {
> 4951 u32 data;
> 4952
> 4953 data = RREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL);
> 4954
> 4955 data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
> 4956 data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
> RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
> 4957
> 4958 WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);
> 4959 }
>
> Now  we do KIQ read and write *every time* we do amdgpu_vm_flush 
> (omg... what's this  ??)
>
>
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: Koenig, Christian 
> Sent: Friday, April 17, 2020 4:59 PM
> To: Liu, Monk ; Tao, Yintian ; 
> Kuehling, Felix ; Deucher, Alexander 
> ; Zhang, Hawking 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: refine kiq read register
>
> Looks like a rather important bug fix to me, but I'm not sure if writing the 
> value into the ring buffer is a good idea.
>
> See we wanted to map the ring buffers read only and USWC for some time.
> That would result in either not working driver or rather crappy performance.
>
> Can't we just call amdgpu_device_wb_get() in amdgpu_device_wb_get() instead 
> and allocate the wb address dynamically?
>
> Regards,
> Christian.
>
> Am 17.04.20 um 09:01 schrieb Liu, Monk:
>> The change Looks good with me, you can put my RB to your patch .
>>
>> Since this patch impact on general logic (not SRIOV only) I would 
>> like you wait a little longer for @Kuehling, Felix and @Deucher, 
>> Alexander and @Koenig, Christian  @Zhang, Hawking
>>
>> If any of them gave you a RB I think we can go this way
>>
>

RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-13 Thread Tao, Yintian
Hi  Monk

Thanks for your review. I have changed the code according to your comments. 
Please help have a review again

Best Regards
Yintian Tao

-Original Message-
From: Liu, Monk  
Sent: 2020年4月11日 16:59
To: Tao, Yintian ; Alex Deucher 
Cc: Deucher, Alexander ; Koenig, Christian 
; amd-gfx list 
Subject: RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

Hi Yintian

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c0f9a651dc06..4f9780aabf5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -152,11 +152,17 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
struct file *f,
if (r < 0)
return r;
 
+   if (!amdgpu_virt_can_access_debugfs(adev))
+   return -EINVAL;
+   else
+   amdgpu_virt_enable_access_debugfs(adev);

You patch looks simply bail out if you are not under "debug" condition, but 
that looks weird: you are forbidding KIQ to do the debugfs access during 
non-debug condition , which is an overkill to me .

The ideal logic is :
1) When we are under "tdr_deubg" mode, we allow debugfs to handled by MMIO 
access, just like bare-metal
2) When we are not under "tdr_debug" mode (e.g.: no tdr triggered ) we shall 
still allow debugfs to work, but all the register access can go with KIQ way .

Looks you are dropping 2 totally  ... 

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-----
From: amd-gfx  On Behalf Of Tao, Yintian
Sent: Thursday, April 9, 2020 11:23 PM
To: Alex Deucher 
Cc: Deucher, Alexander ; Koenig, Christian 
; amd-gfx list 
Subject: RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

Hi  Alex

Many thanks for your review.



-Original Message-
From: Alex Deucher 
Sent: 2020年4月9日 23:21
To: Tao, Yintian 
Cc: Koenig, Christian ; Deucher, Alexander 
; amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

On Thu, Apr 9, 2020 at 10:54 AM Yintian Tao  wrote:
>
> Under bare metal, there is no more else to take care of the GPU 
> register access through MMIO.
> Under Virtualization, to access GPU register is implemented through 
> KIQ during run-time due to world-switch.
>
> Therefore, under SR-IOV user can only access debugfs to r/w GPU 
> registers when meets all three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> v2: merge amdgpu_virt_can_access_debugfs() into
> amdgpu_virt_enable_access_debugfs()
>
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 73 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 26 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>  4 files changed, 108 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..1a4894fa3693 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> if (use_bank) {
> if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
> (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return -EINVAL;
> }
> mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +212,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -255,6 +261,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -263,6 +273,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_re

RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Tao, Yintian
Hi  Alex

Many thanks for your review.



-Original Message-
From: Alex Deucher  
Sent: 2020年4月9日 23:21
To: Tao, Yintian 
Cc: Koenig, Christian ; Deucher, Alexander 
; amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

On Thu, Apr 9, 2020 at 10:54 AM Yintian Tao  wrote:
>
> Under bare metal, there is no more else to take care of the GPU 
> register access through MMIO.
> Under Virtualization, to access GPU register is implemented through 
> KIQ during run-time due to world-switch.
>
> Therefore, under SR-IOV user can only access debugfs to r/w GPU 
> registers when meets all three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> v2: merge amdgpu_virt_can_access_debugfs() into
> amdgpu_virt_enable_access_debugfs()
>
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 73 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 26 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>  4 files changed, 108 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..1a4894fa3693 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> if (use_bank) {
> if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
> (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return -EINVAL;
> }
> mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +212,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -255,6 +261,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -263,6 +273,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
> }
>
> @@ -275,6 +286,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return result;
>  }
>
> @@ -304,6 +316,10 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct 
> file *f, const char __user
> if (r < 0)
> return r;
>
> +   r = amdgpu_virt_enable_access_debugfs(adev);
> +   if (r < 0)
> +   return r;
> +
> while (size) {
> uint32_t value;
>
> @@ -311,6 +327,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
> if (r) {
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r;
> }
>
> @@ -325,6 +342,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
> pm_runtime_mark_last_busy(adev->ddev->dev);
> pm_runtime_put_autosuspend(adev->ddev->dev);
>
> +   amdgpu_virt_disable_access_debugfs(adev);
> return r

RE: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

2020-04-09 Thread Tao, Yintian
Hi  Christian


Many thanks for your review. I will submit one new patch according to your 
suggestion.


Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian  
Sent: 2020年4月9日 20:42
To: Tao, Yintian ; Deucher, Alexander 
; Deng, Emily 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: restrict debugfs register access under SR-IOV

Am 09.04.20 um 08:01 schrieb Yintian Tao:
> Under bare metal, there is no more else to take care of the GPU 
> register access through MMIO.
> Under Virtualization, to access GPU register is implemented through 
> KIQ during run-time due to world-switch.
>
> Therefore, under SR-IOV user can only access debugfs to r/w GPU 
> registers when meets all three conditions below.
> - amdgpu_gpu_recovery=0
> - TDR happened
> - in_gpu_reset=0
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 83 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  7 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 23 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h|  7 ++
>   4 files changed, 114 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c0f9a651dc06..4f9780aabf5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -152,11 +152,17 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
>   if (r < 0)
>   return r;
>   
> + if (!amdgpu_virt_can_access_debugfs(adev))
> + return -EINVAL;
> + else
> + amdgpu_virt_enable_access_debugfs(adev);
> +

It would be better to merge these two functions together.

E.g. that amdgpu_virt_enable_access_debugfs() returns an error if we can't 
allow this.

And -EINVAL is maybe not the right thing here, since this is not caused by an 
invalid value.

Maybe use -EPERM instead.

Regards,
Christian.

>   if (use_bank) {
>   if ((sh_bank != 0x && sh_bank >= 
> adev->gfx.config.max_sh_per_se) ||
>   (se_bank != 0x && se_bank >= 
> adev->gfx.config.max_shader_engines)) {
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
> + amdgpu_virt_disable_access_debugfs(adev);
>   return -EINVAL;
>   }
>   mutex_lock(>grbm_idx_mutex);
> @@ -207,6 +213,7 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
> struct file *f,
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
>   
> + amdgpu_virt_disable_access_debugfs(adev);
>   return result;
>   }
>   
> @@ -255,6 +262,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
>   if (r < 0)
>   return r;
>   
> + if (!amdgpu_virt_can_access_debugfs(adev))
> + return -EINVAL;
> + else
> + amdgpu_virt_enable_access_debugfs(adev);
> +
>   while (size) {
>   uint32_t value;
>   
> @@ -263,6 +275,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
>   if (r) {
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
> + amdgpu_virt_disable_access_debugfs(adev);
>   return r;
>   }
>   
> @@ -275,6 +288,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
> *f, char __user *buf,
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
>   
> + amdgpu_virt_disable_access_debugfs(adev);
>   return result;
>   }
>   
> @@ -304,6 +318,11 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct 
> file *f, const char __user
>   if (r < 0)
>   return r;
>   
> + if (!amdgpu_virt_can_access_debugfs(adev))
> + return -EINVAL;
> + else
> + amdgpu_virt_enable_access_debugfs(adev);
> +
>   while (size) {
>   uint32_t value;
>   
> @@ -311,6 +330,7 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
> *f, const char __user
>   if (r) {
>   pm_runtime_mark_last_busy(adev->ddev->dev);
>   pm_runtime_put_autosuspend(adev->ddev->dev);
> + amdgpu_virt_disable_acc

RE: [PATCH] drm/amdgpu: skip access sdma_v5_0 registers under SRIOV

2020-03-30 Thread Tao, Yintian
Hi  Emily

Many thanks

-Original Message-
From: Deng, Emily  
Sent: 2020年3月30日 19:57
To: Tao, Yintian ; Koenig, Christian 
; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
Subject: RE: [PATCH] drm/amdgpu: skip access sdma_v5_0 registers under SRIOV

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Emily Deng 

Best wishes
Emily Deng
>-Original Message-
>From: amd-gfx  On Behalf Of 
>Yintian Tao
>Sent: Monday, March 30, 2020 4:50 PM
>To: Koenig, Christian ; Deucher, Alexander 
>
>Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
>Subject: [PATCH] drm/amdgpu: skip access sdma_v5_0 registers under 
>SRIOV
>
>Due to the new L1.0b0c011b policy, many SDMA registers are blocked 
>which raise the violation warning. There are total 6 pair register 
>needed to be skipped when driver init and de-init.
>mmSDMA0/1_CNTL
>mmSDMA0/1_F32_CNTL
>mmSDMA0/1_UTCL1_PAGE
>mmSDMA0/1_UTCL1_CNTL
>mmSDMA0/1_CHICKEN_BITS,
>mmSDMA0/1_SEM_WAIT_FAIL_TIMER_CNTL
>
>Signed-off-by: Yintian Tao 
>Change-Id: I9d5087582ceb5f629d37bf856533d00c179e6de3
>---
> drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 110 +
> 1 file changed, 75 insertions(+), 35 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
>b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
>index b3c30616d6b4..d7c0269059b0 100644
>--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
>+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
>@@ -88,6 +88,29 @@ static const struct soc15_reg_golden 
>golden_settings_sdma_5[] = {
>   SOC15_REG_GOLDEN_VALUE(GC, 0, mmSDMA1_UTCL1_PAGE, 0x00ff, 
>0x000c5c00)  };
>
>+static const struct soc15_reg_golden golden_settings_sdma_5_sriov[] = {
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_GFX_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_PAGE_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC0_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC1_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC2_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC3_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC4_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC5_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC6_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC7_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_GFX_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_PAGE_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC0_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC1_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC2_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC3_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC4_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC5_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC6_RB_WPTR_POLL_CNTL, 0xfff7, 0x00403000),
>+  SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC7_RB_WPTR_POLL_CNTL,
>+0xfff7, 0x00403000), };
>+
> static const struct soc15_reg_golden golden_settings_sdma_nv10[] = {
>   SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA0_RLC3_RB_WPTR_POLL_CNTL, 0xfff0, 0x00403000),
>   SOC15_REG_GOLDEN_VALUE(GC, 0,
>mmSDMA1_RLC3_RB_WPTR_POLL_CNTL, 0xfff0, 0x00403000), @@ -141,9
>+164,14 @@ static void sdma_v5_0_init_golden_registers(struct
>amdgpu_device *adev)
>   (const
>u32)ARRAY_SIZE(golden_settings_sdma_nv14));
>   break;
>   case CHIP_NAVI12:
>-  soc15_program_register_sequence(adev,
>-  golden_settings_sdma_5,
>-  (const
>u32)ARRAY_SIZE(golden_settings_sdma_5));
>+  if (amdgpu_sriov_vf(adev))
>+  soc15_program_register_sequence(adev,
>+
>   golden_settings_sdma_5_sriov,
>+  (const
>u32)ARRAY_SIZE(golden_settings_sdma_5_sriov));
>+  else
>+ 

RE: [PATCH] drm/amdgpu: hold the reference of finished fence

2020-03-23 Thread Tao, Yintian
Hi  Christian

The error fence is finished fence not the hw ring fence. Please the call trace 
below.
[  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[  732.939364]  dma_fence_signal+0x29/0x50  ===>drm sched finished fence
[  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
[  732.941910]  dma_fence_signal_locked+0x85/0x100
[  732.942692]  dma_fence_signal+0x29/0x50  > hw fence
[  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
[  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
[  732.945398]  amdgpu_irq_dispatch+0xaf/0x1d0 [amdgpu]
[  732.946317]  amdgpu_ih_process+0x8c/0x110 [amdgpu]
[  732.947206]  amdgpu_irq_handler+0x24/0xa0 [amdgpu]

Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian  
Sent: 2020年3月23日 20:06
To: Tao, Yintian ; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: hold the reference of finished fence

I've just double checked and your analyses actually can't be correct.

When we call dma_fence_signal() in amdgpu_fence_process() we still have a 
reference to the fence.

See the code here:
>     r = dma_fence_signal(fence);
>     if (!r)
>     DMA_FENCE_TRACE(fence, "signaled from irq 
> context\n");
>     else
>     BUG();
>
>     dma_fence_put(fence);

So I'm not sure how you ran into the crash in the first place, this is most 
likely something else.

Regards,
Christian.

Am 23.03.20 um 12:49 schrieb Yintian Tao:
> There is one one corner case at dma_fence_signal_locked which will 
> raise the NULL pointer problem just like below.
> ->dma_fence_signal
>  ->dma_fence_signal_locked
>   ->test_and_set_bit
> here trigger dma_fence_release happen due to the zero of fence refcount.
>
> ->dma_fence_put
>  ->dma_fence_release
>   ->drm_sched_fence_release_scheduled
>   ->call_rcu
> here make the union fled “cb_list” at finished fence to NULL because 
> struct rcu_head contains two pointer which is same as struct list_head 
> cb_list
>
> Therefore, to hold the reference of finished fence at amdgpu_job_run 
> to prevent the null pointer during dma_fence_signal
>
> [  732.912867] BUG: kernel NULL pointer dereference, address: 
> 0008 [  732.914815] #PF: supervisor write access in kernel 
> mode [  732.915731] #PF: error_code(0x0002) - not-present page [  
> 732.916621] PGD 0 P4D 0 [  732.917072] Oops: 0002 [#1] SMP PTI
> [  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G   OE 
> 5.4.0-rc7 #1
> [  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [  
> 732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
> [  732.938569] Call Trace:
> [  732.939003]  
> [  732.939364]  dma_fence_signal+0x29/0x50 [  732.940036]  
> drm_sched_fence_finished+0x12/0x20 [gpu_sched] [  732.940996]  
> drm_sched_process_job+0x34/0xa0 [gpu_sched] [  732.941910]  
> dma_fence_signal_locked+0x85/0x100
> [  732.942692]  dma_fence_signal+0x29/0x50 [  732.943457]  
> amdgpu_fence_process+0x99/0x120 [amdgpu] [  732.944393]  
> sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |  3 +++
>   3 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 7531527067df..03573eff660a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -52,7 +52,7 @@
>   
>   struct amdgpu_fence {
>   struct dma_fence base;
> -
> + struct dma_fence *finished;
>   /* RB, DMA, etc. */
>   struct amdgpu_ring  *ring;
>   };
> @@ -149,6 +149,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, 
> struct dma_fence **f,
>   
>   seq = ++ring->fence_drv.sync_seq;
>   fence->ring = ring;
> + fence->finished = NULL;
>   dma_fence_init(>base, _fence_ops,
>  >fence_drv.lock,
>  adev->fence_context + ring->idx, @@ -182,6 +183,21 @@ 
> int 
> amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f,
>   return 0;
>   }
>   
> +void amdgpu_fence_get_finished(struct dma_fence *base,
> +struct dma_fence *finished) {
> + struct amdgpu_fence *afence = to_amdgpu_fence(base);
> +
> + afence-

RE: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-16 Thread Tao, Yintian
Hi  Xinhui



Sure, can you submit one patch for it? I want to test it on my local server. 
Thanks in advance.


Best Regards
Yintian Tao

From: Pan, Xinhui 
Sent: 2020年3月16日 17:51
To: Koenig, Christian ; Tao, Yintian 

Cc: Deucher, Alexander ; Kuehling, Felix 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit


[AMD Official Use Only - Internal Distribution Only]

I still hit page fault with option 1 while running oclperf test.
Looks like we need sync fence after commit.

From: Tao, Yintian mailto:yintian@amd.com>>
Sent: Monday, March 16, 2020 4:15:01 PM
To: Pan, Xinhui mailto:xinhui@amd.com>>; Koenig, 
Christian mailto:christian.koe...@amd.com>>
Cc: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Kuehling, Felix 
mailto:felix.kuehl...@amd.com>>; Pan, Xinhui 
mailto:xinhui@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>
Subject: RE: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

Hi Xinhui


I encounter the same problem(page fault) when test vk_example benchmark.
I use your first option which can fix the problem. Can you help submit one 
patch?


-   if (flags & AMDGPU_PTE_VALID) {
-   struct amdgpu_bo *root = vm->root.base.bo;
-   if (!dma_fence_is_signaled(vm->last_direct))
-   amdgpu_bo_fence(root, vm->last_direct, true);
+   if (!dma_fence_is_signaled(vm->last_direct))
+   amdgpu_bo_fence(root, vm->last_direct, true);

-   if (!dma_fence_is_signaled(vm->last_delayed))
-   amdgpu_bo_fence(root, vm->last_delayed, true);
-   }
+   if (!dma_fence_is_signaled(vm->last_delayed))
+   amdgpu_bo_fence(root, vm->last_delayed, true);


Best Regards
Yintian Tao

-Original Message-
From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Pan, Xinhui
Sent: 2020年3月14日 21:07
To: Koenig, Christian 
mailto:christian.koe...@amd.com>>
Cc: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Kuehling, Felix 
mailto:felix.kuehl...@amd.com>>; Pan, Xinhui 
mailto:xinhui@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

hi, All
I think I found the root cause. here is what happened.

user: alloc/mapping memory
  kernel: validate memory and update the bo mapping, and update the 
page table
-> amdgpu_vm_bo_update_mapping
-> amdgpu_vm_update_ptes
-> amdgpu_vm_alloc_pts
-> amdgpu_vm_clear_bo // it 
will submit a job and we have a fence. BUT it is NOT added in resv.
user: free/unmapping memory
kernel: unmapping mmeory and udpate the page table
-> amdgpu_vm_bo_update_mapping
sync last_delay fence if flag & AMDGPU_PTE_VALID // of 
source we did not sync it here, as this is unmapping.
-> amdgpu_vm_update_ptes
-> amdgpu_vm_free_pts // unref page 
table bo.

So from the sequence above, we know there is a race betwen bo releasing and bo 
clearing.
bo might have been released before job running.

we can fix it in several ways,
1) sync last_delay in both mapping and unmapping case.
 Chris, you just sync last_delay in mapping case, should it be ok to sync it 
also in unmapping case?

2) always add fence to resv after commit.
 this is done by patchset v4. And only need patch 1. no need to move unref bo 
after commit.

3) move unref bo after commit, and add the last delay fence to resv.
This is done by patchset V1.


any ideas?

thanks
xinhui

> 2020年3月14日 02:05,Koenig, Christian 
> mailto:christian.koe...@amd.com>> 写道:
>
> The page table is not updated and then freed. A higher level PDE is updated 
> and because of this the lower level page tables is freed.
>
> Without this it could be that the memory backing the freed page table is 
> reused while the PDE is still pointing to it.
>
> Rather unlikely that this causes problems, but better save than sorry.
>
> Regards,
> Christian.
>
> Am 13.03.20 um 18:36 schrieb Felix Kuehling:
>> This seems weird. This means that we update a page table, and then free it 
>> in the same amdgpu_vm_update_ptes call? That means the update is redundant. 
>> Can we eliminate the redundant PTE update if the page table is about to be 
>> freed anyway?
>>
>> Regards,
>>   Felix
>>
>> On 2020-03-13 12:09, xinhui pan wrote:
>>> Free page table bo before job 

RE: [PATCH] drm/amdgpu: miss PRT case when bo update

2020-03-16 Thread Tao, Yintian
Hi Christian

Many thanks for your review

Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian  
Sent: 2020年3月16日 17:38
To: Tao, Yintian ; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: miss PRT case when bo update

Am 16.03.20 um 08:52 schrieb Yintian Tao:
> Originally, only the PTE valid is taken in consider.
> The PRT case is missied when bo update which raise problem.
> We need add condition for PRT case.

Good catch, just one style nit pick below.

>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 73398831196f..7a3e4514a00c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1446,7 +1446,7 @@ static int amdgpu_vm_update_ptes(struct 
> amdgpu_vm_update_params *params,
>   uint64_t incr, entry_end, pe_start;
>   struct amdgpu_bo *pt;
>   
> - if (flags & AMDGPU_PTE_VALID) {
> + if (flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) {
>   /* make sure that the page tables covering the
>* address range are actually allocated
>*/
> @@ -1605,7 +1605,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
> amdgpu_device *adev,
>   
>   if (flags & AMDGPU_PTE_VALID) {
>   struct amdgpu_bo *root = vm->root.base.bo;
> -
>   if (!dma_fence_is_signaled(vm->last_direct))

Please keep this empty line, it is required by the coding style guides.

With that fixed the patch is Reviewed-by: Christian König 
.

Regards,
Christian.

>   amdgpu_bo_fence(root, vm->last_direct, true);
>   
> @@ -1718,7 +1717,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
> amdgpu_device *adev,
>   AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>   }
>   
> - } else if (flags & AMDGPU_PTE_VALID) {
> + } else if (flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) {
>   addr += bo_adev->vm_manager.vram_base_offset;
>   addr += pfn << PAGE_SHIFT;
>   }

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-16 Thread Tao, Yintian
Hi Xinhui


I encounter the same problem(page fault) when test vk_example benchmark.
I use your first option which can fix the problem. Can you help submit one 
patch?


-   if (flags & AMDGPU_PTE_VALID) {
-   struct amdgpu_bo *root = vm->root.base.bo;
-   if (!dma_fence_is_signaled(vm->last_direct))
-   amdgpu_bo_fence(root, vm->last_direct, true);
+   if (!dma_fence_is_signaled(vm->last_direct))
+   amdgpu_bo_fence(root, vm->last_direct, true);
 
-   if (!dma_fence_is_signaled(vm->last_delayed))
-   amdgpu_bo_fence(root, vm->last_delayed, true);
-   }
+   if (!dma_fence_is_signaled(vm->last_delayed))
+   amdgpu_bo_fence(root, vm->last_delayed, true);


Best Regards
Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Pan, Xinhui
Sent: 2020年3月14日 21:07
To: Koenig, Christian 
Cc: Deucher, Alexander ; Kuehling, Felix 
; Pan, Xinhui ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

hi, All
I think I found the root cause. here is what happened.

user: alloc/mapping memory
kernel: validate memory and update the bo mapping, and update 
the page table
-> amdgpu_vm_bo_update_mapping
-> amdgpu_vm_update_ptes
-> amdgpu_vm_alloc_pts
-> amdgpu_vm_clear_bo // it 
will submit a job and we have a fence. BUT it is NOT added in resv.
user: free/unmapping memory
kernel: unmapping mmeory and udpate the page table
-> amdgpu_vm_bo_update_mapping
sync last_delay fence if flag & AMDGPU_PTE_VALID // of 
source we did not sync it here, as this is unmapping.
-> amdgpu_vm_update_ptes
-> amdgpu_vm_free_pts // unref page 
table bo.

So from the sequence above, we know there is a race betwen bo releasing and bo 
clearing.
bo might have been released before job running.

we can fix it in several ways,
1) sync last_delay in both mapping and unmapping case.
 Chris, you just sync last_delay in mapping case, should it be ok to sync it 
also in unmapping case?

2) always add fence to resv after commit. 
 this is done by patchset v4. And only need patch 1. no need to move unref bo 
after commit.

3) move unref bo after commit, and add the last delay fence to resv. 
This is done by patchset V1. 


any ideas?

thanks
xinhui

> 2020年3月14日 02:05,Koenig, Christian  写道:
> 
> The page table is not updated and then freed. A higher level PDE is updated 
> and because of this the lower level page tables is freed.
> 
> Without this it could be that the memory backing the freed page table is 
> reused while the PDE is still pointing to it.
> 
> Rather unlikely that this causes problems, but better save than sorry.
> 
> Regards,
> Christian.
> 
> Am 13.03.20 um 18:36 schrieb Felix Kuehling:
>> This seems weird. This means that we update a page table, and then free it 
>> in the same amdgpu_vm_update_ptes call? That means the update is redundant. 
>> Can we eliminate the redundant PTE update if the page table is about to be 
>> freed anyway?
>> 
>> Regards,
>>   Felix
>> 
>> On 2020-03-13 12:09, xinhui pan wrote:
>>> Free page table bo before job submit is insane.
>>> We might touch invalid memory while job is runnig.
>>> 
>>> we now have individualized bo resv during bo releasing.
>>> So any fences added to root PT bo is actually untested when a normal 
>>> PT bo is releasing.
>>> 
>>> We might hit gmc page fault or memory just got overwrited.
>>> 
>>> Cc: Christian König 
>>> Cc: Alex Deucher 
>>> Cc: Felix Kuehling 
>>> Signed-off-by: xinhui pan 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 24 +---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  3 +++
>>>   2 files changed, 24 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 73398831196f..346e2f753474 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -937,6 +937,21 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device 
>>> *adev,
>>>   return r;
>>>   }
>>>   +static void amdgpu_vm_free_zombie_bo(struct amdgpu_device *adev,
>>> +struct amdgpu_vm *vm)
>>> +{
>>> +struct amdgpu_vm_pt *entry;
>>> +
>>> +while (!list_empty(>zombies)) {
>>> +entry = list_first_entry(>zombies, struct amdgpu_vm_pt,
>>> +base.vm_status);
>>> +list_del(>base.vm_status);
>>> +
>>> +amdgpu_bo_unref(>base.bo->shadow);
>>> +amdgpu_bo_unref(>base.bo);
>>> +}
>>> +}
>>> +
>>>   /**
>>>* amdgpu_vm_free_table - fre one PD/PT
>>>*
>>> @@ -945,10 +960,9 @@ static int amdgpu_vm_alloc_pts(struct 

RE: [refactor RLCG wreg path 1/2] drm/amdgpu: refactor RLCG access path part 1

2020-03-11 Thread Tao, Yintian
Reviewed-by: Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Monk Liu
Sent: 2020年3月11日 13:58
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk 
Subject: [refactor RLCG wreg path 1/2] drm/amdgpu: refactor RLCG access path 
part 1

what changed:
1)provide new implementation interface for the rlcg access path 2)put 
SQ_CMD/SQ_IND_INDEX/SQ_IND_DATA to GFX9 RLCG path to align with SRIOV RLCG logic

background:
we what to clear the code path for WREG32_RLC, to make it only covered and 
handled by amdgpu_mm_wreg() routine, this way we can let RLCG to serve the 
register access even through UMR (via debugfs interface) the current 
implementation cannot achieve that goal because it can only hardcode 
everywhere, but UMR only pass "offset" as varable to driver

tested-by: Monk Liu 
tested-by: Zhou pengju 
Signed-off-by: Zhou pengju 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h |   2 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  80 ++-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 177 +++-
 drivers/gpu/drm/amd/amdgpu/soc15.h  |   7 ++
 4 files changed, 264 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
index 52509c2..60bb3e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
@@ -127,6 +127,8 @@ struct amdgpu_rlc_funcs {
void (*reset)(struct amdgpu_device *adev);
void (*start)(struct amdgpu_device *adev);
void (*update_spm_vmid)(struct amdgpu_device *adev, unsigned vmid);
+   void (*rlcg_wreg)(struct amdgpu_device *adev, u32 offset, u32 v);
+   bool (*is_rlcg_access_range)(struct amdgpu_device *adev, uint32_t 
+reg);
 };
 
 struct amdgpu_rlc {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 82ef08d..3222cd3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -224,6 +224,56 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1_2[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0x, 0x0080)  };
 
+static const struct soc15_reg_rlcg rlcg_access_gc_10_0[] = {
+   {SOC15_REG_ENTRY(GC, 0, mmRLC_CSIB_ADDR_HI)},
+   {SOC15_REG_ENTRY(GC, 0, mmRLC_CSIB_ADDR_LO)},
+   {SOC15_REG_ENTRY(GC, 0, mmRLC_CSIB_LENGTH)},
+   {SOC15_REG_ENTRY(GC, 0, mmCP_ME_CNTL)}, };
+
+static void gfx_v10_rlcg_wreg(struct amdgpu_device *adev, u32 offset, 
+u32 v) {
+   static void *scratch_reg0;
+   static void *scratch_reg1;
+   static void *scratch_reg2;
+   static void *scratch_reg3;
+   static void *spare_int;
+   static uint32_t grbm_cntl;
+   static uint32_t grbm_idx;
+   uint32_t i = 0;
+   uint32_t retries = 5;
+
+   scratch_reg0 = adev->rmmio + 
(adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG0_BASE_IDX] + mmSCRATCH_REG0)*4;
+   scratch_reg1 = adev->rmmio + 
(adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG1)*4;
+   scratch_reg2 = adev->rmmio + 
(adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG2)*4;
+   scratch_reg3 = adev->rmmio + 
(adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG3)*4;
+   spare_int = adev->rmmio + 
+(adev->reg_offset[GC_HWIP][0][mmRLC_SPARE_INT_BASE_IDX] + 
+mmRLC_SPARE_INT)*4;
+
+   grbm_cntl = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_CNTL_BASE_IDX] + 
mmGRBM_GFX_CNTL;
+   grbm_idx = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_INDEX_BASE_IDX] + 
+mmGRBM_GFX_INDEX;
+
+   if (amdgpu_sriov_runtime(adev)) {
+   pr_err("shoudn't call rlcg write register during runtime\n");
+   return;
+   }
+
+   writel(v, scratch_reg0);
+   writel(offset | 0x8000, scratch_reg1);
+   writel(1, spare_int);
+   for (i = 0; i < retries; i++) {
+   u32 tmp;
+
+   tmp = readl(scratch_reg1);
+   if (!(tmp & 0x8000))
+   break;
+
+   udelay(10);
+   }
+
+   if (i >= retries)
+   pr_err("timeout: rlcg program reg:0x%05x failed !\n", offset); }
+
 static const struct soc15_reg_golden golden_settings_gc_10_1_nv14[] =  {
/* Pending on emulation bring up */
@@ -4247,6 +4297,32 @@ static void gfx_v10_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
WREG32_SOC15(GC, 0, mmRLC_SPM_MC_CNTL, data);  }
 
+static bool gfx_v10_0_check_rlcg_range(struct amdgpu_device *adev,
+   uint32_t offset,
+   struct soc15_reg_rlcg *entries, int 
arr_size) {
+   int i;
+   uint32_t reg;
+
+   for (i = 0; i < arr_size; i++) {
+   const struct soc15_reg_rlcg *entry;
+
+   entry = [i];
+   reg = 
adev->reg_offset[entry->hwip][entry->instance][entry->segment] + entry->reg;
+   if 

RE: [refactor RLCG wreg path 2/2] drm/amdgpu: refactor RLCG access path part 2

2020-03-11 Thread Tao, Yintian
Reviewed-by: Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Monk Liu
Sent: 2020年3月11日 13:58
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk 
Subject: [refactor RLCG wreg path 2/2] drm/amdgpu: refactor RLCG access path 
part 2

switch to new RLCG access path, and drop the legacy WREG32_RLC macros

tested-by: Monk Liu 
tested-by: Zhou pengju 
Signed-off-by: Zhou pengju 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  30 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|   5 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c|   8 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c |   2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c  |  28 +++---
 drivers/gpu/drm/amd/amdgpu/soc15.c|  11 +--
 drivers/gpu/drm/amd/amdgpu/soc15_common.h |  57 
 8 files changed, 93 insertions(+), 152 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index df841c2..a21f005 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -105,8 +105,8 @@ void kgd_gfx_v9_program_sh_mem_settings(struct kgd_dev 
*kgd, uint32_t vmid,
 
lock_srbm(kgd, 0, 0, 0, vmid);
 
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmSH_MEM_CONFIG), sh_mem_config);
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmSH_MEM_BASES), sh_mem_bases);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSH_MEM_CONFIG), sh_mem_config);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSH_MEM_BASES), sh_mem_bases);
/* APE1 no longer exists on GFX9 */
 
unlock_srbm(kgd);
@@ -242,13 +242,13 @@ int kgd_gfx_v9_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
 
for (reg = hqd_base;
 reg <= SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_HI); reg++)
-   WREG32_RLC(reg, mqd_hqd[reg - hqd_base]);
+   WREG32(reg, mqd_hqd[reg - hqd_base]);
 
 
/* Activate doorbell logic before triggering WPTR poll. */
data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
 CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_DOORBELL_CONTROL), data);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_DOORBELL_CONTROL), data);
 
if (wptr) {
/* Don't read wptr with get_user because the user @@ -277,25 
+277,25 @@ int kgd_gfx_v9_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t 
pipe_id,
guessed_wptr += m->cp_hqd_pq_wptr_lo & ~(queue_size - 1);
guessed_wptr += (uint64_t)m->cp_hqd_pq_wptr_hi << 32;
 
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_LO),
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_LO),
   lower_32_bits(guessed_wptr));
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_HI),
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_HI),
   upper_32_bits(guessed_wptr));
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_POLL_ADDR),
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_POLL_ADDR),
   lower_32_bits((uintptr_t)wptr));
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, 
mmCP_HQD_PQ_WPTR_POLL_ADDR_HI),
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_POLL_ADDR_HI),
   upper_32_bits((uintptr_t)wptr));
WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_PQ_WPTR_POLL_CNTL1),
-  (uint32_t)get_queue_mask(adev, pipe_id, queue_id));
+  get_queue_mask(adev, pipe_id, queue_id));
}
 
/* Start the EOP fetcher */
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_EOP_RPTR),
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_EOP_RPTR),
   REG_SET_FIELD(m->cp_hqd_eop_rptr,
 CP_HQD_EOP_RPTR, INIT_FETCHER, 1));
 
data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE), data);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE), data);
 
release_queue(kgd);
 
@@ -547,7 +547,7 @@ int kgd_gfx_v9_hqd_destroy(struct kgd_dev *kgd, void *mqd,
acquire_queue(kgd, pipe_id, queue_id);
 
if (m->cp_hqd_vmid == 0)
-   WREG32_FIELD15_RLC(GC, 0, RLC_CP_SCHEDULERS, scheduler1, 0);
+   WREG32_FIELD15(GC, 0, RLC_CP_SCHEDULERS, scheduler1, 0);
 
switch (reset_type) {
case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
@@ -561,7 +561,7 @@ int kgd_gfx_v9_hqd_destroy(struct kgd_dev *kgd, void *mqd,
break;
}
 
-   WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_DEQUEUE_REQUEST), type);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_DEQUEUE_REQUEST), type);
 

RE: [PATCH] drm/amdgpu: release drm_device after amdgpu_driver_unload_kms

2020-02-27 Thread Tao, Yintian
Many thanks. I will put it after pci_* functions.

-Original Message-
From: Christian König  
Sent: 2020年2月27日 22:21
To: Tao, Yintian ; Koenig, Christian 
; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: release drm_device after 
amdgpu_driver_unload_kms

Am 27.02.20 um 12:58 schrieb Yintian Tao:
> If we release drm_device before amdgpu_driver_unload_kms, then it will 
> raise the error below. Therefore, we need to place it before 
> amdgpu_driver_unload_kms.
> [   43.055736] Memory manager not clean during takedown.
> [   43.055777] WARNING: CPU: 1 PID: 2807 at 
> /build/linux-hwe-9KJ07q/linux-hwe-4.18.0/drivers/gpu/drm/drm_mm.c:913 
> drm_mm_takedown+0x24/0x30 [drm]
> [   43.055778] Modules linked in: amdgpu(OE-) amd_sched(OE) amdttm(OE) 
> amdkcl(OE) amd_iommu_v2 drm_kms_helper drm i2c_algo_bit fb_sys_fops 
> syscopyarea sysfillrect sysimgblt snd_hda_codec_generic nfit kvm_intel kvm 
> irqbypass crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec 
> snd_hda_core snd_hwdep snd_pcm ghash_clmulni_intel snd_seq_midi 
> snd_seq_midi_event pcbc snd_rawmidi snd_seq snd_seq_device aesni_intel 
> snd_timer joydev aes_x86_64 crypto_simd cryptd glue_helper snd soundcore 
> input_leds mac_hid serio_raw qemu_fw_cfg binfmt_misc sch_fq_codel nfsd 
> auth_rpcgss nfs_acl lockd grace sunrpc parport_pc ppdev lp parport ip_tables 
> x_tables autofs4 hid_generic floppy usbhid psmouse hid i2c_piix4 e1000 
> pata_acpi
> [   43.055819] CPU: 1 PID: 2807 Comm: modprobe Tainted: G   OE 
> 4.18.0-15-generic #16~18.04.1-Ubuntu
> [   43.055820] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.12.0-1 04/01/2014
> [   43.055830] RIP: 0010:drm_mm_takedown+0x24/0x30 [drm]
> [   43.055831] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48 83 c7 38 
> 48 39 c7 75 02 f3 c3 55 48 c7 c7 38 33 80 c0 48 89 e5 e8 1c 41 ec d0 <0f> 0b 
> 5d c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
> [   43.055857] RSP: 0018:ae33c1393d28 EFLAGS: 00010286
> [   43.055859] RAX:  RBX: 9651b4a29800 RCX: 
> 0006
> [   43.055860] RDX: 0007 RSI: 0096 RDI: 
> 9651bfc964b0
> [   43.055861] RBP: ae33c1393d28 R08: 02a6 R09: 
> 0004
> [   43.055861] R10: ae33c1393d20 R11: 0001 R12: 
> 9651ba6cb000
> [   43.055863] R13: 9651b7f4 R14: c0de3a10 R15: 
> 9651ba5c6460
> [   43.055864] FS:  7f1d3c08d540() GS:9651bfc8() 
> knlGS:
> [   43.055865] CS:  0010 DS:  ES:  CR0: 80050033
> [   43.055866] CR2: 5630a5831640 CR3: 00012e274004 CR4: 
> 003606e0
> [   43.055870] DR0:  DR1:  DR2: 
> 
> [   43.055871] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   43.055871] Call Trace:
> [   43.055885]  drm_vma_offset_manager_destroy+0x1b/0x30 [drm]
> [   43.055894]  drm_gem_destroy+0x19/0x40 [drm]
> [   43.055903]  drm_dev_fini+0x7f/0x90 [drm]
> [   43.055911]  drm_dev_release+0x2b/0x40 [drm]
> [   43.055919]  drm_dev_unplug+0x64/0x80 [drm]
> [   43.055994]  amdgpu_pci_remove+0x39/0x70 [amdgpu]
> [   43.055998]  pci_device_remove+0x3e/0xc0
> [   43.056001]  device_release_driver_internal+0x18a/0x260
> [   43.056003]  driver_detach+0x3f/0x80
> [   43.056004]  bus_remove_driver+0x59/0xd0
> [   43.056006]  driver_unregister+0x2c/0x40
> [   43.056008]  pci_unregister_driver+0x22/0xa0
> [   43.056087]  amdgpu_exit+0x15/0x57c [amdgpu]
> [   43.056090]  __x64_sys_delete_module+0x146/0x280
> [   43.056094]  do_syscall_64+0x5a/0x120
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 02d80b9dbfe1..01a1082b5cab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1138,8 +1138,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   #endif
>   DRM_ERROR("Hotplug removal is not supported\n");
>   drm_dev_unplug(dev);
> - drm_dev_put(dev);
>   amdgpu_driver_unload_kms(dev);
> + drm_dev_put(dev);
>   pci_disable_device(pdev);
>   pci_set_drvdata(pdev, NULL);

Maybe even put this after the pci_* functions?

At least pci_set_drvdata() sounds like it needs to come before we release the 
structure.

Christian.

>   }

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: miss to remove pp_sclk file

2020-02-27 Thread Tao, Yintian
Hi  Christian

Thanks a lot for your review


Hi  Alex


Can you help to review it? Thanks in advance.


Best Regards
Yintian Tao
-Original Message-
From: Koenig, Christian  
Sent: 2020年2月27日 22:18
To: Tao, Yintian ; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: miss to remove pp_sclk file

Am 27.02.20 um 15:11 schrieb Yintian Tao:
> Miss to remove pp_sclk file
>
> Signed-off-by: Yintian Tao 

Looks reasonable to me, but Alex can probably better judge.

Acked-by: Christian König 

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 9deff8cc9723..a43fc1c8ffd0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -3471,6 +3471,7 @@ void amdgpu_pm_sysfs_fini(struct amdgpu_device *adev)
>   device_remove_file(adev->dev, _attr_pp_cur_state);
>   device_remove_file(adev->dev, _attr_pp_force_state);
>   device_remove_file(adev->dev, _attr_pp_table);
> + device_remove_file(adev->dev, _attr_pp_sclk);
>   
>   device_remove_file(adev->dev, _attr_pp_dpm_sclk);
>   device_remove_file(adev->dev, _attr_pp_dpm_mclk);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: no need to clean debugfs at amdgpu

2020-02-27 Thread Tao, Yintian
Hi  Christian

Thanks for your suggestion. I will remove all cleanup/fini code as well.

Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian  
Sent: 2020年2月27日 19:54
To: Tao, Yintian ; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: no need to clean debugfs at amdgpu

If we do this we should probably make nails with heads and remove the whole 
cleanup/fini code as well.

Christian.

Am 27.02.20 um 12:50 schrieb Yintian Tao:
> drm_minor_unregister will invoke drm_debugfs_cleanup to clean all the 
> child node under primary minor node.
> We don't need to invoke amdgpu_debugfs_fini and 
> amdgpu_debugfs_regs_cleanup to clean agian.
> Otherwise, it will raise the NULL pointer like below.
> [   45.046029] BUG: unable to handle kernel NULL pointer dereference at 
> 00a8
> [   45.047256] PGD 0 P4D 0
> [   45.047713] Oops: 0002 [#1] SMP PTI
> [   45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: GW  OE 
> 4.18.0-15-generic #16~18.04.1-Ubuntu
> [   45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.12.0-1 04/01/2014
> [   45.050651] RIP: 0010:down_write+0x1f/0x40
> [   45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 
> e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8  48 
> 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01
> [   45.053702] RSP: 0018:ad8f4133fd40 EFLAGS: 00010246
> [   45.054384] RAX: 00a8 RBX: 00a8 RCX: 
> a011327dd814
> [   45.055349] RDX: 0001 RSI: 0001 RDI: 
> 00a8
> [   45.056346] RBP: ad8f4133fd48 R08:  R09: 
> c0690a00
> [   45.057326] R10: ad8f4133fd58 R11: 0001 R12: 
> a0113cff0300
> [   45.058266] R13: a0113c0a R14: c0c02a10 R15: 
> a0113e5c7860
> [   45.059221] FS:  7f60d46f9540() GS:a0113fc0() 
> knlGS:
> [   45.060809] CS:  0010 DS:  ES:  CR0: 80050033
> [   45.061826] CR2: 00a8 CR3: 000136250004 CR4: 
> 003606f0
> [   45.062913] DR0:  DR1:  DR2: 
> 
> [   45.064404] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   45.065897] Call Trace:
> [   45.066426]  debugfs_remove+0x36/0xa0
> [   45.067131]  amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu]
> [   45.068019]  amdgpu_debugfs_fini+0x2c/0x50 [amdgpu]
> [   45.068756]  amdgpu_pci_remove+0x49/0x70 [amdgpu]
> [   45.069439]  pci_device_remove+0x3e/0xc0
> [   45.070037]  device_release_driver_internal+0x18a/0x260
> [   45.070842]  driver_detach+0x3f/0x80
> [   45.071325]  bus_remove_driver+0x59/0xd0
> [   45.071850]  driver_unregister+0x2c/0x40
> [   45.072377]  pci_unregister_driver+0x22/0xa0
> [   45.073043]  amdgpu_exit+0x15/0x57c [amdgpu]
> [   45.073683]  __x64_sys_delete_module+0x146/0x280
> [   45.074369]  do_syscall_64+0x5a/0x120
> [   45.074916]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 1 -
>   2 files changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 8ef8a49b9255..351096ab4301 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3237,7 +3237,6 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   adev->rmmio = NULL;
>   amdgpu_device_doorbell_fini(adev);
>   
> - amdgpu_debugfs_regs_cleanup(adev);
>   device_remove_file(adev->dev, _attr_pcie_replay_count);
>   if (adev->ucode_sysfs_en)
>   amdgpu_ucode_sysfs_fini(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 7cf5f597b90a..02d80b9dbfe1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1139,7 +1139,6 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   DRM_ERROR("Hotplug removal is not supported\n");
>   drm_dev_unplug(dev);
>   drm_dev_put(dev);
> - amdgpu_debugfs_fini(adev);
>   amdgpu_driver_unload_kms(dev);
>   pci_disable_device(pdev);
>   pci_set_drvdata(pdev, NULL);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu/sriov: Tonga sriov also need load firmware with smu

2019-12-16 Thread Tao, Yintian
Reviewd-by Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Emily Deng
Sent: 2019年12月16日 17:17
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily 
Subject: [PATCH] drm/amdgpu/sriov: Tonga sriov also need load firmware with smu

Fix Tonga sriov load driver fail issue.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 3 ++-
 drivers/gpu/drm/amd/powerplay/amd_powerplay.c | 3 ---
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 26d1a4c..52d3f66 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1818,7 +1818,8 @@ static int amdgpu_device_fw_loading(struct amdgpu_device 
*adev)
}
}
 
-   r = amdgpu_pm_load_smu_firmware(adev, _version);
+   if (!amdgpu_sriov_vf(adev) || adev->asic_type == CHIP_TONGA)
+   r = amdgpu_pm_load_smu_firmware(adev, _version);
 
return r;
 }
diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c 
b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
index 5087d6b..7293763 100644
--- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
+++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
@@ -275,9 +275,6 @@ static int pp_dpm_load_fw(void *handle)  {
struct pp_hwmgr *hwmgr = handle;
 
-   if (!hwmgr->not_vf)
-   return 0;
-
if (!hwmgr || !hwmgr->smumgr_funcs || !hwmgr->smumgr_funcs->start_smu)
return -EINVAL;
 
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cyintian.tao%40amd.com%7Ca6a09aeba0ec4693e2a808d78208c1fa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637120846535080512sdata=yR%2FLdgc9IUshvrfc2YWmWr7j6vLUrzxTHItZnEnJGOs%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/powerplay: enable pp one vf mode for vega10

2019-12-10 Thread Tao, Yintian
Ping...

-Original Message-
From: Yintian Tao  
Sent: 2019年12月10日 17:36
To: Deucher, Alexander ; Feng, Kenneth 

Cc: amd-gfx@lists.freedesktop.org; Tao, Yintian 
Subject: [PATCH] drm/amd/powerplay: enable pp one vf mode for vega10

Originally, due to the restriction from PSP and SMU, VF has
to send message to hypervisor driver to handle powerplay
change which is complicated and redundant. Currently, SMU
and PSP can support VF to directly handle powerplay
change by itself. Therefore, the old code about the handshake
between VF and PF to handle powerplay will be removed and VF
will use new the registers below to handshake with SMU.
mmMP1_SMN_C2PMSG_101: register to handle SMU message
mmMP1_SMN_C2PMSG_102: register to handle SMU parameter
mmMP1_SMN_C2PMSG_103: register to handle SMU response

v2: remove module parameter pp_one_vf
v3: fix the parens
v4: forbid vf to change smu feature

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|  16 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |   4 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c| 235 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c  |  51 
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h  |  14 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c |  78 --
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h |   4 -
 drivers/gpu/drm/amd/amdgpu/soc15.c|   8 +-
 drivers/gpu/drm/amd/powerplay/amd_powerplay.c |   4 +-
 .../drm/amd/powerplay/hwmgr/hardwaremanager.c |  15 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c   |  16 ++
 drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.c  |  30 +--
 .../drm/amd/powerplay/hwmgr/vega10_hwmgr.c| 162 
 drivers/gpu/drm/amd/powerplay/inc/hwmgr.h |   1 +
 .../drm/amd/powerplay/smumgr/smu9_smumgr.c|  56 -
 .../drm/amd/powerplay/smumgr/vega10_smumgr.c  |  14 ++
 16 files changed, 406 insertions(+), 302 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b9ca7e728d3e..465156a12d88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1880,6 +1880,9 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
}
}
 
+   if (amdgpu_sriov_vf(adev))
+   amdgpu_virt_init_data_exchange(adev);
+
r = amdgpu_ib_pool_init(adev);
if (r) {
dev_err(adev->dev, "IB initialization failed (%d).\n", r);
@@ -1921,11 +1924,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
amdgpu_amdkfd_device_init(adev);
 
 init_failed:
-   if (amdgpu_sriov_vf(adev)) {
-   if (!r)
-   amdgpu_virt_init_data_exchange(adev);
+   if (amdgpu_sriov_vf(adev))
amdgpu_virt_release_full_gpu(adev, true);
-   }
 
return r;
 }
@@ -2819,7 +2819,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
mutex_init(>virt.vf_errors.lock);
hash_init(adev->mn_hash);
mutex_init(>lock_reset);
-   mutex_init(>virt.dpm_mutex);
mutex_init(>psp.mutex);
 
r = amdgpu_device_check_arguments(adev);
@@ -3040,9 +3039,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
amdgpu_fbdev_init(adev);
 
-   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev))
-   amdgpu_pm_virt_sysfs_init(adev);
-
r = amdgpu_pm_sysfs_init(adev);
if (r) {
adev->pm_sysfs_en = false;
@@ -3187,8 +3183,6 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
iounmap(adev->rmmio);
adev->rmmio = NULL;
amdgpu_device_doorbell_fini(adev);
-   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev))
-   amdgpu_pm_virt_sysfs_fini(adev);
 
amdgpu_debugfs_regs_cleanup(adev);
device_remove_file(adev->dev, _attr_pcie_replay_count);
@@ -3669,6 +3663,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
if (r)
goto error;
 
+   amdgpu_virt_init_data_exchange(adev);
/* we need recover gart prior to run SMC/CP/SDMA resume */
amdgpu_gtt_mgr_recover(>mman.bdev.man[TTM_PL_TT]);
 
@@ -3686,7 +3681,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
amdgpu_amdkfd_post_reset(adev);
 
 error:
-   amdgpu_virt_init_data_exchange(adev);
amdgpu_virt_release_full_gpu(adev, true);
if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
amdgpu_inc_vram_lost(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 5ec1415d1755..3a0ea9096498 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -703,10 +703,6 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
if 

RE: [PATCH] drm/amdgpu: not remove sysfs if not create sysfs

2019-11-29 Thread Tao, Yintian
Hi  Christian

Thanks a lot. I got it. Can I get your RB?

Best Regards
Yintian Tao

-Original Message-
From: Christian König  
Sent: 2019年11月29日 17:30
To: Tao, Yintian ; Koenig, Christian 
; Das, Nirmoy ; 
amd-gfx@lists.freedesktop.org
Cc: Tuikov, Luben 
Subject: Re: [PATCH] drm/amdgpu: not remove sysfs if not create sysfs

Am 29.11.19 um 10:25 schrieb Tao, Yintian:
> Hi  Christian
>
> Do you mean we can remove sysfs_remove_group() for pm_sysfs and ucode_sysfs 
> at amdgpu_device_fini()?

At least I think so, the question is where this group is added?

If that is for some directory which is removed during driver unload then the 
group will be removed automatically as well.

> If so , I think the sysfs directories will not be removed automatically.
> When I remove sysfs_remove_group() at amdgpu_device_fini() and reload amdgpu, 
> then it will report the error below.

Ok in this case forget what I said. The group is added directly to the PCI 
director and that one is obviously not removed when the driver unloads.

Thanks,
Christian.

>
> [ 4192.025969] [drm] fb depth is 24
> [ 4192.025970] [drm]pitch is 7680
> [ 4192.026104] checking generic (f400 24) vs hw (6 
> 2) [ 4192.026182] amdgpu :00:07.0: fb1: amdgpudrmfb frame 
> buffer device [ 4192.043546] sysfs: cannot create duplicate filename 
> '/devices/pci:00/:00:07.0/fw_version'
> [ 4192.043549] CPU: 2 PID: 5423 Comm: modprobe Tainted: G   OE 
> 5.2.0-rc1 #1
> [ 4192.043550] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 4192.043551] Call Trace:
> [ 4192.043781]  dump_stack+0x63/0x85
> [ 4192.043862]  sysfs_warn_dup+0x5b/0x70 [ 4192.043864]  
> internal_create_group+0x36f/0x3a0 [ 4192.043878]  
> sysfs_create_group+0x13/0x20 [ 4192.043970]  
> amdgpu_ucode_sysfs_init+0x18/0x20 [amdgpu] [ 4192.044030]  
> amdgpu_device_init+0xe48/0x1b00 [amdgpu] [ 4192.044086]  
> amdgpu_driver_load_kms+0x5d/0x250 [amdgpu] [ 4192.044099]  
> drm_dev_register+0x12b/0x1c0 [drm]
>
>
> Best Regards
> Yintian Tao
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Christian K?nig
> Sent: 2019年11月29日 16:52
> To: Das, Nirmoy ; amd-gfx@lists.freedesktop.org
> Cc: Tuikov, Luben 
> Subject: Re: [PATCH] drm/amdgpu: not remove sysfs if not create sysfs
>
> Well what we do here actually looks like complete overkill to me.
>
> IIRC when the device is removed all subsequent sysfs directories are removed 
> automatically as well.
>
> So calling sysfs_remove_group() is superflous in the first place.
>
> Regards,
> Christian.
>
> Am 29.11.19 um 09:34 schrieb Nirmoy:
>> Luben, This should take care of the warnings that you get when a navi 
>> fw file is missing from initrd.
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>> On 11/29/19 9:26 AM, Yintian Tao wrote:
>>> When load amdgpu failed before create pm_sysfs and ucode_sysfs, the 
>>> pm_sysfs and ucode_sysfs should not be removed.
>>> Otherwise, there will be warning call trace just like below.
>>> [   24.836386] [drm] VCE initialized successfully.
>>> [   24.841352] amdgpu :00:07.0: amdgpu_device_ip_init failed [ 
>>> 25.370383] amdgpu :00:07.0: Fatal error during GPU init [ 
>>> 25.889575] [drm] amdgpu: finishing device.
>>> [   26.069128] amdgpu :00:07.0: [drm:amdgpu_ring_test_helper 
>>> [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [   26.070110] 
>>> [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed [ 
>>> 26.200309] [TTM] Finalizing pool allocator [   26.200314] [TTM] 
>>> Finalizing DMA pool allocator [   26.200349] [TTM] Zone  kernel: 
>>> Used memory at exit: 0 KiB [   26.200351] [TTM] Zone   dma32: Used 
>>> memory at exit: 0 KiB [   26.200353] [drm] amdgpu: ttm finalized [ 
>>> 26.205329] [ cut here ] [   26.205330] sysfs 
>>> group 'fw_version' not found for kobject ':00:07.0'
>>> [   26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256
>>> sysfs_remove_group+0x80/0x90
>>> [   26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE)
>>> drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea 
>>> sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd 
>>> grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio 
>>> crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec 
>>> ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer 
>>> input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel
>>> aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables 
>>> x_tabl

RE: [PATCH] drm/amdgpu: not remove sysfs if not create sysfs

2019-11-29 Thread Tao, Yintian
Hi  Christian

Do you mean we can remove sysfs_remove_group() for pm_sysfs and ucode_sysfs at 
amdgpu_device_fini()?
If so , I think the sysfs directories will not be removed automatically.
When I remove sysfs_remove_group() at amdgpu_device_fini() and reload amdgpu, 
then it will report the error below.

[ 4192.025969] [drm] fb depth is 24
[ 4192.025970] [drm]pitch is 7680
[ 4192.026104] checking generic (f400 24) vs hw (6 2)
[ 4192.026182] amdgpu :00:07.0: fb1: amdgpudrmfb frame buffer device
[ 4192.043546] sysfs: cannot create duplicate filename 
'/devices/pci:00/:00:07.0/fw_version'
[ 4192.043549] CPU: 2 PID: 5423 Comm: modprobe Tainted: G   OE 
5.2.0-rc1 #1
[ 4192.043550] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 4192.043551] Call Trace:
[ 4192.043781]  dump_stack+0x63/0x85
[ 4192.043862]  sysfs_warn_dup+0x5b/0x70
[ 4192.043864]  internal_create_group+0x36f/0x3a0
[ 4192.043878]  sysfs_create_group+0x13/0x20
[ 4192.043970]  amdgpu_ucode_sysfs_init+0x18/0x20 [amdgpu]
[ 4192.044030]  amdgpu_device_init+0xe48/0x1b00 [amdgpu]
[ 4192.044086]  amdgpu_driver_load_kms+0x5d/0x250 [amdgpu]
[ 4192.044099]  drm_dev_register+0x12b/0x1c0 [drm]


Best Regards
Yintian Tao
-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: 2019年11月29日 16:52
To: Das, Nirmoy ; amd-gfx@lists.freedesktop.org
Cc: Tuikov, Luben 
Subject: Re: [PATCH] drm/amdgpu: not remove sysfs if not create sysfs

Well what we do here actually looks like complete overkill to me.

IIRC when the device is removed all subsequent sysfs directories are removed 
automatically as well.

So calling sysfs_remove_group() is superflous in the first place.

Regards,
Christian.

Am 29.11.19 um 09:34 schrieb Nirmoy:
> Luben, This should take care of the warnings that you get when a navi 
> fw file is missing from initrd.
>
>
> Regards,
>
> Nirmoy
>
> On 11/29/19 9:26 AM, Yintian Tao wrote:
>> When load amdgpu failed before create pm_sysfs and ucode_sysfs, the 
>> pm_sysfs and ucode_sysfs should not be removed.
>> Otherwise, there will be warning call trace just like below.
>> [   24.836386] [drm] VCE initialized successfully.
>> [   24.841352] amdgpu :00:07.0: amdgpu_device_ip_init failed [   
>> 25.370383] amdgpu :00:07.0: Fatal error during GPU init [   
>> 25.889575] [drm] amdgpu: finishing device.
>> [   26.069128] amdgpu :00:07.0: [drm:amdgpu_ring_test_helper 
>> [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [   26.070110] 
>> [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed [   
>> 26.200309] [TTM] Finalizing pool allocator [   26.200314] [TTM] 
>> Finalizing DMA pool allocator [   26.200349] [TTM] Zone  kernel: Used 
>> memory at exit: 0 KiB [   26.200351] [TTM] Zone   dma32: Used memory 
>> at exit: 0 KiB [   26.200353] [drm] amdgpu: ttm finalized [   
>> 26.205329] [ cut here ] [   26.205330] sysfs 
>> group 'fw_version' not found for kobject ':00:07.0'
>> [   26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256
>> sysfs_remove_group+0x80/0x90
>> [   26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE)
>> drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea 
>> sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd 
>> grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio 
>> crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec 
>> ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer 
>> input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel
>> aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables 
>> x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi 
>> floppy [   26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G OE 
>> 5.2.0-rc1 #1 [   26.205370] Hardware name: QEMU Standard PC (i440FX + 
>> PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [   26.205372] 
>> RIP: 0010:sysfs_remove_group+0x80/0x90 [   26.205374] Code: e8 35 b9 
>> ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8
>> f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60
>> 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
>> 00 00 55
>> [   26.205375] RSP: 0018:bee242b0b908 EFLAGS: 00010282 [   
>> 26.205376] RAX:  RBX:  RCX:
>> 0006
>> [   26.205377] RDX: 0007 RSI: 0092 RDI: 
>> 97ad6f817380
>> [   26.205377] RBP: bee242b0b920 R08: 98f520c4 R09: 
>> 02b3
>> [   26.205378] R10: bee242b0b8f8 R11: 02b3 R12: 
>> c0e58240
>> [   26.205379] R13: 97ad6d1fe0b0 R14: 97ad4db954c8 R15: 
>> 97ad4db7fff0
>> [   26.205380] FS:  7ff3d8a1c4c0() GS:97ad6f80()
>> knlGS:
>> [   26.205381] CS:  0010 DS:  ES:  CR0: 80050033 [   
>> 26.205381] CR2: 7f9b2ef1df04 CR3: 00042aab8001 CR4:
>> 003606f0
>> 

RE: [PATCH] drm/amdgpu: put cancel dealyed work at first

2019-11-18 Thread Tao, Yintian
Hi  Christian

Thanks, I got it, I will use flush_delayed_work to replace canceling.

Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian  
Sent: 2019年11月18日 19:32
To: Tao, Yintian 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: put cancel dealyed work at first

Good catch, but I would still prefer to use flush_delayed_work() instead of 
canceling it.

Regards,
Christian.

Am 18.11.19 um 09:21 schrieb Yintian Tao:
> There is one regression from 042f3d7b745cd76aa and one improvement 
> here.
> -regression:
> put flush_delayed_work after adev->shutdown = true which will make 
> amdgpu_ih_process not response the irq At last, all ib ring tests will 
> be failed just like below
>
> [drm] amdgpu: finishing device.
> [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback 
> timer expired on ring comp_1.0.0 [drm] Fence fallback timer expired on 
> ring comp_1.1.0 [drm] Fence fallback timer expired on ring comp_1.2.0 
> [drm] Fence fallback timer expired on ring comp_1.3.0 [drm] Fence 
> fallback timer expired on ring comp_1.0.1 amdgpu :00:07.0: 
> [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 
> (-110).
> amdgpu :00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test 
> failed on comp_1.2.1 (-110).
> amdgpu :00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test 
> failed on comp_1.3.1 (-110).
> amdgpu :00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test 
> failed on sdma0 (-110).
> amdgpu :00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test 
> failed on sdma1 (-110).
> amdgpu :00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test 
> failed on uvd_enc_0.0 (-110).
> amdgpu :00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test 
> failed on vce0 (-110).
> [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test 
> failed (-110).
>
> -improvement:
> In fact, there is cancel_delayed_work_sync in this fucntion So there 
> is no need to invoke flush_delayed_work before 
> cancel_delayed_work_sync. Just put cancel at first
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 17be6389adf7..a2454c3efc65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3109,10 +3109,9 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   int r;
>   
>   DRM_INFO("amdgpu: finishing device.\n");
> + cancel_delayed_work_sync(>delayed_init_work);
>   adev->shutdown = true;
>   
> - flush_delayed_work(>delayed_init_work);
> -
>   /* disable all interrupts */
>   amdgpu_irq_disable_all(adev);
>   if (adev->mode_info.mode_config_initialized){
> @@ -3130,7 +3129,6 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>   adev->firmware.gpu_info_fw = NULL;
>   }
>   adev->accel_working = false;
> - cancel_delayed_work_sync(>delayed_init_work);
>   /* free i2c buses */
>   if (!amdgpu_device_has_dc_support(adev))
>   amdgpu_i2c_fini(adev);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: register pm sysfs for sriov

2019-06-05 Thread Tao, Yintian
Hi  Alex

Many thanks for your review.

Best Regards
Yintian Tao

-Original Message-
From: Alex Deucher  
Sent: Thursday, June 06, 2019 10:28 AM
To: Tao, Yintian 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: register pm sysfs for sriov

On Wed, Jun 5, 2019 at 9:54 AM Yintian Tao  wrote:
>
> we need register pm sysfs for virt in order to support dpm level 
> modification because smu ip block will not be added under SRIOV
>
> Signed-off-by: Yintian Tao 
> Change-Id: Ib0e13934c0c33da00f9d2add6be25a373c6fb957
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 61 
> --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h |  2 +
>  3 files changed, 65 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index d00fd5d..9b9d387 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2695,6 +2695,9 @@ int amdgpu_device_init(struct amdgpu_device 
> *adev,
>
> amdgpu_fbdev_init(adev);
>
> +   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev))
> +   amdgpu_virt_pm_sysfs_init(adev);
> +
> r = amdgpu_pm_sysfs_init(adev);
> if (r)
> DRM_ERROR("registering pm debugfs failed (%d).\n", r); 
> @@ -2816,6 +2819,9 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
> iounmap(adev->rmmio);
> adev->rmmio = NULL;
> amdgpu_device_doorbell_fini(adev);
> +   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev))
> +   amdgpu_virt_pm_sysfs_fini(adev);
> +
> amdgpu_debugfs_regs_cleanup(adev);
> device_remove_file(adev->dev, _attr_pcie_replay_count);
> amdgpu_ucode_sysfs_fini(adev); diff --git 
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index a73e190..b6f16d45 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -269,8 +269,11 @@ static ssize_t 
> amdgpu_get_dpm_forced_performance_level(struct device *dev,
> struct amdgpu_device *adev = ddev->dev_private;
> enum amd_dpm_forced_level level = 0xff;
>
> -   if  ((adev->flags & AMD_IS_PX) &&
> -(ddev->switch_power_state != DRM_SWITCH_POWER_ON))
> +   if (amdgpu_sriov_vf(adev))
> +   return 0;
> +
> +   if ((adev->flags & AMD_IS_PX) &&
> +   (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
> return snprintf(buf, PAGE_SIZE, "off\n");
>
> if (is_support_sw_smu(adev))
> @@ -308,9 +311,11 @@ static ssize_t 
> amdgpu_set_dpm_forced_performance_level(struct device *dev,
>  (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
> return -EINVAL;
>
> -   if (is_support_sw_smu(adev))
> +   if (!amdgpu_sriov_vf(adev) && is_support_sw_smu(adev))
> current_level = smu_get_performance_level(>smu);
> -   else if (adev->powerplay.pp_funcs->get_performance_level)
> +   else if (!amdgpu_sriov_vf(adev) &&
> +adev->powerplay.pp_funcs &&
> +adev->powerplay.pp_funcs->get_performance_level)
> current_level = 
> amdgpu_dpm_get_performance_level(adev);

Wrap the entire existing block in if (!amdgpu_sriov_vf(adev) rather than adding 
the check to each case.

>
> if (strncmp("low", buf, strlen("low")) == 0) { @@ -907,6 
> +912,10 @@ static ssize_t amdgpu_get_pp_dpm_mclk(struct device *dev,
> struct drm_device *ddev = dev_get_drvdata(dev);
> struct amdgpu_device *adev = ddev->dev_private;
>
> +   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
> +   adev->virt.ops->get_pp_clk)
> +   return adev->virt.ops->get_pp_clk(adev,PP_MCLK,buf);
> +
> if (is_support_sw_smu(adev))
> return smu_print_clk_levels(>smu, PP_MCLK, buf);
> else if (adev->powerplay.pp_funcs->print_clock_levels)
> @@ -925,6 +934,9 @@ static ssize_t amdgpu_set_pp_dpm_mclk(struct device *dev,
> int ret;
> uint32_t mask = 0;
>
> +   if (amdgpu_sriov_vf(adev))
> +   return 0;
> +
> ret = amdgpu_read_mask(buf, count, );
> if (ret)
> return ret;
> @@ -965,6 +977,9 @@ static ssize_t amdgpu_set_pp_dpm_socclk(struct device 
> *dev,
> int ret;
> uint32_t mask = 0;
>
> +

RE: [PATCH] drm/amdgpu: register pm sysfs for sriov

2019-06-05 Thread Tao, Yintian
Hi  Alex

Can you help have a review? Thanks in advance.


Best Regards
Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Yintian Tao
Sent: Wednesday, June 05, 2019 10:09 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 

Cc: Tao, Yintian 
Subject: [PATCH] drm/amdgpu: register pm sysfs for sriov

we need register pm sysfs for virt in order to support dpm level modification 
because smu ip block will not be added under SRIOV

Signed-off-by: Yintian Tao 
Change-Id: Ib0e13934c0c33da00f9d2add6be25a373c6fb957
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 61 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h |  2 +
 3 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d00fd5d..9b9d387 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2695,6 +2695,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
amdgpu_fbdev_init(adev);
 
+   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev))
+   amdgpu_virt_pm_sysfs_init(adev);
+
r = amdgpu_pm_sysfs_init(adev);
if (r)
DRM_ERROR("registering pm debugfs failed (%d).\n", r); @@ 
-2816,6 +2819,9 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
iounmap(adev->rmmio);
adev->rmmio = NULL;
amdgpu_device_doorbell_fini(adev);
+   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev))
+   amdgpu_virt_pm_sysfs_fini(adev);
+
amdgpu_debugfs_regs_cleanup(adev);
device_remove_file(adev->dev, _attr_pcie_replay_count);
amdgpu_ucode_sysfs_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index a73e190..93e5205 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -269,8 +269,11 @@ static ssize_t 
amdgpu_get_dpm_forced_performance_level(struct device *dev,
struct amdgpu_device *adev = ddev->dev_private;
enum amd_dpm_forced_level level = 0xff;
 
-   if  ((adev->flags & AMD_IS_PX) &&
-(ddev->switch_power_state != DRM_SWITCH_POWER_ON))
+   if (amdgpu_sriov_vf(adev))
+   return 0;
+
+   if ((adev->flags & AMD_IS_PX) &&
+   (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
return snprintf(buf, PAGE_SIZE, "off\n");
 
if (is_support_sw_smu(adev))
@@ -308,9 +311,11 @@ static ssize_t 
amdgpu_set_dpm_forced_performance_level(struct device *dev,
 (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
return -EINVAL;
 
-   if (is_support_sw_smu(adev))
+   if (!amdgpu_sriov_vf(adev) && is_support_sw_smu(adev))
current_level = smu_get_performance_level(>smu);
-   else if (adev->powerplay.pp_funcs->get_performance_level)
+   else if (!amdgpu_sriov_vf(adev) &&
+adev->powerplay.pp_funcs &&
+adev->powerplay.pp_funcs->get_performance_level)
current_level = amdgpu_dpm_get_performance_level(adev);
 
if (strncmp("low", buf, strlen("low")) == 0) { @@ -885,6 +890,9 @@ 
static ssize_t amdgpu_set_pp_dpm_sclk(struct device *dev,
int ret;
uint32_t mask = 0;
 
+   if (amdgpu_sriov_vf(adev))
+   return 0;
+
ret = amdgpu_read_mask(buf, count, );
if (ret)
return ret;
@@ -907,6 +915,10 @@ static ssize_t amdgpu_get_pp_dpm_mclk(struct device *dev,
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = ddev->dev_private;
 
+   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
+   adev->virt.ops->get_pp_clk)
+   return adev->virt.ops->get_pp_clk(adev,PP_MCLK,buf);
+
if (is_support_sw_smu(adev))
return smu_print_clk_levels(>smu, PP_MCLK, buf);
else if (adev->powerplay.pp_funcs->print_clock_levels)
@@ -925,6 +937,9 @@ static ssize_t amdgpu_set_pp_dpm_mclk(struct device *dev,
int ret;
uint32_t mask = 0;
 
+   if (amdgpu_sriov_vf(adev))
+   return 0;
+
ret = amdgpu_read_mask(buf, count, );
if (ret)
return ret;
@@ -2698,6 +2713,44 @@ void amdgpu_pm_print_power_states(struct amdgpu_device 
*adev)
 
 }
 
+int amdgpu_virt_pm_sysfs_init(struct amdgpu_device *adev) {
+   int ret = 0;
+
+   if (!(amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev)))
+   return ret;
+
+   ret = device_create_file(adev->dev, _attr_pp_dpm_sclk);
+   if (ret) {
+   DRM_ERROR("failed to create device file pp_d

答复: 答复: [PATCH] drm/amdgpu: no need fbcon under sriov

2019-06-04 Thread Tao, Yintian
Yes, you are right the error message is at wrong place.


May I just remove this message?



Best Regards

Yintian Tao


发件人: Koenig, Christian
发送时间: 2019年6月4日 23:22:00
收件人: Tao, Yintian; amd-gfx@lists.freedesktop.org
主题: Re: 答复: [PATCH] drm/amdgpu: no need fbcon under sriov

Am 04.06.19 um 17:16 schrieb Tao, Yintian:

Hi  Christian


But when amdgpu driver is unloading, it will call this function.


And driver unloading is an legal case under SR-IOV.


Do you mean PCIe device removal indicates the unplug the real device?

Yes, exactly and that is not supported.

Sounds like the error message is then on the wrong place.

Christian.



Best Regards

Yitnian Tao


发件人: Christian König 
<mailto:ckoenig.leichtzumer...@gmail.com>
发送时间: 2019年6月4日 21:57:37
收件人: Tao, Yintian; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
主题: Re: [PATCH] drm/amdgpu: no need fbcon under sriov

Am 04.06.19 um 15:43 schrieb Yintian Tao:
> Under Sriov, there is no need of the support for fbcon.

NAK, that error message is not related to fbcon but means that PCIe
device removal is not supported.

Christian.

>
> Signed-off-by: Yintian Tao <mailto:yt...@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 1f38d6f..28d095b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1012,7 +1012,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   {
>struct drm_device *dev = pci_get_drvdata(pdev);
>
> - DRM_ERROR("Device removal is currently not supported outside of 
> fbcon\n");
> + if (!amdgpu_sriov_vf(adev))
> + DRM_ERROR("Device removal is currently not supported outside of 
> fbcon\n");
>drm_dev_unplug(dev);
>drm_dev_put(dev);
>pci_disable_device(pdev);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

答复: [PATCH] drm/amdgpu: no need fbcon under sriov

2019-06-04 Thread Tao, Yintian
Hi  Christian


But when amdgpu driver is unloading, it will call this function.


And driver unloading is an legal case under SR-IOV.


Do you mean PCIe device removal indicates the unplug the real device?


Best Regards

Yitnian Tao


发件人: Christian K?nig 
发送时间: 2019年6月4日 21:57:37
收件人: Tao, Yintian; amd-gfx@lists.freedesktop.org
主题: Re: [PATCH] drm/amdgpu: no need fbcon under sriov

Am 04.06.19 um 15:43 schrieb Yintian Tao:
> Under Sriov, there is no need of the support for fbcon.

NAK, that error message is not related to fbcon but means that PCIe
device removal is not supported.

Christian.

>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 1f38d6f..28d095b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1012,7 +1012,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
>   {
>struct drm_device *dev = pci_get_drvdata(pdev);
>
> - DRM_ERROR("Device removal is currently not supported outside of 
> fbcon\n");
> + if (!amdgpu_sriov_vf(adev))
> + DRM_ERROR("Device removal is currently not supported outside of 
> fbcon\n");
>drm_dev_unplug(dev);
>drm_dev_put(dev);
>pci_disable_device(pdev);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov

2019-05-20 Thread Tao, Yintian
Hi  Alex

So sorry for my missing of you patch because the outlook on website didn’t show 
it.

Your patch seems cleaner and better. Can you help submit it ? Thanks in advance.

Reviewed-by: Yintian Tao


Can

From: Deucher, Alexander 
Sent: Saturday, May 18, 2019 1:07 AM
To: Tao, Yintian ; Alex Deucher 
Cc: amd-gfx@lists.freedesktop.org; Koenig, Christian 
; Huang, Trigger 
Subject: Re: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov

Did you see the patch I attached?

Alex

From: Tao, Yintian
Sent: Friday, May 17, 2019 10:51 AM
To: Alex Deucher
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; 
Koenig, Christian; Deucher, Alexander; Huang, Trigger
Subject: 答复: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov


Hi Alex





Many thanks for your review. I will merge these two patches into one and submit 
again.





Best Regards

Yintian Tao


发件人: Alex Deucher mailto:alexdeuc...@gmail.com>>
发送时间: 2019年5月17日 22:34:30
收件人: Tao, Yintian
抄送: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; 
Koenig, Christian; Deucher, Alexander; Huang, Trigger
主题: Re: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov

[CAUTION: External Email]

How about combining these two patches into one?  This seems cleaner.

Alex

On Thu, May 16, 2019 at 10:39 PM Tao, Yintian 
mailto:yintian@amd.com>> wrote:
>
> Ping...
>
> Hi Christian and Alex
>
>
> Can you help review this? Thanks in advance.
>
>
> Best Regards
> Yintian Tao
>
> -Original Message-
> From: Yintian Tao mailto:yt...@amd.com>>
> Sent: Thursday, May 16, 2019 8:03 PM
> To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> Cc: Tao, Yintian mailto:yintian@amd.com>>; Huang, 
> Trigger mailto:trigger.hu...@amd.com>>
> Subject: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov
>
> For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN, and DF 
> related registers is not readable, seems hardcord is the only way to set the 
> correct vram_width
>
> Signed-off-by: Trigger Huang 
> mailto:trigger.hu...@amd.com>>
> Signed-off-by: Yintian Tao mailto:yt...@amd.com>>
> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index c221570..a417763 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -848,6 +848,13 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
> adev->gmc.vram_width = numchan * chansize;
> }
>
> +   /* For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN,
> +* and DF related registers is not readable, seems hardcord is the
> +* only way to set the correct vram_width */
> +   if (amdgpu_sriov_vf(adev) && (adev->asic_type == CHIP_VEGA10)) {
> +   adev->gmc.vram_width = 2048;
> +   }
> +
> /* size in MB on si */
> adev->gmc.mc_vram_size =
> adev->nbio_funcs->get_memsize(adev) * 1024ULL * 1024ULL;
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: no read DF register under SRIOV and set correct vram width

2019-05-20 Thread Tao, Yintian
Please ignore it. I miss the patch which Alex attached.

-Original Message-
From: amd-gfx  On Behalf Of Yintian Tao
Sent: Monday, May 20, 2019 5:21 PM
To: amd-gfx@lists.freedesktop.org
Cc: Huang, Trigger ; Liu, Monk ; Tao, 
Yintian 
Subject: [PATCH] drm/amdgpu: no read DF register under SRIOV and set correct 
vram width

[CAUTION: External Email]

PART1:
Under SRIOV, reading DF register has chance to lead to AER error in host side, 
just skip reading it.
PART2:
For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN, and DF related 
registers is not readable, seems hardcord is the only way to set the correct 
vram_width.

Signed-off-by: Trigger Huang 
Signed-off-by: Monk Liu 
Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index c221570..b5bf9ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -837,7 +837,7 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)

if (amdgpu_emu_mode != 1)
adev->gmc.vram_width = amdgpu_atomfirmware_get_vram_width(adev);
-   if (!adev->gmc.vram_width) {
+   if (!adev->gmc.vram_width && !amdgpu_sriov_vf(adev)) {
/* hbm memory channel size */
if (adev->flags & AMD_IS_APU)
chansize = 64;
@@ -848,6 +848,13 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
adev->gmc.vram_width = numchan * chansize;
}

+   /* For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN,
+* and DF related registers is not readable, seems hardcord is the
+* only way to set the correct vram_width */
+   if (amdgpu_sriov_vf(adev) && (adev->asic_type == CHIP_VEGA10)) {
+   adev->gmc.vram_width = 2048;
+   }
+
/* size in MB on si */
adev->gmc.mc_vram_size =
adev->nbio_funcs->get_memsize(adev) * 1024ULL * 1024ULL;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

答复: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov

2019-05-17 Thread Tao, Yintian
Hi Alex



Many thanks for your review. I will merge these two patches into one and submit 
again.



Best Regards

Yintian Tao


发件人: Alex Deucher 
发送时间: 2019年5月17日 22:34:30
收件人: Tao, Yintian
抄送: amd-gfx@lists.freedesktop.org; Koenig, Christian; Deucher, Alexander; 
Huang, Trigger
主题: Re: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov

[CAUTION: External Email]

How about combining these two patches into one?  This seems cleaner.

Alex

On Thu, May 16, 2019 at 10:39 PM Tao, Yintian  wrote:
>
> Ping...
>
> Hi Christian and Alex
>
>
> Can you help review this? Thanks in advance.
>
>
> Best Regards
> Yintian Tao
>
> -Original Message-
> From: Yintian Tao 
> Sent: Thursday, May 16, 2019 8:03 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Tao, Yintian ; Huang, Trigger 
> Subject: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov
>
> For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN, and DF 
> related registers is not readable, seems hardcord is the only way to set the 
> correct vram_width
>
> Signed-off-by: Trigger Huang 
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index c221570..a417763 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -848,6 +848,13 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
> adev->gmc.vram_width = numchan * chansize;
> }
>
> +   /* For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN,
> +* and DF related registers is not readable, seems hardcord is the
> +* only way to set the correct vram_width */
> +   if (amdgpu_sriov_vf(adev) && (adev->asic_type == CHIP_VEGA10)) {
> +   adev->gmc.vram_width = 2048;
> +   }
> +
> /* size in MB on si */
> adev->gmc.mc_vram_size =
> adev->nbio_funcs->get_memsize(adev) * 1024ULL * 1024ULL;
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

答复: 答复: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV

2019-05-17 Thread Tao, Yintian
Hi  Chrisitian



Yes, of course. Thanks for your reminder.



Best Regards
Yintian Tao


发件人: Christian König 
发送时间: 2019年5月17日 15:20:54
收件人: Tao, Yintian; Koenig, Christian
抄送: amd-gfx@lists.freedesktop.org; Liu, Monk
主题: Re: 答复: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV

[CAUTION: External Email]
Hi Yintian,

please add this as a code comment to the patch.

Christian.

Am 17.05.19 um 09:17 schrieb Tao, Yintian:

Hi  Christian


Many thanks for your review.


The background is that this bo is to let psp load sos and sysdrv but under 
sriov, sos and sysdrv is loaded by VBIOS or hypervisor driver.


The reason why not let guest driver to load it under SRIOV is that it is not 
safe.



Best Regards

Yintian Tao


发件人: Koenig, Christian
发送时间: 2019年5月17日 14:53:35
收件人: Tao, Yintian
抄送: Liu, Monk; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
主题: Re: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV

Looks good to me now, but I don't know the technical background why this
BO is not needed under SRIOV.

So this patch is Acked-by: Christian König 
<mailto:christian.koe...@amd.com>.

Regards,
Christian.

Am 17.05.19 um 04:41 schrieb Tao, Yintian:
> Hi Christian
>
>
> I have modified it according to your suggestion. Can you help review this 
> again? Thanks in advance.
>
>
> Best Regards
> Yintian Tao
>
> -Original Message-
> From: Yintian Tao <mailto:yt...@amd.com>
> Sent: Thursday, May 16, 2019 7:54 PM
> To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
> Cc: Tao, Yintian <mailto:yintian@amd.com>; Liu, Monk 
> <mailto:monk@amd.com>
> Subject: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV
>
> PSP fw primary buffer is not used under SRIOV.
> Therefore, we don't need to allocate memory for it.
>
> v2: remove superfluous check for amdgpu_bo_free_kernel().
>
> Signed-off-by: Yintian Tao <mailto:yt...@amd.com>
> Signed-off-by: Monk Liu <mailto:monk@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 17 ++---
>   1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index c567a55..af9835c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -905,13 +905,16 @@ static int psp_load_fw(struct amdgpu_device *adev)
>if (!psp->cmd)
>return -ENOMEM;
>
> - ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
> - AMDGPU_GEM_DOMAIN_GTT,
> - >fw_pri_bo,
> - >fw_pri_mc_addr,
> - >fw_pri_buf);
> - if (ret)
> - goto failed;
> + /* this fw pri bo is not used under SRIOV */
> + if (!amdgpu_sriov_vf(psp->adev)) {
> + ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
> +   AMDGPU_GEM_DOMAIN_GTT,
> +   >fw_pri_bo,
> +   >fw_pri_mc_addr,
> +   >fw_pri_buf);
> + if (ret)
> + goto failed;
> + }
>
>ret = amdgpu_bo_create_kernel(adev, PSP_FENCE_BUFFER_SIZE, PAGE_SIZE,
>AMDGPU_GEM_DOMAIN_VRAM,




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

答复: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV

2019-05-17 Thread Tao, Yintian
Hi  Christian


Many thanks for your review.


The background is that this bo is to let psp load sos and sysdrv but under 
sriov, sos and sysdrv is loaded by VBIOS or hypervisor driver.


The reason why not let guest driver to load it under SRIOV is that it is not 
safe.



Best Regards

Yintian Tao


发件人: Koenig, Christian
发送时间: 2019年5月17日 14:53:35
收件人: Tao, Yintian
抄送: Liu, Monk; amd-gfx@lists.freedesktop.org
主题: Re: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV

Looks good to me now, but I don't know the technical background why this
BO is not needed under SRIOV.

So this patch is Acked-by: Christian König .

Regards,
Christian.

Am 17.05.19 um 04:41 schrieb Tao, Yintian:
> Hi Christian
>
>
> I have modified it according to your suggestion. Can you help review this 
> again? Thanks in advance.
>
>
> Best Regards
> Yintian Tao
>
> -Original Message-
> From: Yintian Tao 
> Sent: Thursday, May 16, 2019 7:54 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Tao, Yintian ; Liu, Monk 
> Subject: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV
>
> PSP fw primary buffer is not used under SRIOV.
> Therefore, we don't need to allocate memory for it.
>
> v2: remove superfluous check for amdgpu_bo_free_kernel().
>
> Signed-off-by: Yintian Tao 
> Signed-off-by: Monk Liu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 17 ++---
>   1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index c567a55..af9835c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -905,13 +905,16 @@ static int psp_load_fw(struct amdgpu_device *adev)
>if (!psp->cmd)
>return -ENOMEM;
>
> - ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
> - AMDGPU_GEM_DOMAIN_GTT,
> - >fw_pri_bo,
> - >fw_pri_mc_addr,
> - >fw_pri_buf);
> - if (ret)
> - goto failed;
> + /* this fw pri bo is not used under SRIOV */
> + if (!amdgpu_sriov_vf(psp->adev)) {
> + ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
> +   AMDGPU_GEM_DOMAIN_GTT,
> +   >fw_pri_bo,
> +   >fw_pri_mc_addr,
> +   >fw_pri_buf);
> + if (ret)
> + goto failed;
> + }
>
>ret = amdgpu_bo_create_kernel(adev, PSP_FENCE_BUFFER_SIZE, PAGE_SIZE,
>AMDGPU_GEM_DOMAIN_VRAM,

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV

2019-05-16 Thread Tao, Yintian
Hi Christian


I have modified it according to your suggestion. Can you help review this 
again? Thanks in advance.


Best Regards
Yintian Tao

-Original Message-
From: Yintian Tao  
Sent: Thursday, May 16, 2019 7:54 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tao, Yintian ; Liu, Monk 
Subject: [PATCH] drm/amdgpu: skip fw pri bo alloc for SRIOV

PSP fw primary buffer is not used under SRIOV.
Therefore, we don't need to allocate memory for it.

v2: remove superfluous check for amdgpu_bo_free_kernel().

Signed-off-by: Yintian Tao 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c567a55..af9835c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -905,13 +905,16 @@ static int psp_load_fw(struct amdgpu_device *adev)
if (!psp->cmd)
return -ENOMEM;
 
-   ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
-   AMDGPU_GEM_DOMAIN_GTT,
-   >fw_pri_bo,
-   >fw_pri_mc_addr,
-   >fw_pri_buf);
-   if (ret)
-   goto failed;
+   /* this fw pri bo is not used under SRIOV */
+   if (!amdgpu_sriov_vf(psp->adev)) {
+   ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
+ AMDGPU_GEM_DOMAIN_GTT,
+ >fw_pri_bo,
+ >fw_pri_mc_addr,
+ >fw_pri_buf);
+   if (ret)
+   goto failed;
+   }
 
ret = amdgpu_bo_create_kernel(adev, PSP_FENCE_BUFFER_SIZE, PAGE_SIZE,
AMDGPU_GEM_DOMAIN_VRAM,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: don't read DF register for SRIOV

2019-05-16 Thread Tao, Yintian
Ping...

Hi Christian and Alex


Can you help review this? Thanks in advance.


Best Regards
Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Yintian Tao
Sent: Thursday, May 16, 2019 8:11 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Tao, Yintian 
Subject: [PATCH] drm/amdgpu: don't read DF register for SRIOV

[CAUTION: External Email]

Under SRIOV, reading DF register has chance to lead to AER error in host side, 
just skip reading it.

Signed-off-by: Monk Liu 
Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index a417763..b5bf9ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -837,7 +837,7 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)

if (amdgpu_emu_mode != 1)
adev->gmc.vram_width = amdgpu_atomfirmware_get_vram_width(adev);
-   if (!adev->gmc.vram_width) {
+   if (!adev->gmc.vram_width && !amdgpu_sriov_vf(adev)) {
/* hbm memory channel size */
if (adev->flags & AMD_IS_APU)
chansize = 64;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov

2019-05-16 Thread Tao, Yintian
Ping...

Hi Christian and Alex


Can you help review this? Thanks in advance.


Best Regards
Yintian Tao

-Original Message-
From: Yintian Tao  
Sent: Thursday, May 16, 2019 8:03 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tao, Yintian ; Huang, Trigger 
Subject: [PATCH] drm/amdgpu: set correct vram_width for vega10 under sriov

For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN, and DF related 
registers is not readable, seems hardcord is the only way to set the correct 
vram_width

Signed-off-by: Trigger Huang 
Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index c221570..a417763 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -848,6 +848,13 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
adev->gmc.vram_width = numchan * chansize;
}
 
+   /* For Vega10 SR-IOV, vram_width can't be read from ATOM as RAVEN,
+* and DF related registers is not readable, seems hardcord is the
+* only way to set the correct vram_width */
+   if (amdgpu_sriov_vf(adev) && (adev->asic_type == CHIP_VEGA10)) {
+   adev->gmc.vram_width = 2048;
+   }
+
/* size in MB on si */
adev->gmc.mc_vram_size =
adev->nbio_funcs->get_memsize(adev) * 1024ULL * 1024ULL;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: disable DRIVER_ATOMIC under SRIOV

2019-04-16 Thread Tao, Yintian
Ping...

Hi  Christian and Alex

Can you help have a review on it? Thanks in advance.

Best Regards
Yintian Tao

-Original Message-
From: Yintian Tao  
Sent: Tuesday, April 16, 2019 2:09 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tao, Yintian 
Subject: [PATCH] drm/amdgpu: disable DRIVER_ATOMIC under SRIOV

Under SRIOV, we need disable DRIVER_ATOMIC.
Otherwise, it will trigger WARN_ON at drm_universal_plane_init.

Change-Id: I96a78d6e45b3a67ab9b9534e7071ae5daacc0f4f
Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 7e7f9ed..7d484fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -36,6 +36,7 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
/* enable virtual display */
adev->mode_info.num_crtc = 1;
adev->enable_virtual_display = true;
+   adev->ddev->driver->driver_features &= ~DRIVER_ATOMIC;
adev->cg_flags = 0;
adev->pg_flags = 0;
 }
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: support dpm level modification under virtualization v3

2019-04-10 Thread Tao, Yintian
Hi Alex

Many thanks for your review.

Best Regards
Yintian Tao

-Original Message-
From: Alex Deucher  
Sent: Wednesday, April 10, 2019 11:32 PM
To: Tao, Yintian 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: support dpm level modification under 
virtualization v3

On Wed, Apr 10, 2019 at 10:25 AM Yintian Tao  wrote:
>
> Under vega10 virtualuzation, smu ip block will not be added.
> Therefore, we need add pp clk query and force dpm level function at 
> amdgpu_virt_ops to support the feature.
>
> v2: add get_pp_clk existence check and use kzalloc to allocate buf
>
> v3: return -ENOMEM for allocation failure and correct the coding style
>
> Change-Id: I713419c57b854082f6f739f1d32a055c7115e620
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  4 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 15 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 49 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   | 11 +
>  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 78 
> ++
>  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h  |  6 +++
>  7 files changed, 164 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3ff8899..bb0fd5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2486,6 +2486,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> mutex_init(>virt.vf_errors.lock);
> hash_init(adev->mn_hash);
> mutex_init(>lock_reset);
> +   mutex_init(>virt.dpm_mutex);
>
> amdgpu_device_check_arguments(adev);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 6190495..29ec28f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -727,6 +727,10 @@ static int amdgpu_info_ioctl(struct drm_device *dev, 
> void *data, struct drm_file
> if (adev->pm.dpm_enabled) {
> dev_info.max_engine_clock = amdgpu_dpm_get_sclk(adev, 
> false) * 10;
> dev_info.max_memory_clock = 
> amdgpu_dpm_get_mclk(adev, false) * 10;
> +   } else if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
> +  adev->virt.ops->get_pp_clk) {
> +   dev_info.max_engine_clock = 
> amdgpu_virt_get_sclk(adev, false) * 10;
> +   dev_info.max_memory_clock = 
> + amdgpu_virt_get_mclk(adev, false) * 10;
> } else {
> dev_info.max_engine_clock = adev->clock.default_sclk 
> * 10;
> dev_info.max_memory_clock = 
> adev->clock.default_mclk * 10; diff --git 
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 5540259..0162d1e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -380,6 +380,17 @@ static ssize_t 
> amdgpu_set_dpm_forced_performance_level(struct device *dev,
> goto fail;
> }
>
> +if (amdgpu_sriov_vf(adev)) {
> +if (amdgim_is_hwperf(adev) &&
> +adev->virt.ops->force_dpm_level) {
> +mutex_lock(>pm.mutex);
> +adev->virt.ops->force_dpm_level(adev, level);
> +mutex_unlock(>pm.mutex);
> +return count;
> +} else
> +return -EINVAL;

Coding style.  If any clause as parens, all should.  E.g., this should be :

} else {
return -EINVAL;
}

With that fixed, this patch is:
Reviewed-by: Alex Deucher 

> +}
> +
> if (current_level == level)
> return count;
>
> @@ -843,6 +854,10 @@ static ssize_t amdgpu_get_pp_dpm_sclk(struct device *dev,
> struct drm_device *ddev = dev_get_drvdata(dev);
> struct amdgpu_device *adev = ddev->dev_private;
>
> +   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
> +   adev->virt.ops->get_pp_clk)
> +   return adev->virt.ops->get_pp_clk(adev, PP_SCLK, buf);
> +
> if (is_support_sw_smu(adev))
> return smu_print_clk_levels(>smu, PP_SCLK, buf);
> else if (adev->powerplay.pp_funcs->print_clock_levels)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> inde

RE: [PATCH] drm/amdgpu: support dpm level modification under virtualization v2

2019-04-10 Thread Tao, Yintian
Hi  Christian

Many thanks for your review. I will correct the patch by v3

Best Regards
Yintian Tao

-Original Message-
From: Christian König  
Sent: Wednesday, April 10, 2019 10:02 PM
To: Tao, Yintian ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: support dpm level modification under 
virtualization v2

Am 10.04.19 um 15:02 schrieb Yintian Tao:
> Under vega10 virtualuzation, smu ip block will not be added.
> Therefore, we need add pp clk query and force dpm level function at 
> amdgpu_virt_ops to support the feature.
>
> v2: add get_pp_clk existence check and use kzalloc to allocate buf
>
> Change-Id: I713419c57b854082f6f739f1d32a055c7115e620
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  4 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 15 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 49 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   | 11 +
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 78 
> ++
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h  |  6 +++
>   7 files changed, 164 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3ff8899..bb0fd5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2486,6 +2486,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   mutex_init(>virt.vf_errors.lock);
>   hash_init(adev->mn_hash);
>   mutex_init(>lock_reset);
> + mutex_init(>virt.dpm_mutex);
>   
>   amdgpu_device_check_arguments(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 6190495..29ec28f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -727,6 +727,10 @@ static int amdgpu_info_ioctl(struct drm_device *dev, 
> void *data, struct drm_file
>   if (adev->pm.dpm_enabled) {
>   dev_info.max_engine_clock = amdgpu_dpm_get_sclk(adev, 
> false) * 10;
>   dev_info.max_memory_clock = amdgpu_dpm_get_mclk(adev, 
> false) * 
> 10;
> + } else if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
> +adev->virt.ops->get_pp_clk) {
> + dev_info.max_engine_clock = amdgpu_virt_get_sclk(adev, 
> false) * 10;
> + dev_info.max_memory_clock = amdgpu_virt_get_mclk(adev, 
> false) * 
> +10;
>   } else {
>   dev_info.max_engine_clock = adev->clock.default_sclk * 
> 10;
>   dev_info.max_memory_clock = adev->clock.default_mclk * 
> 10; diff 
> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 5540259..0162d1e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -380,6 +380,17 @@ static ssize_t 
> amdgpu_set_dpm_forced_performance_level(struct device *dev,
>   goto fail;
>   }
>   
> +if (amdgpu_sriov_vf(adev)) {
> +if (amdgim_is_hwperf(adev) &&
> +adev->virt.ops->force_dpm_level) {
> +mutex_lock(>pm.mutex);
> +adev->virt.ops->force_dpm_level(adev, level);
> +mutex_unlock(>pm.mutex);
> +return count;
> +} else
> +return -EINVAL;
> +}
> +
>   if (current_level == level)
>   return count;
>   
> @@ -843,6 +854,10 @@ static ssize_t amdgpu_get_pp_dpm_sclk(struct device *dev,
>   struct drm_device *ddev = dev_get_drvdata(dev);
>   struct amdgpu_device *adev = ddev->dev_private;
>   
> + if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
> + adev->virt.ops->get_pp_clk)
> + return adev->virt.ops->get_pp_clk(adev, PP_SCLK, buf);
> +
>   if (is_support_sw_smu(adev))
>   return smu_print_clk_levels(>smu, PP_SCLK, buf);
>   else if (adev->powerplay.pp_funcs->print_clock_levels)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> index 462a04e..efdb6b7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> @@ -375,4 +375,53 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device 
> *adev)
>   }

RE: [PATCH] drm/amdgpu: support dpm level modification under virtualization

2019-04-10 Thread Tao, Yintian
Hi  Christian


Many thanks.

The reason is that the reserved buffer for pf2vf communication is allocated 
from visible VRAM and the allocation granularity from it is page size at 
amdgpu_ttm_fw_reserve_vram_init.


Best Regards
Yintian Tao


-Original Message-
From: Koenig, Christian  
Sent: Wednesday, April 10, 2019 8:23 PM
To: Tao, Yintian ; Quan, Evan ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: support dpm level modification under 
virtualization

Hi Yintian,

yeah, kzalloc would obvious work.

But why do you need such a large buffer in the first place?

Rule of thumb is that each function should not use more than 1KB of stack, 
otherwise the compiler will raise a warning.

Christian.

Am 10.04.19 um 14:09 schrieb Tao, Yintian:
> Hi  Christian
>
>
> Thanks for your review. May I use the kzalloc to allocate the memory for the 
> buffer to avoid the stack problem you said?
>
> Because hypervisor driver will transfer the message through this page size 
> memory.
>
> Best Regards
> Yintian Tao
>
> -Original Message-
> From: Christian König 
> Sent: Wednesday, April 10, 2019 6:32 PM
> To: Quan, Evan ; Tao, Yintian 
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: support dpm level modification under 
> virtualization
>
> Am 10.04.19 um 11:58 schrieb Quan, Evan:
>>> -Original Message-
>>> From: amd-gfx  On Behalf Of 
>>> Yintian Tao
>>> Sent: 2019年4月9日 23:18
>>> To: amd-gfx@lists.freedesktop.org
>>> Cc: Tao, Yintian 
>>> Subject: [PATCH] drm/amdgpu: support dpm level modification under 
>>> virtualization
>>>
>>> Under vega10 virtualuzation, smu ip block will not be added.
>>> Therefore, we need add pp clk query and force dpm level function at 
>>> amdgpu_virt_ops to support the feature.
>>>
>>> Change-Id: I713419c57b854082f6f739f1d32a055c7115e620
>>> Signed-off-by: Yintian Tao 
>>> ---
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  3 ++
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 15 ++
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33 +
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   | 11 +
>>>drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 78
>>> ++
>>>drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h  |  6 +++
>>>7 files changed, 147 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 3ff8899..bb0fd5a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2486,6 +2486,7 @@ int amdgpu_device_init(struct amdgpu_device 
>>> *adev,
>>> mutex_init(>virt.vf_errors.lock);
>>> hash_init(adev->mn_hash);
>>> mutex_init(>lock_reset);
>>> +   mutex_init(>virt.dpm_mutex);
>>>
>>> amdgpu_device_check_arguments(adev);
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> index 6190495..1353955 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> @@ -727,6 +727,9 @@ static int amdgpu_info_ioctl(struct drm_device 
>>> *dev, void *data, struct drm_file
>>> if (adev->pm.dpm_enabled) {
>>> dev_info.max_engine_clock =
>>> amdgpu_dpm_get_sclk(adev, false) * 10;
>>> dev_info.max_memory_clock =
>>> amdgpu_dpm_get_mclk(adev, false) * 10;
>>> +   } else if (amdgpu_sriov_vf(adev)) {
>>> +   dev_info.max_engine_clock =
>>> amdgpu_virt_get_sclk(adev, false) * 10;
>>> +   dev_info.max_memory_clock =
>>> amdgpu_virt_get_mclk(adev, false) * 10;
>>> } else {
>>> dev_info.max_engine_clock = adev-
>>>> clock.default_sclk * 10;
>>> dev_info.max_memory_clock = adev-
>>>> clock.default_mclk * 10; diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>>> index 5540259..0162d1e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>>> @@ -380,6 +380,17 @@ static ssize_t
>>> amdgpu_set_dpm_forced_performance_level(struct device *dev,
>>>  

RE: [PATCH] drm/amdgpu: support dpm level modification under virtualization

2019-04-10 Thread Tao, Yintian
Hi  Christian


Thanks for your review. May I use the kzalloc to allocate the memory for the 
buffer to avoid the stack problem you said?

Because hypervisor driver will transfer the message through this page size 
memory.

Best Regards
Yintian Tao

-Original Message-
From: Christian König  
Sent: Wednesday, April 10, 2019 6:32 PM
To: Quan, Evan ; Tao, Yintian ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: support dpm level modification under 
virtualization

Am 10.04.19 um 11:58 schrieb Quan, Evan:
>
>> -Original Message-
>> From: amd-gfx  On Behalf Of 
>> Yintian Tao
>> Sent: 2019年4月9日 23:18
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Tao, Yintian 
>> Subject: [PATCH] drm/amdgpu: support dpm level modification under 
>> virtualization
>>
>> Under vega10 virtualuzation, smu ip block will not be added.
>> Therefore, we need add pp clk query and force dpm level function at 
>> amdgpu_virt_ops to support the feature.
>>
>> Change-Id: I713419c57b854082f6f739f1d32a055c7115e620
>> Signed-off-by: Yintian Tao 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  3 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 15 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   | 11 +
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 78
>> ++
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h  |  6 +++
>>   7 files changed, 147 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 3ff8899..bb0fd5a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2486,6 +2486,7 @@ int amdgpu_device_init(struct amdgpu_device 
>> *adev,
>>  mutex_init(>virt.vf_errors.lock);
>>  hash_init(adev->mn_hash);
>>  mutex_init(>lock_reset);
>> +mutex_init(>virt.dpm_mutex);
>>
>>  amdgpu_device_check_arguments(adev);
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> index 6190495..1353955 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> @@ -727,6 +727,9 @@ static int amdgpu_info_ioctl(struct drm_device 
>> *dev, void *data, struct drm_file
>>  if (adev->pm.dpm_enabled) {
>>  dev_info.max_engine_clock =
>> amdgpu_dpm_get_sclk(adev, false) * 10;
>>  dev_info.max_memory_clock =
>> amdgpu_dpm_get_mclk(adev, false) * 10;
>> +} else if (amdgpu_sriov_vf(adev)) {
>> +dev_info.max_engine_clock =
>> amdgpu_virt_get_sclk(adev, false) * 10;
>> +dev_info.max_memory_clock =
>> amdgpu_virt_get_mclk(adev, false) * 10;
>>  } else {
>>  dev_info.max_engine_clock = adev-
>>> clock.default_sclk * 10;
>>  dev_info.max_memory_clock = adev-
>>> clock.default_mclk * 10; diff --git
>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>> index 5540259..0162d1e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
>> @@ -380,6 +380,17 @@ static ssize_t
>> amdgpu_set_dpm_forced_performance_level(struct device *dev,
>>  goto fail;
>>  }
>>
>> +if (amdgpu_sriov_vf(adev)) {
>> +if (amdgim_is_hwperf(adev) &&
>> +adev->virt.ops->force_dpm_level) {
>> +mutex_lock(>pm.mutex);
>> +adev->virt.ops->force_dpm_level(adev, level);
>> +mutex_unlock(>pm.mutex);
>> +return count;
>> +} else
>> +return -EINVAL;
>> +}
>> +
>>  if (current_level == level)
>>  return count;
>>
>> @@ -843,6 +854,10 @@ static ssize_t amdgpu_get_pp_dpm_sclk(struct 
>> device *dev,
>>  struct drm_device *ddev = dev_get_drvdata(dev);
>>  struct amdgpu_device *adev = ddev->dev_private;
>>
>> +if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
>> +adev->virt.ops->get_pp_clk)
>> +return adev->virt.ops->get_pp_clk(ade

RE: [PATCH] drm/amdgpu: support dpm level modification under virtualization

2019-04-10 Thread Tao, Yintian
Hi  Evan 

Many thanks for your review.

I will submit v2 patch according to your comments.

Best Regards
Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Quan, Evan
Sent: Wednesday, April 10, 2019 5:58 PM
To: Tao, Yintian ; amd-gfx@lists.freedesktop.org
Cc: Tao, Yintian 
Subject: RE: [PATCH] drm/amdgpu: support dpm level modification under 
virtualization



> -Original Message-
> From: amd-gfx  On Behalf Of 
> Yintian Tao
> Sent: 2019年4月9日 23:18
> To: amd-gfx@lists.freedesktop.org
> Cc: Tao, Yintian 
> Subject: [PATCH] drm/amdgpu: support dpm level modification under 
> virtualization
> 
> Under vega10 virtualuzation, smu ip block will not be added.
> Therefore, we need add pp clk query and force dpm level function at 
> amdgpu_virt_ops to support the feature.
> 
> Change-Id: I713419c57b854082f6f739f1d32a055c7115e620
> Signed-off-by: Yintian Tao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  3 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 15 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   | 11 +
>  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 78
> ++
>  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h  |  6 +++
>  7 files changed, 147 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3ff8899..bb0fd5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2486,6 +2486,7 @@ int amdgpu_device_init(struct amdgpu_device 
> *adev,
>   mutex_init(>virt.vf_errors.lock);
>   hash_init(adev->mn_hash);
>   mutex_init(>lock_reset);
> + mutex_init(>virt.dpm_mutex);
> 
>   amdgpu_device_check_arguments(adev);
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 6190495..1353955 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -727,6 +727,9 @@ static int amdgpu_info_ioctl(struct drm_device 
> *dev, void *data, struct drm_file
>   if (adev->pm.dpm_enabled) {
>   dev_info.max_engine_clock =
> amdgpu_dpm_get_sclk(adev, false) * 10;
>   dev_info.max_memory_clock =
> amdgpu_dpm_get_mclk(adev, false) * 10;
> + } else if (amdgpu_sriov_vf(adev)) {
> + dev_info.max_engine_clock =
> amdgpu_virt_get_sclk(adev, false) * 10;
> + dev_info.max_memory_clock =
> amdgpu_virt_get_mclk(adev, false) * 10;
>   } else {
>   dev_info.max_engine_clock = adev-
> >clock.default_sclk * 10;
>   dev_info.max_memory_clock = adev-
> >clock.default_mclk * 10; diff --git
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 5540259..0162d1e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -380,6 +380,17 @@ static ssize_t
> amdgpu_set_dpm_forced_performance_level(struct device *dev,
>   goto fail;
>   }
> 
> +if (amdgpu_sriov_vf(adev)) {
> +if (amdgim_is_hwperf(adev) &&
> +adev->virt.ops->force_dpm_level) {
> +mutex_lock(>pm.mutex);
> +adev->virt.ops->force_dpm_level(adev, level);
> +mutex_unlock(>pm.mutex);
> +return count;
> +} else
> +return -EINVAL;
> +}
> +
>   if (current_level == level)
>   return count;
> 
> @@ -843,6 +854,10 @@ static ssize_t amdgpu_get_pp_dpm_sclk(struct 
> device *dev,
>   struct drm_device *ddev = dev_get_drvdata(dev);
>   struct amdgpu_device *adev = ddev->dev_private;
> 
> + if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
> + adev->virt.ops->get_pp_clk)
> + return adev->virt.ops->get_pp_clk(adev, PP_SCLK, buf);
> +
>   if (is_support_sw_smu(adev))
>   return smu_print_clk_levels(>smu, PP_SCLK, buf);
>   else if (adev->powerplay.pp_funcs->print_clock_levels)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> index 462a04e..ae4b2a1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> @@ -375,4 +375,37 @@ void amdgpu_virt_init_data_

RE: [PATCH] drm/amdgpu: support dpm level modification under virtualization

2019-04-10 Thread Tao, Yintian
Hi  Christian

Many thanks. I got it.

Hi  Alex

Can you help review on it? Many thanks in advance.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian  
Sent: Wednesday, April 10, 2019 4:44 PM
To: Tao, Yintian 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: support dpm level modification under 
virtualization

Hi Yintian,

sorry but this is power management and that is not something I'm very familiar 
with.

I can only say that at least on first glance the coding style looks good to me 
:)

Probably best to wait for Alex to wake up.

Regards,
Christian.

Am 10.04.19 um 10:38 schrieb Tao, Yintian:
> Ping
>
>
> Hi Christian
>
>
> Can you help have review on it? Thanks in advance.
>
> Best Regards
> Yintian Tao
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Yintian Tao
> Sent: Tuesday, April 09, 2019 11:18 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Tao, Yintian 
> Subject: [PATCH] drm/amdgpu: support dpm level modification under 
> virtualization
>
> Under vega10 virtualuzation, smu ip block will not be added.
> Therefore, we need add pp clk query and force dpm level function at 
> amdgpu_virt_ops to support the feature.
>
> Change-Id: I713419c57b854082f6f739f1d32a055c7115e620
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  3 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 15 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   | 11 +
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 78 
> ++
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h  |  6 +++
>   7 files changed, 147 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 3ff8899..bb0fd5a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2486,6 +2486,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   mutex_init(>virt.vf_errors.lock);
>   hash_init(adev->mn_hash);
>   mutex_init(>lock_reset);
> + mutex_init(>virt.dpm_mutex);
>   
>   amdgpu_device_check_arguments(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 6190495..1353955 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -727,6 +727,9 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
> *data, struct drm_file
>   if (adev->pm.dpm_enabled) {
>   dev_info.max_engine_clock = amdgpu_dpm_get_sclk(adev, 
> false) * 10;
>   dev_info.max_memory_clock = amdgpu_dpm_get_mclk(adev, 
> false) * 
> 10;
> + } else if (amdgpu_sriov_vf(adev)) {
> + dev_info.max_engine_clock = amdgpu_virt_get_sclk(adev, 
> false) * 10;
> + dev_info.max_memory_clock = amdgpu_virt_get_mclk(adev, 
> false) * 
> +10;
>   } else {
>   dev_info.max_engine_clock = adev->clock.default_sclk * 
> 10;
>   dev_info.max_memory_clock = adev->clock.default_mclk * 
> 10; diff 
> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 5540259..0162d1e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -380,6 +380,17 @@ static ssize_t 
> amdgpu_set_dpm_forced_performance_level(struct device *dev,
>   goto fail;
>   }
>   
> +if (amdgpu_sriov_vf(adev)) {
> +if (amdgim_is_hwperf(adev) &&
> +adev->virt.ops->force_dpm_level) {
> +mutex_lock(>pm.mutex);
> +adev->virt.ops->force_dpm_level(adev, level);
> +mutex_unlock(>pm.mutex);
> +return count;
> +} else
> +return -EINVAL;
> +}
> +
>   if (current_level == level)
>   return count;
>   
> @@ -843,6 +854,10 @@ static ssize_t amdgpu_get_pp_dpm_sclk(struct device *dev,
>   struct drm_device *ddev = dev_get_drvdata(dev);
>   struct amdgpu_device *adev = ddev->dev_private;
>   
> + if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
> + adev->virt.ops->get_pp_clk)
> + return adev->virt.ops->get_pp_clk(adev, PP_SCLK, buf);
> +
>   if (is_support_sw

RE: [PATCH] drm/amdgpu: support dpm level modification under virtualization

2019-04-10 Thread Tao, Yintian
Ping


Hi Christian 


Can you help have review on it? Thanks in advance.

Best Regards
Yintian Tao

-Original Message-
From: amd-gfx  On Behalf Of Yintian Tao
Sent: Tuesday, April 09, 2019 11:18 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tao, Yintian 
Subject: [PATCH] drm/amdgpu: support dpm level modification under virtualization

Under vega10 virtualuzation, smu ip block will not be added.
Therefore, we need add pp clk query and force dpm level function at 
amdgpu_virt_ops to support the feature.

Change-Id: I713419c57b854082f6f739f1d32a055c7115e620
Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 15 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   | 11 +
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 78 ++
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h  |  6 +++
 7 files changed, 147 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3ff8899..bb0fd5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2486,6 +2486,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
mutex_init(>virt.vf_errors.lock);
hash_init(adev->mn_hash);
mutex_init(>lock_reset);
+   mutex_init(>virt.dpm_mutex);
 
amdgpu_device_check_arguments(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 6190495..1353955 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -727,6 +727,9 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
if (adev->pm.dpm_enabled) {
dev_info.max_engine_clock = amdgpu_dpm_get_sclk(adev, 
false) * 10;
dev_info.max_memory_clock = amdgpu_dpm_get_mclk(adev, 
false) * 10;
+   } else if (amdgpu_sriov_vf(adev)) {
+   dev_info.max_engine_clock = amdgpu_virt_get_sclk(adev, 
false) * 10;
+   dev_info.max_memory_clock = amdgpu_virt_get_mclk(adev, 
false) * 10;
} else {
dev_info.max_engine_clock = adev->clock.default_sclk * 
10;
dev_info.max_memory_clock = adev->clock.default_mclk * 
10; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index 5540259..0162d1e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -380,6 +380,17 @@ static ssize_t 
amdgpu_set_dpm_forced_performance_level(struct device *dev,
goto fail;
}
 
+if (amdgpu_sriov_vf(adev)) {
+if (amdgim_is_hwperf(adev) &&
+adev->virt.ops->force_dpm_level) {
+mutex_lock(>pm.mutex);
+adev->virt.ops->force_dpm_level(adev, level);
+mutex_unlock(>pm.mutex);
+return count;
+} else
+return -EINVAL;
+}
+
if (current_level == level)
return count;
 
@@ -843,6 +854,10 @@ static ssize_t amdgpu_get_pp_dpm_sclk(struct device *dev,
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = ddev->dev_private;
 
+   if (amdgpu_sriov_vf(adev) && amdgim_is_hwperf(adev) &&
+   adev->virt.ops->get_pp_clk)
+   return adev->virt.ops->get_pp_clk(adev, PP_SCLK, buf);
+
if (is_support_sw_smu(adev))
return smu_print_clk_levels(>smu, PP_SCLK, buf);
else if (adev->powerplay.pp_funcs->print_clock_levels)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 462a04e..ae4b2a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -375,4 +375,37 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device 
*adev)
}
 }
 
+static uint32_t parse_clk(char *buf, bool min) {
+char *ptr = buf;
+uint32_t clk = 0;
+
+do {
+ptr = strchr(ptr, ':');
+if (!ptr)
+break;
+ptr+=2;
+clk = simple_strtoul(ptr, NULL, 10);
+} while (!min);
+
+return clk * 100;
+}
+
+uint32_t amdgpu_virt_get_sclk(struct amdgpu_device *adev, bool lowest) 
+{
+char buf[512] = {0};
+
+adev->virt.ops->get_pp_clk(adev, PP_SCLK, buf);
+
+return parse_clk(buf, lowest);
+}
+
+uint32_t amdgpu_virt_get_mclk(struct amdgpu_device *adev, bool lowest) 
+{
+char buf[512] = {0

RE: [PATCH] drm/amdgpu: move set pg state to suspend phase2

2018-08-22 Thread Tao, Yintian
Please ignore this patch. I will re-submit patch with better solution.

-Original Message-
From: Yintian Tao [mailto:yt...@amd.com] 
Sent: Wednesday, August 22, 2018 2:50 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tao, Yintian 
Subject: [PATCH] drm/amdgpu: move set pg state to suspend phase2

Under virtualization, We have to require full-acess gpu at suspend phase2 due 
to some special register access. In order to guarantee it, we should move set 
pg and cg state to suspend
phase2 to make registers access at one full-acess lifecycle.

Signed-off-by: Yintian Tao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 42 ++
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c9557d9..2d95769 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1713,10 +1713,11 @@ static int amdgpu_device_set_cg_state(struct 
amdgpu_device *adev,
i = state == AMD_CG_STATE_GATE ? j : adev->num_ip_blocks - j - 
1;
if (!adev->ip_blocks[i].status.valid)
continue;
-   /* skip CG for VCE/UVD, it's handled specially */
+   /* skip CG for VCE/UVD and DCE, it's handled specially */
if (adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_UVD &&
adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_VCE &&
adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_VCN &&
+   adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_DCE &&
adev->ip_blocks[i].version->funcs->set_clockgating_state) {
/* enable clockgating to save power */
r = 
adev->ip_blocks[i].version->funcs->set_clockgating_state((void *)adev, @@ 
-1743,10 +1744,11 @@ static int amdgpu_device_set_pg_state(struct amdgpu_device 
*adev, enum amd_power
i = state == AMD_PG_STATE_GATE ? j : adev->num_ip_blocks - j - 
1;
if (!adev->ip_blocks[i].status.valid)
continue;
-   /* skip CG for VCE/UVD, it's handled specially */
+   /* skip CG for VCE/UVD and DCE, it's handled specially */
if (adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_UVD &&
adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_VCE &&
adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_VCN &&
+   adev->ip_blocks[i].version->type != AMD_IP_BLOCK_TYPE_DCE &&
adev->ip_blocks[i].version->funcs->set_powergating_state) {
/* enable powergating to save power */
r = 
adev->ip_blocks[i].version->funcs->set_powergating_state((void *)adev, @@ 
-1932,17 +1934,29 @@ static int amdgpu_device_ip_suspend_phase1(struct 
amdgpu_device *adev)  {
int i, r;
 
-   if (amdgpu_sriov_vf(adev))
-   amdgpu_virt_request_full_gpu(adev, false);
-
-   amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
-   amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
-
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.valid)
continue;
/* displays are handled separately */
if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_DCE) {
+   if 
(adev->ip_blocks[i].version->funcs->set_powergating_state) {
+   r = 
adev->ip_blocks[i].version->funcs->set_powergating_state((void *)adev,
+   
 AMD_PG_STATE_UNGATE);
+   if (r) {
+   DRM_ERROR("set_powergating_state(gate) 
of IP block <%s> failed %d\n",
+ 
adev->ip_blocks[i].version->funcs->name, r);
+   return r;
+   }
+   }
+   /* ungate blocks so that suspend can properly shut them 
down */
+   if 
(adev->ip_blocks[i].version->funcs->set_clockgating_state) {
+   r = 
adev->ip_blocks[i].version->funcs->set_clockgating_state((void *)adev,
+   
 AMD_CG_STATE_UNGATE);
+   if (r) {
+   
DRM_ERROR("set_clockgating_state(ungate) of IP block <%s> failed %d\n"

RE: [PATCH] drm/amd/powerplay: fix typo error for '3be7be08ac'

2018-01-04 Thread Tao, Yintian
Hi  Christian

Thanks for your review. I have submit it.


Best Regards
Yintian Tao

-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] 
Sent: Thursday, January 04, 2018 5:35 PM
To: Zhu, Rex <rex@amd.com>; Tao, Yintian <yintian@amd.com>; 
amd-gfx@lists.freedesktop.org
Cc: Zhou, David(ChunMing) <david1.z...@amd.com>
Subject: Re: [PATCH] drm/amd/powerplay: fix typo error for '3be7be08ac'

Reviewed-by: Christian König <christian.koe...@amd.com>

Please commit ASAP since this is breaking builds, Christian.

Am 04.01.2018 um 10:11 schrieb Zhu, Rex:
> Reviewed-by: Rex Zhu <rex@amd.com>
>
> Best Regards
> Rex
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf 
> Of Yintian Tao
> Sent: Thursday, January 04, 2018 5:05 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhou, David(ChunMing); Tao, Yintian
> Subject: [PATCH] drm/amd/powerplay: fix typo error for '3be7be08ac'
>
> Due to typo error, it will cause compile error so fix it.
> Change-Id: Iabe7158e08e6aef155ca3394cafc6eb4256a0030
> Signed-off-by: Yintian Tao <yt...@amd.com>
> ---
>   drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> index 7dc4cee..25dd778 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
> @@ -648,7 +648,7 @@ int smu7_init(struct pp_hwmgr *hwmgr)
>   
>   int smu7_smu_fini(struct pp_hwmgr *hwmgr)  {
> - struct smu7_smumgr smu_data = (struct smu7_smumgr 
> *)(hwmgr->smu_backend);
> + struct smu7_smumgr *smu_data = (struct smu7_smumgr 
> +*)(hwmgr->smu_backend);
>   
>   smu_free_memory(hwmgr->device, smu_data->header_buffer.handle);
>   if (!cgs_is_virtualization_enabled(hwmgr->device))
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/powerplay: fix memory leakage when reload

2018-01-03 Thread Tao, Yintian
Hi Alex

Thanks a lot. I got it.

Best Regards
Yintian Tao

From: Deucher, Alexander
Sent: Wednesday, January 03, 2018 10:32 PM
To: Tao, Yintian <yintian@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/powerplay: fix memory leakage when reload


Did you see my reply yesterday?  I reviewed it.  I also think we need to fix up 
cz, rv, and vg10.


From: Tao, Yintian
Sent: Tuesday, January 2, 2018 9:22:23 PM
To: Tao, Yintian; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, 
Alexander
Subject: RE: [PATCH] drm/amd/powerplay: fix memory leakage when reload

Add Alex

-Original Message-
From: Yintian Tao [mailto:yt...@amd.com]
Sent: Monday, January 01, 2018 11:16 AM
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Cc: Tao, Yintian <yintian@amd.com<mailto:yintian@amd.com>>
Subject: [PATCH] drm/amd/powerplay: fix memory leakage when reload

add smu_free_memory when smu fini to prevent memory leakage

Change-Id: Id9103d8b54869b63f22a9af53d9fbc3b7a221191
Signed-off-by: Yintian Tao <yt...@amd.com<mailto:yt...@amd.com>>
---
 drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index c49a6f2..925217e 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -607,6 +607,12 @@ int smu7_init(struct pp_smumgr *smumgr)

 int smu7_smu_fini(struct pp_smumgr *smumgr)  {
+   struct smu7_smumgr *smu_data = (struct smu7_smumgr
+*)(smumgr->backend);
+
+   smu_free_memory(smumgr->device, smu_data->header_buffer.handle);
+   if (!cgs_is_virtualization_enabled(smumgr->device))
+   smu_free_memory(smumgr->device, smu_data->smu_buffer.handle);
+
 if (smumgr->backend) {
 kfree(smumgr->backend);
 smumgr->backend = NULL;
--
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/powerplay: fix memory leakage when reload

2018-01-02 Thread Tao, Yintian
Add Alex

-Original Message-
From: Yintian Tao [mailto:yt...@amd.com] 
Sent: Monday, January 01, 2018 11:16 AM
To: amd-gfx@lists.freedesktop.org
Cc: Tao, Yintian <yintian@amd.com>
Subject: [PATCH] drm/amd/powerplay: fix memory leakage when reload

add smu_free_memory when smu fini to prevent memory leakage

Change-Id: Id9103d8b54869b63f22a9af53d9fbc3b7a221191
Signed-off-by: Yintian Tao <yt...@amd.com>
---
 drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index c49a6f2..925217e 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -607,6 +607,12 @@ int smu7_init(struct pp_smumgr *smumgr)
 
 int smu7_smu_fini(struct pp_smumgr *smumgr)  {
+   struct smu7_smumgr *smu_data = (struct smu7_smumgr 
+*)(smumgr->backend);
+
+   smu_free_memory(smumgr->device, smu_data->header_buffer.handle);
+   if (!cgs_is_virtualization_enabled(smumgr->device))
+   smu_free_memory(smumgr->device, smu_data->smu_buffer.handle);
+
if (smumgr->backend) {
kfree(smumgr->backend);
smumgr->backend = NULL;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Fix no irq process when evict vram

2017-12-13 Thread Tao, Yintian
Hi  Lothian


First of all, thanks for your review.

No, it is the patch which achieve the same function for the issue. But it is 
the root cause of fence timeout.
The patch 
b9141cd3<https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.16-wip=b9141cd3930e390f156739829ca9589fda7926e4>
 is the word-around for the issue. And I think the varible “shutdown” 
assignment is better to be located after amdgpu_fini() to ensure no irq miss.

Best Regards
Yintian Tao


From: Mike Lothian [mailto:m...@fireburn.co.uk]
Sent: Wednesday, December 13, 2017 7:23 PM
To: Tao, Yintian <yintian@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Fix no irq process when evict vram

Is this a follow on to 
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.16-wip=b9141cd3930e390f156739829ca9589fda7926e4

On Wed, 13 Dec 2017 at 07:11 Yintian Tao <yt...@amd.com<mailto:yt...@amd.com>> 
wrote:
When unload amdgpu driver we use sdma to evict vram but there is no
irq process after sdma completed work which raises that waiting for the
fence costs 2s which will trigger VFLR under SRIOV and at last make
unload driver failed.The reason is that the shutdown varible in adev
is set to true before evict vram, it cause ISR directly return without
processing.Therefore, we need set the varible after evict vram.

Change-Id: I7bf75481aa0744b99c41672b49670adc70b478bd
Signed-off-by: Yintian Tao <yt...@amd.com<mailto:yt...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a269bbc..80934ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2458,7 +2458,6 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
int r;

DRM_INFO("amdgpu: finishing device.\n");
-   adev->shutdown = true;
if (adev->mode_info.mode_config_initialized)
drm_crtc_force_disable_all(adev->ddev);

@@ -2466,6 +2465,7 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
amdgpu_fence_driver_fini(adev);
amdgpu_fbdev_fini(adev);
r = amdgpu_fini(adev);
+   adev->shutdown = true;
if (adev->firmware.gpu_info_fw) {
release_firmware(adev->firmware.gpu_info_fw);
adev->firmware.gpu_info_fw = NULL;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [virtual display] code review

2017-01-24 Thread Tao, Yintian
Hi  Christian

Thanks for your guidance. I will follow it next time.


Best Regards
Yintian Tao

-Original Message-
From: Koenig, Christian 
Sent: Tuesday, January 24, 2017 5:31 PM
To: Tao, Yintian <yintian@amd.com>; Deucher, Alexander 
<alexander.deuc...@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [virtual display] code review

For public mailing lists please use text format only, not HTML mail.

Apart from that the patch looks good to me and is Reviewed-by: Christian König 
<christian.koe...@amd.com>.

Regards,
Christian.

Am 24.01.2017 um 10:19 schrieb Tao, Yintian:
>
> Hi  Christian
>
> Please help to review this patch. The whole story is as follows:
>
> For pass-through case, amdgpu module driver may be included into one 
> image with specified BDF parameter,
>
> which will cause the failure of virtual display creation when the slot 
> where GPU is put into is modified.
>
> Therefore adding the new parameter “all” for virtual display enable 
> will fix this issue. Thanks a lot.
>
> Best Regards
>
> Yintian Tao
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[virtual display] code review

2017-01-24 Thread Tao, Yintian
Hi  Christian


Please help to review this patch. The whole story is as follows:

For pass-through case, amdgpu module driver may be included into one image with 
specified BDF parameter,

which will cause the failure of virtual display creation when the slot where 
GPU is put into is modified.

Therefore adding the new parameter “all” for virtual display enable will fix 
this issue. Thanks a lot.


Best Regards
Yintian Tao



0001-drm-amdgpu-add-new-virtual-display-ID.patch
Description: 0001-drm-amdgpu-add-new-virtual-display-ID.patch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


remove static integer at uvd power gate to fix two gpu core boot up

2016-12-21 Thread Tao, Yintian
Hi  Alex


Please help have a review. Thanks a lot.


Best Regards
Yintian Tao


0001-drm-amdgpu-remove-static-integer-for-uvd-pp-state.patch
Description: 0001-drm-amdgpu-remove-static-integer-for-uvd-pp-state.patch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx