Am 14.08.2018 um 10:13 schrieb Deng, Emily:
-----Original Message-----
From: Christian König <ckoenig.leichtzumer...@gmail.com>
Sent: Tuesday, August 14, 2018 3:54 PM
To: Deng, Emily <emily.d...@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu/sriov: For sriov runtime, use kiq to do
invalidate tlb

Am 14.08.2018 um 09:46 schrieb Emily Deng:
To avoid the tlb flush not interrupted by world switch, use kiq and
one command to do tlb invalidate.
Well NAK, this just duplicates the TLB handling and moves it outside of the GMC
code.
No, it not duplicates the TLB handling, it only send one command, and not 
duplicate. With kiq,
it only use one command to do the invalidate tlb and wait ack, and won't be 
interrupted by world switch.

That is effectively duplicating the TLB handling.

Instead just lower the timeout and suppress the warning when SRIOV is active.
With the kiq to do tlb flush, no warning issue expected.

Either fix that issue thoughtfully or let us live with the warning message.

But please no more SRIOV workaround we can't test on bare metal and break things with older firmware.

Regards,
Christian.


Best wishes
Emily Deng
Christian.

SWDEV-161497

Signed-off-by: Emily Deng <emily.d...@amd.com>
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 50
++++++++++++++++++++++++++++++++
   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |  2 ++
   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c    |  5 ++++
   3 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 21adb1b6..aa6ddcc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -233,6 +233,56 @@ void amdgpu_virt_kiq_wreg(struct amdgpu_device
*adev, uint32_t reg, uint32_t v)
        pr_err("failed to write reg:%x\n", reg);
   }

+void amdgpu_virt_kiq_invalidate_tlb(struct amdgpu_device *adev, struct
amdgpu_vmhub *hub,
+               unsigned eng, u32 req, uint32_t vmid) {
+       signed long r, cnt = 0;
+       unsigned long flags;
+       uint32_t seq;
+       struct amdgpu_kiq *kiq = &adev->gfx.kiq;
+       struct amdgpu_ring *ring = &kiq->ring;
+
+       BUG_ON(!ring->funcs->emit_reg_write_reg_wait);
+
+       spin_lock_irqsave(&kiq->ring_lock, flags);
+       amdgpu_ring_alloc(ring, 32);
+       amdgpu_ring_emit_reg_write_reg_wait(ring, hub->vm_inv_eng0_req +
eng,
+                                           hub->vm_inv_eng0_ack + eng,
+                                           req, 1 << vmid);
+       amdgpu_fence_emit_polling(ring, &seq);
+       amdgpu_ring_commit(ring);
+       spin_unlock_irqrestore(&kiq->ring_lock, flags);
+
+       r = amdgpu_fence_wait_polling(ring, seq, MAX_KIQ_REG_WAIT);
+
+       /* don't wait anymore for gpu reset case because this way may
+        * block gpu_recover() routine forever, e.g. this virt_kiq_rreg
+        * is triggered in TTM and ttm_bo_lock_delayed_workqueue() will
+        * never return if we keep waiting in virt_kiq_rreg, which cause
+        * gpu_recover() hang there.
+        *
+        * also don't wait anymore for IRQ context
+        * */
+       if (r < 1 && (adev->in_gpu_reset || in_interrupt()))
+               goto failed_kiq;
+
+       if (in_interrupt())
+               might_sleep();
+
+       while (r < 1 && cnt++ < MAX_KIQ_REG_TRY) {
+               msleep(MAX_KIQ_REG_BAILOUT_INTERVAL);
+               r = amdgpu_fence_wait_polling(ring, seq,
MAX_KIQ_REG_WAIT);
+       }
+
+       if (cnt > MAX_KIQ_REG_TRY)
+               goto failed_kiq;
+
+       return;
+
+failed_kiq:
+       pr_err("failed to invalidate tlb with kiq\n"); }
+
   /**
    * amdgpu_virt_request_full_gpu() - request full gpu access
    * @amdgpu:  amdgpu device.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 880ac11..a2e3c78 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -288,6 +288,8 @@ void amdgpu_free_static_csa(struct amdgpu_device
*adev);
   void amdgpu_virt_init_setting(struct amdgpu_device *adev);
   uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg);
   void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
uint32_t v);
+void amdgpu_virt_kiq_invalidate_tlb(struct amdgpu_device *adev, struct
amdgpu_vmhub *hub,
+               unsigned eng, u32 req, uint32_t vmid);
   int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init);
   int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init);
   int amdgpu_virt_reset_gpu(struct amdgpu_device *adev); diff --git
a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index ed467de..6a886d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -339,6 +339,11 @@ static void gmc_v9_0_flush_gpu_tlb(struct
amdgpu_device *adev,
                struct amdgpu_vmhub *hub = &adev->vmhub[i];
                u32 tmp = gmc_v9_0_get_invalidate_req(vmid);

+               if (amdgpu_sriov_vf(adev) && amdgpu_sriov_runtime(adev)) {
+                       amdgpu_virt_kiq_invalidate_tlb(adev, hub, eng, tmp,
vmid);
+                       continue;
+               }
+
                WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp);

                /* Busy wait for ACK.*/
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to