Because gpu_srio_reset (will send patch for this routine later) doesn't call 
amdgpu_suspend()
That's most likely not a good idea.

Suspend and resume should always be paired, otherwise you run into exactly those problems and the GART is most likely only the tip of the iceberg here.

For example you also mess up the reference counting for buffer containing the UVD and VCE firmware (ok that won't affect SRIOV for now).

Maybe you just want to call hw_init() instead of a resume here?

Regards,
Christian.

Am 06.02.2017 um 16:55 schrieb Liu, Monk:
I recall why I made this patch

When testing SRIOV gpu reset feature, I it will always waiting and not return 
if without this patch, with more look into it:

Because gpu_srio_reset (will send patch for this routine later) doesn't call 
amdgpu_suspend(), so the gart table BO won't get unpin, which lead to driver 
infinite wait loop  if we pin it again in resume.
For bare-metal case, gpu_reset will call amdgpu_suspend so the gart bo will unpin.

BTW:
GPU_SRIOV_RESET is invoked after HYPERVISOR call VF_FLR on this vf device, so 
all IP blocks's suspend routine is not needed at all.

What about:
+       if (adev->gart.table_addr && amdgpu_sriov_vf(adev)) {
+               /* it's a resume call, gart already pin */
+               return 0;
+       }

BR Monk


-----Original Message-----
From: Christian König [mailto:deathsim...@vodafone.de]
Sent: Monday, February 06, 2017 10:31 PM
To: Liu, Monk <monk....@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin

Hui? We shouldn't need to call this function from a GPU reset, do we really do 
so?

But even if we call it from GPU reset we certainly should have called the 
matching unpin function before.

Otherwise we certainly won't be able to resume from the next suspend after a 
GPU reset.

Regards,
Christian.

Am 06.02.2017 um 15:25 schrieb Liu, Monk:
Emmmm looks like I missed the part of S3 function

But if this is from a GPU reset ,  we also shouldn't continue run this
function otherwise GPU reset will fail (SRIOV reset test)

BR Monk

-----Original Message-----
From: Christian König [mailto:deathsim...@vodafone.de]
Sent: Monday, February 06, 2017 4:14 PM
To: Liu, Monk <monk....@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin

A bug NAK on this! amdgpu_gart_table_vram_unpin() must be called during suspend.

Otherwise the GART table can be corrupted and we run into a whole bunch of 
problems.

We could add a "BUG_ON(adev->gart.table_addr != NULL);" here to double check 
that, but just ignoring that something went horrible wrong is clearly the wrong approach.

Regards,
Christian.

Am 04.02.2017 um 11:34 schrieb Monk Liu:
if this call is from resume, shouldn't enter pin logic at all

Change-Id: I40a5cdc2a716c4c20d2812fd74ece4ea284b6765
Signed-off-by: Monk Liu <monk....@amd.com>
---
    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 5 +++++
    1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 964d2a9..5e907f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -151,6 +151,11 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev)
        uint64_t gpu_addr;
        int r;
+ if (adev->gart.table_addr) {
+               /* it's a resume call, gart already pin */
+               return 0;
+       }
+
        r = amdgpu_bo_reserve(adev->gart.robj, false);
        if (unlikely(r != 0))
                return r;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to