On Thu, Jul 6, 2023 at 4:56 AM Chen, Horace <horace.c...@amd.com> wrote:
>
> [AMD Official Use Only - General]
>
> Hi Christian & Leo,
>
> Sorry, missed Leo's email.
>
> Currently SR-IOV VF doesn't have suspend/resume use case and this code is in 
> vcn_v4_0_start_sriov() which will only called by SR-IOV code path. So 
> currently we will not hit the suspend/resume issue even with this patch.
>
> About why this patch is necessary:
>     Because SR-IOV uses FLR instead of mode 1 reset as the default reset 
> method, and FLR will not reset VRAM during the common reset. So if there is 
> some bad data in the cache, it will cause VF continues to fail after reset.  
> We have some test case(such as data poison) which may hit this issue.
>
> If want to be safer in case we may add suspend/resume case in the future:
>     how about add a "if (amdgpu_in_reset(adev))" check to make sure we only 
> clear cache in the reset domain?

Yes, that seems like a better bet as that will keep suspend and resume working.

Alex

>
> Thanks & Regards,
> Horace.
>
> -----Original Message-----
> From: Koenig, Christian <christian.koe...@amd.com>
> Sent: Thursday, July 6, 2023 3:24 PM
> To: Liu, Leo <leo....@amd.com>; Chen, Horace <horace.c...@amd.com>; 
> amd-gfx@lists.freedesktop.org
> Cc: Quan, Evan <evan.q...@amd.com>; Deucher, Alexander 
> <alexander.deuc...@amd.com>; Xiao, Jack <jack.x...@amd.com>; Zhang, Hawking 
> <hawking.zh...@amd.com>; Liu, Monk <monk....@amd.com>; Xu, Feifei 
> <feifei...@amd.com>; Chang, HaiJun <haijun.ch...@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: Clear VCN cache when hw_init
>
> Since this patch was already pushed please revert it until we have a 
> technical consent on why this should be necessary.
>
> Regards,
> Christian.
>
> Am 05.07.23 um 21:57 schrieb Leo Liu:
> > What Christian says is correct, esp. during the playback or encode,
> > when suspend/resume happens, it will save the FW context, and after
> > resume, it will continue the job to where it left during the suspend.
> > Will this apply to SRIOV case? Since the changes only within the SRIOV
> > code, please make sure that also please specify the SRIOV from your
> > patch subject and commit message.
> >
> > Regards,
> >
> > Leo
> >
> >
> > On 2023-06-30 07:38, Christian König wrote:
> >> Am 20.06.23 um 15:29 schrieb Horace Chen:
> >>> [Why]
> >>> VCN will use some framebuffer space as its cache. It needs to be
> >>> reset when reset happens, such as FLR. Otherwise some error may be
> >>> kept after the reset.
> >>
> >> Well this doesn't make sense at all.
> >>
> >> The full content of adev->vcn.inst[i].cpu_addr is saved and restored
> >> during suspend/resume and IIRC GPU resets as well.
> >>
> >> See functions amdgpu_vcn_suspend() and amdgpu_vcn_resume().
> >>
> >> Please let Leo's team take a look at this and review the change
> >> before it is committed.
> >>
> >> Regards,
> >> Christian.
> >>
> >>>
> >>> Signed-off-by: Horace Chen <horace.c...@amd.com>
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 3 +++
> >>>   1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
> >>> b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
> >>> index b48bb5212488..2db73a964031 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
> >>> @@ -1292,6 +1292,7 @@ static int vcn_v4_0_start_sriov(struct
> >>> amdgpu_device *adev)
> >>>               cache_size);
> >>>             cache_addr = adev->vcn.inst[i].gpu_addr + offset;
> >>> +        memset(adev->vcn.inst[i].cpu_addr + offset, 0,
> >>> AMDGPU_VCN_STACK_SIZE);
> >>>           MMSCH_V4_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, i,
> >>>               regUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW),
> >>>               lower_32_bits(cache_addr)); @@ -1307,6 +1308,8 @@
> >>> static int vcn_v4_0_start_sriov(struct amdgpu_device *adev)
> >>>             cache_addr = adev->vcn.inst[i].gpu_addr + offset +
> >>>               AMDGPU_VCN_STACK_SIZE;
> >>> +        memset(adev->vcn.inst[i].cpu_addr + offset +
> >>> AMDGPU_VCN_STACK_SIZE, 0,
> >>> +            AMDGPU_VCN_STACK_SIZE);
> >>>           MMSCH_V4_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, i,
> >>>               regUVD_LMI_VCPU_CACHE2_64BIT_BAR_LOW),
> >>>               lower_32_bits(cache_addr));
> >>
>

Reply via email to