On Fri, Apr 22, 2022 at 3:54 PM Wong, Alice <[email protected]> wrote: > > [AMD Official Use Only] > > Hi Alex, > > The attached patch freed most of the allocated memory except for one > allocated by psp_tmr_init during psp_load_fw. > Combination of the attached patch and calling psp_hw_fini when psp_hw_init > failed would fix the issue. > > As a proper fix, we can call psp_tmr_terminate in psp_load_fw when > psp_load_non_psp_fw failed. (attached patch) > We can't move psp_tmr_init to sw_init because we need to load toc to get the > TMR size. > Do you have any concerns with this approach?
That only covers failures in hw_init(). It doesn't handle resume(). Looks like all of the ta functions also potentially leak. I'm working on a cleanup to handle all of these. Alex > > Alice > > > -----Original Message----- > From: Alex Deucher <[email protected]> > Sent: April 21, 2022 1:31 AM > To: Wong, Alice <[email protected]> > Cc: amd-gfx list <[email protected]> > Subject: Re: [PATCH] drm/amdgpu: clean up psp ip if hw_init failed v2 > > On Wed, Apr 20, 2022 at 5:48 PM Alice Wong <[email protected]> wrote: > > > > If at any point psp_hw_init failed, psp_hw_fini would not be called > > during unload due to ip_blocks[PSP].status.hw not being set to true. > > This could cause a memory leak when the driver unloads. > > As a rule of thumb, each IP block should cleanup themselves when their > > hw_init fails. Only previously intialized blocks should be cleaned up > > by the common framework. > > > > v1: Call IP blocks' respective hw_fini when hw_init failed from > > the common framework > > v2: Call psp_hw_fini when psp_hw_init failed. > > > > Signed-off-by: Alice Wong <[email protected]> > > I don't think this is a good idea. The hw programming in hw_fini makes no > sense if the driver never set it up in the first place if hw_init failed > before initializing the hw. It would be better to just properly unwind the > relevant functions. Presumably the problem you are seeing is the failure to > free the GPU memory allocated in fw_fini, depending on where it fails. We > should just allocate the memory in sw_init; that is what we do in other IPs. > Does the attached patch fix the issue you are seeing? > > Alex > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 57 > > +++++++++++++------------ > > 1 file changed, 29 insertions(+), 28 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > index 5b9e48d44f5b..52b14efa848a 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > @@ -2537,6 +2537,34 @@ static int psp_load_fw(struct amdgpu_device *adev) > > return ret; > > } > > > > +static int psp_hw_fini(void *handle) > > +{ > > + struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > + struct psp_context *psp = &adev->psp; > > + > > + if (psp->ta_fw) { > > + psp_ras_terminate(psp); > > + psp_securedisplay_terminate(psp); > > + psp_rap_terminate(psp); > > + psp_dtm_terminate(psp); > > + psp_hdcp_terminate(psp); > > + } > > + > > + psp_asd_terminate(psp); > > + > > + psp_tmr_terminate(psp); > > + psp_ring_destroy(psp, PSP_RING_TYPE__KM); > > + > > + amdgpu_bo_free_kernel(&psp->fw_pri_bo, > > + &psp->fw_pri_mc_addr, &psp->fw_pri_buf); > > + amdgpu_bo_free_kernel(&psp->fence_buf_bo, > > + &psp->fence_buf_mc_addr, &psp->fence_buf); > > + amdgpu_bo_free_kernel(&psp->cmd_buf_bo, &psp->cmd_buf_mc_addr, > > + (void **)&psp->cmd_buf_mem); > > + > > + return 0; > > +} > > + > > static int psp_hw_init(void *handle) > > { > > int ret; > > @@ -2563,37 +2591,10 @@ static int psp_hw_init(void *handle) > > failed: > > adev->firmware.load_type = AMDGPU_FW_LOAD_DIRECT; > > mutex_unlock(&adev->firmware.mutex); > > + psp_hw_fini(handle); > > return -EINVAL; > > } > > > > -static int psp_hw_fini(void *handle) > > -{ > > - struct amdgpu_device *adev = (struct amdgpu_device *)handle; > > - struct psp_context *psp = &adev->psp; > > - > > - if (psp->ta_fw) { > > - psp_ras_terminate(psp); > > - psp_securedisplay_terminate(psp); > > - psp_rap_terminate(psp); > > - psp_dtm_terminate(psp); > > - psp_hdcp_terminate(psp); > > - } > > - > > - psp_asd_terminate(psp); > > - > > - psp_tmr_terminate(psp); > > - psp_ring_destroy(psp, PSP_RING_TYPE__KM); > > - > > - amdgpu_bo_free_kernel(&psp->fw_pri_bo, > > - &psp->fw_pri_mc_addr, &psp->fw_pri_buf); > > - amdgpu_bo_free_kernel(&psp->fence_buf_bo, > > - &psp->fence_buf_mc_addr, &psp->fence_buf); > > - amdgpu_bo_free_kernel(&psp->cmd_buf_bo, &psp->cmd_buf_mc_addr, > > - (void **)&psp->cmd_buf_mem); > > - > > - return 0; > > -} > > - > > static int psp_suspend(void *handle) > > { > > int ret; > > -- > > 2.25.1 > >
