Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset
On Wed, Dec 18, 2019 at 9:12 PM Quan, Evan wrote: > > Hi Alex, > > "Power saving" means the regular suspend/resume case, right? That was > considered. I mean BACO for runtime power management support. I landed the code to enable BACO for saving power at runtime when the GPU is not in use. It's still disabled by default until we properly handle KFD support, but you can enable it with amdgpu.runpm=1. > With current amdgpu code, the MP1 state was not correctly set for the regular > suspend case. > More straightforwardly I believe PrepareMP1_for_unload should be issued to > MP1 on regular suspend path(excluding gpu reset case). > > And with the MP1 state correctly set for all case, we can remove the > "adev->in_gpu_reset". > But for now, I do not want to involve too many changes and limit this to the > gpu reset case. > > P.S. the mp1 state was correctly handled for mode1 reset. So, it's safe to > enforce this for all gpu reset case instead of baco reset only. Ah, good to hear. Alex > > Regards, > Evan > > -Original Message- > > From: Alex Deucher > > Sent: Wednesday, December 18, 2019 10:56 PM > > To: Quan, Evan > > Cc: amd-gfx list > > Subject: Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS > > baco reset > > > > On Tue, Dec 17, 2019 at 10:25 PM Evan Quan wrote: > > > > > > For non-RAS baco reset, there is no need to reset the SMC. Thus the > > > firmware reloading should be avoided. > > > > > > Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c > > > Signed-off-by: Evan Quan > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 ++- > > > drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 14 ++ > > > 2 files changed, 16 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > > index c14f2ccd0677..9bf7e92394f5 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > > @@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp) > > > continue; > > > > > > if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC && > > > - (psp_smu_reload_quirk(psp) || > > > psp->autoload_supported)) > > > + ((adev->in_gpu_reset && psp_smu_reload_quirk(psp)) > > > + || psp->autoload_supported)) > > > > Will this cover the power saving case as well? Do we need to check > > adev->in_gpu_reset as well or can we drop that part? > > > > Alex > > > > > continue; > > > > > > if (amdgpu_sriov_vf(adev) && diff --git > > > a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > > b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > > index c66ca8cc2ebd..ba761e9366e3 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > > @@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct > > psp_context *psp, > > > return true; > > > } > > > > > > +/* > > > + * Check whether SMU is still alive. If that's true > > > + * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading. > > > + */ > > > +static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp) { > > > + struct amdgpu_device *adev = psp->adev; > > > + uint32_t reg; > > > + > > > + reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0); > > > + return (reg & > > MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ? > > > +true : false; } > > > + > > > static int psp_v11_0_mode1_reset(struct psp_context *psp) { > > > int ret; > > > @@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = { > > > .ring_stop = psp_v11_0_ring_stop, > > > .ring_destroy = psp_v11_0_ring_destroy, > > > .compare_sram_data = psp_v11_0_compare_sram_data, > > > + .smu_reload_quirk = psp_v11_0_smu_reload_quirk, > > > .mode1_reset = psp_v11_0_mode1_reset, > > > .xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info, > > > .xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info, > > > -- > > > 2.24.0 > > > > > > ___ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist > > > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd- > > gfxdata=02%7C01%7Cev > > > > > an.quan%40amd.com%7C8781ad2ef92d4a188c3008d783ca6846%7C3dd8961fe > > 4884e6 > > > > > 08e11a82d994e183d%7C0%7C0%7C637122777663939524sdata=DMLV% > > 2Bz%2FsG > > > nXhpsiOdv9EZrsBcn6HGJ3L7lKdKL2PaPQ%3Dreserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset
Hi Alex, "Power saving" means the regular suspend/resume case, right? That was considered. With current amdgpu code, the MP1 state was not correctly set for the regular suspend case. More straightforwardly I believe PrepareMP1_for_unload should be issued to MP1 on regular suspend path(excluding gpu reset case). And with the MP1 state correctly set for all case, we can remove the "adev->in_gpu_reset". But for now, I do not want to involve too many changes and limit this to the gpu reset case. P.S. the mp1 state was correctly handled for mode1 reset. So, it's safe to enforce this for all gpu reset case instead of baco reset only. Regards, Evan > -Original Message- > From: Alex Deucher > Sent: Wednesday, December 18, 2019 10:56 PM > To: Quan, Evan > Cc: amd-gfx list > Subject: Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS > baco reset > > On Tue, Dec 17, 2019 at 10:25 PM Evan Quan wrote: > > > > For non-RAS baco reset, there is no need to reset the SMC. Thus the > > firmware reloading should be avoided. > > > > Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c > > Signed-off-by: Evan Quan > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 ++- > > drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 14 ++ > > 2 files changed, 16 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > index c14f2ccd0677..9bf7e92394f5 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > > @@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp) > > continue; > > > > if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC && > > - (psp_smu_reload_quirk(psp) || psp->autoload_supported)) > > + ((adev->in_gpu_reset && psp_smu_reload_quirk(psp)) > > + || psp->autoload_supported)) > > Will this cover the power saving case as well? Do we need to check > adev->in_gpu_reset as well or can we drop that part? > > Alex > > > continue; > > > > if (amdgpu_sriov_vf(adev) && diff --git > > a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > index c66ca8cc2ebd..ba761e9366e3 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > > @@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct > psp_context *psp, > > return true; > > } > > > > +/* > > + * Check whether SMU is still alive. If that's true > > + * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading. > > + */ > > +static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp) { > > + struct amdgpu_device *adev = psp->adev; > > + uint32_t reg; > > + > > + reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0); > > + return (reg & > MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ? > > +true : false; } > > + > > static int psp_v11_0_mode1_reset(struct psp_context *psp) { > > int ret; > > @@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = { > > .ring_stop = psp_v11_0_ring_stop, > > .ring_destroy = psp_v11_0_ring_destroy, > > .compare_sram_data = psp_v11_0_compare_sram_data, > > + .smu_reload_quirk = psp_v11_0_smu_reload_quirk, > > .mode1_reset = psp_v11_0_mode1_reset, > > .xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info, > > .xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info, > > -- > > 2.24.0 > > > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist > > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd- > gfxdata=02%7C01%7Cev > > > an.quan%40amd.com%7C8781ad2ef92d4a188c3008d783ca6846%7C3dd8961fe > 4884e6 > > > 08e11a82d994e183d%7C0%7C0%7C637122777663939524sdata=DMLV% > 2Bz%2FsG > > nXhpsiOdv9EZrsBcn6HGJ3L7lKdKL2PaPQ%3Dreserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset
On Tue, Dec 17, 2019 at 10:25 PM Evan Quan wrote: > > For non-RAS baco reset, there is no need to reset the SMC. Thus > the firmware reloading should be avoided. > > Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c > Signed-off-by: Evan Quan > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 14 ++ > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > index c14f2ccd0677..9bf7e92394f5 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > @@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp) > continue; > > if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC && > - (psp_smu_reload_quirk(psp) || psp->autoload_supported)) > + ((adev->in_gpu_reset && psp_smu_reload_quirk(psp)) > + || psp->autoload_supported)) Will this cover the power saving case as well? Do we need to check adev->in_gpu_reset as well or can we drop that part? Alex > continue; > > if (amdgpu_sriov_vf(adev) && > diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > index c66ca8cc2ebd..ba761e9366e3 100644 > --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c > @@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct > psp_context *psp, > return true; > } > > +/* > + * Check whether SMU is still alive. If that's true > + * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading. > + */ > +static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp) > +{ > + struct amdgpu_device *adev = psp->adev; > + uint32_t reg; > + > + reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0); > + return (reg & MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ? true : > false; > +} > + > static int psp_v11_0_mode1_reset(struct psp_context *psp) > { > int ret; > @@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = { > .ring_stop = psp_v11_0_ring_stop, > .ring_destroy = psp_v11_0_ring_destroy, > .compare_sram_data = psp_v11_0_compare_sram_data, > + .smu_reload_quirk = psp_v11_0_smu_reload_quirk, > .mode1_reset = psp_v11_0_mode1_reset, > .xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info, > .xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info, > -- > 2.24.0 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset
For non-RAS baco reset, there is no need to reset the SMC. Thus the firmware reloading should be avoided. Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 ++- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 14 ++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index c14f2ccd0677..9bf7e92394f5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp) continue; if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC && - (psp_smu_reload_quirk(psp) || psp->autoload_supported)) + ((adev->in_gpu_reset && psp_smu_reload_quirk(psp)) + || psp->autoload_supported)) continue; if (amdgpu_sriov_vf(adev) && diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c index c66ca8cc2ebd..ba761e9366e3 100644 --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c @@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct psp_context *psp, return true; } +/* + * Check whether SMU is still alive. If that's true + * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading. + */ +static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp) +{ + struct amdgpu_device *adev = psp->adev; + uint32_t reg; + + reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0); + return (reg & MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ? true : false; +} + static int psp_v11_0_mode1_reset(struct psp_context *psp) { int ret; @@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = { .ring_stop = psp_v11_0_ring_stop, .ring_destroy = psp_v11_0_ring_destroy, .compare_sram_data = psp_v11_0_compare_sram_data, + .smu_reload_quirk = psp_v11_0_smu_reload_quirk, .mode1_reset = psp_v11_0_mode1_reset, .xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info, .xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info, -- 2.24.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx