Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset

2019-12-18 Thread Alex Deucher
On Wed, Dec 18, 2019 at 9:12 PM Quan, Evan  wrote:
>
> Hi Alex,
>
> "Power saving" means the regular suspend/resume case, right? That was 
> considered.

I mean BACO for runtime power management support.  I landed the code
to enable BACO for saving power at runtime when the GPU is not in use.
It's still disabled by default until we properly handle KFD support,
but you can enable it with amdgpu.runpm=1.

> With current amdgpu code, the MP1 state was not correctly set for the regular 
> suspend case.
> More straightforwardly I believe PrepareMP1_for_unload should be issued to 
> MP1 on regular suspend path(excluding gpu reset case).
>
> And with the MP1 state correctly set for all case, we can remove the 
> "adev->in_gpu_reset".
> But for now, I do not want to involve too many changes and limit this to the 
> gpu reset case.
>
> P.S. the mp1 state was correctly handled for mode1 reset. So, it's safe to 
> enforce this for all gpu reset case instead of baco reset only.

Ah, good to hear.

Alex

>
> Regards,
> Evan
> > -Original Message-
> > From: Alex Deucher 
> > Sent: Wednesday, December 18, 2019 10:56 PM
> > To: Quan, Evan 
> > Cc: amd-gfx list 
> > Subject: Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS
> > baco reset
> >
> > On Tue, Dec 17, 2019 at 10:25 PM Evan Quan  wrote:
> > >
> > > For non-RAS baco reset, there is no need to reset the SMC. Thus the
> > > firmware reloading should be avoided.
> > >
> > > Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c
> > > Signed-off-by: Evan Quan 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c |  3 ++-
> > > drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 14 ++
> > >  2 files changed, 16 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > index c14f2ccd0677..9bf7e92394f5 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > @@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp)
> > > continue;
> > >
> > > if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC &&
> > > -   (psp_smu_reload_quirk(psp) || 
> > > psp->autoload_supported))
> > > +   ((adev->in_gpu_reset && psp_smu_reload_quirk(psp))
> > > + || psp->autoload_supported))
> >
> > Will this cover the power saving case as well?  Do we need to check
> > adev->in_gpu_reset as well or can we drop that part?
> >
> > Alex
> >
> > > continue;
> > >
> > > if (amdgpu_sriov_vf(adev) && diff --git
> > > a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > > index c66ca8cc2ebd..ba761e9366e3 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > > @@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct
> > psp_context *psp,
> > > return true;
> > >  }
> > >
> > > +/*
> > > + * Check whether SMU is still alive. If that's true
> > > + * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading.
> > > + */
> > > +static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp) {
> > > +   struct amdgpu_device *adev = psp->adev;
> > > +   uint32_t reg;
> > > +
> > > +   reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0);
> > > +   return (reg &
> > MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ?
> > > +true : false; }
> > > +
> > >  static int psp_v11_0_mode1_reset(struct psp_context *psp)  {
> > > int ret;
> > > @@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = {
> > > .ring_stop = psp_v11_0_ring_stop,
> > > .ring_destroy = psp_v11_0_ring_destroy,
> > > .compare_sram_data = psp_v11_0_compare_sram_data,
> > > +   .smu_reload_quirk = psp_v11_0_smu_reload_quirk,
> > > .mode1_reset = psp_v11_0_mode1_reset,
> > > .xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info,
> > > .xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info,
> > > --
> > > 2.24.0
> > >
> > > ___
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> > > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-
> > gfxdata=02%7C01%7Cev
> > >
> > an.quan%40amd.com%7C8781ad2ef92d4a188c3008d783ca6846%7C3dd8961fe
> > 4884e6
> > >
> > 08e11a82d994e183d%7C0%7C0%7C637122777663939524sdata=DMLV%
> > 2Bz%2FsG
> > > nXhpsiOdv9EZrsBcn6HGJ3L7lKdKL2PaPQ%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset

2019-12-18 Thread Quan, Evan
Hi Alex,

"Power saving" means the regular suspend/resume case, right? That was 
considered. 
With current amdgpu code, the MP1 state was not correctly set for the regular 
suspend case. 
More straightforwardly I believe PrepareMP1_for_unload should be issued to MP1 
on regular suspend path(excluding gpu reset case).

And with the MP1 state correctly set for all case, we can remove the 
"adev->in_gpu_reset".
But for now, I do not want to involve too many changes and limit this to the 
gpu reset case.

P.S. the mp1 state was correctly handled for mode1 reset. So, it's safe to 
enforce this for all gpu reset case instead of baco reset only. 

Regards,
Evan
> -Original Message-
> From: Alex Deucher 
> Sent: Wednesday, December 18, 2019 10:56 PM
> To: Quan, Evan 
> Cc: amd-gfx list 
> Subject: Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS
> baco reset
> 
> On Tue, Dec 17, 2019 at 10:25 PM Evan Quan  wrote:
> >
> > For non-RAS baco reset, there is no need to reset the SMC. Thus the
> > firmware reloading should be avoided.
> >
> > Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c
> > Signed-off-by: Evan Quan 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c |  3 ++-
> > drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 14 ++
> >  2 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > index c14f2ccd0677..9bf7e92394f5 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > @@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp)
> > continue;
> >
> > if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC &&
> > -   (psp_smu_reload_quirk(psp) || psp->autoload_supported))
> > +   ((adev->in_gpu_reset && psp_smu_reload_quirk(psp))
> > + || psp->autoload_supported))
> 
> Will this cover the power saving case as well?  Do we need to check
> adev->in_gpu_reset as well or can we drop that part?
> 
> Alex
> 
> > continue;
> >
> > if (amdgpu_sriov_vf(adev) && diff --git
> > a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > index c66ca8cc2ebd..ba761e9366e3 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> > @@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct
> psp_context *psp,
> > return true;
> >  }
> >
> > +/*
> > + * Check whether SMU is still alive. If that's true
> > + * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading.
> > + */
> > +static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp) {
> > +   struct amdgpu_device *adev = psp->adev;
> > +   uint32_t reg;
> > +
> > +   reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0);
> > +   return (reg &
> MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ?
> > +true : false; }
> > +
> >  static int psp_v11_0_mode1_reset(struct psp_context *psp)  {
> > int ret;
> > @@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = {
> > .ring_stop = psp_v11_0_ring_stop,
> > .ring_destroy = psp_v11_0_ring_destroy,
> > .compare_sram_data = psp_v11_0_compare_sram_data,
> > +   .smu_reload_quirk = psp_v11_0_smu_reload_quirk,
> > .mode1_reset = psp_v11_0_mode1_reset,
> > .xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info,
> > .xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info,
> > --
> > 2.24.0
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-
> gfxdata=02%7C01%7Cev
> >
> an.quan%40amd.com%7C8781ad2ef92d4a188c3008d783ca6846%7C3dd8961fe
> 4884e6
> >
> 08e11a82d994e183d%7C0%7C0%7C637122777663939524sdata=DMLV%
> 2Bz%2FsG
> > nXhpsiOdv9EZrsBcn6HGJ3L7lKdKL2PaPQ%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset

2019-12-18 Thread Alex Deucher
On Tue, Dec 17, 2019 at 10:25 PM Evan Quan  wrote:
>
> For non-RAS baco reset, there is no need to reset the SMC. Thus
> the firmware reloading should be avoided.
>
> Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c
> Signed-off-by: Evan Quan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 14 ++
>  2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index c14f2ccd0677..9bf7e92394f5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp)
> continue;
>
> if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC &&
> -   (psp_smu_reload_quirk(psp) || psp->autoload_supported))
> +   ((adev->in_gpu_reset && psp_smu_reload_quirk(psp))
> + || psp->autoload_supported))

Will this cover the power saving case as well?  Do we need to check
adev->in_gpu_reset as well or can we drop that part?

Alex

> continue;
>
> if (amdgpu_sriov_vf(adev) &&
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index c66ca8cc2ebd..ba761e9366e3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct 
> psp_context *psp,
> return true;
>  }
>
> +/*
> + * Check whether SMU is still alive. If that's true
> + * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading.
> + */
> +static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp)
> +{
> +   struct amdgpu_device *adev = psp->adev;
> +   uint32_t reg;
> +
> +   reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0);
> +   return (reg & MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ? true : 
> false;
> +}
> +
>  static int psp_v11_0_mode1_reset(struct psp_context *psp)
>  {
> int ret;
> @@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = {
> .ring_stop = psp_v11_0_ring_stop,
> .ring_destroy = psp_v11_0_ring_destroy,
> .compare_sram_data = psp_v11_0_compare_sram_data,
> +   .smu_reload_quirk = psp_v11_0_smu_reload_quirk,
> .mode1_reset = psp_v11_0_mode1_reset,
> .xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info,
> .xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info,
> --
> 2.24.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset

2019-12-17 Thread Evan Quan
For non-RAS baco reset, there is no need to reset the SMC. Thus
the firmware reloading should be avoided.

Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 14 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c14f2ccd0677..9bf7e92394f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1439,7 +1439,8 @@ static int psp_np_fw_load(struct psp_context *psp)
continue;
 
if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC &&
-   (psp_smu_reload_quirk(psp) || psp->autoload_supported))
+   ((adev->in_gpu_reset && psp_smu_reload_quirk(psp))
+ || psp->autoload_supported))
continue;
 
if (amdgpu_sriov_vf(adev) &&
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index c66ca8cc2ebd..ba761e9366e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -676,6 +676,19 @@ static bool psp_v11_0_compare_sram_data(struct psp_context 
*psp,
return true;
 }
 
+/*
+ * Check whether SMU is still alive. If that's true
+ * (e.g. for non-RAS baco reset), we need to skip SMC firmware reloading.
+ */
+static bool psp_v11_0_smu_reload_quirk(struct psp_context *psp)
+{
+   struct amdgpu_device *adev = psp->adev;
+   uint32_t reg;
+
+   reg = RREG32_PCIE(smnMP1_FIRMWARE_FLAGS | 0x03b0);
+   return (reg & MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK) ? true : 
false;
+}
+
 static int psp_v11_0_mode1_reset(struct psp_context *psp)
 {
int ret;
@@ -1070,6 +1083,7 @@ static const struct psp_funcs psp_v11_0_funcs = {
.ring_stop = psp_v11_0_ring_stop,
.ring_destroy = psp_v11_0_ring_destroy,
.compare_sram_data = psp_v11_0_compare_sram_data,
+   .smu_reload_quirk = psp_v11_0_smu_reload_quirk,
.mode1_reset = psp_v11_0_mode1_reset,
.xgmi_get_topology_info = psp_v11_0_xgmi_get_topology_info,
.xgmi_set_topology_info = psp_v11_0_xgmi_set_topology_info,
-- 
2.24.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx