[AMD Official Use Only - AMD Internal Distribution Only]

Hi Alex,


> -----Original Message-----
> From: Alex Deucher <[email protected]>
> Sent: Tuesday, January 6, 2026 6:40 AM
> To: Yuan, Perry <[email protected]>
> Cc: [email protected]; Deucher, Alexander
> <[email protected]>; Zhang, Yifan <[email protected]>
> Subject: Re: [PATCH] drm/amd/pm: Disable MMIO access during SMU Mode 1
> reset
>
> On Fri, Dec 26, 2025 at 4:36 AM Perry Yuan <[email protected]> wrote:
> >
> > During Mode 1 reset, the ASIC undergoes a reset cycle and becomes
> > temporarily inaccessible via PCIe. Any attempt to access MMIO
> > registers during this window (e.g., from interrupt handlers or other
> > driver threads) can result in uncompleted PCIe transactions, leading
> > to NMI panics or system hangs.
> >
> > To prevent this, set the `no_hw_access` flag to true immediately after
> > triggering the reset. This signals other driver components to skip
> > register accesses while the device is offline.
> >
> > A memory barrier `smp_mb()` is added to ensure the flag update is
> > globally visible to all cores before the driver enters the sleep/wait
> > state.
>
> Seems like it would make sense to extend this to all asics which support mode1
> reset.
>
> Alex

Sounds good, I will make the change for other asics which has the mode 1 reset 
callback then.

Best Regards.

Perry.

>
> >
> > Signed-off-by: Perry Yuan <[email protected]>
> > Reviewed-by: Yifan Zhang <[email protected]>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c           | 3 +++
> >  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 7 ++++++-
> > drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c | 9 +++++++--
> >  3 files changed, 16 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 824c5489ec85..75b1b78c0437 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -5776,6 +5776,9 @@ int amdgpu_device_mode1_reset(struct
> amdgpu_device *adev)
> >         if (ret)
> >                 goto mode1_reset_failed;
> >
> > +       /* enable mmio access after mode 1 reset completed */
> > +       adev->no_hw_access = false;
> > +
> >         amdgpu_device_load_pci_state(adev->pdev);
> >         ret = amdgpu_psp_wait_for_bootloader(adev);
> >         if (ret)
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > index 8e35d501e81d..dcb169b25916 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > @@ -2850,8 +2850,13 @@ static int smu_v13_0_0_mode1_reset(struct
> smu_context *smu)
> >                 break;
> >         }
> >
> > -       if (!ret)
> > +       if (!ret) {
> > +               /* disable mmio access while doing mode 1 reset*/
> > +               smu->adev->no_hw_access = true;
> > +               /* ensure no_hw_access is globally visible before any MMIO 
> > */
> > +               smp_mb();
> >                 msleep(SMU13_MODE1_RESET_WAIT_TIME_IN_MS);
> > +       }
> >
> >         return ret;
> >  }
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c
> > b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c
> > index af1bc7b4350b..b1016debdf06 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c
> > @@ -2069,10 +2069,15 @@ static int smu_v14_0_2_mode1_reset(struct
> > smu_context *smu)
> >
> >         ret = smu_cmn_send_debug_smc_msg(smu,
> DEBUGSMC_MSG_Mode1Reset);
> >         if (!ret) {
> > -               if (amdgpu_emu_mode == 1)
> > +               if (amdgpu_emu_mode == 1) {
> >                         msleep(50000);
> > -               else
> > +               } else {
> > +                       /* disable mmio access while doing mode 1 reset*/
> > +                       smu->adev->no_hw_access = true;
> > +                       /* ensure no_hw_access is globally visible before 
> > any MMIO */
> > +                       smp_mb();
> >                         msleep(1000);
> > +               }
> >         }
> >
> >         return ret;
> > --
> > 2.34.1
> >

Reply via email to