On 7/31/2025 2:00 PM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> Patch looks good to me.
> 
> But I prefer to add a flag such as "link reset" to indicate that the response 
> register may not accurately represent the execution status of msg.

There is nothing wrong with the response register status. The message is
simply not processed since the firmware is doing link reset state. Hence
the response status is actually correct in the sense that previous
message is not processed.

> And mark this flag in the necessary case (such as reset/flr). And the check 
> of the previous msg response can be ignored when the next msg is issued.

A similar thing is done for RAS and there we allow a certain set of
messages to be sent.

In this case, we are marking messages as priority messages which are
expected to be processed any time regardless of the execution status of
the previous message. Reset is considered one such category of messages.
 > Additionally, this should belong to a global state of an SMU/PMFW,
and it doesn't seem very reasonable to bind it to a specific MSG.
> What do you think?

Though it is related to state, only specific messages will be processed
by firmware in that state. Here, this is only having a single message.
If there are others too, those will be flagged as well.

> 
> Btw, the msg of GfxDriverReset/FLR both can cause the response register to be 
> cleared.
> 

This is only bypassing the checks before sending reset message, not
after sending reset message.

Thanks,
Lijo

> Best Regards,
> Kevin
> 
> -----Original Message-----
> From: Lazar, Lijo <lijo.la...@amd.com>
> Sent: Thursday, July 31, 2025 2:09 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking <hawking.zh...@amd.com>; Deucher, Alexander 
> <alexander.deuc...@amd.com>; Kamal, Asad <asad.ka...@amd.com>; Sun, 
> Ce(Overlord) <ce....@amd.com>; Wang, Yang(Kevin) <kevinyang.w...@amd.com>
> Subject: [PATCH] drm/amd/pm: Add priority messages for SMU v13.0.6
> 
> Certain messages will processed with high priority by PMFW even if it hasn't 
> responded to a previous message. Send the priority message regardless of the 
> success/fail status of the previous message. Add support on SMUv13.0.6 and 
> SMUv13.0.12
> 
> Signed-off-by: Lijo Lazar <lijo.la...@amd.com>
> ---
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h       |  1 +
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c  |  2 +-
>  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c   |  2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c             | 14 +++++++++-----
>  4 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h 
> b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> index d7a9e41820fa..aaf148591a98 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
> @@ -469,6 +469,7 @@ enum smu_feature_mask {
>  /* Message category flags */
>  #define SMU_MSG_VF_FLAG                        (1U << 0)
>  #define SMU_MSG_RAS_PRI                        (1U << 1)
> +#define SMU_MSG_HI_PRI                 (1U << 2)
> 
>  /* Firmware capability flags */
>  #define SMU_FW_CAP_RAS_PRI             (1U << 0)
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
> index 02a455a31c25..17e0303f603b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
> @@ -106,7 +106,7 @@ const struct cmn2asic_msg_mapping 
> smu_v13_0_12_message_map[SMU_MSG_MAX_COUNT] =
>         MSG_MAP(GetDpmFreqByIndex,                   
> PPSMC_MSG_GetDpmFreqByIndex,               1),
>         MSG_MAP(SetPptLimit,                         PPSMC_MSG_SetPptLimit,   
>                   0),
>         MSG_MAP(GetPptLimit,                         PPSMC_MSG_GetPptLimit,   
>                   1),
> -       MSG_MAP(GfxDeviceDriverReset,                
> PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI),
> +       MSG_MAP(GfxDeviceDriverReset,                
> PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI | SMU_MSG_HI_PRI),
>         MSG_MAP(DramLogSetDramAddrHigh,              
> PPSMC_MSG_DramLogSetDramAddrHigh,          0),
>         MSG_MAP(DramLogSetDramAddrLow,               
> PPSMC_MSG_DramLogSetDramAddrLow,           0),
>         MSG_MAP(DramLogSetDramSize,                  
> PPSMC_MSG_DramLogSetDramSize,              0),
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> index 9cc294f4708b..c22b3f646355 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> @@ -145,7 +145,7 @@ static const struct cmn2asic_msg_mapping 
> smu_v13_0_6_message_map[SMU_MSG_MAX_COU
>         MSG_MAP(GetDpmFreqByIndex,                   
> PPSMC_MSG_GetDpmFreqByIndex,               1),
>         MSG_MAP(SetPptLimit,                         PPSMC_MSG_SetPptLimit,   
>                   0),
>         MSG_MAP(GetPptLimit,                         PPSMC_MSG_GetPptLimit,   
>                   1),
> -       MSG_MAP(GfxDeviceDriverReset,                
> PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI),
> +       MSG_MAP(GfxDeviceDriverReset,                
> PPSMC_MSG_GfxDriverReset,                  SMU_MSG_RAS_PRI | SMU_MSG_HI_PRI),
>         MSG_MAP(DramLogSetDramAddrHigh,              
> PPSMC_MSG_DramLogSetDramAddrHigh,          0),
>         MSG_MAP(DramLogSetDramAddrLow,               
> PPSMC_MSG_DramLogSetDramAddrLow,           0),
>         MSG_MAP(DramLogSetDramSize,                  
> PPSMC_MSG_DramLogSetDramSize,              0),
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index 59f9abd0f7b8..f1f5cd8c2cd9 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -256,11 +256,12 @@ static int __smu_cmn_ras_filter_msg(struct smu_context 
> *smu,  {
>         struct amdgpu_device *adev = smu->adev;
>         uint32_t flags, resp;
> -       bool fed_status;
> +       bool fed_status, pri;
> 
>         flags = __smu_cmn_get_msg_flags(smu, msg);
>         *poll = true;
> 
> +       pri = !!(flags & SMU_MSG_HI_PRI);
>         /* When there is RAS fatal error, FW won't process non-RAS priority
>          * messages. Don't allow any messages other than RAS priority 
> messages.
>          */
> @@ -272,15 +273,18 @@ static int __smu_cmn_ras_filter_msg(struct smu_context 
> *smu,
>                                 smu_get_message_name(smu, msg));
>                         return -EACCES;
>                 }
> +       }
> 
> +       if (pri || fed_status) {
>                 /* FW will ignore non-priority messages when a RAS fatal error
> -                * is detected. Hence it is possible that a previous message
> -                * wouldn't have got response. Allow to continue without 
> polling
> -                * for response status for priority messages.
> +                * or reset condition is detected. Hence it is possible that a
> +                * previous message wouldn't have got response. Allow to
> +                * continue without polling for response status for priority
> +                * messages.
>                  */
>                 resp = RREG32(smu->resp_reg);
>                 dev_dbg(adev->dev,
> -                       "Sending RAS priority message %s response status: %x",
> +                       "Sending priority message %s response status: %x",
>                         smu_get_message_name(smu, msg), resp);
>                 if (resp == 0)
>                         *poll = false;
> --
> 2.49.0
> 

Reply via email to