On 8/4/2025 11:10 AM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> RE: SMU_MSG_HI_PRI
>
> The name doesn't look clear and generic enough and how about rename to
> SMU_MSG_NO_RESP_CHECK ?
Since we check the response of the sent message, NO_RESP_CHECK may be
confusing. The check is skipped only before sending; will change to
SMU_MSG_NO_PRECHECK before pushing. Thanks!
Thanks,
Lijo
>
> Anyway, the patch is
>
> Reviewed-by: Yang Wang <kevinyang.w...@amd.com>
>
> Best Regards,
> Kevin
>
> -----Original Message-----
> From: Lazar, Lijo <lijo.la...@amd.com>
> Sent: Friday, August 1, 2025 15:43
> To: Wang, Yang(Kevin) <kevinyang.w...@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking <hawking.zh...@amd.com>; Deucher, Alexander
> <alexander.deuc...@amd.com>; Kamal, Asad <asad.ka...@amd.com>; Sun,
> Ce(Overlord) <ce....@amd.com>
> Subject: Re: [PATCH] drm/amd/pm: Add priority messages for SMU v13.0.6
>
>
>
> On 7/31/2025 2:00 PM, Wang, Yang(Kevin) wrote:
>> [AMD Official Use Only - AMD Internal Distribution Only]
>>
>> Patch looks good to me.
>>
>> But I prefer to add a flag such as "link reset" to indicate that the
>> response register may not accurately represent the execution status of msg.
>
> There is nothing wrong with the response register status. The message is
> simply not processed since the firmware is doing link reset state. Hence the
> response status is actually correct in the sense that previous message is not
> processed.
>
>> And mark this flag in the necessary case (such as reset/flr). And the check
>> of the previous msg response can be ignored when the next msg is issued.
>
> A similar thing is done for RAS and there we allow a certain set of messages
> to be sent.
>
> In this case, we are marking messages as priority messages which are expected
> to be processed any time regardless of the execution status of the previous
> message. Reset is considered one such category of messages.
> > Additionally, this should belong to a global state of an SMU/PMFW, and it
> doesn't seem very reasonable to bind it to a specific MSG.
>> What do you think?
>
> Though it is related to state, only specific messages will be processed by
> firmware in that state. Here, this is only having a single message.
> If there are others too, those will be flagged as well.
>
>>
>> Btw, the msg of GfxDriverReset/FLR both can cause the response register to
>> be cleared.
>>
>
> This is only bypassing the checks before sending reset message, not after
> sending reset message.
>
> Thanks,
> Lijo
>
>> Best Regards,
>> Kevin
>>
>> -----Original Message-----
>> From: Lazar, Lijo <lijo.la...@amd.com>
>> Sent: Thursday, July 31, 2025 2:09 PM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Zhang, Hawking <hawking.zh...@amd.com>; Deucher, Alexander
>> <alexander.deuc...@amd.com>; Kamal, Asad <asad.ka...@amd.com>; Sun,
>> Ce(Overlord) <ce....@amd.com>; Wang, Yang(Kevin)
>> <kevinyang.w...@amd.com>
>> Subject: [PATCH] drm/amd/pm: Add priority messages for SMU v13.0.6
>>
>> Certain messages will processed with high priority by PMFW even if it
>> hasn't responded to a previous message. Send the priority message
>> regardless of the success/fail status of the previous message. Add
>> support on SMUv13.0.6 and SMUv13.0.12
>>
>> Signed-off-by: Lijo Lazar <lijo.la...@amd.com>
>> ---
>> drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 1 +
>> .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c | 2 +-
>> .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 +-
>> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 14 +++++++++-----
>> 4 files changed, 12 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
>> b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
>> index d7a9e41820fa..aaf148591a98 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h
>> @@ -469,6 +469,7 @@ enum smu_feature_mask {
>> /* Message category flags */
>> #define SMU_MSG_VF_FLAG (1U << 0)
>> #define SMU_MSG_RAS_PRI (1U << 1)
>> +#define SMU_MSG_HI_PRI (1U << 2)
>>
>> /* Firmware capability flags */
>> #define SMU_FW_CAP_RAS_PRI (1U << 0)
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
>> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
>> index 02a455a31c25..17e0303f603b 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c
>> @@ -106,7 +106,7 @@ const struct cmn2asic_msg_mapping
>> smu_v13_0_12_message_map[SMU_MSG_MAX_COUNT] =
>> MSG_MAP(GetDpmFreqByIndex,
>> PPSMC_MSG_GetDpmFreqByIndex, 1),
>> MSG_MAP(SetPptLimit, PPSMC_MSG_SetPptLimit,
>> 0),
>> MSG_MAP(GetPptLimit, PPSMC_MSG_GetPptLimit,
>> 1),
>> - MSG_MAP(GfxDeviceDriverReset,
>> PPSMC_MSG_GfxDriverReset, SMU_MSG_RAS_PRI),
>> + MSG_MAP(GfxDeviceDriverReset,
>> PPSMC_MSG_GfxDriverReset, SMU_MSG_RAS_PRI | SMU_MSG_HI_PRI),
>> MSG_MAP(DramLogSetDramAddrHigh,
>> PPSMC_MSG_DramLogSetDramAddrHigh, 0),
>> MSG_MAP(DramLogSetDramAddrLow,
>> PPSMC_MSG_DramLogSetDramAddrLow, 0),
>> MSG_MAP(DramLogSetDramSize,
>> PPSMC_MSG_DramLogSetDramSize, 0),
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
>> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
>> index 9cc294f4708b..c22b3f646355 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
>> @@ -145,7 +145,7 @@ static const struct cmn2asic_msg_mapping
>> smu_v13_0_6_message_map[SMU_MSG_MAX_COU
>> MSG_MAP(GetDpmFreqByIndex,
>> PPSMC_MSG_GetDpmFreqByIndex, 1),
>> MSG_MAP(SetPptLimit, PPSMC_MSG_SetPptLimit,
>> 0),
>> MSG_MAP(GetPptLimit, PPSMC_MSG_GetPptLimit,
>> 1),
>> - MSG_MAP(GfxDeviceDriverReset,
>> PPSMC_MSG_GfxDriverReset, SMU_MSG_RAS_PRI),
>> + MSG_MAP(GfxDeviceDriverReset,
>> PPSMC_MSG_GfxDriverReset, SMU_MSG_RAS_PRI | SMU_MSG_HI_PRI),
>> MSG_MAP(DramLogSetDramAddrHigh,
>> PPSMC_MSG_DramLogSetDramAddrHigh, 0),
>> MSG_MAP(DramLogSetDramAddrLow,
>> PPSMC_MSG_DramLogSetDramAddrLow, 0),
>> MSG_MAP(DramLogSetDramSize,
>> PPSMC_MSG_DramLogSetDramSize, 0),
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>> b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>> index 59f9abd0f7b8..f1f5cd8c2cd9 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
>> @@ -256,11 +256,12 @@ static int __smu_cmn_ras_filter_msg(struct smu_context
>> *smu, {
>> struct amdgpu_device *adev = smu->adev;
>> uint32_t flags, resp;
>> - bool fed_status;
>> + bool fed_status, pri;
>>
>> flags = __smu_cmn_get_msg_flags(smu, msg);
>> *poll = true;
>>
>> + pri = !!(flags & SMU_MSG_HI_PRI);
>> /* When there is RAS fatal error, FW won't process non-RAS priority
>> * messages. Don't allow any messages other than RAS priority
>> messages.
>> */
>> @@ -272,15 +273,18 @@ static int __smu_cmn_ras_filter_msg(struct smu_context
>> *smu,
>> smu_get_message_name(smu, msg));
>> return -EACCES;
>> }
>> + }
>>
>> + if (pri || fed_status) {
>> /* FW will ignore non-priority messages when a RAS fatal
>> error
>> - * is detected. Hence it is possible that a previous message
>> - * wouldn't have got response. Allow to continue without
>> polling
>> - * for response status for priority messages.
>> + * or reset condition is detected. Hence it is possible that
>> a
>> + * previous message wouldn't have got response. Allow to
>> + * continue without polling for response status for priority
>> + * messages.
>> */
>> resp = RREG32(smu->resp_reg);
>> dev_dbg(adev->dev,
>> - "Sending RAS priority message %s response status:
>> %x",
>> + "Sending priority message %s response status:
>> + %x",
>> smu_get_message_name(smu, msg), resp);
>> if (resp == 0)
>> *poll = false;
>> --
>> 2.49.0
>>
>