RE: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed
[AMD Official Use Only - General] Yes, it is. Regards, Hawking From: Deucher, Alexander Sent: Wednesday, January 3, 2024 03:45 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Zhou1, Tao ; Yang, Stanley ; Wang, Yang(Kevin) ; Chai, Thomas ; Li, Candice Cc: Lazar, Lijo ; Ma, Le Subject: Re: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed [AMD Official Use Only - General] Is mmIP_DISCOVERY_VERSION at the same offset across ASIC families? Alex From: Hawking Zhang mailto:hawking.zh...@amd.com>> Sent: Monday, January 1, 2024 10:43 PM To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> mailto:amd-gfx@lists.freedesktop.org>>; Zhou1, Tao mailto:tao.zh...@amd.com>>; Yang, Stanley mailto:stanley.y...@amd.com>>; Wang, Yang(Kevin) mailto:kevinyang.w...@amd.com>>; Chai, Thomas mailto:yipeng.c...@amd.com>>; Li, Candice mailto:candice...@amd.com>> Cc: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Deucher, Alexander mailto:alexander.deuc...@amd.com>>; Lazar, Lijo mailto:lijo.la...@amd.com>>; Ma, Le mailto:le...@amd.com>> Subject: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed Check and report boot status if discovery failed. Signed-off-by: Hawking Zhang mailto:hawking.zh...@amd.com>> --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index b8fde08aec8e..302b71e9f1e2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -27,6 +27,7 @@ #include "amdgpu_discovery.h" #include "soc15_hw_ip.h" #include "discovery.h" +#include "amdgpu_ras.h" #include "soc15.h" #include "gfx_v9_0.h" @@ -98,6 +99,7 @@ #define FIRMWARE_IP_DISCOVERY "amdgpu/ip_discovery.bin" MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY); +#define mmIP_DISCOVERY_VERSION 0x16A00 #define mmRCC_CONFIG_MEMSIZE0xde3 #define mmMP0_SMN_C2PMSG_33 0x16061 #define mmMM_INDEX 0x0 @@ -518,7 +520,9 @@ static int amdgpu_discovery_init(struct amdgpu_device *adev) out: kfree(adev->mman.discovery_bin); adev->mman.discovery_bin = NULL; - + if ((amdgpu_discovery != 2) && + (RREG32(mmIP_DISCOVERY_VERSION) == 4)) + amdgpu_ras_query_boot_status(adev, 4); return r; } -- 2.17.1
RE: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed
[AMD Official Use Only - General] RE - I'm not sure about hard-coding 4 instances here. The code you dropped in patch 1 was using adev->aid_mask. But I guess that's not even initialized correctly if IP discovery failed. Will this work correctly on the APU version? Yes aid_mask is not initialized. IP_DISCOVERY_VERSION is the only available fuse setting that can be used to identify or equivalent to 4 instances of aid in such case. We switched to a common mailbox reg that works for both APU and dGPU. The expectation is for APU, driver still reports fw boot status, while it gives next level information on the failures if boot fails on dGPU. Regards, Hawking -Original Message- From: Kuehling, Felix Sent: Wednesday, January 3, 2024 01:49 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org; Zhou1, Tao ; Yang, Stanley ; Wang, Yang(Kevin) ; Chai, Thomas ; Li, Candice Cc: Deucher, Alexander ; Ma, Le ; Lazar, Lijo Subject: Re: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed On 2024-01-02 09:07, Hawking Zhang wrote: > Check and report boot status if discovery failed. > > Signed-off-by: Hawking Zhang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > index b8fde08aec8e..302b71e9f1e2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > @@ -27,6 +27,7 @@ > #include "amdgpu_discovery.h" > #include "soc15_hw_ip.h" > #include "discovery.h" > +#include "amdgpu_ras.h" > > #include "soc15.h" > #include "gfx_v9_0.h" > @@ -98,6 +99,7 @@ > #define FIRMWARE_IP_DISCOVERY "amdgpu/ip_discovery.bin" > MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY); > > +#define mmIP_DISCOVERY_VERSION 0x16A00 > #define mmRCC_CONFIG_MEMSIZE0xde3 > #define mmMP0_SMN_C2PMSG_33 0x16061 > #define mmMM_INDEX 0x0 > @@ -518,7 +520,9 @@ static int amdgpu_discovery_init(struct amdgpu_device > *adev) > out: > kfree(adev->mman.discovery_bin); > adev->mman.discovery_bin = NULL; > - > + if ((amdgpu_discovery != 2) && > + (RREG32(mmIP_DISCOVERY_VERSION) == 4)) > + amdgpu_ras_query_boot_status(adev, 4); I'm not sure about hard-coding 4 instances here. The code you dropped in patch 1 was using adev->aid_mask. But I guess that's not even initialized correctly if IP discovery failed. Will this work correctly on the APU version? Regards, Felix > return r; > } >
Re: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed
[AMD Official Use Only - General] Is mmIP_DISCOVERY_VERSION at the same offset across ASIC families? Alex From: Hawking Zhang Sent: Monday, January 1, 2024 10:43 PM To: amd-gfx@lists.freedesktop.org ; Zhou1, Tao ; Yang, Stanley ; Wang, Yang(Kevin) ; Chai, Thomas ; Li, Candice Cc: Zhang, Hawking ; Deucher, Alexander ; Lazar, Lijo ; Ma, Le Subject: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed Check and report boot status if discovery failed. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index b8fde08aec8e..302b71e9f1e2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -27,6 +27,7 @@ #include "amdgpu_discovery.h" #include "soc15_hw_ip.h" #include "discovery.h" +#include "amdgpu_ras.h" #include "soc15.h" #include "gfx_v9_0.h" @@ -98,6 +99,7 @@ #define FIRMWARE_IP_DISCOVERY "amdgpu/ip_discovery.bin" MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY); +#define mmIP_DISCOVERY_VERSION 0x16A00 #define mmRCC_CONFIG_MEMSIZE0xde3 #define mmMP0_SMN_C2PMSG_33 0x16061 #define mmMM_INDEX 0x0 @@ -518,7 +520,9 @@ static int amdgpu_discovery_init(struct amdgpu_device *adev) out: kfree(adev->mman.discovery_bin); adev->mman.discovery_bin = NULL; - + if ((amdgpu_discovery != 2) && + (RREG32(mmIP_DISCOVERY_VERSION) == 4)) + amdgpu_ras_query_boot_status(adev, 4); return r; } -- 2.17.1
Re: [PATCH 4/5] drm/amdgpu: Query boot status if discovery failed
On 2024-01-02 09:07, Hawking Zhang wrote: Check and report boot status if discovery failed. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index b8fde08aec8e..302b71e9f1e2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -27,6 +27,7 @@ #include "amdgpu_discovery.h" #include "soc15_hw_ip.h" #include "discovery.h" +#include "amdgpu_ras.h" #include "soc15.h" #include "gfx_v9_0.h" @@ -98,6 +99,7 @@ #define FIRMWARE_IP_DISCOVERY "amdgpu/ip_discovery.bin" MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY); +#define mmIP_DISCOVERY_VERSION 0x16A00 #define mmRCC_CONFIG_MEMSIZE 0xde3 #define mmMP0_SMN_C2PMSG_33 0x16061 #define mmMM_INDEX0x0 @@ -518,7 +520,9 @@ static int amdgpu_discovery_init(struct amdgpu_device *adev) out: kfree(adev->mman.discovery_bin); adev->mman.discovery_bin = NULL; - + if ((amdgpu_discovery != 2) && + (RREG32(mmIP_DISCOVERY_VERSION) == 4)) + amdgpu_ras_query_boot_status(adev, 4); I'm not sure about hard-coding 4 instances here. The code you dropped in patch 1 was using adev->aid_mask. But I guess that's not even initialized correctly if IP discovery failed. Will this work correctly on the APU version? Regards, Felix return r; }