Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
On 5/14/21 4:13 PM, Alex Deucher wrote: On Fri, May 14, 2021 at 4:20 AM wrote: From: changzhu From: Changfeng There is problem with 3DCGCG firmware and it will cause compute test hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid compute hang. Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 Signed-off-by: Changfeng Reviewed-by: Alex Deucher WIth this applied, can we re-enable the additional compute queues? I didn't push that change as I was suppose do more tests with KFD and I probably got distracted by some other activity. Sorry for causing this confusion! Acked-by: Nirmoy Das Regards, Nirmoy Alex --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 22608c45f07c..feaa5e4a5538 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, amdgpu_gfx_rlc_enter_safe_mode(adev); /* Enable 3D CGCG/CGLS */ - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { + if (enable) { /* write cmd to clear cgcg/cgls ov */ def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); /* unset CGCG override */ @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, /* enable 3Dcgcg FSM(0x363f) */ def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; + else + data = 0x0 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 4b660b2d1c22..080e715799d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | AMD_CG_SUPPORT_GFX_MGLS | AMD_CG_SUPPORT_GFX_CP_LS | - AMD_CG_SUPPORT_GFX_3D_CGCG | AMD_CG_SUPPORT_GFX_3D_CGLS | AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_CGLS | @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) AMD_CG_SUPPORT_GFX_MGLS | AMD_CG_SUPPORT_GFX_RLC_LS | AMD_CG_SUPPORT_GFX_CP_LS | - AMD_CG_SUPPORT_GFX_3D_CGCG | AMD_CG_SUPPORT_GFX_3D_CGLS | AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_CGLS | -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cnirmoy.das%40amd.com%7C2337da3349cc4613a5bf08d916e28b82%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565984543494149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=BCI8ckEunFfb5P80Ncaa3iuz9SHEqj07SXt6H2lZMCg%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cnirmoy.das%40amd.com%7C2337da3349cc4613a5bf08d916e28b82%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565984543494149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=BCI8ckEunFfb5P80Ncaa3iuz9SHEqj07SXt6H2lZMCg%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
On 5/19/21 5:14 AM, Huang, Ray wrote: [Public] I check the patch (below) to disable compute queues for raven is not landed into drm-next. So actually all queues are enabled at this moment. Nirmoy, can we get your confirmation? I indeed didn't push the commit that disable all but one cu for raven. I was suppose to check with kfd as Felix wanted to know if that bug affects KFD. I think I got distracted with something else. Regards, Nirmoy *diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c* *index 97a8f786cf85..9352fcb77fe9 100644* *--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c* *+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c* *@@ -812,6 +812,13 @@* void amdgpu_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v) int amdgpu_gfx_get_num_kcq(struct amdgpu_device *adev) { if (amdgpu_num_kcq == -1) { + /* raven firmware currently can not load balance jobs + * among multiple compute queues. Enable only one + * compute queue till we have a firmware fix. + */ + if (adev->asic_type == CHIP_RAVEN) + return 1; + return 8; } else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid parameter provided by user\n"); And I am glad to see that we have a solution to fix this issue at current. Nice work, Changfeng! Best Regards, Ray *From:* Deucher, Alexander *Sent:* Wednesday, May 19, 2021 11:04 AM *To:* Chen, Guchun ; Zhu, Changfeng ; Alex Deucher ; Das, Nirmoy *Cc:* Huang, Ray ; amd-gfx list *Subject:* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] I thought we had disabled all but one of the compute queues on raven due to this issue or at least disabled the schedulers for the additional queues, but maybe I'm misremembering. Alex *From:*Chen, Guchun mailto:guchun.c...@amd.com>> *Sent:* Tuesday, May 18, 2021 11:00 PM *To:* Zhu, Changfeng <mailto:changfeng@amd.com>>; Deucher, Alexander mailto:alexander.deuc...@amd.com>>; Alex Deucher mailto:alexdeuc...@gmail.com>>; Das, Nirmoy mailto:nirmoy@amd.com>> *Cc:* Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list <mailto:amd-gfx@lists.freedesktop.org>> *Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] Nirmoy’s patch landed already if I understand correctly. d41a39dda140 drm/scheduler: improve job distribution with multiple queues Regards, Guchun *From:* amd-gfx <mailto:amd-gfx-boun...@lists.freedesktop.org>> *On Behalf Of *Zhu, Changfeng *Sent:* Wednesday, May 19, 2021 10:56 AM *To:* Deucher, Alexander <mailto:alexander.deuc...@amd.com>>; Alex Deucher mailto:alexdeuc...@gmail.com>>; Das, Nirmoy mailto:nirmoy@amd.com>> *Cc:* Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list <mailto:amd-gfx@lists.freedesktop.org>> *Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. *From:* Deucher, Alexander <mailto:alexander.deuc...@amd.com>> *Sent:* Wednesday, May 19, 2021 10:53 AM *To:* Zhu, Changfeng <mailto:changfeng@amd.com>>; Alex Deucher <mailto:alexdeuc...@gmail.com>>; Das, Nirmoy <mailto:nirmoy....@amd.com>> *Cc:* Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list <mailto:amd-gfx@lists.freedesktop.org>> *Subject:* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex *From:*amd-gfx <mailto:amd-gfx-boun...@lists.freedesktop.org>> on behalf of Zhu, Changfeng mailto:changfeng@amd.com>> *Sent:* Tuesday, May 18, 2021 10:28 PM *To:* Alex Deucher mailto:alexdeuc...@gmail.com>> *Cc:* Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list <mailto:amd-gfx@lists.freedesktop.org>> *Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -Original Message- From: Alex Deucher mailto:alexdeuc...@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To
RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
[Public] I check the patch (below) to disable compute queues for raven is not landed into drm-next. So actually all queues are enabled at this moment. Nirmoy, can we get your confirmation? diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 97a8f786cf85..9352fcb77fe9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -812,6 +812,13 @@ void amdgpu_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v) int amdgpu_gfx_get_num_kcq(struct amdgpu_device *adev) { if (amdgpu_num_kcq == -1) { +/* raven firmware currently can not load balance jobs +* among multiple compute queues. Enable only one +* compute queue till we have a firmware fix. +*/ +if (adev->asic_type == CHIP_RAVEN) + return 1; + return 8; } else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid parameter provided by user\n"); And I am glad to see that we have a solution to fix this issue at current. Nice work, Changfeng! Best Regards, Ray From: Deucher, Alexander Sent: Wednesday, May 19, 2021 11:04 AM To: Chen, Guchun ; Zhu, Changfeng ; Alex Deucher ; Das, Nirmoy Cc: Huang, Ray ; amd-gfx list Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] I thought we had disabled all but one of the compute queues on raven due to this issue or at least disabled the schedulers for the additional queues, but maybe I'm misremembering. Alex From: Chen, Guchun mailto:guchun.c...@amd.com>> Sent: Tuesday, May 18, 2021 11:00 PM To: Zhu, Changfeng mailto:changfeng@amd.com>>; Deucher, Alexander mailto:alexander.deuc...@amd.com>>; Alex Deucher mailto:alexdeuc...@gmail.com>>; Das, Nirmoy mailto:nirmoy@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] Nirmoy's patch landed already if I understand correctly. d41a39dda140 drm/scheduler: improve job distribution with multiple queues Regards, Guchun From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> On Behalf Of Zhu, Changfeng Sent: Wednesday, May 19, 2021 10:56 AM To: Deucher, Alexander mailto:alexander.deuc...@amd.com>>; Alex Deucher mailto:alexdeuc...@gmail.com>>; Das, Nirmoy mailto:nirmoy@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander mailto:alexander.deuc...@amd.com>> Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng mailto:changfeng@amd.com>>; Alex Deucher mailto:alexdeuc...@gmail.com>>; Das, Nirmoy mailto:nirmoy....@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> on behalf of Zhu, Changfeng mailto:changfeng@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher mailto:alexdeuc...@gmail.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -Original Message- From: Alex Deucher mailto:alexdeuc...@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng mailto:changfeng....@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute qu
Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
[Public] I thought we had disabled all but one of the compute queues on raven due to this issue or at least disabled the schedulers for the additional queues, but maybe I'm misremembering. Alex From: Chen, Guchun Sent: Tuesday, May 18, 2021 11:00 PM To: Zhu, Changfeng ; Deucher, Alexander ; Alex Deucher ; Das, Nirmoy Cc: Huang, Ray ; amd-gfx list Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] Nirmoy’s patch landed already if I understand correctly. d41a39dda140 drm/scheduler: improve job distribution with multiple queues Regards, Guchun From: amd-gfx On Behalf Of Zhu, Changfeng Sent: Wednesday, May 19, 2021 10:56 AM To: Deucher, Alexander ; Alex Deucher ; Das, Nirmoy Cc: Huang, Ray ; amd-gfx list Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander mailto:alexander.deuc...@amd.com>> Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng mailto:changfeng@amd.com>>; Alex Deucher mailto:alexdeuc...@gmail.com>>; Das, Nirmoy mailto:nirmoy@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> on behalf of Zhu, Changfeng mailto:changfeng@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher mailto:alexdeuc...@gmail.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -Original Message- From: Alex Deucher mailto:alexdeuc...@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng mailto:changfeng@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng mailto:changfeng@amd.com>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -Original Message- > From: Huang, Ray mailto:ray.hu...@amd.com>> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher mailto:alexdeuc...@gmail.com>>; Zhu, > Changfeng > mailto:changfeng@amd.com>> > Cc: amd-gfx list > mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM > > mailto:changfeng@amd.com>> wrote: > > > > > > From: changzhu mailto:changfeng@amd.com>> > > > > > > From: Changfeng mailto:changfeng@amd.com>> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng > > > mailto:changfeng@amd.com>> > > > > Reviewed-by: Alex
RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
[Public] Nirmoy's patch landed already if I understand correctly. d41a39dda140 drm/scheduler: improve job distribution with multiple queues Regards, Guchun From: amd-gfx On Behalf Of Zhu, Changfeng Sent: Wednesday, May 19, 2021 10:56 AM To: Deucher, Alexander ; Alex Deucher ; Das, Nirmoy Cc: Huang, Ray ; amd-gfx list Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] [Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander mailto:alexander.deuc...@amd.com>> Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng mailto:changfeng@amd.com>>; Alex Deucher mailto:alexdeuc...@gmail.com>>; Das, Nirmoy mailto:nirmoy@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> on behalf of Zhu, Changfeng mailto:changfeng@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher mailto:alexdeuc...@gmail.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -Original Message- From: Alex Deucher mailto:alexdeuc...@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng mailto:changfeng@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng mailto:changfeng@amd.com>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -Original Message- > From: Huang, Ray mailto:ray.hu...@amd.com>> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher mailto:alexdeuc...@gmail.com>>; Zhu, > Changfeng > mailto:changfeng@amd.com>> > Cc: amd-gfx list > mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM > > mailto:changfeng@amd.com>> wrote: > > > > > > From: changzhu mailto:changfeng@amd.com>> > > > > > > From: Changfeng mailto:changfeng@amd.com>> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng > > > mailto:changfeng@amd.com>> > > > > Reviewed-by: Alex Deucher > > mailto:alexander.deuc...@amd.com>> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui mailto:ray.hu...@amd.com>> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > > > drivers/gpu/drm/amd/amdgp
RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
[Public] Hi Alex, This is the issue exposed by Nirmoy's patch that provided better load balancing across queues. BR, Changfeng. From: Deucher, Alexander Sent: Wednesday, May 19, 2021 10:53 AM To: Zhu, Changfeng ; Alex Deucher ; Das, Nirmoy Cc: Huang, Ray ; amd-gfx list Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> on behalf of Zhu, Changfeng mailto:changfeng@amd.com>> Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher mailto:alexdeuc...@gmail.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -Original Message- From: Alex Deucher mailto:alexdeuc...@gmail.com>> Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng mailto:changfeng@amd.com>> Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list mailto:amd-gfx@lists.freedesktop.org>> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng mailto:changfeng@amd.com>> wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -Original Message- > From: Huang, Ray mailto:ray.hu...@amd.com>> > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher mailto:alexdeuc...@gmail.com>>; Zhu, > Changfeng > mailto:changfeng....@amd.com>> > Cc: amd-gfx list > mailto:amd-gfx@lists.freedesktop.org>> > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM > > mailto:changfeng@amd.com>> wrote: > > > > > > From: changzhu mailto:changfeng@amd.com>> > > > > > > From: Changfeng mailto:changfeng@amd.com>> > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng > > > mailto:changfeng@amd.com>> > > > > Reviewed-by: Alex Deucher > > mailto:alexander.deuc...@amd.com>> > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui mailto:ray.hu...@amd.com>> > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct > > > amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if
Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
[Public] + Nirmoy I thought we disabled all but one of the compute queues on raven due to this issue. Maybe that patch never landed? Wasn't this the same issue that was exposed by Nirmoy's patch that provided better load balancing across queues? Alex From: amd-gfx on behalf of Zhu, Changfeng Sent: Tuesday, May 18, 2021 10:28 PM To: Alex Deucher Cc: Huang, Ray ; amd-gfx list Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang [AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -Original Message- From: Alex Deucher Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng Cc: Huang, Ray ; amd-gfx list Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -Original Message- > From: Huang, Ray > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher ; Zhu, Changfeng > > Cc: amd-gfx list > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM wrote: > > > > > > From: changzhu > > > > > > From: Changfeng > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng > > > > Reviewed-by: Alex Deucher > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct > > > amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, > > > mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << > > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << > > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > +
RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
[AMD Official Use Only - Internal Distribution Only] Hi Alex. I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Do you mean we have something else to do for re-enabling the extra compute queues? BR, Changfeng. -Original Message- From: Alex Deucher Sent: Wednesday, May 19, 2021 10:20 AM To: Zhu, Changfeng Cc: Huang, Ray ; amd-gfx list Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -Original Message- > From: Huang, Ray > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher ; Zhu, Changfeng > > Cc: amd-gfx list > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to > avoid compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM wrote: > > > > > > From: changzhu > > > > > > From: Changfeng > > > > > > There is problem with 3DCGCG firmware and it will cause compute > > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver > > > to avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng > > > > Reviewed-by: Alex Deucher > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct > > > amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, > > > mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x363f) */ > > > def = RREG32_SOC15(GC, 0, > > > mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << > > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << > > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --
Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
Care to submit a patch to re-enable the extra compute queues? Alex On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng wrote: > > [AMD Official Use Only - Internal Distribution Only] > > Hi Ray and Alex, > > I have confirmed it can enable the additional compute queues with this patch: > > [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 > [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 > [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 > [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 > [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 > [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 > [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 > [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 > > BR, > Changfeng. > > > -Original Message- > From: Huang, Ray > Sent: Monday, May 17, 2021 2:27 PM > To: Alex Deucher ; Zhu, Changfeng > > Cc: amd-gfx list > Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid > compute hang > > On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > > On Fri, May 14, 2021 at 4:20 AM wrote: > > > > > > From: changzhu > > > > > > From: Changfeng > > > > > > There is problem with 3DCGCG firmware and it will cause compute test > > > hang on picasso/raven1. It needs to disable 3DCGCG in driver to > > > avoid compute hang. > > > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > > Signed-off-by: Changfeng > > > > Reviewed-by: Alex Deucher > > > > WIth this applied, can we re-enable the additional compute queues? > > > > I think so. > > Changfeng, could you please confirm this on all raven series? > > Patch is Reviewed-by: Huang Rui > > > Alex > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > > > drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > index 22608c45f07c..feaa5e4a5538 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct > > > amdgpu_device *adev, > > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > > > /* Enable 3D CGCG/CGLS */ > > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > > + if (enable) { > > > /* write cmd to clear cgcg/cgls ov */ > > > def = data = RREG32_SOC15(GC, 0, > > > mmRLC_CGTT_MGCG_OVERRIDE); > > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > > /* enable 3Dcgcg FSM(0x363f) */ > > > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > > > > > - data = (0x36 << > > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > > + data = (0x36 << > > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > > + else > > > + data = 0x0 << > > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > > + > > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > > data |= (0x000F << > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > index 4b660b2d1c22..080e715799d4 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > > AMD_CG_SUPPORT_GFX_MGLS | > > > A
RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
[AMD Official Use Only - Internal Distribution Only] Hi Ray and Alex, I have confirmed it can enable the additional compute queues with this patch: [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1 [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1 [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1 [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1 [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1 [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1 [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1 [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1 BR, Changfeng. -Original Message- From: Huang, Ray Sent: Monday, May 17, 2021 2:27 PM To: Alex Deucher ; Zhu, Changfeng Cc: amd-gfx list Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > On Fri, May 14, 2021 at 4:20 AM wrote: > > > > From: changzhu > > > > From: Changfeng > > > > There is problem with 3DCGCG firmware and it will cause compute test > > hang on picasso/raven1. It needs to disable 3DCGCG in driver to > > avoid compute hang. > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > Signed-off-by: Changfeng > > Reviewed-by: Alex Deucher > > WIth this applied, can we re-enable the additional compute queues? > I think so. Changfeng, could you please confirm this on all raven series? Patch is Reviewed-by: Huang Rui > Alex > > > --- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > > drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index 22608c45f07c..feaa5e4a5538 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct > > amdgpu_device *adev, > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > /* Enable 3D CGCG/CGLS */ > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > + if (enable) { > > /* write cmd to clear cgcg/cgls ov */ > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, > > /* enable 3Dcgcg FSM(0x363f) */ > > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > > > - data = (0x36 << > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > + data = (0x36 << > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + else > > + data = 0x0 << > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > + > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > data |= (0x000F << > > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > index 4b660b2d1c22..080e715799d4 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > AMD_CG_SUPPORT_GFX_CGCG | > > AMD_CG_SUPPORT_GFX_CGLS | @@ -1413,7 > > +1412,6 @@ static int soc15_common_early_init(void *handle) > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_RLC_LS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD
Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote: > On Fri, May 14, 2021 at 4:20 AM wrote: > > > > From: changzhu > > > > From: Changfeng > > > > There is problem with 3DCGCG firmware and it will cause compute test > > hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid > > compute hang. > > > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > > Signed-off-by: Changfeng > > Reviewed-by: Alex Deucher > > WIth this applied, can we re-enable the additional compute queues? > I think so. Changfeng, could you please confirm this on all raven series? Patch is Reviewed-by: Huang Rui > Alex > > > --- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > > drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index 22608c45f07c..feaa5e4a5538 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct > > amdgpu_device *adev, > > amdgpu_gfx_rlc_enter_safe_mode(adev); > > > > /* Enable 3D CGCG/CGLS */ > > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > > + if (enable) { > > /* write cmd to clear cgcg/cgls ov */ > > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > > /* unset CGCG override */ > > @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct > > amdgpu_device *adev, > > /* enable 3Dcgcg FSM(0x363f) */ > > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > > > - data = (0x36 << > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > > + data = (0x36 << > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > > + else > > + data = 0x0 << > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > > + > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > > data |= (0x000F << > > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > > b/drivers/gpu/drm/amd/amdgpu/soc15.c > > index 4b660b2d1c22..080e715799d4 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > AMD_CG_SUPPORT_GFX_CGCG | > > AMD_CG_SUPPORT_GFX_CGLS | > > @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) > > AMD_CG_SUPPORT_GFX_MGLS | > > AMD_CG_SUPPORT_GFX_RLC_LS | > > AMD_CG_SUPPORT_GFX_CP_LS | > > - AMD_CG_SUPPORT_GFX_3D_CGCG | > > AMD_CG_SUPPORT_GFX_3D_CGLS | > > AMD_CG_SUPPORT_GFX_CGCG | > > AMD_CG_SUPPORT_GFX_CGLS | > > -- > > 2.17.1 > > > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2BV7pSY%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
On Fri, May 14, 2021 at 4:20 AM wrote: > > From: changzhu > > From: Changfeng > > There is problem with 3DCGCG firmware and it will cause compute test > hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid > compute hang. > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 > Signed-off-by: Changfeng Reviewed-by: Alex Deucher WIth this applied, can we re-enable the additional compute queues? Alex > --- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- > drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > index 22608c45f07c..feaa5e4a5538 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct > amdgpu_device *adev, > amdgpu_gfx_rlc_enter_safe_mode(adev); > > /* Enable 3D CGCG/CGLS */ > - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { > + if (enable) { > /* write cmd to clear cgcg/cgls ov */ > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); > /* unset CGCG override */ > @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct > amdgpu_device *adev, > /* enable 3Dcgcg FSM(0x363f) */ > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); > > - data = (0x36 << > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) > + data = (0x36 << > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; > + else > + data = 0x0 << > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; > + > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) > data |= (0x000F << > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 4b660b2d1c22..080e715799d4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | > AMD_CG_SUPPORT_GFX_MGLS | > AMD_CG_SUPPORT_GFX_CP_LS | > - AMD_CG_SUPPORT_GFX_3D_CGCG | > AMD_CG_SUPPORT_GFX_3D_CGLS | > AMD_CG_SUPPORT_GFX_CGCG | > AMD_CG_SUPPORT_GFX_CGLS | > @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) > AMD_CG_SUPPORT_GFX_MGLS | > AMD_CG_SUPPORT_GFX_RLC_LS | > AMD_CG_SUPPORT_GFX_CP_LS | > - AMD_CG_SUPPORT_GFX_3D_CGCG | > AMD_CG_SUPPORT_GFX_3D_CGLS | > AMD_CG_SUPPORT_GFX_CGCG | > AMD_CG_SUPPORT_GFX_CGLS | > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
From: changzhu From: Changfeng There is problem with 3DCGCG firmware and it will cause compute test hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid compute hang. Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87 Signed-off-by: Changfeng --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++--- drivers/gpu/drm/amd/amdgpu/soc15.c| 2 -- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 22608c45f07c..feaa5e4a5538 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, amdgpu_gfx_rlc_enter_safe_mode(adev); /* Enable 3D CGCG/CGLS */ - if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) { + if (enable) { /* write cmd to clear cgcg/cgls ov */ def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE); /* unset CGCG override */ @@ -4959,8 +4959,12 @@ static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev, /* enable 3Dcgcg FSM(0x363f) */ def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D); - data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG) + data = (0x36 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) | + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK; + else + data = 0x0 << RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT; + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS) data |= (0x000F << RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) | RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK; diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 4b660b2d1c22..080e715799d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle) adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG | AMD_CG_SUPPORT_GFX_MGLS | AMD_CG_SUPPORT_GFX_CP_LS | - AMD_CG_SUPPORT_GFX_3D_CGCG | AMD_CG_SUPPORT_GFX_3D_CGLS | AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_CGLS | @@ -1413,7 +1412,6 @@ static int soc15_common_early_init(void *handle) AMD_CG_SUPPORT_GFX_MGLS | AMD_CG_SUPPORT_GFX_RLC_LS | AMD_CG_SUPPORT_GFX_CP_LS | - AMD_CG_SUPPORT_GFX_3D_CGCG | AMD_CG_SUPPORT_GFX_3D_CGLS | AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_CGLS | -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx