RE: [PATCH] drm/amdgpu/mes: refine for maximum packet execution
[Public] The maximum sched_hw_submission value will be set to 8. The value will result that the maximum number of packet execution which is not greater than firmware allowed. Regards, Jack -Original Message- From: Deucher, Alexander Sent: Tuesday, August 6, 2024 10:34 PM To: Xiao, Jack ; amd-gfx@lists.freedesktop.org; ckoenig.leichtzumer...@gmail.com Subject: RE: [PATCH] drm/amdgpu/mes: refine for maximum packet execution [Public] > -Original Message- > From: Xiao, Jack > Sent: Tuesday, July 23, 2024 4:28 AM > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; ckoenig.leichtzumer...@gmail.com > Cc: Xiao, Jack > Subject: [PATCH] drm/amdgpu/mes: refine for maximum packet execution > > Only allow API_NUMBER_OF_COMMAND_MAX packet in mes ring buffer, refine > the code for maximum packet execution. > > Signed-off-by: Jack Xiao Any updates on this patch? Alex > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 2 ++ > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +- > drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 2 +- > 3 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > index ad49cecb20b8..dfe487521a4a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > @@ -212,6 +212,8 @@ int amdgpu_ring_init(struct amdgpu_device *adev, > struct amdgpu_ring *ring, >*/ > if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ) > sched_hw_submission = max(sched_hw_submission, 256); > + else if (ring->funcs->type == AMDGPU_RING_TYPE_MES) > + sched_hw_submission = 32; > else if (ring == &adev->sdma.instance[0].page) > sched_hw_submission = 256; > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > index 68c74adf79f1..951f13e01ee9 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > @@ -198,7 +198,7 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, > > seq = ++ring->fence_drv.sync_seq; > r = amdgpu_fence_wait_polling(ring, > - seq - ring->fence_drv.num_fences_mask, > + seq - > (API_NUMBER_OF_COMMAND_MAX/2 - 1), > timeout); > if (r < 1) > goto error_undo; > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > index 48e01206bcc4..dc65f28d1795 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > @@ -184,7 +184,7 @@ static int > mes_v12_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, > > seq = ++ring->fence_drv.sync_seq; > r = amdgpu_fence_wait_polling(ring, > - seq - ring->fence_drv.num_fences_mask, > + seq - > (API_NUMBER_OF_COMMAND_MAX/2 - 1), > timeout); > if (r < 1) > goto error_undo; > -- > 2.41.0
RE: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
[AMD Official Use Only - AMD Internal Distribution Only] There is a max command number for mes queue in mes_api_def.h enum {API_NUMBER_OF_COMMAND_MAX = 32}; It should explain why the patch can’t fix the issue. I will send out another patch to refine code according to the firmware limitation. Regards, Jack From: amd-gfx On Behalf Of Xiao, Jack Sent: Tuesday, July 23, 2024 11:08 AM To: Alex Deucher ; Christian König Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow [AMD Official Use Only - AMD Internal Distribution Only] [AMD Official Use Only - AMD Internal Distribution Only] > Does this patch fix it? https://patchwork.freedesktop.org/patch/605437/ No, please do not check in the patch, it will make my fix not working. Regards, Jack From: Alex Deucher mailto:alexdeuc...@gmail.com>> Sent: Tuesday, 23 July 2024 03:52 To: Christian König mailto:ckoenig.leichtzumer...@gmail.com>> Cc: Xiao, Jack mailto:jack.x...@amd.com>>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> mailto:amd-gfx@lists.freedesktop.org>>; Deucher, Alexander mailto:alexander.deuc...@amd.com>> Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow Does this patch fix it? https://patchwork.freedesktop.org/patch/605437/ Alex On Mon, Jul 22, 2024 at 7:21 AM Christian König mailto:ckoenig.leichtzumer...@gmail.com>> wrote: > > What I meant is that the MES ring is now to small for the number of packets > written to it. > > Either reduce the num_hw_submission or increase the MES ring size. > > E.g. see this code here in amdgpu_ring_init: > > if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ) > sched_hw_submission = max(sched_hw_submission, 256); > else if (ring == &adev->sdma.instance[0].page) > sched_hw_submission = 256; > > We are basically just missing a case for the MES here as far as I can see. > > Regards, > Christian. > > Am 22.07.24 um 10:46 schrieb Xiao, Jack: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > >> Can you try to reduce num_hw_submission for the MES ring? > > Smaller num_hw_submission should not help for this issue, for Mes work > without drm scheduler like legacy kiq. Smaller num_hw_submission will result > in smaller mes ring size and more waiting time. > > Regards, > Jack > > From: Christian König > mailto:ckoenig.leichtzumer...@gmail.com>> > Sent: Monday, 22 July 2024 16:20 > To: Xiao, Jack mailto:jack.x...@amd.com>>; > amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> > mailto:amd-gfx@lists.freedesktop.org>>; > Deucher, Alexander > mailto:alexander.deuc...@amd.com>> > Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow > > Thx, but in that case this patch here then won't help either it just > mitigates the problem. > > Can you try to reduce num_hw_submission for the MES ring? > > Thanks, > Christian. > > Am 22.07.24 um 05:27 schrieb Xiao, Jack: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > >> I think we rather need to increase the MES ring size instead. > > Unfortunately, it doesn't work. I guess mes firmware has limitation. > > Regards, > Jack > > > From: Christian König > mailto:ckoenig.leichtzumer...@gmail.com>> > Sent: Friday, 19 July 2024 23:44 > To: Xiao, Jack mailto:jack.x...@amd.com>>; > amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> > mailto:amd-gfx@lists.freedesktop.org>>; > Deucher, Alexander > mailto:alexander.deuc...@amd.com>> > Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow > > Am 19.07.24 um 11:16 schrieb Jack Xiao: > > wait memory room until enough before writing mes packets > > to avoid ring buffer overflow. > > > > Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>> > > --- > > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++ > > drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++ > > 2 files changed, 28 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > index 8ce51b9236c1..68c74adf79f1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > @@ -168,7 +168,7 @@ static int > > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, > >const char *op_str, *mis
Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
[AMD Official Use Only - AMD Internal Distribution Only] > Does this patch fix it? https://patchwork.freedesktop.org/patch/605437/ No, please do not check in the patch, it will make my fix not working. Regards, Jack From: Alex Deucher Sent: Tuesday, 23 July 2024 03:52 To: Christian König Cc: Xiao, Jack ; amd-gfx@lists.freedesktop.org ; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow Does this patch fix it? https://patchwork.freedesktop.org/patch/605437/ Alex On Mon, Jul 22, 2024 at 7:21 AM Christian König wrote: > > What I meant is that the MES ring is now to small for the number of packets > written to it. > > Either reduce the num_hw_submission or increase the MES ring size. > > E.g. see this code here in amdgpu_ring_init: > > if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ) > sched_hw_submission = max(sched_hw_submission, 256); > else if (ring == &adev->sdma.instance[0].page) > sched_hw_submission = 256; > > We are basically just missing a case for the MES here as far as I can see. > > Regards, > Christian. > > Am 22.07.24 um 10:46 schrieb Xiao, Jack: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > >> Can you try to reduce num_hw_submission for the MES ring? > > Smaller num_hw_submission should not help for this issue, for Mes work > without drm scheduler like legacy kiq. Smaller num_hw_submission will result > in smaller mes ring size and more waiting time. > > Regards, > Jack > > From: Christian König > Sent: Monday, 22 July 2024 16:20 > To: Xiao, Jack ; amd-gfx@lists.freedesktop.org > ; Deucher, Alexander > > Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow > > Thx, but in that case this patch here then won't help either it just > mitigates the problem. > > Can you try to reduce num_hw_submission for the MES ring? > > Thanks, > Christian. > > Am 22.07.24 um 05:27 schrieb Xiao, Jack: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > >> I think we rather need to increase the MES ring size instead. > > Unfortunately, it doesn't work. I guess mes firmware has limitation. > > Regards, > Jack > > > From: Christian König > Sent: Friday, 19 July 2024 23:44 > To: Xiao, Jack ; amd-gfx@lists.freedesktop.org > ; Deucher, Alexander > > Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow > > Am 19.07.24 um 11:16 schrieb Jack Xiao: > > wait memory room until enough before writing mes packets > > to avoid ring buffer overflow. > > > > Signed-off-by: Jack Xiao > > --- > > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++ > > drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++ > > 2 files changed, 28 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > index 8ce51b9236c1..68c74adf79f1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > @@ -168,7 +168,7 @@ static int > > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, > >const char *op_str, *misc_op_str; > >unsigned long flags; > >u64 status_gpu_addr; > > - u32 status_offset; > > + u32 seq, status_offset; > >u64 *status_ptr; > >signed long r; > >int ret; > > @@ -196,6 +196,13 @@ static int > > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, > >if (r) > >goto error_unlock_free; > > > > + seq = ++ring->fence_drv.sync_seq; > > + r = amdgpu_fence_wait_polling(ring, > > + seq - ring->fence_drv.num_fences_mask, > > + timeout); > > + if (r < 1) > > + goto error_undo; > > + > >api_status = (struct MES_API_STATUS *)((char *)pkt + api_status_off); > >api_status->api_completion_fence_addr = status_gpu_addr; > >api_status->api_completion_fence_value = 1; > > @@ -208,8 +215,7 @@ static int > > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, > >mes_status_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS; > >mes_status_pkt.api_status.api_completion_fence_addr = > >ring->fence_drv.gpu_addr; > > - mes_status_pkt.api_status.api_completion_fence_v
Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
[AMD Official Use Only - AMD Internal Distribution Only] >> Can you try to reduce num_hw_submission for the MES ring? Smaller num_hw_submission should not help for this issue, for Mes work without drm scheduler like legacy kiq. Smaller num_hw_submission will result in smaller mes ring size and more waiting time. Regards, Jack From: Christian König Sent: Monday, 22 July 2024 16:20 To: Xiao, Jack ; amd-gfx@lists.freedesktop.org ; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow Thx, but in that case this patch here then won't help either it just mitigates the problem. Can you try to reduce num_hw_submission for the MES ring? Thanks, Christian. Am 22.07.24 um 05:27 schrieb Xiao, Jack: [AMD Official Use Only - AMD Internal Distribution Only] >> I think we rather need to increase the MES ring size instead. Unfortunately, it doesn't work. I guess mes firmware has limitation. Regards, Jack From: Christian König <mailto:ckoenig.leichtzumer...@gmail.com> Sent: Friday, 19 July 2024 23:44 To: Xiao, Jack <mailto:jack.x...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>; Deucher, Alexander <mailto:alexander.deuc...@amd.com> Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow Am 19.07.24 um 11:16 schrieb Jack Xiao: > wait memory room until enough before writing mes packets > to avoid ring buffer overflow. > > Signed-off-by: Jack Xiao <mailto:jack.x...@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++ > drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++ > 2 files changed, 28 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > index 8ce51b9236c1..68c74adf79f1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > @@ -168,7 +168,7 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >const char *op_str, *misc_op_str; >unsigned long flags; >u64 status_gpu_addr; > - u32 status_offset; > + u32 seq, status_offset; >u64 *status_ptr; >signed long r; >int ret; > @@ -196,6 +196,13 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >if (r) >goto error_unlock_free; > > + seq = ++ring->fence_drv.sync_seq; > + r = amdgpu_fence_wait_polling(ring, > + seq - ring->fence_drv.num_fences_mask, > + timeout); > + if (r < 1) > + goto error_undo; > + >api_status = (struct MES_API_STATUS *)((char *)pkt + api_status_off); >api_status->api_completion_fence_addr = status_gpu_addr; >api_status->api_completion_fence_value = 1; > @@ -208,8 +215,7 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >mes_status_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS; >mes_status_pkt.api_status.api_completion_fence_addr = >ring->fence_drv.gpu_addr; > - mes_status_pkt.api_status.api_completion_fence_value = > - ++ring->fence_drv.sync_seq; > + mes_status_pkt.api_status.api_completion_fence_value = seq; > >amdgpu_ring_write_multiple(ring, &mes_status_pkt, > sizeof(mes_status_pkt) / 4); > @@ -229,7 +235,7 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >dev_dbg(adev->dev, "MES msg=%d was emitted\n", >x_pkt->header.opcode); > > - r = amdgpu_fence_wait_polling(ring, ring->fence_drv.sync_seq, timeout); > + r = amdgpu_fence_wait_polling(ring, seq, timeout); >if (r < 1 || !*status_ptr) { > >if (misc_op_str) > @@ -252,6 +258,10 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >amdgpu_device_wb_free(adev, status_offset); >return 0; > > +error_undo: > + dev_err(adev->dev, "MES ring buffer is full.\n"); > + amdgpu_ring_undo(ring); > + > error_unlock_free: >spin_unlock_irqrestore(&mes->ring_lock, flags); > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > index c9f74231ad59..48e01206bcc4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > @@ -154,7 +154,7 @@ static int > mes_v12_0_submit_pkt_and_poll_com
Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
[AMD Official Use Only - AMD Internal Distribution Only] >> I think we rather need to increase the MES ring size instead. Unfortunately, it doesn't work. I guess mes firmware has limitation. Regards, Jack From: Christian König Sent: Friday, 19 July 2024 23:44 To: Xiao, Jack ; amd-gfx@lists.freedesktop.org ; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow Am 19.07.24 um 11:16 schrieb Jack Xiao: > wait memory room until enough before writing mes packets > to avoid ring buffer overflow. > > Signed-off-by: Jack Xiao > --- > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++ > drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++ > 2 files changed, 28 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > index 8ce51b9236c1..68c74adf79f1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > @@ -168,7 +168,7 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >const char *op_str, *misc_op_str; >unsigned long flags; >u64 status_gpu_addr; > - u32 status_offset; > + u32 seq, status_offset; >u64 *status_ptr; >signed long r; >int ret; > @@ -196,6 +196,13 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >if (r) >goto error_unlock_free; > > + seq = ++ring->fence_drv.sync_seq; > + r = amdgpu_fence_wait_polling(ring, > + seq - ring->fence_drv.num_fences_mask, > + timeout); > + if (r < 1) > + goto error_undo; > + >api_status = (struct MES_API_STATUS *)((char *)pkt + api_status_off); >api_status->api_completion_fence_addr = status_gpu_addr; >api_status->api_completion_fence_value = 1; > @@ -208,8 +215,7 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >mes_status_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS; >mes_status_pkt.api_status.api_completion_fence_addr = >ring->fence_drv.gpu_addr; > - mes_status_pkt.api_status.api_completion_fence_value = > - ++ring->fence_drv.sync_seq; > + mes_status_pkt.api_status.api_completion_fence_value = seq; > >amdgpu_ring_write_multiple(ring, &mes_status_pkt, > sizeof(mes_status_pkt) / 4); > @@ -229,7 +235,7 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >dev_dbg(adev->dev, "MES msg=%d was emitted\n", >x_pkt->header.opcode); > > - r = amdgpu_fence_wait_polling(ring, ring->fence_drv.sync_seq, timeout); > + r = amdgpu_fence_wait_polling(ring, seq, timeout); >if (r < 1 || !*status_ptr) { > >if (misc_op_str) > @@ -252,6 +258,10 @@ static int > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >amdgpu_device_wb_free(adev, status_offset); >return 0; > > +error_undo: > + dev_err(adev->dev, "MES ring buffer is full.\n"); > + amdgpu_ring_undo(ring); > + > error_unlock_free: >spin_unlock_irqrestore(&mes->ring_lock, flags); > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > index c9f74231ad59..48e01206bcc4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c > @@ -154,7 +154,7 @@ static int > mes_v12_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >const char *op_str, *misc_op_str; >unsigned long flags; >u64 status_gpu_addr; > - u32 status_offset; > + u32 seq, status_offset; >u64 *status_ptr; >signed long r; >int ret; > @@ -182,6 +182,13 @@ static int > mes_v12_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes, >if (r) >goto error_unlock_free; > > + seq = ++ring->fence_drv.sync_seq; > + r = amdgpu_fence_wait_polling(ring, > + seq - ring->fence_drv.num_fences_mask, That's what's amdgpu_fence_emit_polling() does anyway. So this here just moves the polling a bit earlier. I think we rather need to increase the MES ring size instead. Regards, Christian. > + timeout); > + if (r < 1) > + goto error_undo; > + >api_status = (struct MES_API_STATUS *)((cha
RE: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET
[AMD Official Use Only - General] I can see the soft reset can be called from amdgpu_device_gpu_recover and pci error handler, they have called amdgpu_device_lock_reset_domain to acquire a reset lock before soft reset. Regards, Jack -Original Message- From: Christian König Sent: Wednesday, December 20, 2023 10:06 PM To: Xiao, Jack ; Alex Deucher Cc: Deucher, Alexander ; amd-gfx@lists.freedesktop.org; Zhang, Hawking Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET Well not the reset lock, but there should only be a single reset queue which this runs on. Regards, Christian. Am 20.12.23 um 10:49 schrieb Xiao, Jack: > [AMD Official Use Only - General] > > It's already protected by the reset lock. In my understanding, soft reset > should not run in parallel. > > Regards, > Jack > > -Original Message- > From: Alex Deucher > Sent: Wednesday, December 20, 2023 1:04 AM > To: Xiao, Jack > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Zhang, Hawking > Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before > access CP_VMID_RESET > > On Tue, Dec 19, 2023 at 4:30 AM Jack Xiao wrote: >> It's required to take the gfx mutex before access to CP_VMID_RESET, >> for there is a race condition with CP firmware to write the register. >> >> Signed-off-by: Jack Xiao >> --- >> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 20 >> 1 file changed, 20 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c >> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c >> index bdcf96df69e6..ae3370d34d11 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c >> @@ -4518,6 +4518,22 @@ static int gfx_v11_0_soft_reset(void *handle) >> } >> } >> > We probably want a CPU mutex or spinlock to protect this as well unless this > is already protected by the reset lock. > > Alex > >> + /* Try to require the gfx mutex before access to CP_VMID_RESET */ >> + for (i = 0; i < adev->usec_timeout; i++) { >> + /* Request with MeId=2, PipeId=0 */ >> + tmp = REG_SET_FIELD(0, CP_GFX_INDEX_MUTEX, REQUEST, 1); >> + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, CLIENTID, 4); >> + WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp); >> + if (RREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX) == tmp) >> + break; >> + udelay(1); >> + } >> + >> + if (i >= adev->usec_timeout) { >> + printk("Failed to require the gfx mutex during soft >> reset\n"); >> + return -EINVAL; >> + } >> + >> WREG32_SOC15(GC, 0, regCP_VMID_RESET, 0xfffe); >> >> // Read CP_VMID_RESET register three times. >> @@ -4526,6 +4542,10 @@ static int gfx_v11_0_soft_reset(void *handle) >> RREG32_SOC15(GC, 0, regCP_VMID_RESET); >> RREG32_SOC15(GC, 0, regCP_VMID_RESET); >> >> + /* release the gfx mutex */ >> + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, REQUEST, 0); >> + WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp); >> + >> for (i = 0; i < adev->usec_timeout; i++) { >> if (!RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) && >> !RREG32_SOC15(GC, 0, regCP_GFX_HQD_ACTIVE)) >> -- >> 2.41.0 >>
RE: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET
[AMD Official Use Only - General] It's already protected by the reset lock. In my understanding, soft reset should not run in parallel. Regards, Jack -Original Message- From: Alex Deucher Sent: Wednesday, December 20, 2023 1:04 AM To: Xiao, Jack Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander ; Zhang, Hawking Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET On Tue, Dec 19, 2023 at 4:30 AM Jack Xiao wrote: > > It's required to take the gfx mutex before access to CP_VMID_RESET, > for there is a race condition with CP firmware to write the register. > > Signed-off-by: Jack Xiao > --- > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 20 > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > index bdcf96df69e6..ae3370d34d11 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > @@ -4518,6 +4518,22 @@ static int gfx_v11_0_soft_reset(void *handle) > } > } > We probably want a CPU mutex or spinlock to protect this as well unless this is already protected by the reset lock. Alex > + /* Try to require the gfx mutex before access to CP_VMID_RESET */ > + for (i = 0; i < adev->usec_timeout; i++) { > + /* Request with MeId=2, PipeId=0 */ > + tmp = REG_SET_FIELD(0, CP_GFX_INDEX_MUTEX, REQUEST, 1); > + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, CLIENTID, 4); > + WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp); > + if (RREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX) == tmp) > + break; > + udelay(1); > + } > + > + if (i >= adev->usec_timeout) { > + printk("Failed to require the gfx mutex during soft reset\n"); > + return -EINVAL; > + } > + > WREG32_SOC15(GC, 0, regCP_VMID_RESET, 0xfffe); > > // Read CP_VMID_RESET register three times. > @@ -4526,6 +4542,10 @@ static int gfx_v11_0_soft_reset(void *handle) > RREG32_SOC15(GC, 0, regCP_VMID_RESET); > RREG32_SOC15(GC, 0, regCP_VMID_RESET); > > + /* release the gfx mutex */ > + tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, REQUEST, 0); > + WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp); > + > for (i = 0; i < adev->usec_timeout; i++) { > if (!RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) && > !RREG32_SOC15(GC, 0, regCP_GFX_HQD_ACTIVE)) > -- > 2.41.0 >
RE: [PATCH] drm/amdgpu/mes11: remove aggregated doorbell code
[AMD Official Use Only - General] If aggregated doorbell will not be intended for user queue oversubscription, it's ok to remove it. Reviewed-by: Jack Xiao -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Friday, October 13, 2023 9:53 PM To: Deucher, Alexander Cc: amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu/mes11: remove aggregated doorbell code Ping? On Wed, Oct 11, 2023 at 2:27 PM Alex Deucher wrote: > > It's not enabled in hardware so the code is dead. > Remove it. > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 86 > +- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | > 56 - drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 83 > - > 3 files changed, 40 insertions(+), 185 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > index 27b224b0688a..91c07ab4f14e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c > @@ -5170,45 +5170,17 @@ static u64 gfx_v11_0_ring_get_wptr_gfx(struct > amdgpu_ring *ring) static void gfx_v11_0_ring_set_wptr_gfx(struct > amdgpu_ring *ring) { > struct amdgpu_device *adev = ring->adev; > - uint32_t *wptr_saved; > - uint32_t *is_queue_unmap; > - uint64_t aggregated_db_index; > - uint32_t mqd_size = adev->mqds[AMDGPU_HW_IP_GFX].mqd_size; > - uint64_t wptr_tmp; > > - if (ring->is_mes_queue) { > - wptr_saved = (uint32_t *)(ring->mqd_ptr + mqd_size); > - is_queue_unmap = (uint32_t *)(ring->mqd_ptr + mqd_size + > - sizeof(uint32_t)); > - aggregated_db_index = > - amdgpu_mes_get_aggregated_doorbell_index(adev, > - > ring->hw_prio); > - > - wptr_tmp = ring->wptr & ring->buf_mask; > - atomic64_set((atomic64_t *)ring->wptr_cpu_addr, wptr_tmp); > - *wptr_saved = wptr_tmp; > - /* assume doorbell always being used by mes mapped queue */ > - if (*is_queue_unmap) { > - WDOORBELL64(aggregated_db_index, wptr_tmp); > - WDOORBELL64(ring->doorbell_index, wptr_tmp); > - } else { > - WDOORBELL64(ring->doorbell_index, wptr_tmp); > - > - if (*is_queue_unmap) > - WDOORBELL64(aggregated_db_index, wptr_tmp); > - } > + if (ring->use_doorbell) { > + /* XXX check if swapping is necessary on BE */ > + atomic64_set((atomic64_t *)ring->wptr_cpu_addr, > +ring->wptr); > + WDOORBELL64(ring->doorbell_index, ring->wptr); > } else { > - if (ring->use_doorbell) { > - /* XXX check if swapping is necessary on BE */ > - atomic64_set((atomic64_t *)ring->wptr_cpu_addr, > -ring->wptr); > - WDOORBELL64(ring->doorbell_index, ring->wptr); > - } else { > - WREG32_SOC15(GC, 0, regCP_RB0_WPTR, > -lower_32_bits(ring->wptr)); > - WREG32_SOC15(GC, 0, regCP_RB0_WPTR_HI, > -upper_32_bits(ring->wptr)); > - } > + WREG32_SOC15(GC, 0, regCP_RB0_WPTR, > +lower_32_bits(ring->wptr)); > + WREG32_SOC15(GC, 0, regCP_RB0_WPTR_HI, > +upper_32_bits(ring->wptr)); > } > } > > @@ -5233,42 +5205,14 @@ static u64 > gfx_v11_0_ring_get_wptr_compute(struct amdgpu_ring *ring) static void > gfx_v11_0_ring_set_wptr_compute(struct amdgpu_ring *ring) { > struct amdgpu_device *adev = ring->adev; > - uint32_t *wptr_saved; > - uint32_t *is_queue_unmap; > - uint64_t aggregated_db_index; > - uint32_t mqd_size = adev->mqds[AMDGPU_HW_IP_COMPUTE].mqd_size; > - uint64_t wptr_tmp; > > - if (ring->is_mes_queue) { > - wptr_saved = (uint32_t *)(ring->mqd_ptr + mqd_size); > - is_queue_unmap = (uint32_t *)(ring->mqd_ptr + mqd_size + > - sizeof(uint32_t)); > - aggregated_db_index = > - amdgpu_mes_get_aggregated_doorbell_index(adev, > - > ring->hw_prio); > - > - wptr_tmp = ring->wptr & ring->buf_mask; > - atomic64_set((atomic64_t *)ring->wptr_cpu_addr, wptr_tmp); > - *wptr_saved = wptr_tmp; > - /* assume doorbell always used by mes mapped queue */ > - if (*is_queue_unmap) { > -
Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM
[AMD Official Use Only - General] Yes, it should work. If that, the routine would behave like *reset*. That means driver would behave like to create a new queue instead of restoring a saved queue. I can have a try with this solution. Regards, Jack From: Alex Deucher Sent: Monday, 15 May 2023 21:08 To: Xiao, Jack Cc: amd-gfx@lists.freedesktop.org ; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM On Mon, May 15, 2023 at 6:40 AM Xiao, Jack wrote: > > [AMD Official Use Only - General] > > The MQD data in VRAM would be lost after S3, for the MQD bo is pinned down > as kernel bo and can't be evicted to system memory. > AFAIK, driver should not to do allocate/free memory during S3, as there are > issues observed to do memory allocation during S3. We restore the contents of the MQD in gfx_v*_0_gfx_init_queue() and gfx_v*_0_kcq_init_queue(). Alex > > Regards, > Jack > > -Original Message- > From: Alex Deucher > Sent: Friday, May 12, 2023 9:13 PM > To: Xiao, Jack > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > > Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM > > On Fri, May 12, 2023 at 4:16 AM Jack Xiao wrote: > > > > Make the preemption optimization effect only for SRIOV, for it caused > > failure to resume from S3. > > Can you elaborate? We ultimately want MQDs in VRAM for performance reasons > even for bare metal. > > Alex > > > > > Signed-off-by: Jack Xiao > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++- > > drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 7 +-- > > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 7 +-- > > 3 files changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > index a22d88a4178a..1b795b7bbf38 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > @@ -385,7 +385,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, > > u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > > > /* Only enable on gfx10 and 11 for now to avoid changing behavior > > on older chips */ > > - if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0)) > > + if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0) && > > + amdgpu_sriov_vf(adev)) > > domain |= AMDGPU_GEM_DOMAIN_VRAM; > > > > /* create MQD for KIQ */ > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > index 4560476c7c31..5c3d3f6c7ebd 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > > @@ -889,6 +889,7 @@ static int mes_v10_1_mqd_sw_init(struct > > amdgpu_device *adev, { > > int r, mqd_size = sizeof(struct v10_compute_mqd); > > struct amdgpu_ring *ring; > > + u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > > > if (pipe == AMDGPU_MES_KIQ_PIPE) > > ring = &adev->gfx.kiq[0].ring; @@ -900,9 +901,11 @@ > > static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev, > > if (ring->mqd_obj) > > return 0; > > > > + if (amdgpu_sriov_vf(adev)) > > + domain |= AMDGPU_GEM_DOMAIN_VRAM; > > + > > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > > - AMDGPU_GEM_DOMAIN_VRAM | > > - AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > > + domain, &ring->mqd_obj, > > &ring->mqd_gpu_addr, &ring->mqd_ptr); > > if (r) { > > dev_warn(adev->dev, "failed to create ring mqd bo > > (%d)", r); diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > index 3adb450eec07..79a4d2bfd94a 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > @@ -987,6 +987,7 @@ static int mes_v11_0_mqd_sw_init(struct > > amdgpu_device *adev, { > > int r, mqd_size = sizeof(struct v11_compute_mqd); > > struct amdgpu_ring *ring; > > + u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > > > if (pipe == AMDGPU_MES_KIQ_PIPE) > > ring = &adev->gfx.kiq[0].ring; @@ -998,9 +999,11 @@ > > static int mes_v11_0_mqd_s
RE: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM
[AMD Official Use Only - General] The MQD data in VRAM would be lost after S3, for the MQD bo is pinned down as kernel bo and can't be evicted to system memory. AFAIK, driver should not to do allocate/free memory during S3, as there are issues observed to do memory allocation during S3. Regards, Jack -Original Message- From: Alex Deucher Sent: Friday, May 12, 2023 9:13 PM To: Xiao, Jack Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM On Fri, May 12, 2023 at 4:16 AM Jack Xiao wrote: > > Make the preemption optimization effect only for SRIOV, for it caused > failure to resume from S3. Can you elaborate? We ultimately want MQDs in VRAM for performance reasons even for bare metal. Alex > > Signed-off-by: Jack Xiao > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 7 +-- > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 7 +-- > 3 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > index a22d88a4178a..1b795b7bbf38 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > @@ -385,7 +385,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev, > u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > /* Only enable on gfx10 and 11 for now to avoid changing behavior on > older chips */ > - if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0)) > + if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0) && > + amdgpu_sriov_vf(adev)) > domain |= AMDGPU_GEM_DOMAIN_VRAM; > > /* create MQD for KIQ */ > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > index 4560476c7c31..5c3d3f6c7ebd 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c > @@ -889,6 +889,7 @@ static int mes_v10_1_mqd_sw_init(struct > amdgpu_device *adev, { > int r, mqd_size = sizeof(struct v10_compute_mqd); > struct amdgpu_ring *ring; > + u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > if (pipe == AMDGPU_MES_KIQ_PIPE) > ring = &adev->gfx.kiq[0].ring; @@ -900,9 +901,11 @@ > static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev, > if (ring->mqd_obj) > return 0; > > + if (amdgpu_sriov_vf(adev)) > + domain |= AMDGPU_GEM_DOMAIN_VRAM; > + > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > - AMDGPU_GEM_DOMAIN_VRAM | > - AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > + domain, &ring->mqd_obj, > &ring->mqd_gpu_addr, &ring->mqd_ptr); > if (r) { > dev_warn(adev->dev, "failed to create ring mqd bo > (%d)", r); diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > index 3adb450eec07..79a4d2bfd94a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > @@ -987,6 +987,7 @@ static int mes_v11_0_mqd_sw_init(struct > amdgpu_device *adev, { > int r, mqd_size = sizeof(struct v11_compute_mqd); > struct amdgpu_ring *ring; > + u32 domain = AMDGPU_GEM_DOMAIN_GTT; > > if (pipe == AMDGPU_MES_KIQ_PIPE) > ring = &adev->gfx.kiq[0].ring; @@ -998,9 +999,11 @@ > static int mes_v11_0_mqd_sw_init(struct amdgpu_device *adev, > if (ring->mqd_obj) > return 0; > > + if (amdgpu_sriov_vf(adev)) > + domain |= AMDGPU_GEM_DOMAIN_VRAM; > + > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE, > - AMDGPU_GEM_DOMAIN_VRAM | > - AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj, > + domain, &ring->mqd_obj, > &ring->mqd_gpu_addr, &ring->mqd_ptr); > if (r) { > dev_warn(adev->dev, "failed to create ring mqd bo > (%d)", r); > -- > 2.37.3 >
RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable
[AMD Official Use Only - General] >> The driver are resumed before the core Linux memory management is ready to >> serve allocations. E.g. swap for example isn't turned on yet. >> This means that this stuff only worked because we were able to allocate >> memory from the pool which isn't guaranteed in any way. Memory allocation failure can happen at any time, every programmer should correctly handle it. If memory allocation failure is not critical error and can gracefully continue to run, it should be acceptable. The memory allocation failure during mes self test should be the acceptable one. It will not make system hang up and driver can gracefully continue to run. Regards, Jack From: Koenig, Christian Sent: Friday, February 10, 2023 6:25 PM To: Xiao, Jack ; Quan, Evan ; Christian König ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Hi Jack, Am 10.02.23 um 10:51 schrieb Xiao, Jack: [AMD Official Use Only - General] Hi Christian, >> Allocating buffers temporary for stuff like that is illegal during resume. Can you *deeply* explain why it is illegal during ip late_init stage which is a part stage of resume? Well no, I don't have the time to explain this to everybody individually. [Jack] ... In my understanding, after gmc ready, driver can allocate/free kernel bo, and after SDMA ready, the eviction should be ready. What else prevent driver doing that during resume? The driver are resumed before the core Linux memory management is ready to serve allocations. E.g. swap for example isn't turned on yet. This means that this stuff only worked because we were able to allocate memory from the pool which isn't guaranteed in any way. >> I strongly suggest to just remove the MES test. It's abusing the kernel ring >> interface in a way we didn't want anyway and is currently replaced by >> Shahanks work. The kernel mes unit test is very meaningful and important to catch MES firmware issue at first time before issue went spread to other components like kfd/umd to avoid the problem complicated, Otherwise, the issue would become hard to catch and debug. Secondly, for mes unit test is self-containing and no dependency, it is a part of milestone to qualify MES ready, indicating that it can deliver to other component especially during brinup. It is likely ring test and ib test indicating gfx is ready to go. After totally transitioning to gfx user queue, mes unit test may be the only one unit test which can indicate gfx is ready at the very early stage of bringup when UMD is not ready. Alex and I are the maintainers of the driver who are deciding stuff like that and at least I don't really buy that argument. The ring, IB and benchmark tests are in the kernel module because they are simple. If we have a complicated unit test like simulating creating an MES user queue to test the firmware functionality than that's really overkill. Especially when you need to allocate memory for it. We previously had people requesting to add shader code and other complicated testing and rejected that as well because it just bloat up the kernel driver unnecessarily. If we can modify the MES test to not abuse the amdgpu_ring structure only work with memory from the SA for example we could keep this, but not really in the current state. Regards, Christian. Regards, Jack From: Koenig, Christian <mailto:christian.koe...@amd.com> Sent: Friday, February 10, 2023 4:08 PM To: Quan, Evan <mailto:evan.q...@amd.com>; Christian König <mailto:ckoenig.leichtzumer...@gmail.com>; Xiao, Jack <mailto:jack.x...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, Alexander <mailto:alexander.deuc...@amd.com> Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Hi Evan, yeah, exactly that's what this warning should prevent. Allocating buffers temporary for stuff like that is illegal during resume. I strongly suggest to just remove the MES test. It's abusing the kernel ring interface in a way we didn't want anyway and is currently replaced by Shahanks work. Regards, Christian. Am 10.02.23 um 05:12 schrieb Quan, Evan: [AMD Official Use Only - General] Hi Jack, Are you trying to fix the call trace popped up on resuming below? It seems mes created some bo for its self test and freed it up later at the final stage of the resuming process. All these happened before the in_suspend flag cleared. And that triggered the call trace. Is my understanding correct? [74084.799260] WARNING: CPU: 2 PID: 2891 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu] [74084.811019] Modules linked in: nls_iso8859_1 amdgpu(OE) iommu_v2 gpu_sched drm_buddy drm_ttm_helper ttm
RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable
[AMD Official Use Only - General] Hi Christian, >> Allocating buffers temporary for stuff like that is illegal during resume. Can you *deeply* explain why it is illegal during ip late_init stage which is a part stage of resume? In my understanding, after gmc ready, driver can allocate/free kernel bo, and after SDMA ready, the eviction should be ready. What else prevent driver doing that during resume? >> I strongly suggest to just remove the MES test. It's abusing the kernel ring >> interface in a way we didn't want anyway and is currently replaced by >> Shahanks work. The kernel mes unit test is very meaningful and important to catch MES firmware issue at first time before issue went spread to other components like kfd/umd to avoid the problem complicated, Otherwise, the issue would become hard to catch and debug. Secondly, for mes unit test is self-containing and no dependency, it is a part of milestone to qualify MES ready, indicating that it can deliver to other component especially during brinup. It is likely ring test and ib test indicating gfx is ready to go. After totally transitioning to gfx user queue, mes unit test may be the only one unit test which can indicate gfx is ready at the very early stage of bringup when UMD is not ready. Regards, Jack From: Koenig, Christian Sent: Friday, February 10, 2023 4:08 PM To: Quan, Evan ; Christian König ; Xiao, Jack ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Hi Evan, yeah, exactly that's what this warning should prevent. Allocating buffers temporary for stuff like that is illegal during resume. I strongly suggest to just remove the MES test. It's abusing the kernel ring interface in a way we didn't want anyway and is currently replaced by Shahanks work. Regards, Christian. Am 10.02.23 um 05:12 schrieb Quan, Evan: [AMD Official Use Only - General] Hi Jack, Are you trying to fix the call trace popped up on resuming below? It seems mes created some bo for its self test and freed it up later at the final stage of the resuming process. All these happened before the in_suspend flag cleared. And that triggered the call trace. Is my understanding correct? [74084.799260] WARNING: CPU: 2 PID: 2891 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu] [74084.811019] Modules linked in: nls_iso8859_1 amdgpu(OE) iommu_v2 gpu_sched drm_buddy drm_ttm_helper ttm drm_display_helper drm_kms_helper i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_sm [74084.811042] ip_tables x_tables autofs4 hid_logitech_hidpp hid_logitech_dj hid_generic e1000e usbhid ptp uas hid video i2c_i801 ahci pps_core crc32_pclmul i2c_smbus usb_storage libahci wmi [74084.914519] CPU: 2 PID: 2891 Comm: kworker/u16:38 Tainted: GW IOE 6.0.0-custom #1 [74084.923146] Hardware name: ASUS System Product Name/PRIME Z390-A, BIOS 2004 11/02/2021 [74084.931074] Workqueue: events_unbound async_run_entry_fn [74084.936393] RIP: 0010:amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu] [74084.942422] Code: 00 4d 85 ed 74 08 49 c7 45 00 00 00 00 00 4d 85 e4 74 08 49 c7 04 24 00 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc <0f> 0b e9 39 ff ff ff 3d 00 fe ff ff 0f 85 75 96 47 00 ebf [74084.961199] RSP: :bed6812ebb90 EFLAGS: 00010202 [74084.966435] RAX: RBX: bed6812ebc50 RCX: [74084.973578] RDX: bed6812ebc70 RSI: bed6812ebc60 RDI: bed6812ebc50 [74084.980725] RBP: bed6812ebbb8 R08: R09: 01ff [74084.987869] R10: bed6812ebb40 R11: R12: bed6812ebc70 [74084.995015] R13: bed6812ebc60 R14: 963a2945cc00 R15: 9639c7da5630 [74085.002160] FS: () GS:963d1dc8() knlGS: [74085.010262] CS: 0010 DS: ES: CR0: 80050033 [74085.016016] CR2: CR3: 000377c0a001 CR4: 003706e0 [74085.023164] DR0: DR1: DR2: [74085.030307] DR3: DR6: fffe0ff0 DR7: 0400 [74085.037453] Call Trace: [74085.039911] [74085.042023] amdgpu_mes_self_test+0x385/0x460 [amdgpu] [74085.047293] mes_v11_0_late_init+0x44/0x50 [amdgpu] [74085.052291] amdgpu_device_ip_late_init+0x50/0x270 [amdgpu] [74085.058032] amdgpu_device_resume+0xb0/0x2d0 [amdgpu] [74085.063187] amdgpu_pmops_resume+0x37/0x70 [amdgpu] [74085.068162] pci_pm_resume+0x68/0x100 [74085.071836] ? pci_legacy_resume+0x80/0x80 [74085.075943] dpm_run_callback+0x4c/0x160 [74085.079873] device_resume+0xad/0x210 [74085.083546] async_resume+0x1e/0x40 [74085.087046] async_run_entry_fn+0x30/0x120 [74085.091152] process_one_work+0x21a/0x3f0 [74085.095173] worker_thread+0x50/0x3e0 [74085.098845] ? process_one_work+0x3f0/0x3f0 [74085.103039] kthread+0xfa
RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable
[AMD Official Use Only - General] >> >> It's simply not allowed to free up resources during suspend since those can't be acquired again during resume. >> The in_suspend flag is set at the beginning of suspend and unset at the end of resume. It can't filter the case you mentioned. Why not? This is exactly what it should do. [Jack] If freeing up resources during resume, it should not hit the issue you described. But only checking in_suspend flag would take these cases as warning. Regards, Jack From: Koenig, Christian Sent: Monday, February 6, 2023 4:06 PM To: Xiao, Jack ; Christian König ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Am 06.02.23 um 08:23 schrieb Xiao, Jack: [AMD Official Use Only - General] >> Nope, that is not related to any hw state. can use other flag. >> It's simply not allowed to free up resources during suspend since those >> can't be acquired again during resume. The in_suspend flag is set at the beginning of suspend and unset at the end of resume. It can't filter the case you mentioned. Why not? This is exactly what it should do. Do you know the root cause of these cases hitting the issue? So that we can get an exact point to warn the freeing up behavior. Well the root cause are programming errors. See between suspending and resuming you should not allocate nor free memory. Otherwise we can run into trouble. And this check here is one part of that, we should probably add another warning during allocation of memory. But this here is certainly correct. Regards, Christian. Thanks, Jack From: Christian König <mailto:ckoenig.leichtzumer...@gmail.com> Sent: Friday, February 3, 2023 9:20 PM To: Xiao, Jack <mailto:jack.x...@amd.com>; Koenig, Christian <mailto:christian.koe...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, Alexander <mailto:alexander.deuc...@amd.com> Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Nope, that is not related to any hw state. It's simply not allowed to free up resources during suspend since those can't be acquired again during resume. We had a couple of cases now where this was wrong. If you get a warning from that please fix the code which tried to free something during suspend instead. Regards, Christian. Am 03.02.23 um 07:04 schrieb Xiao, Jack: [AMD Official Use Only - General] >> It's simply illegal to free up memory during suspend. Why? In my understanding, the limit was caused by DMA shutdown. Regards, Jack From: Koenig, Christian <mailto:christian.koe...@amd.com> Sent: Thursday, February 2, 2023 7:43 PM To: Xiao, Jack <mailto:jack.x...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, Alexander <mailto:alexander.deuc...@amd.com> Subject: AW: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Big NAK to this! This warning is not related in any way to the hw state. It's simply illegal to free up memory during suspend. Regards, Christian. Von: Xiao, Jack mailto:jack.x...@amd.com>> Gesendet: Donnerstag, 2. Februar 2023 10:54 An: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> mailto:amd-gfx@lists.freedesktop.org>>; Deucher, Alexander mailto:alexander.deuc...@amd.com>>; Koenig, Christian mailto:christian.koe...@amd.com>> Cc: Xiao, Jack mailto:jack.x...@amd.com>> Betreff: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Reduce waringings, only warn when DMA is unavailable. Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>> --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 2d237f3d3a2e..e3e3764ea697 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -422,7 +422,8 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr, if (*bo == NULL) return; - WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend); + WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend && + !amdgpu_ttm_adev((*bo)->tbo.bdev)->ip_blocks[AMD_IP_BLOCK_TYPE_SDMA].status.hw); if (likely(amdgpu_bo_reserve(*bo, true) == 0)) { if (cpu_addr) -- 2.37.3
RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable
[AMD Official Use Only - General] >> Nope, that is not related to any hw state. can use other flag. >> It's simply not allowed to free up resources during suspend since those >> can't be acquired again during resume. The in_suspend flag is set at the beginning of suspend and unset at the end of resume. It can't filter the case you mentioned. Do you know the root cause of these cases hitting the issue? So that we can get an exact point to warn the freeing up behavior. Thanks, Jack From: Christian König Sent: Friday, February 3, 2023 9:20 PM To: Xiao, Jack ; Koenig, Christian ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Nope, that is not related to any hw state. It's simply not allowed to free up resources during suspend since those can't be acquired again during resume. We had a couple of cases now where this was wrong. If you get a warning from that please fix the code which tried to free something during suspend instead. Regards, Christian. Am 03.02.23 um 07:04 schrieb Xiao, Jack: [AMD Official Use Only - General] >> It's simply illegal to free up memory during suspend. Why? In my understanding, the limit was caused by DMA shutdown. Regards, Jack From: Koenig, Christian <mailto:christian.koe...@amd.com> Sent: Thursday, February 2, 2023 7:43 PM To: Xiao, Jack <mailto:jack.x...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, Alexander <mailto:alexander.deuc...@amd.com> Subject: AW: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Big NAK to this! This warning is not related in any way to the hw state. It's simply illegal to free up memory during suspend. Regards, Christian. Von: Xiao, Jack mailto:jack.x...@amd.com>> Gesendet: Donnerstag, 2. Februar 2023 10:54 An: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> mailto:amd-gfx@lists.freedesktop.org>>; Deucher, Alexander mailto:alexander.deuc...@amd.com>>; Koenig, Christian mailto:christian.koe...@amd.com>> Cc: Xiao, Jack mailto:jack.x...@amd.com>> Betreff: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Reduce waringings, only warn when DMA is unavailable. Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>> --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 2d237f3d3a2e..e3e3764ea697 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -422,7 +422,8 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr, if (*bo == NULL) return; - WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend); + WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend && + !amdgpu_ttm_adev((*bo)->tbo.bdev)->ip_blocks[AMD_IP_BLOCK_TYPE_SDMA].status.hw); if (likely(amdgpu_bo_reserve(*bo, true) == 0)) { if (cpu_addr) -- 2.37.3
RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable
[AMD Official Use Only - General] >> It's simply illegal to free up memory during suspend. Why? In my understanding, the limit was caused by DMA shutdown. Regards, Jack From: Koenig, Christian Sent: Thursday, February 2, 2023 7:43 PM To: Xiao, Jack ; amd-gfx@lists.freedesktop.org; Deucher, Alexander Subject: AW: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Big NAK to this! This warning is not related in any way to the hw state. It's simply illegal to free up memory during suspend. Regards, Christian. ____ Von: Xiao, Jack mailto:jack.x...@amd.com>> Gesendet: Donnerstag, 2. Februar 2023 10:54 An: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> mailto:amd-gfx@lists.freedesktop.org>>; Deucher, Alexander mailto:alexander.deuc...@amd.com>>; Koenig, Christian mailto:christian.koe...@amd.com>> Cc: Xiao, Jack mailto:jack.x...@amd.com>> Betreff: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable Reduce waringings, only warn when DMA is unavailable. Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>> --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 2d237f3d3a2e..e3e3764ea697 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -422,7 +422,8 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr, if (*bo == NULL) return; - WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend); + WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend && + !amdgpu_ttm_adev((*bo)->tbo.bdev)->ip_blocks[AMD_IP_BLOCK_TYPE_SDMA].status.hw); if (likely(amdgpu_bo_reserve(*bo, true) == 0)) { if (cpu_addr) -- 2.37.3
Re: [PATCH 2/2] drm/amdgpu: remove deprecated MES version vars
[AMD Official Use Only - General] The series is Reviewed-by: Jack Xiao Regards, Jack From: Sider, Graham Sent: Wednesday, 26 October 2022 03:20 To: amd-gfx@lists.freedesktop.org Cc: Xiao, Jack ; Zhang, Hawking ; Sider, Graham Subject: [PATCH 2/2] drm/amdgpu: remove deprecated MES version vars MES scheduler and kiq versions are stored in mes.sched_version and mes.kiq_version, respectively, which are read from a register after their queues are initialized. Remove mes.ucode_fw_version and mes.data_fw_version which tried to read this versioning info from the firmware headers (which don't contain this information). Signed-off-by: Graham Sider --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 -- drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 4 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 4 3 files changed, 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h index ad980f4b66e1..97c05d08a551 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h @@ -91,14 +91,12 @@ struct amdgpu_mes { struct amdgpu_bo*ucode_fw_obj[AMDGPU_MAX_MES_PIPES]; uint64_t ucode_fw_gpu_addr[AMDGPU_MAX_MES_PIPES]; uint32_t*ucode_fw_ptr[AMDGPU_MAX_MES_PIPES]; - uint32_tucode_fw_version[AMDGPU_MAX_MES_PIPES]; uint64_tuc_start_addr[AMDGPU_MAX_MES_PIPES]; /* mes ucode data */ struct amdgpu_bo*data_fw_obj[AMDGPU_MAX_MES_PIPES]; uint64_tdata_fw_gpu_addr[AMDGPU_MAX_MES_PIPES]; uint32_t*data_fw_ptr[AMDGPU_MAX_MES_PIPES]; - uint32_tdata_fw_version[AMDGPU_MAX_MES_PIPES]; uint64_tdata_start_addr[AMDGPU_MAX_MES_PIPES]; /* eop gpu obj */ diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c index 067d10073a56..1abdf8b7ab50 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c @@ -415,10 +415,6 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev, mes_hdr = (const struct mes_firmware_header_v1_0 *) adev->mes.fw[pipe]->data; - adev->mes.ucode_fw_version[pipe] = - le32_to_cpu(mes_hdr->mes_ucode_version); - adev->mes.ucode_fw_version[pipe] = - le32_to_cpu(mes_hdr->mes_ucode_data_version); adev->mes.uc_start_addr[pipe] = le32_to_cpu(mes_hdr->mes_uc_start_addr_lo) | ((uint64_t)(le32_to_cpu(mes_hdr->mes_uc_start_addr_hi)) << 32); diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index e14f314902b1..27a330f51c7d 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -480,10 +480,6 @@ static int mes_v11_0_init_microcode(struct amdgpu_device *adev, mes_hdr = (const struct mes_firmware_header_v1_0 *) adev->mes.fw[pipe]->data; - adev->mes.ucode_fw_version[pipe] = - le32_to_cpu(mes_hdr->mes_ucode_version); - adev->mes.ucode_fw_version[pipe] = - le32_to_cpu(mes_hdr->mes_ucode_data_version); adev->mes.uc_start_addr[pipe] = le32_to_cpu(mes_hdr->mes_uc_start_addr_lo) | ((uint64_t)(le32_to_cpu(mes_hdr->mes_uc_start_addr_hi)) << 32); -- 2.25.1
Re: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini
[AMD Official Use Only - General] Reviewed-by: Jack Xiao Regards, Jack From: amd-gfx on behalf of YuBiao Wang Sent: Tuesday, 18 October 2022 13:37 To: amd-gfx@lists.freedesktop.org Cc: Wang, YuBiao ; Grodzovsky, Andrey ; Xiao, Jack ; Xu, Feifei ; Chen, Horace ; Wang, Yang(Kevin) ; Tuikov, Luben ; Deucher, Alexander ; Quan, Evan ; Koenig, Christian ; Liu, Monk ; Zhang, Hawking Subject: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini [Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jobs are done by mes and before kiq fini. v3: Move the dequeue operation inside kiq_hw_fini. Signed-off-by: YuBiao Wang --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 42 -- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 1174dcc88db5..b477bed40d61 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -1151,6 +1151,42 @@ static int mes_v11_0_sw_fini(void *handle) return 0; } +static void mes_v11_0_kiq_dequeue_sched(struct amdgpu_device *adev) +{ + uint32_t data; + int i; + + mutex_lock(&adev->srbm_mutex); + soc21_grbm_select(adev, 3, AMDGPU_MES_SCHED_PIPE, 0, 0); + + /* disable the queue if it's active */ + if (RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1) { + WREG32_SOC15(GC, 0, regCP_HQD_DEQUEUE_REQUEST, 1); + for (i = 0; i < adev->usec_timeout; i++) { + if (!(RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1)) + break; + udelay(1); + } + } + data = RREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL); + data = REG_SET_FIELD(data, CP_HQD_PQ_DOORBELL_CONTROL, + DOORBELL_EN, 0); + data = REG_SET_FIELD(data, CP_HQD_PQ_DOORBELL_CONTROL, + DOORBELL_HIT, 1); + WREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL, data); + + WREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL, 0); + + WREG32_SOC15(GC, 0, regCP_HQD_PQ_WPTR_LO, 0); + WREG32_SOC15(GC, 0, regCP_HQD_PQ_WPTR_HI, 0); + WREG32_SOC15(GC, 0, regCP_HQD_PQ_RPTR, 0); + + soc21_grbm_select(adev, 0, 0, 0, 0); + mutex_unlock(&adev->srbm_mutex); + + adev->mes.ring.sched.ready = false; +} + static void mes_v11_0_kiq_setting(struct amdgpu_ring *ring) { uint32_t tmp; @@ -1202,6 +1238,9 @@ static int mes_v11_0_kiq_hw_init(struct amdgpu_device *adev) static int mes_v11_0_kiq_hw_fini(struct amdgpu_device *adev) { + if (adev->mes.ring.sched.ready) + mes_v11_0_kiq_dequeue_sched(adev); + mes_v11_0_enable(adev, false); return 0; } @@ -1257,9 +1296,6 @@ static int mes_v11_0_hw_init(void *handle) static int mes_v11_0_hw_fini(void *handle) { - struct amdgpu_device *adev = (struct amdgpu_device *)handle; - - adev->mes.ring.sched.ready = false; return 0; } -- 2.25.1
Re: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini
[AMD Official Use Only - General] >> + /* dequeue sched pipe via kiq */ >> + void(*kiq_dequeue_sched)(struct >> amdgpu_device *adev); >> + It's unnecessary to expose a new callback function proto to perform dequeuing scheduler queue, for kiq_fini or mes_hw_fini callback had got the propriate sequence to do this during fini procedure. In addition, it seems that the issue is caused by that unclear CP_HQD_PQ_WPTR/CONTROL registers would trigger scheduler reading from unexpected address once CP_MES_CNTL is enabled: a. for dequeue operation uses MMIO rather than kiq, can the dequeue operation move to mes_v11_hw_fini? b. if something went wrong, it's better move the dequeue operation inside mes_v11_0_kiq_hw_fini before disabling mes controller. Regards, Jack From: YuBiao Wang Sent: Friday, 14 October 2022 11:34 To: amd-gfx@lists.freedesktop.org Cc: Grodzovsky, Andrey ; Quan, Evan ; Chen, Horace ; Tuikov, Luben ; Koenig, Christian ; Deucher, Alexander ; Xiao, Jack ; Zhang, Hawking ; Liu, Monk ; Xu, Feifei ; Wang, Yang(Kevin) ; Wang, YuBiao Subject: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini Update: Remove redundant comments as Christian suggests. [Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jobs are done by mes and before kiq fini. Signed-off-by: YuBiao Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 3 ++ drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 ++ drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 41 +++-- 3 files changed, 44 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h index ad980f4b66e1..ea8efa52503b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h @@ -130,6 +130,9 @@ struct amdgpu_mes { int (*kiq_hw_init)(struct amdgpu_device *adev); int (*kiq_hw_fini)(struct amdgpu_device *adev); + /* dequeue sched pipe via kiq */ + void(*kiq_dequeue_sched)(struct amdgpu_device *adev); + /* ip specific functions */ const struct amdgpu_mes_funcs *funcs; }; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 257b2e4de600..7c75758f58e1 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -4406,6 +4406,9 @@ static int gfx_v11_0_hw_fini(void *handle) if (amdgpu_gfx_disable_kcq(adev)) DRM_ERROR("KCQ disable failed\n"); + if (adev->mes.ring.sched.ready && adev->mes.kiq_dequeue_sched) + adev->mes.kiq_dequeue_sched(adev); + amdgpu_mes_kiq_hw_fini(adev); } diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index b48684db2832..eef29646b074 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -44,6 +44,7 @@ MODULE_FIRMWARE("amdgpu/gc_11_0_3_mes1.bin"); static int mes_v11_0_hw_fini(void *handle); static int mes_v11_0_kiq_hw_init(struct amdgpu_device *adev); static int mes_v11_0_kiq_hw_fini(struct amdgpu_device *adev); +static void mes_v11_0_kiq_dequeue_sched(struct amdgpu_device *adev); #define MES_EOP_SIZE 2048 @@ -1078,6 +1079,7 @@ static int mes_v11_0_sw_init(void *handle) adev->mes.funcs = &mes_v11_0_funcs; adev->mes.kiq_hw_init = &mes_v11_0_kiq_hw_init; adev->mes.kiq_hw_fini = &mes_v11_0_kiq_hw_fini; + adev->mes.kiq_dequeue_sched = &mes_v11_0_kiq_dequeue_sched; r = amdgpu_mes_init(adev); if (r) @@ -1151,6 +1153,42 @@ static int mes_v11_0_sw_fini(void *handle) return 0; } +static void mes_v11_0_kiq_dequeue_sched(struct amdgpu_device *adev) +{ + uint32_t data; + int i; + + mutex_lock(&adev->srbm_mutex); + soc21_grbm_select(adev, 3, AMDGPU_MES_SCHED_PIPE, 0, 0); + + /* disable the queue if it's active */ + if (RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1) { + WREG32_SOC15(GC, 0, regCP_HQD_DEQUEUE_REQUEST, 1); + for (i = 0; i < adev->usec_timeout; i++) { + if (!(RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1)) + break; + udelay(1); + } + } + data = RREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL); + data = REG_SET_FIELD(data, CP_HQD_PQ_DOORBELL_CONTROL, + DOORBELL_EN, 0); + d
Re: [PATCH 6/7] drm/amdgpu/mes: add mes ring test
[AMD Official Use Only - General] will drop this single patch, and send out another independent patch for this. Thanks, Jack From: Yu, Lang Sent: Wednesday, 29 June 2022 16:35 To: Xiao, Jack Cc: amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 6/7] drm/amdgpu/mes: add mes ring test On 06/29/ , Jack Xiao wrote: > Use read/write register to test mes ring. > > Signed-off-by: Jack Xiao > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 36 + > drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 + > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 6 + > 3 files changed, 43 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > index b6c2a5058b64..c18ea0bc00eb 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > @@ -926,6 +926,42 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, > uint32_t reg, >return r; > } > > +int amdgpu_mes_ring_test_ring(struct amdgpu_device *adev) > +{ > + uint32_t scratch; > + uint32_t tmp = 0; > + unsigned i; > + int r = 0; > + > + r = amdgpu_gfx_scratch_get(adev, &scratch); amdgpu_gfx_scratch_get/free() have been removed in latest amd-staging-drm-next. See e9b8129d8ca5 (drm/amdgpu: nuke dynamic gfx scratch reg allocation). Regards, Lang > + if (r) { > + DRM_ERROR("amdgpu: mes failed to get scratch reg (%d).\n", r); > + return r; > + } > + > + WREG32(scratch, 0xCAFEDEAD); > + > + tmp = amdgpu_mes_rreg(adev, scratch); > + if (tmp != 0xCAFEDEAD) { > + DRM_ERROR("mes failed to read register\n"); > + goto error; > + } > + > + r = amdgpu_mes_wreg(adev, scratch, 0xDEADBEEF); > + if (r) > + goto error; > + > + tmp = RREG32(scratch); > + if (tmp != 0xDEADBEEF) { > + DRM_ERROR("mes failed to write register\n"); > + r = -EIO; > + } > + > +error: > + amdgpu_gfx_scratch_free(adev, scratch); > + return r; > +} > + > static void > amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev, > struct amdgpu_ring *ring, > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > index 93b2ba817916..81610e3f3059 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > @@ -341,6 +341,7 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, > uint32_t reg, > int amdgpu_mes_reg_write_reg_wait(struct amdgpu_device *adev, > uint32_t reg0, uint32_t reg1, > uint32_t ref, uint32_t mask); > +int amdgpu_mes_ring_test_ring(struct amdgpu_device *adev); > > int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id, >int queue_type, int idx, > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > index 2a6c7a680c62..c4d085429d26 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > @@ -1194,6 +1194,12 @@ static int mes_v11_0_hw_init(void *handle) >goto failure; >} > > + r = amdgpu_mes_ring_test_ring(adev); > + if (r) { > + DRM_ERROR("MES ring test failed\n"); > + goto failure; > + } > + >/* > * Disable KIQ ring usage from the driver once MES is enabled. > * MES uses KIQ ring exclusively so driver cannot access KIQ ring > -- > 2.35.1 >
Re: [PATCH v3 3/3] drm/amdgpu: Update mes_v11_api_def.h
[AMD Official Use Only - General] >> + uint64_toversubscription_no_aggregated_en; uint64_t is unnecessary here, 32bit or bool type here better. With fixed, the patch is Reviewed-by: Jack Xiao Regards, Jack From: amd-gfx on behalf of Graham Sider Sent: Monday, 13 June 2022 23:20 To: amd-gfx@lists.freedesktop.org Cc: Joshi, Mukul ; Kuehling, Felix ; Sider, Graham ; Yang, Philip Subject: [PATCH v3 3/3] drm/amdgpu: Update mes_v11_api_def.h Update MES API to support oversubscription without aggregated doorbell for usermode queues. Signed-off-by: Graham Sider --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 + drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 3 +++ drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 ++ drivers/gpu/drm/amd/include/mes_v11_api_def.h | 4 +++- 5 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index 2e86baa32c55..3d9a81a8fa1c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -681,6 +681,7 @@ int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id, queue_input.wptr_addr = qprops->wptr_gpu_addr; queue_input.queue_type = qprops->queue_type; queue_input.paging = qprops->paging; + queue_input.oversubscription_no_aggregated_en = 0; r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input); if (r) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h index aa06c8396ee0..26765a9946a9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h @@ -207,6 +207,7 @@ struct mes_add_queue_input { uint32_tdebug_vmid; uint64_ttba_addr; uint64_ttma_addr; + uint64_toversubscription_no_aggregated_en; }; struct mes_remove_queue_input { diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 2a9ef308e71c..95a1394d3943 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -163,6 +163,8 @@ static int mes_v11_0_add_hw_queue(struct amdgpu_mes *mes, mes_add_queue_pkt.gws_size = input->gws_size; mes_add_queue_pkt.trap_handler_addr = input->tba_addr; mes_add_queue_pkt.tma_addr = input->tma_addr; + mes_add_queue_pkt.oversubscription_no_aggregated_en = + input->oversubscription_no_aggregated_en; mes_add_queue_pkt.api_status.api_completion_fence_addr = mes->ring.fence_drv.gpu_addr; @@ -341,6 +343,7 @@ static int mes_v11_0_set_hw_resources(struct amdgpu_mes *mes) mes_set_hw_res_pkt.disable_reset = 1; mes_set_hw_res_pkt.disable_mes_log = 1; mes_set_hw_res_pkt.use_different_vmid_compute = 1; + mes_set_hw_res_pkt.oversubscription_timer = 50; mes_set_hw_res_pkt.api_status.api_completion_fence_addr = mes->ring.fence_drv.gpu_addr; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index d8de2fbdfc7d..762bc6059387 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -235,6 +235,8 @@ static int add_queue_mes(struct device_queue_manager *dqm, struct queue *q, } else queue_input.wptr_addr = (uint64_t)q->properties.write_ptr; + queue_input.oversubscription_no_aggregated_en = 1; + queue_input.paging = false; queue_input.tba_addr = qpd->tba_addr; queue_input.tma_addr = qpd->tma_addr; diff --git a/drivers/gpu/drm/amd/include/mes_v11_api_def.h b/drivers/gpu/drm/amd/include/mes_v11_api_def.h index f9d02d7bdf77..95f0246eb045 100644 --- a/drivers/gpu/drm/amd/include/mes_v11_api_def.h +++ b/drivers/gpu/drm/amd/include/mes_v11_api_def.h @@ -226,6 +226,7 @@ union MESAPI_SET_HW_RESOURCES { }; uint32_tuint32_t_all; }; + uint32_toversubscription_timer; }; uint32_tmax_dwords_in_api[API_FRAME_SIZE_IN_DWORDS]; @@ -265,7 +266,8 @@ union MESAPI__ADD_QUEUE { uint32_t is_gang_suspended : 1; uint32_t is_tmz_queue : 1; uint32_t map_kiq_utility_queue : 1; - uint32_t reserved : 23; + uint32_t oversubscription_no_aggregated_en : 1; + uint32_t reserved : 22; }; struct MES_API_STATUS api_status; uint64_ttma_addr; -- 2.25.
Re: [PATCH v3 1/3] drm/amdgpu: Fetch MES scheduler/KIQ versions
[AMD Official Use Only - General] Reviewed-by: Jack Xiao Regards, Jack From: amd-gfx on behalf of Graham Sider Sent: Monday, 13 June 2022 23:19 To: amd-gfx@lists.freedesktop.org Cc: Joshi, Mukul ; Kuehling, Felix ; Sider, Graham ; Yang, Philip Subject: [PATCH v3 1/3] drm/amdgpu: Fetch MES scheduler/KIQ versions Store MES scheduler and MES KIQ version numbers in amdgpu_mes for GFX11. Signed-off-by: Graham Sider --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 3 +++ drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 12 2 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h index 92ddee5e33db..aa06c8396ee0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h @@ -64,6 +64,9 @@ struct amdgpu_mes { spinlock_t queue_id_lock; + uint32_tsched_version; + uint32_tkiq_version; + uint32_ttotal_max_queue; uint32_tdoorbell_id_offset; uint32_tmax_doorbell_slices; diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index e4eb87689f7f..2a9ef308e71c 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -890,6 +890,18 @@ static int mes_v11_0_queue_init(struct amdgpu_device *adev, mes_v11_0_queue_init_register(ring); } + /* get MES scheduler/KIQ versions */ + mutex_lock(&adev->srbm_mutex); + soc21_grbm_select(adev, 3, pipe, 0, 0); + + if (pipe == AMDGPU_MES_SCHED_PIPE) + adev->mes.sched_version = RREG32_SOC15(GC, 0, regCP_MES_GP3_LO); + else if (pipe == AMDGPU_MES_KIQ_PIPE && adev->enable_mes_kiq) + adev->mes.kiq_version = RREG32_SOC15(GC, 0, regCP_MES_GP3_LO); + + soc21_grbm_select(adev, 0, 0, 0, 0); + mutex_unlock(&adev->srbm_mutex); + return 0; } -- 2.25.1
Re: [PATCH] drm/amdgpu: remove redundant enable_mes/enable_mes_kiq setting
[AMD Official Use Only - General] Reviewed-by: Jack Xiao Regards, Jack From: amd-gfx on behalf of Yifan Zhang Sent: Sunday, 12 June 2022 12:31 To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Yifan Subject: [PATCH] drm/amdgpu: remove redundant enable_mes/enable_mes_kiq setting enable_mes and enable_mes_kiq are set in both device init and MES IP init. Leave the ones in MES IP init, since it is a more accurate way to judge from GC IP version. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 1 file changed, 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 68a53528089f..2b92281dd0c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3614,14 +3614,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (amdgpu_mcbp) DRM_INFO("MCBP is enabled\n"); - if (adev->asic_type >= CHIP_NAVI10) { - if (amdgpu_mes || amdgpu_mes_kiq) - adev->enable_mes = true; - - if (amdgpu_mes_kiq) - adev->enable_mes_kiq = true; - } - /* * Reset domain needs to be present early, before XGMI hive discovered * (if any) and intitialized to use reset sem and in_gpu reset flag -- 2.35.1
RE: [PATCH] drm/amd/powerplay: bug fix for memory clock request from display
Reviewed-by: Jack Xiao -Original Message- From: amd-gfx On Behalf Of Kenneth Feng Sent: Wednesday, October 16, 2019 4:58 PM To: amd-gfx@lists.freedesktop.org Cc: Feng, Kenneth Subject: [PATCH] drm/amd/powerplay: bug fix for memory clock request from display In some cases, display fixes memory clock frequency to a high value rather than the natural memory clock switching. When we comes back from s3 resume, the request from display is not reset, this causes the bug which makes the memory clock goes into a low value. Then due to the insuffcient memory clock, the screen flicks. Signed-off-by: Kenneth Feng --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c index e2a03f4..ee374df 100644 --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c @@ -1354,6 +1354,8 @@ static int smu_resume(void *handle) if (smu->is_apu) smu_set_gfx_cgpg(&adev->smu, true); + smu->disable_uclk_switch = 0; + mutex_unlock(&smu->mutex); pr_info("SMU is resumed successfully!\n"); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: add dummy read for some GCVM status registers
Reviewed-by: Jack Xiao -Original Message- From: Yuan, Xiaojie Sent: Thursday, August 22, 2019 11:01 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Hawking ; Xiao, Jack ; Yuan, Xiaojie Subject: [PATCH] drm/amdgpu: add dummy read for some GCVM status registers The GRBM register interface is now capable of bursting 1 cycle per register wr->wr, wr->rd much faster than previous muticycle per transaction done interface. This has caused a problem where status registers requiring HW to update have a 1 cycle delay, due to the register update having to go through GRBM. SW may operate on an incorrect value if they write a register and immediately check the corresponding status register. Registers requiring HW to clear or set fields may be delayed by 1 cycle. For example, 1. write VM_INVALIDATE_ENG0_REQ mask = 5a 2. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a a. HW will reset VM_INVALIDATE_ENG0_ACK = 0 until invalidation is complete 3. write VM_INVALIDATE_ENG0_REQ mask = 5a 4. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a a. First read of VM_INVALIDATE_ENG0_ACK = 5a instead of 0 b. Second read of VM_INVALIDATE_ENG0_ACK = 0 because the remote GRBM h/w register takes one extra cycle to be cleared c. In this case, SW will see a false ACK if they exit on first read Affected registers (only GC variant) | Recommended Dummy Read --+ VM_INVALIDATE_ENG*_ACK| VM_INVALIDATE_ENG*_REQ VM_L2_STATUS | VM_L2_STATUS VM_L2_PROTECTION_FAULT_STATUS | VM_L2_PROTECTION_FAULT_STATUS VM_L2_PROTECTION_FAULT_ADDR_HI/LO32 | VM_L2_PROTECTION_FAULT_ADDR_HI/LO32 VM_L2_IH_LOG_BUSY | VM_L2_IH_LOG_BUSY MC_VM_L2_PERFCOUNTER_HI/LO| MC_VM_L2_PERFCOUNTER_HI/LO ATC_L2_PERFCOUNTER_HI/LO | ATC_L2_PERFCOUNTER_HI/LO ATC_L2_PERFCOUNTER2_HI/LO | ATC_L2_PERFCOUNTER2_HI/LO Signed-off-by: Xiaojie Yuan --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 15 +++ drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 2 files changed, 31 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index d83d8a6a1fc0..56f76a1f32ee 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -135,6 +135,14 @@ static int gmc_v10_0_process_interrupt(struct amdgpu_device *adev, addr |= ((u64)entry->src_data[1] & 0xf) << 44; if (!amdgpu_sriov_vf(adev)) { + /* +* Issue a dummy read to wait for the status register to +* be updated to avoid reading an incorrect value due to +* the new fast GRBM interface. +*/ + if (entry->vmid_src == AMDGPU_GFXHUB_0) + RREG32(hub->vm_l2_pro_fault_status); + status = RREG32(hub->vm_l2_pro_fault_status); WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1); } @@ -228,6 +236,13 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device *adev, uint32_t vmid, WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp); + /* +* Issue a dummy read to wait for the ACK register to be cleared +* to avoid a false ACK due to the new fast GRBM interface. +*/ + if (vmhub == AMDGPU_GFXHUB_0) + RREG32_NO_KIQ(hub->vm_inv_eng0_req + eng); + /* Wait for ACK with a delay.*/ for (i = 0; i < adev->usec_timeout; i++) { tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_ack + eng); diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 6de17267fc35..17700606f54b 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -352,6 +352,14 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev, /* If it's the first fault for this address, process it normally */ if (!amdgpu_sriov_vf(adev)) { + /* +* Issue a dummy read to wait for the status register to +* be updated to avoid reading an incorrect value due to +* the new fast GRBM interface. +*/ + if (entry->vmid_src == AMDGPU_GFXHUB_0) + RREG32(hub->vm_l2_pro_fault_status); + status = RREG32(hub->vm_l2_pro_fault_status); WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1); } @@ -480,6 +488,14 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, spin_lock(&adev->gmc.invalidate_lock); WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp); + + /* +* Issue a dummy read to wait for the ACK register to be c
[PATCH] drm/amd/powerplay: increase waiting time for smu response
We observed some SMU commands take more time for execution, so increase waiting time for response. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index bc39690..88d3127 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -64,9 +64,9 @@ static int smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg) static int smu_v11_0_wait_for_response(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; - uint32_t cur_value, i; + uint32_t cur_value, i, timeout = adev->usec_timeout * 10; - for (i = 0; i < adev->usec_timeout; i++) { + for (i = 0; i < timeout; i++) { cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90); if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0) break; @@ -74,7 +74,7 @@ static int smu_v11_0_wait_for_response(struct smu_context *smu) } /* timeout means wrong logic */ - if (i == adev->usec_timeout) + if (i == timeout) return -ETIME; return RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90) == 0x1 ? 0 : -EIO; -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdkfd: remove an unused variable
Just for cleanup. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 24db82b..3322a44 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -487,7 +487,6 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev, const struct kfd2kgd_calls *f2g) { struct kfd_dev *kfd; - int ret; const struct kfd_device_info *device_info = lookup_device_info(pdev->device); -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job
I sent out a separate patch to add job->preamble_status back. Regards, Jack -Original Message- From: Koenig, Christian Sent: Tuesday, June 18, 2019 5:03 PM To: Xiao, Jack ; Alex Deucher ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Hawking Subject: Re: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job Yeah thought so, cause that would break immediately and won't work at all. It must be something like a merge or rebase artifact. What's the best way to fix that up? In other words how should it look like? Christian. Am 18.06.19 um 10:32 schrieb Xiao, Jack: > In original patch, there is no deletion on job->preamble_status. > > Regards, > Jack > > -Original Message- > From: Christian König > Sent: Tuesday, June 18, 2019 4:04 PM > To: Alex Deucher ; > amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Xiao, Jack > ; Zhang, Hawking > Subject: Re: [PATCH 092/459] drm/amdgpu: add ib preemption status in > amdgpu_job > > Am 17.06.19 um 21:10 schrieb Alex Deucher: >> From: Jack Xiao >> >> Add ib preemption status in amdgpu_job, so that ring level function >> can detect preemption and program for resuming it. >> >> Acked-by: Alex Deucher >> Reviewed-by: Hawking Zhang >> Signed-off-by: Jack Xiao >> Signed-off-by: Alex Deucher >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +- >>drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ >>2 files changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c >> index e15531bfcd47..90cc0841b564 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c >> @@ -208,7 +208,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, >> unsigned num_ibs, >> >> skip_preamble = ring->current_ctx == fence_ctx; >> if (job && ring->funcs->emit_cntxcntl) { >> -status |= job->preamble_status; >> +status |= job->preemption_status; > That actually looks completely incorrect to me since it breaks the old status > handling. > > Christian. > >> amdgpu_ring_emit_cntxcntl(ring, status); >> } >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> index e1b46a6703de..51e62504c279 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> @@ -29,6 +29,8 @@ >>#define AMDGPU_PREAMBLE_IB_PRESENT_FIRST(1 << 1) >>/* bit set means context switch occured */ >>#define AMDGPU_HAVE_CTX_SWITCH (1 << 2) >> +/* bit set means IB is preempted */ >> +#define AMDGPU_IB_PREEMPTED (1 << 3) >> >>#define to_amdgpu_job(sched_job) \ >> container_of((sched_job), struct amdgpu_job, base) @@ -45,6 >> +47,7 @@ struct amdgpu_job { >> struct amdgpu_ib*ibs; >> struct dma_fence*fence; /* the hw fence */ >> uint32_tpreamble_status; >> +uint32_tpreemption_status; >> uint32_tnum_ibs; >> void*owner; >> boolvm_needs_flush; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: add job->preamble_status back to status value
job->preamble_status was accidently dropped from the combination value of status varaiable which is the parameter of some ring level function. The patch is to add it back. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index 90cc084..6882eeb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -208,6 +208,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, skip_preamble = ring->current_ctx == fence_ctx; if (job && ring->funcs->emit_cntxcntl) { + status |= job->preamble_status; status |= job->preemption_status; amdgpu_ring_emit_cntxcntl(ring, status); } -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job
In original patch, there is no deletion on job->preamble_status. Regards, Jack -Original Message- From: Christian König Sent: Tuesday, June 18, 2019 4:04 PM To: Alex Deucher ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Xiao, Jack ; Zhang, Hawking Subject: Re: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job Am 17.06.19 um 21:10 schrieb Alex Deucher: > From: Jack Xiao > > Add ib preemption status in amdgpu_job, so that ring level function > can detect preemption and program for resuming it. > > Acked-by: Alex Deucher > Reviewed-by: Hawking Zhang > Signed-off-by: Jack Xiao > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > index e15531bfcd47..90cc0841b564 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > @@ -208,7 +208,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, > unsigned num_ibs, > > skip_preamble = ring->current_ctx == fence_ctx; > if (job && ring->funcs->emit_cntxcntl) { > - status |= job->preamble_status; > + status |= job->preemption_status; That actually looks completely incorrect to me since it breaks the old status handling. Christian. > amdgpu_ring_emit_cntxcntl(ring, status); > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > index e1b46a6703de..51e62504c279 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > @@ -29,6 +29,8 @@ > #define AMDGPU_PREAMBLE_IB_PRESENT_FIRST(1 << 1) > /* bit set means context switch occured */ > #define AMDGPU_HAVE_CTX_SWITCH (1 << 2) > +/* bit set means IB is preempted */ > +#define AMDGPU_IB_PREEMPTED (1 << 3) > > #define to_amdgpu_job(sched_job)\ > container_of((sched_job), struct amdgpu_job, base) @@ -45,6 > +47,7 > @@ struct amdgpu_job { > struct amdgpu_ib*ibs; > struct dma_fence*fence; /* the hw fence */ > uint32_tpreamble_status; > + uint32_tpreemption_status; > uint32_tnum_ibs; > void*owner; > boolvm_needs_flush; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amd/powerplay: add smu message mutex
Drop it due to a modification was missing. A new patch was sent out. -Original Message- From: Xiao, Jack Sent: Monday, June 03, 2019 2:12 PM To: amd-gfx@lists.freedesktop.org; Deucher, Alexander ; Zhang, Hawking Cc: Xiao, Jack Subject: [PATCH] drm/amd/powerplay: add smu message mutex Add smu message mutex preventing against race condition issue. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 + drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 7 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h index 3eb1de9..735233e 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h @@ -374,6 +374,7 @@ struct smu_context const struct smu_funcs *funcs; const struct pptable_funcs *ppt_funcs; struct mutexmutex; + struct mutexmsg_mutex; uint64_t pool_size; struct smu_table_contextsmu_table; diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index d2eeb62..de737a0 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -104,6 +104,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) if (index < 0) return index; + mutex_lock(&smu->msg_mutex); + smu_v11_0_wait_for_response(smu); WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0); @@ -111,11 +113,11 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index); ret = smu_v11_0_wait_for_response(smu); - if (ret) pr_err("Failed to send message 0x%x, response 0x%x\n", index, ret); + mutex_unlock(&smu->msg_mutex); return ret; } @@ -132,6 +134,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) if (index < 0) return index; + mutex_lock(&smu->msg_mutex); + ret = smu_v11_0_wait_for_response(smu); if (ret) pr_err("Failed to send message 0x%x, response 0x%x, param 0x%x\n", @@ -148,6 +152,7 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) pr_err("Failed to send message 0x%x, response 0x%x param 0x%x\n", index, ret, param); + mutex_unlock(&smu->msg_mutex); return ret; } -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/powerplay: add smu message mutex
Add smu message mutex preventing against race condition issue. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 1 + drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 + drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 7 ++- 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c index 3026c7e..db2bbec 100644 --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c @@ -350,6 +350,7 @@ static int smu_early_init(void *handle) smu->adev = adev; smu->pm_enabled = !!amdgpu_dpm; mutex_init(&smu->mutex); + mutex_init(&smu->msg_mutex); return smu_set_funcs(adev); } diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h index 3eb1de9..735233e 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h @@ -374,6 +374,7 @@ struct smu_context const struct smu_funcs *funcs; const struct pptable_funcs *ppt_funcs; struct mutexmutex; + struct mutexmsg_mutex; uint64_t pool_size; struct smu_table_contextsmu_table; diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index d2eeb62..de737a0 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -104,6 +104,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) if (index < 0) return index; + mutex_lock(&smu->msg_mutex); + smu_v11_0_wait_for_response(smu); WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0); @@ -111,11 +113,11 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index); ret = smu_v11_0_wait_for_response(smu); - if (ret) pr_err("Failed to send message 0x%x, response 0x%x\n", index, ret); + mutex_unlock(&smu->msg_mutex); return ret; } @@ -132,6 +134,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) if (index < 0) return index; + mutex_lock(&smu->msg_mutex); + ret = smu_v11_0_wait_for_response(smu); if (ret) pr_err("Failed to send message 0x%x, response 0x%x, param 0x%x\n", @@ -148,6 +152,7 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) pr_err("Failed to send message 0x%x, response 0x%x param 0x%x\n", index, ret, param); + mutex_unlock(&smu->msg_mutex); return ret; } -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/powerplay: add smu message mutex
Add smu message mutex preventing against race condition issue. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 + drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 7 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h index 3eb1de9..735233e 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h @@ -374,6 +374,7 @@ struct smu_context const struct smu_funcs *funcs; const struct pptable_funcs *ppt_funcs; struct mutexmutex; + struct mutexmsg_mutex; uint64_t pool_size; struct smu_table_contextsmu_table; diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index d2eeb62..de737a0 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -104,6 +104,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) if (index < 0) return index; + mutex_lock(&smu->msg_mutex); + smu_v11_0_wait_for_response(smu); WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0); @@ -111,11 +113,11 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index); ret = smu_v11_0_wait_for_response(smu); - if (ret) pr_err("Failed to send message 0x%x, response 0x%x\n", index, ret); + mutex_unlock(&smu->msg_mutex); return ret; } @@ -132,6 +134,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) if (index < 0) return index; + mutex_lock(&smu->msg_mutex); + ret = smu_v11_0_wait_for_response(smu); if (ret) pr_err("Failed to send message 0x%x, response 0x%x, param 0x%x\n", @@ -148,6 +152,7 @@ static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg) pr_err("Failed to send message 0x%x, response 0x%x param 0x%x\n", index, ret, param); + mutex_unlock(&smu->msg_mutex); return ret; } -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/3] drm/amdgpu: add field indicating if has PCIE atomics support
The new field in amdgpu device is used to record whether the system has PCIE atomics support. The field can be exposed to UMD or kfd whether PCIE atomics have supported. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index d355e9a..b0a01e4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -767,6 +767,7 @@ struct amdgpu_device { struct mutexgrbm_idx_mutex; struct dev_pm_domainvga_pm_domain; boolhave_disp_power_ref; + boolhave_atomics_support; /* BIOS */ boolis_atom_fw; -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 2/3] drm/amdgpu: enable PCIE atomics ops support
GPU atomics operation depends on PCIE atomics support. Always enable PCIE atomics ops support in case that it hasn't been enabled. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bdd1fe73..55772eb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2562,6 +2562,17 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (adev->rio_mem == NULL) DRM_INFO("PCI I/O BAR is not found.\n"); + /* enable PCIE atomic ops */ + r = pci_enable_atomic_ops_to_root(adev->pdev, + PCI_EXP_DEVCAP2_ATOMIC_COMP32 | + PCI_EXP_DEVCAP2_ATOMIC_COMP64); + if (r) { + adev->have_atomics_support = false; + DRM_INFO("PCIE atomic ops is not supported\n"); + } else { + adev->have_atomics_support = true; + } + amdgpu_device_get_pcie_info(adev); /* early init functions */ -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 0/3] always enable PCIE atomics
GPU atomics operation depends on PCIE atomics support. Always enable PCIE atomics ops support in case that it hasn't been enabled. Add field to amdgpu device, that can be exposed to UMD or kfd indicating whether the system has PCIE atomics support. Jack Xiao (3): drm/amdgpu: add field indicating if has PCIE atomics support drm/amdgpu: enable PCIE atomics ops support drm/amdkfd: remove duplicated PCIE atomics request drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++ drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 -- 5 files changed, 24 insertions(+), 6 deletions(-) -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 3/3] drm/amdkfd: remove duplicated PCIE atomics request
Since amdgpu has always requested PCIE atomics, kfd don't need duplicated PCIE atomics enablement. Referring to amdgpu request result is enough. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 -- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 98326e3b..ddd6c52 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -620,6 +620,13 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid) return false; } +bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)kgd; + + return adev->have_atomics_support; +} + #ifndef CONFIG_HSA_AMD bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index f57f297..8d135c82 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -135,6 +135,7 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine, uint32_t vmid, uint64_t gpu_addr, uint32_t *ib_cmd, uint32_t ib_len); void amdgpu_amdkfd_set_compute_idle(struct kgd_dev *kgd, bool idle); +bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd); struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void); struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 7b4ea24..76a1599 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -481,17 +481,15 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, * 32 and 64-bit requests are possible and must be * supported. */ - ret = pci_enable_atomic_ops_to_root(pdev, - PCI_EXP_DEVCAP2_ATOMIC_COMP32 | - PCI_EXP_DEVCAP2_ATOMIC_COMP64); - if (device_info->needs_pci_atomics && ret < 0) { + kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd); + if (device_info->needs_pci_atomics && + !kfd->pci_atomic_requested) { dev_info(kfd_device, "skipped device %x:%x, PCI rejects atomics\n", pdev->vendor, pdev->device); kfree(kfd); return NULL; - } else if (!ret) - kfd->pci_atomic_requested = true; + } kfd->kgd = kgd; kfd->device_info = device_info; -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: enable PCIE atomics ops support
GPU atomics operation depends on PCIE atomics support. Always enable PCIE atomics ops support in case that it hasn't been enabled. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bdd1fe73..a2c6064 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2562,6 +2562,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (adev->rio_mem == NULL) DRM_INFO("PCI I/O BAR is not found.\n"); + /* enable PCIE atomic ops */ + r = pci_enable_atomic_ops_to_root(adev->pdev, + PCI_EXP_DEVCAP2_ATOMIC_COMP32 | + PCI_EXP_DEVCAP2_ATOMIC_COMP64); + if (r) + DRM_INFO("PCIE atomic ops is not supported\n"); + amdgpu_device_get_pcie_info(adev); /* early init functions */ -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: enable PCIE atomics ops support
GPU atomics operation depends on PCIE atomics support. Always enable PCIE atomics ops support in case that it hasn't been enabled. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bdd1fe73..a2c6064 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2562,6 +2562,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, if (adev->rio_mem == NULL) DRM_INFO("PCI I/O BAR is not found.\n"); + /* enable PCIE atomic ops */ + r = pci_enable_atomic_ops_to_root(adev->pdev, + PCI_EXP_DEVCAP2_ATOMIC_COMP32 | + PCI_EXP_DEVCAP2_ATOMIC_COMP64); + if (r) + DRM_INFO("PCIE atomic ops is not supported\n"); + amdgpu_device_get_pcie_info(adev); /* early init functions */ -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH v2] drm/amdgpu: add flags to emit_ib interface v2
Replace the last bool type parameter with a general flags parameter, to make the last parameter be able to contain more information. v2: drop setting need_ctx_switch = false Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 10 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 2 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c| 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c| 6 +++--- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/si_dma.c | 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c| 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/vce_v3_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/vce_v4_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c| 6 +++--- 20 files changed, 34 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index c48207b3..0b8ef2d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -202,12 +202,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, amdgpu_asic_flush_hdp(adev, ring); } + if (need_ctx_switch) + status |= AMDGPU_HAVE_CTX_SWITCH; + skip_preamble = ring->current_ctx == fence_ctx; if (job && ring->funcs->emit_cntxcntl) { - if (need_ctx_switch) - status |= AMDGPU_HAVE_CTX_SWITCH; status |= job->preamble_status; - amdgpu_ring_emit_cntxcntl(ring, status); } @@ -221,8 +221,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, !amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */ continue; - amdgpu_ring_emit_ib(ring, job, ib, need_ctx_switch); - need_ctx_switch = false; + amdgpu_ring_emit_ib(ring, job, ib, status); + status &= ~AMDGPU_HAVE_CTX_SWITCH; } if (ring->funcs->emit_tmz) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index d87e828..d7fae26 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -131,7 +131,7 @@ struct amdgpu_ring_funcs { void (*emit_ib)(struct amdgpu_ring *ring, struct amdgpu_job *job, struct amdgpu_ib *ib, - bool ctx_switch); + uint32_t flags); void (*emit_fence)(struct amdgpu_ring *ring, uint64_t addr, uint64_t seq, unsigned flags); void (*emit_pipeline_sync)(struct amdgpu_ring *ring); @@ -229,7 +229,7 @@ struct amdgpu_ring { #define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r)) #define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r)) #define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r)) -#define amdgpu_ring_emit_ib(r, job, ib, c) ((r)->funcs->emit_ib((r), (job), (ib), (c))) +#define amdgpu_ring_emit_ib(r, job, ib, flags) ((r)->funcs->emit_ib((r), (job), (ib), (flags))) #define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r)) #define amdgpu_ring_emit_vm_flush(r, vmid, addr) (r)->funcs->emit_vm_flush((r), (vmid), (addr)) #define amdgpu_ring_emit_fence(r, addr, seq, flags) (r)->funcs->emit_fence((r), (addr), (seq), (flags)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c index 98a1b2c..c021b11 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c @@ -1035,7 +1035,7 @@ int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser *p, uint32_t ib_idx) void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_job *job, struct amdgpu_ib *ib, - bool ctx_switch) + uint32_t flags) { amdgpu_ring_write(ring, VCE_CMD_IB); amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h index 5029365..30ea54d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h @@ -66,7 +66,7 @@ int amdgpu_vce_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle, int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parse
[PATCH] drm/amdgpu: add flags to emit_ib interface
Replace the last bool type parameter with a general flags parameter, to make the last parameter be able to contain more information. Signed-off-by: Jack Xiao --- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 7 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h | 2 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c| 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c| 6 +++--- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/si_dma.c | 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c| 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c| 4 ++-- drivers/gpu/drm/amd/amdgpu/vce_v3_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/vce_v4_0.c| 2 +- drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c| 6 +++--- 20 files changed, 35 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index c48207b3..1b7edd7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c @@ -221,7 +221,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, !amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */ continue; - amdgpu_ring_emit_ib(ring, job, ib, need_ctx_switch); + if (need_ctx_switch) + status |= AMDGPU_HAVE_CTX_SWITCH; + else + status &= ~AMDGPU_HAVE_CTX_SWITCH; + + amdgpu_ring_emit_ib(ring, job, ib, status); need_ctx_switch = false; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index d87e828..d7fae26 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -131,7 +131,7 @@ struct amdgpu_ring_funcs { void (*emit_ib)(struct amdgpu_ring *ring, struct amdgpu_job *job, struct amdgpu_ib *ib, - bool ctx_switch); + uint32_t flags); void (*emit_fence)(struct amdgpu_ring *ring, uint64_t addr, uint64_t seq, unsigned flags); void (*emit_pipeline_sync)(struct amdgpu_ring *ring); @@ -229,7 +229,7 @@ struct amdgpu_ring { #define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r)) #define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r)) #define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r)) -#define amdgpu_ring_emit_ib(r, job, ib, c) ((r)->funcs->emit_ib((r), (job), (ib), (c))) +#define amdgpu_ring_emit_ib(r, job, ib, flags) ((r)->funcs->emit_ib((r), (job), (ib), (flags))) #define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r)) #define amdgpu_ring_emit_vm_flush(r, vmid, addr) (r)->funcs->emit_vm_flush((r), (vmid), (addr)) #define amdgpu_ring_emit_fence(r, addr, seq, flags) (r)->funcs->emit_fence((r), (addr), (seq), (flags)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c index 98a1b2c..c021b11 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c @@ -1035,7 +1035,7 @@ int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser *p, uint32_t ib_idx) void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_job *job, struct amdgpu_ib *ib, - bool ctx_switch) + uint32_t flags) { amdgpu_ring_write(ring, VCE_CMD_IB); amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr)); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h index 5029365..30ea54d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h @@ -66,7 +66,7 @@ int amdgpu_vce_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle, int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parser *p, uint32_t ib_idx); int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser *p, uint32_t ib_idx); void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_job *job, - struct amdgpu_ib *ib, bool ctx_switch); + struct amdgpu_ib *ib, uint32_t flags); void amdgpu_vce_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 seq, unsigned flags); int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring); diff --git a/drivers/gp
RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9
Can you explain what the benefits can be gained from AGP aperture enablement? Otherwise, it would increase our maintenance workload. Regards, Jack -Original Message- From: Christian König Sent: Wednesday, August 29, 2018 4:47 PM To: Kuehling, Felix ; Koenig, Christian ; Xiao, Jack ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9 Completely agree with Felix here. It makes system memory access slightly simpler, but I would say that you accidentally corrupt the GART table and that you accidentally corrupt the the AGP aperture is equally likely. Regards, Christian. Am 28.08.2018 um 20:12 schrieb Felix Kuehling: > The GPU always has access to the entire guest physical address space. > You can just program arbitrary addresses into page table entries and > set the system bit. That's how GART and GPUVM mappings work. They will > not go through the AGP aperture. And there is no mechanism (other than > an > IOMMU) to protect system memory from GPU access. > > Regards, > Felix > > > On 2018-08-28 07:41 AM, Christian König wrote: >> Well that is indeed true, but we still have IOMMU between the GPU and >> the system memory. >> >> So we should still catch issues when something goes terrible wrong. >> >> Additional to that only the system domain, e.g. kernel copies, page >> table updates etc are allowed to use it. >> >> Regards, >> Christian. >> >> Am 28.08.2018 um 09:06 schrieb Xiao, Jack: >>> I mean it has risk to make GPU allowed to access to most system >>> memory without explicit claiming, it's easier to mask problem. >>> >>> Regards, >>> Jack >>> >>> -Original Message- >>> From: Koenig, Christian >>> Sent: Tuesday, August 28, 2018 2:46 PM >>> To: Xiao, Jack ; Kuehling, Felix >>> ; Christian König >>> ; amd-gfx@lists.freedesktop.org >>> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address >>> space on GMC9 >>> >>>> This series patches seems to make AGP aperture allowed to access >>>> any system memory (16GB) bypass GPU VM protection. >>> The system aperture should only be active in the system domain, or >>> otherwise applications would have access to local memory as well. >>> >>> There are some bits in the VM registers to enable/disable that, but >>> when we would have that setting incorrect we would see quite a bunch >>> of other problems. >>> >>> Might still be a good idea to double check if all the bits are setup >>> correctly. >>> >>> Regards, >>> Christian. >>> >>> Am 28.08.2018 um 07:31 schrieb Xiao, Jack: >>>> This series patches seems to make AGP aperture allowed to access >>>> any system memory (16GB) bypass GPU VM protection. >>>> If someone made a wrong logic requesting an illegal address which >>>> occasionally was located inside AGP aperture, but without any VM >>>> protection, the illegal address would be finally translated into a >>>> system memory address; if GPU read/wrote such system memory >>>> address, the system memory address might belong to kernel or any >>>> user application, the r/w operation would result in any unpredictable >>>> issue. >>>> The most important is that such kind of issue is so hard to be >>>> addressed. >>>> Is it worth doing this, but exposing risk? >>>> >>>> Regards, >>>> Jack >>>> >>>> -Original Message- >>>> From: amd-gfx On Behalf Of >>>> Felix Kuehling >>>> Sent: Tuesday, August 28, 2018 3:03 AM >>>> To: Christian König ; >>>> amd-gfx@lists.freedesktop.org; Koenig, Christian >>>> >>>> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address >>>> space on GMC9 >>>> >>>> The point of this series seems to be to allow access to small >>>> system memory BOs (one page) without a GART mapping. I'm guessing >>>> that reduces pressure on the GART and removes the need for HDP and >>>> TLB flushes. Why does Patch 10 only enable that on GFXv9? Is there >>>> less benefit on older chips? >>>> >>>> Is this related to your recent changes to allow page tables in >>>> system memory? >>>> >>>> See my replies to patch 6 and 8. Other than that, the series is >>>> Acked-by: Felix Kuehling >>&
RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9
> Well that is indeed true, but we still have IOMMU between the GPU and the > system memory. Some platform hasn't IOMMU, or IOMMU isn’t enabled by default. If such kind of issue was reported by customer, it was so hard to narrow down. > Additional to that only the system domain, e.g. kernel copies, page table > updates etc are allowed to use it. What was the motivation of this series patch? I guess it is for performance improvement, right? But AGP aperture memory is translated as mtype=UC same with GART (mtype=UC). The more overhead on GART is gpuVM pagetable walker, but I guess the overhead is little. Regards, Jack -Original Message- From: Christian König Sent: Tuesday, August 28, 2018 7:42 PM To: Xiao, Jack ; Koenig, Christian ; Kuehling, Felix ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9 Well that is indeed true, but we still have IOMMU between the GPU and the system memory. So we should still catch issues when something goes terrible wrong. Additional to that only the system domain, e.g. kernel copies, page table updates etc are allowed to use it. Regards, Christian. Am 28.08.2018 um 09:06 schrieb Xiao, Jack: > I mean it has risk to make GPU allowed to access to most system memory > without explicit claiming, it's easier to mask problem. > > Regards, > Jack > > -Original Message- > From: Koenig, Christian > Sent: Tuesday, August 28, 2018 2:46 PM > To: Xiao, Jack ; Kuehling, Felix > ; Christian König > ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address > space on GMC9 > >> This series patches seems to make AGP aperture allowed to access any system >> memory (16GB) bypass GPU VM protection. > The system aperture should only be active in the system domain, or otherwise > applications would have access to local memory as well. > > There are some bits in the VM registers to enable/disable that, but when we > would have that setting incorrect we would see quite a bunch of other > problems. > > Might still be a good idea to double check if all the bits are setup > correctly. > > Regards, > Christian. > > Am 28.08.2018 um 07:31 schrieb Xiao, Jack: >> This series patches seems to make AGP aperture allowed to access any system >> memory (16GB) bypass GPU VM protection. >> If someone made a wrong logic requesting an illegal address which >> occasionally was located inside AGP aperture, but without any VM >> protection, the illegal address would be finally translated into a >> system memory address; if GPU read/wrote such system memory address, the >> system memory address might belong to kernel or any user application, the >> r/w operation would result in any unpredictable issue. >> The most important is that such kind of issue is so hard to be addressed. >> Is it worth doing this, but exposing risk? >> >> Regards, >> Jack >> >> -Original Message- >> From: amd-gfx On Behalf Of >> Felix Kuehling >> Sent: Tuesday, August 28, 2018 3:03 AM >> To: Christian König ; >> amd-gfx@lists.freedesktop.org; Koenig, Christian >> >> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address >> space on GMC9 >> >> The point of this series seems to be to allow access to small system memory >> BOs (one page) without a GART mapping. I'm guessing that reduces pressure on >> the GART and removes the need for HDP and TLB flushes. Why does Patch 10 >> only enable that on GFXv9? Is there less benefit on older chips? >> >> Is this related to your recent changes to allow page tables in system memory? >> >> See my replies to patch 6 and 8. Other than that, the series is >> Acked-by: Felix Kuehling >> >> Regards, >> Felix >> >> >> On 2018-08-27 12:53 PM, Christian König wrote: >>> Only use the lower address space on GMC9 for the system domain. >>> Otherwise we would need to sign extend GMC addresses. >>> >>> Signed-off-by: Christian König >>> --- >>>drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++ >>>1 file changed, 3 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>> index e44b5191735d..d982956c8329 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>> @@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle) >>> if (r) >>> return r; >>> >>> - /* Set the i
RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9
I mean it has risk to make GPU allowed to access to most system memory without explicit claiming, it's easier to mask problem. Regards, Jack -Original Message- From: Koenig, Christian Sent: Tuesday, August 28, 2018 2:46 PM To: Xiao, Jack ; Kuehling, Felix ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9 > This series patches seems to make AGP aperture allowed to access any system > memory (16GB) bypass GPU VM protection. The system aperture should only be active in the system domain, or otherwise applications would have access to local memory as well. There are some bits in the VM registers to enable/disable that, but when we would have that setting incorrect we would see quite a bunch of other problems. Might still be a good idea to double check if all the bits are setup correctly. Regards, Christian. Am 28.08.2018 um 07:31 schrieb Xiao, Jack: > This series patches seems to make AGP aperture allowed to access any system > memory (16GB) bypass GPU VM protection. > If someone made a wrong logic requesting an illegal address which > occasionally was located inside AGP aperture, but without any VM > protection, the illegal address would be finally translated into a > system memory address; if GPU read/wrote such system memory address, the > system memory address might belong to kernel or any user application, the r/w > operation would result in any unpredictable issue. > The most important is that such kind of issue is so hard to be addressed. > Is it worth doing this, but exposing risk? > > Regards, > Jack > > -Original Message- > From: amd-gfx On Behalf Of > Felix Kuehling > Sent: Tuesday, August 28, 2018 3:03 AM > To: Christian König ; > amd-gfx@lists.freedesktop.org; Koenig, Christian > > Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address > space on GMC9 > > The point of this series seems to be to allow access to small system memory > BOs (one page) without a GART mapping. I'm guessing that reduces pressure on > the GART and removes the need for HDP and TLB flushes. Why does Patch 10 only > enable that on GFXv9? Is there less benefit on older chips? > > Is this related to your recent changes to allow page tables in system memory? > > See my replies to patch 6 and 8. Other than that, the series is > Acked-by: Felix Kuehling > > Regards, > Felix > > > On 2018-08-27 12:53 PM, Christian König wrote: >> Only use the lower address space on GMC9 for the system domain. >> Otherwise we would need to sign extend GMC addresses. >> >> Signed-off-by: Christian König >> --- >> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++ >> 1 file changed, 3 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >> index e44b5191735d..d982956c8329 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >> @@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle) >> if (r) >> return r; >> >> -/* Set the internal MC address mask >> - * This is the max address of the GPU's >> - * internal address space. >> +/* Use only the lower range for the internal MC address mask. This is >> + * the max address of the GPU's internal address space. >> */ >> -adev->gmc.mc_mask = 0xULL; /* 48 bit MC */ >> +adev->gmc.mc_mask = 0x7fffULL; >> >> /* set DMA mask + need_dma32 flags. >> * PCIE - can handle 44-bits. > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9
This series patches seems to make AGP aperture allowed to access any system memory (16GB) bypass GPU VM protection. If someone made a wrong logic requesting an illegal address which occasionally was located inside AGP aperture, but without any VM protection, the illegal address would be finally translated into a system memory address; if GPU read/wrote such system memory address, the system memory address might belong to kernel or any user application, the r/w operation would result in any unpredictable issue. The most important is that such kind of issue is so hard to be addressed. Is it worth doing this, but exposing risk? Regards, Jack -Original Message- From: amd-gfx On Behalf Of Felix Kuehling Sent: Tuesday, August 28, 2018 3:03 AM To: Christian König ; amd-gfx@lists.freedesktop.org; Koenig, Christian Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9 The point of this series seems to be to allow access to small system memory BOs (one page) without a GART mapping. I'm guessing that reduces pressure on the GART and removes the need for HDP and TLB flushes. Why does Patch 10 only enable that on GFXv9? Is there less benefit on older chips? Is this related to your recent changes to allow page tables in system memory? See my replies to patch 6 and 8. Other than that, the series is Acked-by: Felix Kuehling Regards, Felix On 2018-08-27 12:53 PM, Christian König wrote: > Only use the lower address space on GMC9 for the system domain. > Otherwise we would need to sign extend GMC addresses. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++ > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > index e44b5191735d..d982956c8329 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > @@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle) > if (r) > return r; > > - /* Set the internal MC address mask > - * This is the max address of the GPU's > - * internal address space. > + /* Use only the lower range for the internal MC address mask. This is > + * the max address of the GPU's internal address space. >*/ > - adev->gmc.mc_mask = 0xULL; /* 48 bit MC */ > + adev->gmc.mc_mask = 0x7fffULL; > > /* set DMA mask + need_dma32 flags. >* PCIE - can handle 44-bits. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx