RE: [PATCH] drm/amdgpu/mes: refine for maximum packet execution

2024-08-07 Thread Xiao, Jack
[Public]

The maximum sched_hw_submission value will be set to 8. The value will result 
that the maximum number of packet execution which is not greater than firmware 
allowed.

Regards,
Jack

-Original Message-
From: Deucher, Alexander 
Sent: Tuesday, August 6, 2024 10:34 PM
To: Xiao, Jack ; amd-gfx@lists.freedesktop.org; 
ckoenig.leichtzumer...@gmail.com
Subject: RE: [PATCH] drm/amdgpu/mes: refine for maximum packet execution

[Public]

> -Original Message-
> From: Xiao, Jack 
> Sent: Tuesday, July 23, 2024 4:28 AM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; ckoenig.leichtzumer...@gmail.com
> Cc: Xiao, Jack 
> Subject: [PATCH] drm/amdgpu/mes: refine for maximum packet execution
>
> Only allow API_NUMBER_OF_COMMAND_MAX packet in mes ring buffer, refine
> the code for maximum packet execution.
>
> Signed-off-by: Jack Xiao 

Any updates on this patch?

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 2 ++
>  drivers/gpu/drm/amd/amdgpu/mes_v11_0.c   | 2 +-
>  drivers/gpu/drm/amd/amdgpu/mes_v12_0.c   | 2 +-
>  3 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index ad49cecb20b8..dfe487521a4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -212,6 +212,8 @@ int amdgpu_ring_init(struct amdgpu_device *adev,
> struct amdgpu_ring *ring,
>*/
>   if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
>   sched_hw_submission = max(sched_hw_submission, 256);
> + else if (ring->funcs->type == AMDGPU_RING_TYPE_MES)
> + sched_hw_submission = 32;
>   else if (ring == &adev->sdma.instance[0].page)
>   sched_hw_submission = 256;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> index 68c74adf79f1..951f13e01ee9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> @@ -198,7 +198,7 @@ static int
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>
>   seq = ++ring->fence_drv.sync_seq;
>   r = amdgpu_fence_wait_polling(ring,
> -   seq - ring->fence_drv.num_fences_mask,
> +   seq -
> (API_NUMBER_OF_COMMAND_MAX/2 - 1),
> timeout);
>   if (r < 1)
>   goto error_undo;
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> index 48e01206bcc4..dc65f28d1795 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> @@ -184,7 +184,7 @@ static int
> mes_v12_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>
>   seq = ++ring->fence_drv.sync_seq;
>   r = amdgpu_fence_wait_polling(ring,
> -   seq - ring->fence_drv.num_fences_mask,
> +   seq -
> (API_NUMBER_OF_COMMAND_MAX/2 - 1),
> timeout);
>   if (r < 1)
>   goto error_undo;
> --
> 2.41.0




RE: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

2024-07-23 Thread Xiao, Jack
[AMD Official Use Only - AMD Internal Distribution Only]

There is a max command number for mes queue in mes_api_def.h

enum {API_NUMBER_OF_COMMAND_MAX   = 32};

It should explain why the patch can’t fix the issue.

I will send out another patch to refine code according to the firmware 
limitation.

Regards,
Jack

From: amd-gfx  On Behalf Of Xiao, Jack
Sent: Tuesday, July 23, 2024 11:08 AM
To: Alex Deucher ; Christian König 

Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 

Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow


[AMD Official Use Only - AMD Internal Distribution Only]


[AMD Official Use Only - AMD Internal Distribution Only]

> Does this patch fix it?
https://patchwork.freedesktop.org/patch/605437/

No, please do not check in the patch, it will make my fix not working.

Regards,
Jack

From: Alex Deucher mailto:alexdeuc...@gmail.com>>
Sent: Tuesday, 23 July 2024 03:52
To: Christian König 
mailto:ckoenig.leichtzumer...@gmail.com>>
Cc: Xiao, Jack mailto:jack.x...@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>; Deucher, 
Alexander mailto:alexander.deuc...@amd.com>>
Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

Does this patch fix it?
https://patchwork.freedesktop.org/patch/605437/

Alex

On Mon, Jul 22, 2024 at 7:21 AM Christian König
mailto:ckoenig.leichtzumer...@gmail.com>> 
wrote:
>
> What I meant is that the MES ring is now to small for the number of packets 
> written to it.
>
> Either reduce the num_hw_submission or increase the MES ring size.
>
> E.g. see this code here in amdgpu_ring_init:
>
> if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
> sched_hw_submission = max(sched_hw_submission, 256);
> else if (ring == &adev->sdma.instance[0].page)
> sched_hw_submission = 256;
>
> We are basically just missing a case for the MES here as far as I can see.
>
> Regards,
> Christian.
>
> Am 22.07.24 um 10:46 schrieb Xiao, Jack:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> >> Can you try to reduce num_hw_submission for the MES ring?
>
> Smaller num_hw_submission should not help for this issue, for Mes work 
> without drm scheduler like legacy kiq. Smaller num_hw_submission will result 
> in smaller mes ring size and more waiting time.
>
> Regards,
> Jack
> 
> From: Christian König 
> mailto:ckoenig.leichtzumer...@gmail.com>>
> Sent: Monday, 22 July 2024 16:20
> To: Xiao, Jack mailto:jack.x...@amd.com>>; 
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
> mailto:amd-gfx@lists.freedesktop.org>>; 
> Deucher, Alexander 
> mailto:alexander.deuc...@amd.com>>
> Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
>
> Thx, but in that case this patch here then won't help either it just 
> mitigates the problem.
>
> Can you try to reduce num_hw_submission for the MES ring?
>
> Thanks,
> Christian.
>
> Am 22.07.24 um 05:27 schrieb Xiao, Jack:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> >> I think we rather need to increase the MES ring size instead.
>
> Unfortunately, it doesn't work. I guess mes firmware has limitation.
>
> Regards,
> Jack
>
> 
> From: Christian König 
> mailto:ckoenig.leichtzumer...@gmail.com>>
> Sent: Friday, 19 July 2024 23:44
> To: Xiao, Jack mailto:jack.x...@amd.com>>; 
> amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
> mailto:amd-gfx@lists.freedesktop.org>>; 
> Deucher, Alexander 
> mailto:alexander.deuc...@amd.com>>
> Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
>
> Am 19.07.24 um 11:16 schrieb Jack Xiao:
> > wait memory room until enough before writing mes packets
> > to avoid ring buffer overflow.
> >
> > Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++
> >   drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++
> >   2 files changed, 28 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > index 8ce51b9236c1..68c74adf79f1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > @@ -168,7 +168,7 @@ static int 
> > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
> >const char *op_str, *mis

Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

2024-07-22 Thread Xiao, Jack
[AMD Official Use Only - AMD Internal Distribution Only]

> Does this patch fix it?
https://patchwork.freedesktop.org/patch/605437/

No, please do not check in the patch, it will make my fix not working.

Regards,
Jack

From: Alex Deucher 
Sent: Tuesday, 23 July 2024 03:52
To: Christian König 
Cc: Xiao, Jack ; amd-gfx@lists.freedesktop.org 
; Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

Does this patch fix it?
https://patchwork.freedesktop.org/patch/605437/

Alex

On Mon, Jul 22, 2024 at 7:21 AM Christian König
 wrote:
>
> What I meant is that the MES ring is now to small for the number of packets 
> written to it.
>
> Either reduce the num_hw_submission or increase the MES ring size.
>
> E.g. see this code here in amdgpu_ring_init:
>
> if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
> sched_hw_submission = max(sched_hw_submission, 256);
> else if (ring == &adev->sdma.instance[0].page)
> sched_hw_submission = 256;
>
> We are basically just missing a case for the MES here as far as I can see.
>
> Regards,
> Christian.
>
> Am 22.07.24 um 10:46 schrieb Xiao, Jack:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> >> Can you try to reduce num_hw_submission for the MES ring?
>
> Smaller num_hw_submission should not help for this issue, for Mes work 
> without drm scheduler like legacy kiq. Smaller num_hw_submission will result 
> in smaller mes ring size and more waiting time.
>
> Regards,
> Jack
> 
> From: Christian König 
> Sent: Monday, 22 July 2024 16:20
> To: Xiao, Jack ; amd-gfx@lists.freedesktop.org 
> ; Deucher, Alexander 
> 
> Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
>
> Thx, but in that case this patch here then won't help either it just 
> mitigates the problem.
>
> Can you try to reduce num_hw_submission for the MES ring?
>
> Thanks,
> Christian.
>
> Am 22.07.24 um 05:27 schrieb Xiao, Jack:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> >> I think we rather need to increase the MES ring size instead.
>
> Unfortunately, it doesn't work. I guess mes firmware has limitation.
>
> Regards,
> Jack
>
> 
> From: Christian König 
> Sent: Friday, 19 July 2024 23:44
> To: Xiao, Jack ; amd-gfx@lists.freedesktop.org 
> ; Deucher, Alexander 
> 
> Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow
>
> Am 19.07.24 um 11:16 schrieb Jack Xiao:
> > wait memory room until enough before writing mes packets
> > to avoid ring buffer overflow.
> >
> > Signed-off-by: Jack Xiao 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++
> >   drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++
> >   2 files changed, 28 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > index 8ce51b9236c1..68c74adf79f1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > @@ -168,7 +168,7 @@ static int 
> > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
> >const char *op_str, *misc_op_str;
> >unsigned long flags;
> >u64 status_gpu_addr;
> > - u32 status_offset;
> > + u32 seq, status_offset;
> >u64 *status_ptr;
> >signed long r;
> >int ret;
> > @@ -196,6 +196,13 @@ static int 
> > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
> >if (r)
> >goto error_unlock_free;
> >
> > + seq = ++ring->fence_drv.sync_seq;
> > + r = amdgpu_fence_wait_polling(ring,
> > +   seq - ring->fence_drv.num_fences_mask,
> > +   timeout);
> > + if (r < 1)
> > + goto error_undo;
> > +
> >api_status = (struct MES_API_STATUS *)((char *)pkt + api_status_off);
> >api_status->api_completion_fence_addr = status_gpu_addr;
> >api_status->api_completion_fence_value = 1;
> > @@ -208,8 +215,7 @@ static int 
> > mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
> >mes_status_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS;
> >mes_status_pkt.api_status.api_completion_fence_addr =
> >ring->fence_drv.gpu_addr;
> > - mes_status_pkt.api_status.api_completion_fence_v

Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

2024-07-22 Thread Xiao, Jack
[AMD Official Use Only - AMD Internal Distribution Only]

>> Can you try to reduce num_hw_submission for the MES ring?

Smaller num_hw_submission should not help for this issue, for Mes work without 
drm scheduler like legacy kiq. Smaller num_hw_submission will result in smaller 
mes ring size and more waiting time.

Regards,
Jack

From: Christian König 
Sent: Monday, 22 July 2024 16:20
To: Xiao, Jack ; amd-gfx@lists.freedesktop.org 
; Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

Thx, but in that case this patch here then won't help either it just mitigates 
the problem.

Can you try to reduce num_hw_submission for the MES ring?

Thanks,
Christian.

Am 22.07.24 um 05:27 schrieb Xiao, Jack:

[AMD Official Use Only - AMD Internal Distribution Only]

>> I think we rather need to increase the MES ring size instead.

Unfortunately, it doesn't work. I guess mes firmware has limitation.

Regards,
Jack


From: Christian König 
<mailto:ckoenig.leichtzumer...@gmail.com>
Sent: Friday, 19 July 2024 23:44
To: Xiao, Jack <mailto:jack.x...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
<mailto:amd-gfx@lists.freedesktop.org>; Deucher, 
Alexander <mailto:alexander.deuc...@amd.com>
Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

Am 19.07.24 um 11:16 schrieb Jack Xiao:
> wait memory room until enough before writing mes packets
> to avoid ring buffer overflow.
>
> Signed-off-by: Jack Xiao <mailto:jack.x...@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++
>   drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++
>   2 files changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> index 8ce51b9236c1..68c74adf79f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> @@ -168,7 +168,7 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>const char *op_str, *misc_op_str;
>unsigned long flags;
>u64 status_gpu_addr;
> - u32 status_offset;
> + u32 seq, status_offset;
>u64 *status_ptr;
>signed long r;
>int ret;
> @@ -196,6 +196,13 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>if (r)
>goto error_unlock_free;
>
> + seq = ++ring->fence_drv.sync_seq;
> + r = amdgpu_fence_wait_polling(ring,
> +   seq - ring->fence_drv.num_fences_mask,
> +   timeout);
> + if (r < 1)
> + goto error_undo;
> +
>api_status = (struct MES_API_STATUS *)((char *)pkt + api_status_off);
>api_status->api_completion_fence_addr = status_gpu_addr;
>api_status->api_completion_fence_value = 1;
> @@ -208,8 +215,7 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>mes_status_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS;
>mes_status_pkt.api_status.api_completion_fence_addr =
>ring->fence_drv.gpu_addr;
> - mes_status_pkt.api_status.api_completion_fence_value =
> - ++ring->fence_drv.sync_seq;
> + mes_status_pkt.api_status.api_completion_fence_value = seq;
>
>amdgpu_ring_write_multiple(ring, &mes_status_pkt,
>   sizeof(mes_status_pkt) / 4);
> @@ -229,7 +235,7 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>dev_dbg(adev->dev, "MES msg=%d was emitted\n",
>x_pkt->header.opcode);
>
> - r = amdgpu_fence_wait_polling(ring, ring->fence_drv.sync_seq, timeout);
> + r = amdgpu_fence_wait_polling(ring, seq, timeout);
>if (r < 1 || !*status_ptr) {
>
>if (misc_op_str)
> @@ -252,6 +258,10 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>amdgpu_device_wb_free(adev, status_offset);
>return 0;
>
> +error_undo:
> + dev_err(adev->dev, "MES ring buffer is full.\n");
> + amdgpu_ring_undo(ring);
> +
>   error_unlock_free:
>spin_unlock_irqrestore(&mes->ring_lock, flags);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> index c9f74231ad59..48e01206bcc4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> @@ -154,7 +154,7 @@ static int 
> mes_v12_0_submit_pkt_and_poll_com

Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

2024-07-21 Thread Xiao, Jack
[AMD Official Use Only - AMD Internal Distribution Only]

>> I think we rather need to increase the MES ring size instead.

Unfortunately, it doesn't work. I guess mes firmware has limitation.

Regards,
Jack


From: Christian König 
Sent: Friday, 19 July 2024 23:44
To: Xiao, Jack ; amd-gfx@lists.freedesktop.org 
; Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu/mes: fix mes ring buffer overflow

Am 19.07.24 um 11:16 schrieb Jack Xiao:
> wait memory room until enough before writing mes packets
> to avoid ring buffer overflow.
>
> Signed-off-by: Jack Xiao 
> ---
>   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++
>   drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++
>   2 files changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> index 8ce51b9236c1..68c74adf79f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> @@ -168,7 +168,7 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>const char *op_str, *misc_op_str;
>unsigned long flags;
>u64 status_gpu_addr;
> - u32 status_offset;
> + u32 seq, status_offset;
>u64 *status_ptr;
>signed long r;
>int ret;
> @@ -196,6 +196,13 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>if (r)
>goto error_unlock_free;
>
> + seq = ++ring->fence_drv.sync_seq;
> + r = amdgpu_fence_wait_polling(ring,
> +   seq - ring->fence_drv.num_fences_mask,
> +   timeout);
> + if (r < 1)
> + goto error_undo;
> +
>api_status = (struct MES_API_STATUS *)((char *)pkt + api_status_off);
>api_status->api_completion_fence_addr = status_gpu_addr;
>api_status->api_completion_fence_value = 1;
> @@ -208,8 +215,7 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>mes_status_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS;
>mes_status_pkt.api_status.api_completion_fence_addr =
>ring->fence_drv.gpu_addr;
> - mes_status_pkt.api_status.api_completion_fence_value =
> - ++ring->fence_drv.sync_seq;
> + mes_status_pkt.api_status.api_completion_fence_value = seq;
>
>amdgpu_ring_write_multiple(ring, &mes_status_pkt,
>   sizeof(mes_status_pkt) / 4);
> @@ -229,7 +235,7 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>dev_dbg(adev->dev, "MES msg=%d was emitted\n",
>x_pkt->header.opcode);
>
> - r = amdgpu_fence_wait_polling(ring, ring->fence_drv.sync_seq, timeout);
> + r = amdgpu_fence_wait_polling(ring, seq, timeout);
>if (r < 1 || !*status_ptr) {
>
>if (misc_op_str)
> @@ -252,6 +258,10 @@ static int 
> mes_v11_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>amdgpu_device_wb_free(adev, status_offset);
>return 0;
>
> +error_undo:
> + dev_err(adev->dev, "MES ring buffer is full.\n");
> + amdgpu_ring_undo(ring);
> +
>   error_unlock_free:
>spin_unlock_irqrestore(&mes->ring_lock, flags);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> index c9f74231ad59..48e01206bcc4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
> @@ -154,7 +154,7 @@ static int 
> mes_v12_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>const char *op_str, *misc_op_str;
>unsigned long flags;
>u64 status_gpu_addr;
> - u32 status_offset;
> + u32 seq, status_offset;
>u64 *status_ptr;
>signed long r;
>int ret;
> @@ -182,6 +182,13 @@ static int 
> mes_v12_0_submit_pkt_and_poll_completion(struct amdgpu_mes *mes,
>if (r)
>goto error_unlock_free;
>
> + seq = ++ring->fence_drv.sync_seq;
> + r = amdgpu_fence_wait_polling(ring,
> +   seq - ring->fence_drv.num_fences_mask,

That's what's amdgpu_fence_emit_polling() does anyway.

So this here just moves the polling a bit earlier.

I think we rather need to increase the MES ring size instead.

Regards,
Christian.


> +   timeout);
> + if (r < 1)
> + goto error_undo;
> +
>api_status = (struct MES_API_STATUS *)((cha

RE: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET

2023-12-21 Thread Xiao, Jack
[AMD Official Use Only - General]

I can see the soft reset can be called from amdgpu_device_gpu_recover and pci 
error handler, they have called amdgpu_device_lock_reset_domain to acquire a 
reset lock before soft reset.

Regards,
Jack

-Original Message-
From: Christian König 
Sent: Wednesday, December 20, 2023 10:06 PM
To: Xiao, Jack ; Alex Deucher 
Cc: Deucher, Alexander ; 
amd-gfx@lists.freedesktop.org; Zhang, Hawking 
Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access 
CP_VMID_RESET

Well not the reset lock, but there should only be a single reset queue which 
this runs on.

Regards,
Christian.

Am 20.12.23 um 10:49 schrieb Xiao, Jack:
> [AMD Official Use Only - General]
>
> It's already protected by the reset lock. In my understanding, soft reset 
> should not run in parallel.
>
> Regards,
> Jack
>
> -Original Message-
> From: Alex Deucher 
> Sent: Wednesday, December 20, 2023 1:04 AM
> To: Xiao, Jack 
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Zhang, Hawking 
> Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before
> access CP_VMID_RESET
>
> On Tue, Dec 19, 2023 at 4:30 AM Jack Xiao  wrote:
>> It's required to take the gfx mutex before access to CP_VMID_RESET,
>> for there is a race condition with CP firmware to write the register.
>>
>> Signed-off-by: Jack Xiao 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 20 
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> index bdcf96df69e6..ae3370d34d11 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> @@ -4518,6 +4518,22 @@ static int gfx_v11_0_soft_reset(void *handle)
>>  }
>>  }
>>
> We probably want a CPU mutex or spinlock to protect this as well unless this 
> is already protected by the reset lock.
>
> Alex
>
>> +   /* Try to require the gfx mutex before access to CP_VMID_RESET */
>> +   for (i = 0; i < adev->usec_timeout; i++) {
>> +   /* Request with MeId=2, PipeId=0 */
>> +   tmp = REG_SET_FIELD(0, CP_GFX_INDEX_MUTEX, REQUEST, 1);
>> +   tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, CLIENTID, 4);
>> +   WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
>> +   if (RREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX) == tmp)
>> +   break;
>> +   udelay(1);
>> +   }
>> +
>> +   if (i >= adev->usec_timeout) {
>> +   printk("Failed to require the gfx mutex during soft 
>> reset\n");
>> +   return -EINVAL;
>> +   }
>> +
>>  WREG32_SOC15(GC, 0, regCP_VMID_RESET, 0xfffe);
>>
>>  // Read CP_VMID_RESET register three times.
>> @@ -4526,6 +4542,10 @@ static int gfx_v11_0_soft_reset(void *handle)
>>  RREG32_SOC15(GC, 0, regCP_VMID_RESET);
>>  RREG32_SOC15(GC, 0, regCP_VMID_RESET);
>>
>> +   /* release the gfx mutex */
>> +   tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, REQUEST, 0);
>> +   WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
>> +
>>  for (i = 0; i < adev->usec_timeout; i++) {
>>  if (!RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) &&
>>  !RREG32_SOC15(GC, 0, regCP_GFX_HQD_ACTIVE))
>> --
>> 2.41.0
>>



RE: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access CP_VMID_RESET

2023-12-20 Thread Xiao, Jack
[AMD Official Use Only - General]

It's already protected by the reset lock. In my understanding, soft reset 
should not run in parallel.

Regards,
Jack

-Original Message-
From: Alex Deucher 
Sent: Wednesday, December 20, 2023 1:04 AM
To: Xiao, Jack 
Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Zhang, Hawking 
Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access 
CP_VMID_RESET

On Tue, Dec 19, 2023 at 4:30 AM Jack Xiao  wrote:
>
> It's required to take the gfx mutex before access to CP_VMID_RESET,
> for there is a race condition with CP firmware to write the register.
>
> Signed-off-by: Jack Xiao 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> index bdcf96df69e6..ae3370d34d11 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> @@ -4518,6 +4518,22 @@ static int gfx_v11_0_soft_reset(void *handle)
> }
> }
>

We probably want a CPU mutex or spinlock to protect this as well unless this is 
already protected by the reset lock.

Alex

> +   /* Try to require the gfx mutex before access to CP_VMID_RESET */
> +   for (i = 0; i < adev->usec_timeout; i++) {
> +   /* Request with MeId=2, PipeId=0 */
> +   tmp = REG_SET_FIELD(0, CP_GFX_INDEX_MUTEX, REQUEST, 1);
> +   tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, CLIENTID, 4);
> +   WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
> +   if (RREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX) == tmp)
> +   break;
> +   udelay(1);
> +   }
> +
> +   if (i >= adev->usec_timeout) {
> +   printk("Failed to require the gfx mutex during soft reset\n");
> +   return -EINVAL;
> +   }
> +
> WREG32_SOC15(GC, 0, regCP_VMID_RESET, 0xfffe);
>
> // Read CP_VMID_RESET register three times.
> @@ -4526,6 +4542,10 @@ static int gfx_v11_0_soft_reset(void *handle)
> RREG32_SOC15(GC, 0, regCP_VMID_RESET);
> RREG32_SOC15(GC, 0, regCP_VMID_RESET);
>
> +   /* release the gfx mutex */
> +   tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, REQUEST, 0);
> +   WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
> +
> for (i = 0; i < adev->usec_timeout; i++) {
> if (!RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) &&
> !RREG32_SOC15(GC, 0, regCP_GFX_HQD_ACTIVE))
> --
> 2.41.0
>


RE: [PATCH] drm/amdgpu/mes11: remove aggregated doorbell code

2023-10-16 Thread Xiao, Jack
[AMD Official Use Only - General]

If aggregated doorbell will not be intended for user queue oversubscription, 
it's ok to remove it.

Reviewed-by: Jack Xiao 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Friday, October 13, 2023 9:53 PM
To: Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu/mes11: remove aggregated doorbell code

Ping?

On Wed, Oct 11, 2023 at 2:27 PM Alex Deucher  wrote:
>
> It's not enabled in hardware so the code is dead.
> Remove it.
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 86
> +-  drivers/gpu/drm/amd/amdgpu/mes_v11_0.c |
> 56 -  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 83
> -
>  3 files changed, 40 insertions(+), 185 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> index 27b224b0688a..91c07ab4f14e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> @@ -5170,45 +5170,17 @@ static u64 gfx_v11_0_ring_get_wptr_gfx(struct
> amdgpu_ring *ring)  static void gfx_v11_0_ring_set_wptr_gfx(struct
> amdgpu_ring *ring)  {
> struct amdgpu_device *adev = ring->adev;
> -   uint32_t *wptr_saved;
> -   uint32_t *is_queue_unmap;
> -   uint64_t aggregated_db_index;
> -   uint32_t mqd_size = adev->mqds[AMDGPU_HW_IP_GFX].mqd_size;
> -   uint64_t wptr_tmp;
>
> -   if (ring->is_mes_queue) {
> -   wptr_saved = (uint32_t *)(ring->mqd_ptr + mqd_size);
> -   is_queue_unmap = (uint32_t *)(ring->mqd_ptr + mqd_size +
> - sizeof(uint32_t));
> -   aggregated_db_index =
> -   amdgpu_mes_get_aggregated_doorbell_index(adev,
> -
> ring->hw_prio);
> -
> -   wptr_tmp = ring->wptr & ring->buf_mask;
> -   atomic64_set((atomic64_t *)ring->wptr_cpu_addr, wptr_tmp);
> -   *wptr_saved = wptr_tmp;
> -   /* assume doorbell always being used by mes mapped queue */
> -   if (*is_queue_unmap) {
> -   WDOORBELL64(aggregated_db_index, wptr_tmp);
> -   WDOORBELL64(ring->doorbell_index, wptr_tmp);
> -   } else {
> -   WDOORBELL64(ring->doorbell_index, wptr_tmp);
> -
> -   if (*is_queue_unmap)
> -   WDOORBELL64(aggregated_db_index, wptr_tmp);
> -   }
> +   if (ring->use_doorbell) {
> +   /* XXX check if swapping is necessary on BE */
> +   atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
> +ring->wptr);
> +   WDOORBELL64(ring->doorbell_index, ring->wptr);
> } else {
> -   if (ring->use_doorbell) {
> -   /* XXX check if swapping is necessary on BE */
> -   atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
> -ring->wptr);
> -   WDOORBELL64(ring->doorbell_index, ring->wptr);
> -   } else {
> -   WREG32_SOC15(GC, 0, regCP_RB0_WPTR,
> -lower_32_bits(ring->wptr));
> -   WREG32_SOC15(GC, 0, regCP_RB0_WPTR_HI,
> -upper_32_bits(ring->wptr));
> -   }
> +   WREG32_SOC15(GC, 0, regCP_RB0_WPTR,
> +lower_32_bits(ring->wptr));
> +   WREG32_SOC15(GC, 0, regCP_RB0_WPTR_HI,
> +upper_32_bits(ring->wptr));
> }
>  }
>
> @@ -5233,42 +5205,14 @@ static u64
> gfx_v11_0_ring_get_wptr_compute(struct amdgpu_ring *ring)  static void
> gfx_v11_0_ring_set_wptr_compute(struct amdgpu_ring *ring)  {
> struct amdgpu_device *adev = ring->adev;
> -   uint32_t *wptr_saved;
> -   uint32_t *is_queue_unmap;
> -   uint64_t aggregated_db_index;
> -   uint32_t mqd_size = adev->mqds[AMDGPU_HW_IP_COMPUTE].mqd_size;
> -   uint64_t wptr_tmp;
>
> -   if (ring->is_mes_queue) {
> -   wptr_saved = (uint32_t *)(ring->mqd_ptr + mqd_size);
> -   is_queue_unmap = (uint32_t *)(ring->mqd_ptr + mqd_size +
> - sizeof(uint32_t));
> -   aggregated_db_index =
> -   amdgpu_mes_get_aggregated_doorbell_index(adev,
> -
> ring->hw_prio);
> -
> -   wptr_tmp = ring->wptr & ring->buf_mask;
> -   atomic64_set((atomic64_t *)ring->wptr_cpu_addr, wptr_tmp);
> -   *wptr_saved = wptr_tmp;
> -   /* assume doorbell always used by mes mapped queue */
> -   if (*is_queue_unmap) {
> -   

Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM

2023-05-15 Thread Xiao, Jack
[AMD Official Use Only - General]

Yes, it should work. If that, the routine would behave like *reset*. That means 
driver would behave like to create a new queue instead of restoring a saved 
queue. I can have a try with this solution.

Regards,
Jack

From: Alex Deucher 
Sent: Monday, 15 May 2023 21:08
To: Xiao, Jack 
Cc: amd-gfx@lists.freedesktop.org ; Deucher, 
Alexander 
Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM

On Mon, May 15, 2023 at 6:40 AM Xiao, Jack  wrote:
>
> [AMD Official Use Only - General]
>
> The MQD data in VRAM would be lost after S3,  for the MQD bo is pinned down 
> as kernel bo and can't be evicted to system memory.
> AFAIK, driver should not to do allocate/free memory during S3, as there are 
> issues observed to do memory allocation during S3.

We restore the contents of the MQD in gfx_v*_0_gfx_init_queue() and
gfx_v*_0_kcq_init_queue().

Alex

>
> Regards,
> Jack
>
> -Original Message-
> From: Alex Deucher 
> Sent: Friday, May 12, 2023 9:13 PM
> To: Xiao, Jack 
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
> 
> Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM
>
> On Fri, May 12, 2023 at 4:16 AM Jack Xiao  wrote:
> >
> > Make the preemption optimization effect only for SRIOV, for it caused
> > failure to resume from S3.
>
> Can you elaborate?  We ultimately want MQDs in VRAM for performance reasons 
> even for bare metal.
>
> Alex
>
> >
> > Signed-off-by: Jack Xiao 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++-
> > drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 7 +--
> > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 7 +--
> >  3 files changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > index a22d88a4178a..1b795b7bbf38 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> > @@ -385,7 +385,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
> > u32 domain = AMDGPU_GEM_DOMAIN_GTT;
> >
> > /* Only enable on gfx10 and 11 for now to avoid changing behavior 
> > on older chips */
> > -   if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0))
> > +   if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0) &&
> > +   amdgpu_sriov_vf(adev))
> > domain |= AMDGPU_GEM_DOMAIN_VRAM;
> >
> > /* create MQD for KIQ */
> > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> > b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> > index 4560476c7c31..5c3d3f6c7ebd 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> > @@ -889,6 +889,7 @@ static int mes_v10_1_mqd_sw_init(struct
> > amdgpu_device *adev,  {
> > int r, mqd_size = sizeof(struct v10_compute_mqd);
> > struct amdgpu_ring *ring;
> > +   u32 domain = AMDGPU_GEM_DOMAIN_GTT;
> >
> > if (pipe == AMDGPU_MES_KIQ_PIPE)
> > ring = &adev->gfx.kiq[0].ring; @@ -900,9 +901,11 @@
> > static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
> > if (ring->mqd_obj)
> > return 0;
> >
> > +   if (amdgpu_sriov_vf(adev))
> > +   domain |= AMDGPU_GEM_DOMAIN_VRAM;
> > +
> > r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> > -   AMDGPU_GEM_DOMAIN_VRAM |
> > -   AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
> > +   domain, &ring->mqd_obj,
> > &ring->mqd_gpu_addr, &ring->mqd_ptr);
> > if (r) {
> > dev_warn(adev->dev, "failed to create ring mqd bo
> > (%d)", r); diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > index 3adb450eec07..79a4d2bfd94a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > @@ -987,6 +987,7 @@ static int mes_v11_0_mqd_sw_init(struct
> > amdgpu_device *adev,  {
> > int r, mqd_size = sizeof(struct v11_compute_mqd);
> > struct amdgpu_ring *ring;
> > +   u32 domain = AMDGPU_GEM_DOMAIN_GTT;
> >
> > if (pipe == AMDGPU_MES_KIQ_PIPE)
> > ring = &adev->gfx.kiq[0].ring; @@ -998,9 +999,11 @@
> > static int mes_v11_0_mqd_s

RE: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM

2023-05-15 Thread Xiao, Jack
[AMD Official Use Only - General]

The MQD data in VRAM would be lost after S3,  for the MQD bo is pinned down as 
kernel bo and can't be evicted to system memory.
AFAIK, driver should not to do allocate/free memory during S3, as there are 
issues observed to do memory allocation during S3. 

Regards,
Jack

-Original Message-
From: Alex Deucher  
Sent: Friday, May 12, 2023 9:13 PM
To: Xiao, Jack 
Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 

Subject: Re: [PATCH] drm/amdgpu: fix S3 issue if MQD in VRAM

On Fri, May 12, 2023 at 4:16 AM Jack Xiao  wrote:
>
> Make the preemption optimization effect only for SRIOV, for it caused 
> failure to resume from S3.

Can you elaborate?  We ultimately want MQDs in VRAM for performance reasons 
even for bare metal.

Alex

>
> Signed-off-by: Jack Xiao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 ++-  
> drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 7 +--  
> drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 7 +--
>  3 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index a22d88a4178a..1b795b7bbf38 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -385,7 +385,8 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
> u32 domain = AMDGPU_GEM_DOMAIN_GTT;
>
> /* Only enable on gfx10 and 11 for now to avoid changing behavior on 
> older chips */
> -   if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0))
> +   if (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(10, 0, 0) &&
> +   amdgpu_sriov_vf(adev))
> domain |= AMDGPU_GEM_DOMAIN_VRAM;
>
> /* create MQD for KIQ */
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c 
> b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> index 4560476c7c31..5c3d3f6c7ebd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
> @@ -889,6 +889,7 @@ static int mes_v10_1_mqd_sw_init(struct 
> amdgpu_device *adev,  {
> int r, mqd_size = sizeof(struct v10_compute_mqd);
> struct amdgpu_ring *ring;
> +   u32 domain = AMDGPU_GEM_DOMAIN_GTT;
>
> if (pipe == AMDGPU_MES_KIQ_PIPE)
> ring = &adev->gfx.kiq[0].ring; @@ -900,9 +901,11 @@ 
> static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
> if (ring->mqd_obj)
> return 0;
>
> +   if (amdgpu_sriov_vf(adev))
> +   domain |= AMDGPU_GEM_DOMAIN_VRAM;
> +
> r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> -   AMDGPU_GEM_DOMAIN_VRAM |
> -   AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
> +   domain, &ring->mqd_obj,
> &ring->mqd_gpu_addr, &ring->mqd_ptr);
> if (r) {
> dev_warn(adev->dev, "failed to create ring mqd bo 
> (%d)", r); diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> index 3adb450eec07..79a4d2bfd94a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> @@ -987,6 +987,7 @@ static int mes_v11_0_mqd_sw_init(struct 
> amdgpu_device *adev,  {
> int r, mqd_size = sizeof(struct v11_compute_mqd);
> struct amdgpu_ring *ring;
> +   u32 domain = AMDGPU_GEM_DOMAIN_GTT;
>
> if (pipe == AMDGPU_MES_KIQ_PIPE)
> ring = &adev->gfx.kiq[0].ring; @@ -998,9 +999,11 @@ 
> static int mes_v11_0_mqd_sw_init(struct amdgpu_device *adev,
> if (ring->mqd_obj)
> return 0;
>
> +   if (amdgpu_sriov_vf(adev))
> +   domain |= AMDGPU_GEM_DOMAIN_VRAM;
> +
> r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> -   AMDGPU_GEM_DOMAIN_VRAM |
> -   AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
> +   domain, &ring->mqd_obj,
> &ring->mqd_gpu_addr, &ring->mqd_ptr);
> if (r) {
> dev_warn(adev->dev, "failed to create ring mqd bo 
> (%d)", r);
> --
> 2.37.3
>


RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

2023-02-10 Thread Xiao, Jack
[AMD Official Use Only - General]

>> The driver are resumed before the core Linux memory management is ready to 
>> serve allocations. E.g. swap for example isn't turned on yet.

>> This means that this stuff only worked because we were able to allocate 
>> memory from the pool which isn't guaranteed in any way.

Memory allocation failure can happen at any time, every programmer should 
correctly handle it.
If memory allocation failure is not critical error and can gracefully continue 
to run, it should be acceptable.
The memory allocation failure during mes self test should be the acceptable 
one. It will not make system hang up and
driver can gracefully continue to run.

Regards,
Jack

From: Koenig, Christian 
Sent: Friday, February 10, 2023 6:25 PM
To: Xiao, Jack ; Quan, Evan ; Christian 
König ; amd-gfx@lists.freedesktop.org; 
Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Hi Jack,

Am 10.02.23 um 10:51 schrieb Xiao, Jack:

[AMD Official Use Only - General]

Hi Christian,

>> Allocating buffers temporary for stuff like that is illegal during resume.

Can you *deeply* explain why it is illegal during ip late_init stage which is a 
part stage of resume?

Well no, I don't have the time to explain this to everybody individually.

[Jack] ...

In my understanding, after gmc ready, driver can allocate/free kernel bo, and 
after SDMA ready,
the eviction should be ready. What else prevent driver doing that during resume?

The driver are resumed before the core Linux memory management is ready to 
serve allocations. E.g. swap for example isn't turned on yet.

This means that this stuff only worked because we were able to allocate memory 
from the pool which isn't guaranteed in any way.

>> I strongly suggest to just remove the MES test. It's abusing the kernel ring 
>> interface in a way we didn't want anyway and is currently replaced by 
>> Shahanks work.

The kernel mes unit test is very meaningful and important to catch MES firmware 
issue at first time before
issue went spread to other components like kfd/umd to avoid the problem 
complicated, Otherwise, the issue
would become hard to catch and debug.

Secondly, for mes unit test is self-containing and no dependency, it is a part 
of milestone to qualify MES ready,
indicating that it can deliver to other component especially during brinup. It 
is likely ring test and ib test indicating
gfx is ready to go. After totally transitioning to gfx user queue, mes unit 
test may be the only one unit test which
can indicate gfx is ready at the very early stage of bringup when UMD is not 
ready.

Alex and I are the maintainers of the driver who are deciding stuff like that 
and at least I don't really buy that argument. The ring, IB and benchmark tests 
are in the kernel module because they are simple.

If we have a complicated unit test like simulating creating an MES user queue 
to test the firmware functionality than that's really overkill. Especially when 
you need to allocate memory for it.

We previously had people requesting to add shader code and other complicated 
testing and rejected that as well because it just bloat up the kernel driver 
unnecessarily.

If we can modify the MES test to not abuse the amdgpu_ring structure only work 
with memory from the SA for example we could keep this, but not really in the 
current state.

Regards,
Christian.


Regards,
Jack

From: Koenig, Christian 
<mailto:christian.koe...@amd.com>
Sent: Friday, February 10, 2023 4:08 PM
To: Quan, Evan <mailto:evan.q...@amd.com>; Christian König 
<mailto:ckoenig.leichtzumer...@gmail.com>; 
Xiao, Jack <mailto:jack.x...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, 
Alexander <mailto:alexander.deuc...@amd.com>
Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Hi Evan,

yeah, exactly that's what this warning should prevent. Allocating buffers 
temporary for stuff like that is illegal during resume.

I strongly suggest to just remove the MES test. It's abusing the kernel ring 
interface in a way we didn't want anyway and is currently replaced by Shahanks 
work.

Regards,
Christian.
Am 10.02.23 um 05:12 schrieb Quan, Evan:

[AMD Official Use Only - General]

Hi Jack,

Are you trying to fix the call trace popped up on resuming below?
It seems mes created some bo for its self test and freed it up later at the 
final stage of the resuming process.
All these happened before the in_suspend flag cleared. And that triggered the 
call trace.
Is my understanding correct?

[74084.799260] WARNING: CPU: 2 PID: 2891 at 
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 
[amdgpu]
[74084.811019] Modules linked in: nls_iso8859_1 amdgpu(OE) iommu_v2 gpu_sched 
drm_buddy drm_ttm_helper ttm

RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

2023-02-10 Thread Xiao, Jack
[AMD Official Use Only - General]

Hi Christian,

>> Allocating buffers temporary for stuff like that is illegal during resume.

Can you *deeply* explain why it is illegal during ip late_init stage which is a 
part stage of resume?
In my understanding, after gmc ready, driver can allocate/free kernel bo, and 
after SDMA ready,
the eviction should be ready. What else prevent driver doing that during resume?

>> I strongly suggest to just remove the MES test. It's abusing the kernel ring 
>> interface in a way we didn't want anyway and is currently replaced by 
>> Shahanks work.

The kernel mes unit test is very meaningful and important to catch MES firmware 
issue at first time before
issue went spread to other components like kfd/umd to avoid the problem 
complicated, Otherwise, the issue
would become hard to catch and debug.

Secondly, for mes unit test is self-containing and no dependency, it is a part 
of milestone to qualify MES ready,
indicating that it can deliver to other component especially during brinup. It 
is likely ring test and ib test indicating
gfx is ready to go. After totally transitioning to gfx user queue, mes unit 
test may be the only one unit test which
can indicate gfx is ready at the very early stage of bringup when UMD is not 
ready.

Regards,
Jack

From: Koenig, Christian 
Sent: Friday, February 10, 2023 4:08 PM
To: Quan, Evan ; Christian König 
; Xiao, Jack ; 
amd-gfx@lists.freedesktop.org; Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Hi Evan,

yeah, exactly that's what this warning should prevent. Allocating buffers 
temporary for stuff like that is illegal during resume.

I strongly suggest to just remove the MES test. It's abusing the kernel ring 
interface in a way we didn't want anyway and is currently replaced by Shahanks 
work.

Regards,
Christian.
Am 10.02.23 um 05:12 schrieb Quan, Evan:

[AMD Official Use Only - General]

Hi Jack,

Are you trying to fix the call trace popped up on resuming below?
It seems mes created some bo for its self test and freed it up later at the 
final stage of the resuming process.
All these happened before the in_suspend flag cleared. And that triggered the 
call trace.
Is my understanding correct?

[74084.799260] WARNING: CPU: 2 PID: 2891 at 
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 
[amdgpu]
[74084.811019] Modules linked in: nls_iso8859_1 amdgpu(OE) iommu_v2 gpu_sched 
drm_buddy drm_ttm_helper ttm drm_display_helper drm_kms_helper i2c_algo_bit 
fb_sys_fops syscopyarea sysfillrect sysimgblt snd_sm
[74084.811042]  ip_tables x_tables autofs4 hid_logitech_hidpp hid_logitech_dj 
hid_generic e1000e usbhid ptp uas hid video i2c_i801 ahci pps_core crc32_pclmul 
i2c_smbus usb_storage libahci wmi
[74084.914519] CPU: 2 PID: 2891 Comm: kworker/u16:38 Tainted: GW IOE
  6.0.0-custom #1
[74084.923146] Hardware name: ASUS System Product Name/PRIME Z390-A, BIOS 2004 
11/02/2021
[74084.931074] Workqueue: events_unbound async_run_entry_fn
[74084.936393] RIP: 0010:amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu]
[74084.942422] Code: 00 4d 85 ed 74 08 49 c7 45 00 00 00 00 00 4d 85 e4 74 08 
49 c7 04 24 00 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc <0f> 0b e9 
39 ff ff ff 3d 00 fe ff ff 0f 85 75 96 47 00 ebf
[74084.961199] RSP: :bed6812ebb90 EFLAGS: 00010202
[74084.966435] RAX:  RBX: bed6812ebc50 RCX: 
[74084.973578] RDX: bed6812ebc70 RSI: bed6812ebc60 RDI: bed6812ebc50
[74084.980725] RBP: bed6812ebbb8 R08:  R09: 01ff
[74084.987869] R10: bed6812ebb40 R11:  R12: bed6812ebc70
[74084.995015] R13: bed6812ebc60 R14: 963a2945cc00 R15: 9639c7da5630
[74085.002160] FS:  () GS:963d1dc8() 
knlGS:
[74085.010262] CS:  0010 DS:  ES:  CR0: 80050033
[74085.016016] CR2:  CR3: 000377c0a001 CR4: 003706e0
[74085.023164] DR0:  DR1:  DR2: 
[74085.030307] DR3:  DR6: fffe0ff0 DR7: 0400
[74085.037453] Call Trace:
[74085.039911]  
[74085.042023]  amdgpu_mes_self_test+0x385/0x460 [amdgpu]
[74085.047293]  mes_v11_0_late_init+0x44/0x50 [amdgpu]
[74085.052291]  amdgpu_device_ip_late_init+0x50/0x270 [amdgpu]
[74085.058032]  amdgpu_device_resume+0xb0/0x2d0 [amdgpu]
[74085.063187]  amdgpu_pmops_resume+0x37/0x70 [amdgpu]
[74085.068162]  pci_pm_resume+0x68/0x100
[74085.071836]  ? pci_legacy_resume+0x80/0x80
[74085.075943]  dpm_run_callback+0x4c/0x160
[74085.079873]  device_resume+0xad/0x210
[74085.083546]  async_resume+0x1e/0x40
[74085.087046]  async_run_entry_fn+0x30/0x120
[74085.091152]  process_one_work+0x21a/0x3f0
[74085.095173]  worker_thread+0x50/0x3e0
[74085.098845]  ? process_one_work+0x3f0/0x3f0
[74085.103039]  kthread+0xfa

RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

2023-02-06 Thread Xiao, Jack
[AMD Official Use Only - General]

  >> >> It's simply not allowed to free up 
resources during suspend since those can't be acquired again during resume.
  >> The in_suspend flag is set at the beginning of 
suspend and unset at the end of resume. It can't filter the case you mentioned.

   Why not? This is exactly what it should do.

[Jack] If freeing up resources during resume, it should not hit the issue you 
described. But only checking in_suspend flag would take these cases as warning.

Regards,
Jack

From: Koenig, Christian 
Sent: Monday, February 6, 2023 4:06 PM
To: Xiao, Jack ; Christian König 
; amd-gfx@lists.freedesktop.org; Deucher, 
Alexander 
Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Am 06.02.23 um 08:23 schrieb Xiao, Jack:

[AMD Official Use Only - General]

>> Nope, that is not related to any hw state.

can use other flag.

>> It's simply not allowed to free up resources during suspend since those 
>> can't be acquired again during resume.
The in_suspend flag is set at the beginning of suspend and unset at the end of 
resume. It can't filter the case you mentioned.

Why not? This is exactly what it should do.

Do you know the root cause of these cases hitting the issue? So that we can get 
an exact point to warn the freeing up behavior.

Well the root cause are programming errors. See between suspending and resuming 
you should not allocate nor free memory.

Otherwise we can run into trouble. And this check here is one part of that, we 
should probably add another warning during allocation of memory. But this here 
is certainly correct.

Regards,
Christian.


Thanks,
Jack

From: Christian König 
<mailto:ckoenig.leichtzumer...@gmail.com>
Sent: Friday, February 3, 2023 9:20 PM
To: Xiao, Jack <mailto:jack.x...@amd.com>; Koenig, Christian 
<mailto:christian.koe...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, 
Alexander <mailto:alexander.deuc...@amd.com>
Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Nope, that is not related to any hw state.

It's simply not allowed to free up resources during suspend since those can't 
be acquired again during resume.

We had a couple of cases now where this was wrong. If you get a warning from 
that please fix the code which tried to free something during suspend instead.

Regards,
Christian.
Am 03.02.23 um 07:04 schrieb Xiao, Jack:

[AMD Official Use Only - General]

>> It's simply illegal to free up memory during suspend.
Why? In my understanding, the limit was caused by DMA shutdown.

Regards,
Jack

From: Koenig, Christian 
<mailto:christian.koe...@amd.com>
Sent: Thursday, February 2, 2023 7:43 PM
To: Xiao, Jack <mailto:jack.x...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, 
Alexander <mailto:alexander.deuc...@amd.com>
Subject: AW: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Big NAK to this! This warning is not related in any way to the hw state.

It's simply illegal to free up memory during suspend.

Regards,
Christian.


Von: Xiao, Jack mailto:jack.x...@amd.com>>
Gesendet: Donnerstag, 2. Februar 2023 10:54
An: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>; Deucher, 
Alexander mailto:alexander.deuc...@amd.com>>; 
Koenig, Christian mailto:christian.koe...@amd.com>>
Cc: Xiao, Jack mailto:jack.x...@amd.com>>
Betreff: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

Reduce waringings, only warn when DMA is unavailable.

Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 2d237f3d3a2e..e3e3764ea697 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -422,7 +422,8 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 
*gpu_addr,
 if (*bo == NULL)
 return;

-   WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend);
+   WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend &&
+   
!amdgpu_ttm_adev((*bo)->tbo.bdev)->ip_blocks[AMD_IP_BLOCK_TYPE_SDMA].status.hw);

 if (likely(amdgpu_bo_reserve(*bo, true) == 0)) {
 if (cpu_addr)
--
2.37.3




RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

2023-02-05 Thread Xiao, Jack
[AMD Official Use Only - General]

>> Nope, that is not related to any hw state.

can use other flag.

>> It's simply not allowed to free up resources during suspend since those 
>> can't be acquired again during resume.
The in_suspend flag is set at the beginning of suspend and unset at the end of 
resume. It can't filter the case you mentioned.
Do you know the root cause of these cases hitting the issue? So that we can get 
an exact point to warn the freeing up behavior.

Thanks,
Jack

From: Christian König 
Sent: Friday, February 3, 2023 9:20 PM
To: Xiao, Jack ; Koenig, Christian 
; amd-gfx@lists.freedesktop.org; Deucher, Alexander 

Subject: Re: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Nope, that is not related to any hw state.

It's simply not allowed to free up resources during suspend since those can't 
be acquired again during resume.

We had a couple of cases now where this was wrong. If you get a warning from 
that please fix the code which tried to free something during suspend instead.

Regards,
Christian.
Am 03.02.23 um 07:04 schrieb Xiao, Jack:

[AMD Official Use Only - General]

>> It's simply illegal to free up memory during suspend.
Why? In my understanding, the limit was caused by DMA shutdown.

Regards,
Jack

From: Koenig, Christian 
<mailto:christian.koe...@amd.com>
Sent: Thursday, February 2, 2023 7:43 PM
To: Xiao, Jack <mailto:jack.x...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, 
Alexander <mailto:alexander.deuc...@amd.com>
Subject: AW: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Big NAK to this! This warning is not related in any way to the hw state.

It's simply illegal to free up memory during suspend.

Regards,
Christian.


Von: Xiao, Jack mailto:jack.x...@amd.com>>
Gesendet: Donnerstag, 2. Februar 2023 10:54
An: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>; Deucher, 
Alexander mailto:alexander.deuc...@amd.com>>; 
Koenig, Christian mailto:christian.koe...@amd.com>>
Cc: Xiao, Jack mailto:jack.x...@amd.com>>
Betreff: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

Reduce waringings, only warn when DMA is unavailable.

Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 2d237f3d3a2e..e3e3764ea697 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -422,7 +422,8 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 
*gpu_addr,
 if (*bo == NULL)
 return;

-   WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend);
+   WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend &&
+   
!amdgpu_ttm_adev((*bo)->tbo.bdev)->ip_blocks[AMD_IP_BLOCK_TYPE_SDMA].status.hw);

 if (likely(amdgpu_bo_reserve(*bo, true) == 0)) {
 if (cpu_addr)
--
2.37.3



RE: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

2023-02-02 Thread Xiao, Jack
[AMD Official Use Only - General]

>> It's simply illegal to free up memory during suspend.
Why? In my understanding, the limit was caused by DMA shutdown.

Regards,
Jack

From: Koenig, Christian 
Sent: Thursday, February 2, 2023 7:43 PM
To: Xiao, Jack ; amd-gfx@lists.freedesktop.org; Deucher, 
Alexander 
Subject: AW: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is 
unavailable

Big NAK to this! This warning is not related in any way to the hw state.

It's simply illegal to free up memory during suspend.

Regards,
Christian.

____
Von: Xiao, Jack mailto:jack.x...@amd.com>>
Gesendet: Donnerstag, 2. Februar 2023 10:54
An: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>; Deucher, 
Alexander mailto:alexander.deuc...@amd.com>>; 
Koenig, Christian mailto:christian.koe...@amd.com>>
Cc: Xiao, Jack mailto:jack.x...@amd.com>>
Betreff: [PATCH] drm/amdgpu: only WARN freeing buffers when DMA is unavailable

Reduce waringings, only warn when DMA is unavailable.

Signed-off-by: Jack Xiao mailto:jack.x...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 2d237f3d3a2e..e3e3764ea697 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -422,7 +422,8 @@ void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 
*gpu_addr,
 if (*bo == NULL)
 return;

-   WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend);
+   WARN_ON(amdgpu_ttm_adev((*bo)->tbo.bdev)->in_suspend &&
+   
!amdgpu_ttm_adev((*bo)->tbo.bdev)->ip_blocks[AMD_IP_BLOCK_TYPE_SDMA].status.hw);

 if (likely(amdgpu_bo_reserve(*bo, true) == 0)) {
 if (cpu_addr)
--
2.37.3


Re: [PATCH 2/2] drm/amdgpu: remove deprecated MES version vars

2022-10-26 Thread Xiao, Jack
[AMD Official Use Only - General]

The series is Reviewed-by: Jack Xiao 

Regards,
Jack

From: Sider, Graham 
Sent: Wednesday, 26 October 2022 03:20
To: amd-gfx@lists.freedesktop.org 
Cc: Xiao, Jack ; Zhang, Hawking ; 
Sider, Graham 
Subject: [PATCH 2/2] drm/amdgpu: remove deprecated MES version vars

MES scheduler and kiq versions are stored in mes.sched_version and
mes.kiq_version, respectively, which are read from a register after
their queues are initialized. Remove mes.ucode_fw_version and
mes.data_fw_version which tried to read this versioning info from the
firmware headers (which don't contain this information).

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 --
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c  | 4 
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 4 
 3 files changed, 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index ad980f4b66e1..97c05d08a551 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -91,14 +91,12 @@ struct amdgpu_mes {
 struct amdgpu_bo*ucode_fw_obj[AMDGPU_MAX_MES_PIPES];
 uint64_t
ucode_fw_gpu_addr[AMDGPU_MAX_MES_PIPES];
 uint32_t*ucode_fw_ptr[AMDGPU_MAX_MES_PIPES];
-   uint32_tucode_fw_version[AMDGPU_MAX_MES_PIPES];
 uint64_tuc_start_addr[AMDGPU_MAX_MES_PIPES];

 /* mes ucode data */
 struct amdgpu_bo*data_fw_obj[AMDGPU_MAX_MES_PIPES];
 uint64_tdata_fw_gpu_addr[AMDGPU_MAX_MES_PIPES];
 uint32_t*data_fw_ptr[AMDGPU_MAX_MES_PIPES];
-   uint32_tdata_fw_version[AMDGPU_MAX_MES_PIPES];
 uint64_tdata_start_addr[AMDGPU_MAX_MES_PIPES];

 /* eop gpu obj */
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index 067d10073a56..1abdf8b7ab50 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -415,10 +415,6 @@ static int mes_v10_1_init_microcode(struct amdgpu_device 
*adev,

 mes_hdr = (const struct mes_firmware_header_v1_0 *)
 adev->mes.fw[pipe]->data;
-   adev->mes.ucode_fw_version[pipe] =
-   le32_to_cpu(mes_hdr->mes_ucode_version);
-   adev->mes.ucode_fw_version[pipe] =
-   le32_to_cpu(mes_hdr->mes_ucode_data_version);
 adev->mes.uc_start_addr[pipe] =
 le32_to_cpu(mes_hdr->mes_uc_start_addr_lo) |
 ((uint64_t)(le32_to_cpu(mes_hdr->mes_uc_start_addr_hi)) << 32);
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index e14f314902b1..27a330f51c7d 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -480,10 +480,6 @@ static int mes_v11_0_init_microcode(struct amdgpu_device 
*adev,

 mes_hdr = (const struct mes_firmware_header_v1_0 *)
 adev->mes.fw[pipe]->data;
-   adev->mes.ucode_fw_version[pipe] =
-   le32_to_cpu(mes_hdr->mes_ucode_version);
-   adev->mes.ucode_fw_version[pipe] =
-   le32_to_cpu(mes_hdr->mes_ucode_data_version);
 adev->mes.uc_start_addr[pipe] =
 le32_to_cpu(mes_hdr->mes_uc_start_addr_lo) |
 ((uint64_t)(le32_to_cpu(mes_hdr->mes_uc_start_addr_hi)) << 32);
--
2.25.1



Re: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini

2022-10-18 Thread Xiao, Jack
[AMD Official Use Only - General]

Reviewed-by: Jack Xiao 

Regards,
Jack

From: amd-gfx  on behalf of YuBiao Wang 

Sent: Tuesday, 18 October 2022 13:37
To: amd-gfx@lists.freedesktop.org 
Cc: Wang, YuBiao ; Grodzovsky, Andrey 
; Xiao, Jack ; Xu, Feifei 
; Chen, Horace ; Wang, Yang(Kevin) 
; Tuikov, Luben ; Deucher, 
Alexander ; Quan, Evan ; Koenig, 
Christian ; Liu, Monk ; Zhang, 
Hawking 
Subject: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini

[Why]
If mes is not dequeued during fini, mes will be in an uncleaned state
during reload, then mes couldn't receive some commands which leads to
reload failure.

[How]
Perform MES dequeue via MMIO after all the unmap jobs are done by mes
and before kiq fini.

v3: Move the dequeue operation inside kiq_hw_fini.

Signed-off-by: YuBiao Wang 
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 42 --
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 1174dcc88db5..b477bed40d61 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -1151,6 +1151,42 @@ static int mes_v11_0_sw_fini(void *handle)
 return 0;
 }

+static void mes_v11_0_kiq_dequeue_sched(struct amdgpu_device *adev)
+{
+   uint32_t data;
+   int i;
+
+   mutex_lock(&adev->srbm_mutex);
+   soc21_grbm_select(adev, 3, AMDGPU_MES_SCHED_PIPE, 0, 0);
+
+   /* disable the queue if it's active */
+   if (RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1) {
+   WREG32_SOC15(GC, 0, regCP_HQD_DEQUEUE_REQUEST, 1);
+   for (i = 0; i < adev->usec_timeout; i++) {
+   if (!(RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1))
+   break;
+   udelay(1);
+   }
+   }
+   data = RREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL);
+   data = REG_SET_FIELD(data, CP_HQD_PQ_DOORBELL_CONTROL,
+   DOORBELL_EN, 0);
+   data = REG_SET_FIELD(data, CP_HQD_PQ_DOORBELL_CONTROL,
+   DOORBELL_HIT, 1);
+   WREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL, data);
+
+   WREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL, 0);
+
+   WREG32_SOC15(GC, 0, regCP_HQD_PQ_WPTR_LO, 0);
+   WREG32_SOC15(GC, 0, regCP_HQD_PQ_WPTR_HI, 0);
+   WREG32_SOC15(GC, 0, regCP_HQD_PQ_RPTR, 0);
+
+   soc21_grbm_select(adev, 0, 0, 0, 0);
+   mutex_unlock(&adev->srbm_mutex);
+
+   adev->mes.ring.sched.ready = false;
+}
+
 static void mes_v11_0_kiq_setting(struct amdgpu_ring *ring)
 {
 uint32_t tmp;
@@ -1202,6 +1238,9 @@ static int mes_v11_0_kiq_hw_init(struct amdgpu_device 
*adev)

 static int mes_v11_0_kiq_hw_fini(struct amdgpu_device *adev)
 {
+   if (adev->mes.ring.sched.ready)
+   mes_v11_0_kiq_dequeue_sched(adev);
+
 mes_v11_0_enable(adev, false);
 return 0;
 }
@@ -1257,9 +1296,6 @@ static int mes_v11_0_hw_init(void *handle)

 static int mes_v11_0_hw_fini(void *handle)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-   adev->mes.ring.sched.ready = false;
 return 0;
 }

--
2.25.1



Re: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini

2022-10-17 Thread Xiao, Jack
[AMD Official Use Only - General]


>> +   /* dequeue sched pipe via kiq */
>> +   void(*kiq_dequeue_sched)(struct 
>> amdgpu_device *adev);
>> +

It's unnecessary to expose a new callback function proto to perform dequeuing 
scheduler queue,
for kiq_fini or mes_hw_fini callback had got the propriate sequence to do this 
during fini procedure.
In addition, it seems that the issue is caused by that unclear 
CP_HQD_PQ_WPTR/CONTROL registers would trigger scheduler reading from 
unexpected address once CP_MES_CNTL is enabled:
a. for dequeue operation uses MMIO rather than kiq, can the dequeue operation 
move to mes_v11_hw_fini?
b. if something went wrong, it's better move the dequeue operation inside 
mes_v11_0_kiq_hw_fini before disabling mes controller.

Regards,
Jack


From: YuBiao Wang 
Sent: Friday, 14 October 2022 11:34
To: amd-gfx@lists.freedesktop.org 
Cc: Grodzovsky, Andrey ; Quan, Evan 
; Chen, Horace ; Tuikov, Luben 
; Koenig, Christian ; Deucher, 
Alexander ; Xiao, Jack ; Zhang, 
Hawking ; Liu, Monk ; Xu, Feifei 
; Wang, Yang(Kevin) ; Wang, YuBiao 

Subject: [PATCH v3] drm/amdgpu: dequeue mes scheduler during fini

Update: Remove redundant comments as Christian suggests.

[Why]
If mes is not dequeued during fini, mes will be in an uncleaned state
during reload, then mes couldn't receive some commands which leads to
reload failure.

[How]
Perform MES dequeue via MMIO after all the unmap jobs are done by mes
and before kiq fini.

Signed-off-by: YuBiao Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  3 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  |  3 ++
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 41 +++--
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index ad980f4b66e1..ea8efa52503b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -130,6 +130,9 @@ struct amdgpu_mes {
 int (*kiq_hw_init)(struct amdgpu_device 
*adev);
 int (*kiq_hw_fini)(struct amdgpu_device 
*adev);

+   /* dequeue sched pipe via kiq */
+   void(*kiq_dequeue_sched)(struct 
amdgpu_device *adev);
+
 /* ip specific functions */
 const struct amdgpu_mes_funcs   *funcs;
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 257b2e4de600..7c75758f58e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -4406,6 +4406,9 @@ static int gfx_v11_0_hw_fini(void *handle)
 if (amdgpu_gfx_disable_kcq(adev))
 DRM_ERROR("KCQ disable failed\n");

+   if (adev->mes.ring.sched.ready && adev->mes.kiq_dequeue_sched)
+   adev->mes.kiq_dequeue_sched(adev);
+
 amdgpu_mes_kiq_hw_fini(adev);
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index b48684db2832..eef29646b074 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -44,6 +44,7 @@ MODULE_FIRMWARE("amdgpu/gc_11_0_3_mes1.bin");
 static int mes_v11_0_hw_fini(void *handle);
 static int mes_v11_0_kiq_hw_init(struct amdgpu_device *adev);
 static int mes_v11_0_kiq_hw_fini(struct amdgpu_device *adev);
+static void mes_v11_0_kiq_dequeue_sched(struct amdgpu_device *adev);

 #define MES_EOP_SIZE   2048

@@ -1078,6 +1079,7 @@ static int mes_v11_0_sw_init(void *handle)
 adev->mes.funcs = &mes_v11_0_funcs;
 adev->mes.kiq_hw_init = &mes_v11_0_kiq_hw_init;
 adev->mes.kiq_hw_fini = &mes_v11_0_kiq_hw_fini;
+   adev->mes.kiq_dequeue_sched = &mes_v11_0_kiq_dequeue_sched;

 r = amdgpu_mes_init(adev);
 if (r)
@@ -1151,6 +1153,42 @@ static int mes_v11_0_sw_fini(void *handle)
 return 0;
 }

+static void mes_v11_0_kiq_dequeue_sched(struct amdgpu_device *adev)
+{
+   uint32_t data;
+   int i;
+
+   mutex_lock(&adev->srbm_mutex);
+   soc21_grbm_select(adev, 3, AMDGPU_MES_SCHED_PIPE, 0, 0);
+
+   /* disable the queue if it's active */
+   if (RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1) {
+   WREG32_SOC15(GC, 0, regCP_HQD_DEQUEUE_REQUEST, 1);
+   for (i = 0; i < adev->usec_timeout; i++) {
+   if (!(RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) & 1))
+   break;
+   udelay(1);
+   }
+   }
+   data = RREG32_SOC15(GC, 0, regCP_HQD_PQ_DOORBELL_CONTROL);
+   data = REG_SET_FIELD(data, CP_HQD_PQ_DOORBELL_CONTROL,
+   DOORBELL_EN, 0);
+   d

Re: [PATCH 6/7] drm/amdgpu/mes: add mes ring test

2022-06-29 Thread Xiao, Jack
[AMD Official Use Only - General]

will drop this single patch, and send out another independent patch for this.

Thanks,
Jack

From: Yu, Lang 
Sent: Wednesday, 29 June 2022 16:35
To: Xiao, Jack 
Cc: amd-gfx@lists.freedesktop.org 
Subject: Re: [PATCH 6/7] drm/amdgpu/mes: add mes ring test

On 06/29/ , Jack Xiao wrote:
> Use read/write register to test mes ring.
>
> Signed-off-by: Jack Xiao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 36 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  1 +
>  drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  |  6 +
>  3 files changed, 43 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> index b6c2a5058b64..c18ea0bc00eb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> @@ -926,6 +926,42 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, 
> uint32_t reg,
>return r;
>  }
>
> +int amdgpu_mes_ring_test_ring(struct amdgpu_device *adev)
> +{
> + uint32_t scratch;
> + uint32_t tmp = 0;
> + unsigned i;
> + int r = 0;
> +
> + r = amdgpu_gfx_scratch_get(adev, &scratch);

amdgpu_gfx_scratch_get/free() have been removed in latest amd-staging-drm-next.
See e9b8129d8ca5 (drm/amdgpu: nuke dynamic gfx scratch reg allocation).

Regards,
Lang

> + if (r) {
> + DRM_ERROR("amdgpu: mes failed to get scratch reg (%d).\n", r);
> + return r;
> + }
> +
> + WREG32(scratch, 0xCAFEDEAD);
> +
> + tmp = amdgpu_mes_rreg(adev, scratch);
> + if (tmp != 0xCAFEDEAD) {
> + DRM_ERROR("mes failed to read register\n");
> + goto error;
> + }
> +
> + r = amdgpu_mes_wreg(adev, scratch, 0xDEADBEEF);
> + if (r)
> + goto error;
> +
> + tmp = RREG32(scratch);
> + if (tmp != 0xDEADBEEF) {
> + DRM_ERROR("mes failed to write register\n");
> + r = -EIO;
> + }
> +
> +error:
> + amdgpu_gfx_scratch_free(adev, scratch);
> + return r;
> +}
> +
>  static void
>  amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev,
>   struct amdgpu_ring *ring,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> index 93b2ba817916..81610e3f3059 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> @@ -341,6 +341,7 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, 
> uint32_t reg,
>  int amdgpu_mes_reg_write_reg_wait(struct amdgpu_device *adev,
>  uint32_t reg0, uint32_t reg1,
>  uint32_t ref, uint32_t mask);
> +int amdgpu_mes_ring_test_ring(struct amdgpu_device *adev);
>
>  int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
>int queue_type, int idx,
> diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> index 2a6c7a680c62..c4d085429d26 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> @@ -1194,6 +1194,12 @@ static int mes_v11_0_hw_init(void *handle)
>goto failure;
>}
>
> + r = amdgpu_mes_ring_test_ring(adev);
> + if (r) {
> + DRM_ERROR("MES ring test failed\n");
> + goto failure;
> + }
> +
>/*
> * Disable KIQ ring usage from the driver once MES is enabled.
> * MES uses KIQ ring exclusively so driver cannot access KIQ ring
> --
> 2.35.1
>


Re: [PATCH v3 3/3] drm/amdgpu: Update mes_v11_api_def.h

2022-06-14 Thread Xiao, Jack
[AMD Official Use Only - General]

>> +   uint64_toversubscription_no_aggregated_en;

uint64_t is unnecessary here, 32bit or bool type here better.

With fixed, the patch is Reviewed-by: Jack Xiao 

Regards,
Jack

From: amd-gfx  on behalf of Graham Sider 

Sent: Monday, 13 June 2022 23:20
To: amd-gfx@lists.freedesktop.org 
Cc: Joshi, Mukul ; Kuehling, Felix 
; Sider, Graham ; Yang, Philip 

Subject: [PATCH v3 3/3] drm/amdgpu: Update mes_v11_api_def.h

Update MES API to support oversubscription without aggregated doorbell
for usermode queues.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c   | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h   | 1 +
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 ++
 drivers/gpu/drm/amd/include/mes_v11_api_def.h | 4 +++-
 5 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 2e86baa32c55..3d9a81a8fa1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -681,6 +681,7 @@ int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int 
gang_id,
 queue_input.wptr_addr = qprops->wptr_gpu_addr;
 queue_input.queue_type = qprops->queue_type;
 queue_input.paging = qprops->paging;
+   queue_input.oversubscription_no_aggregated_en = 0;

 r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input);
 if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index aa06c8396ee0..26765a9946a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -207,6 +207,7 @@ struct mes_add_queue_input {
 uint32_tdebug_vmid;
 uint64_ttba_addr;
 uint64_ttma_addr;
+   uint64_toversubscription_no_aggregated_en;
 };

 struct mes_remove_queue_input {
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 2a9ef308e71c..95a1394d3943 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -163,6 +163,8 @@ static int mes_v11_0_add_hw_queue(struct amdgpu_mes *mes,
 mes_add_queue_pkt.gws_size = input->gws_size;
 mes_add_queue_pkt.trap_handler_addr = input->tba_addr;
 mes_add_queue_pkt.tma_addr = input->tma_addr;
+   mes_add_queue_pkt.oversubscription_no_aggregated_en =
+   input->oversubscription_no_aggregated_en;

 mes_add_queue_pkt.api_status.api_completion_fence_addr =
 mes->ring.fence_drv.gpu_addr;
@@ -341,6 +343,7 @@ static int mes_v11_0_set_hw_resources(struct amdgpu_mes 
*mes)
 mes_set_hw_res_pkt.disable_reset = 1;
 mes_set_hw_res_pkt.disable_mes_log = 1;
 mes_set_hw_res_pkt.use_different_vmid_compute = 1;
+   mes_set_hw_res_pkt.oversubscription_timer = 50;

 mes_set_hw_res_pkt.api_status.api_completion_fence_addr =
 mes->ring.fence_drv.gpu_addr;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index d8de2fbdfc7d..762bc6059387 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -235,6 +235,8 @@ static int add_queue_mes(struct device_queue_manager *dqm, 
struct queue *q,
 } else
 queue_input.wptr_addr = (uint64_t)q->properties.write_ptr;

+   queue_input.oversubscription_no_aggregated_en = 1;
+
 queue_input.paging = false;
 queue_input.tba_addr = qpd->tba_addr;
 queue_input.tma_addr = qpd->tma_addr;
diff --git a/drivers/gpu/drm/amd/include/mes_v11_api_def.h 
b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
index f9d02d7bdf77..95f0246eb045 100644
--- a/drivers/gpu/drm/amd/include/mes_v11_api_def.h
+++ b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
@@ -226,6 +226,7 @@ union MESAPI_SET_HW_RESOURCES {
 };
 uint32_tuint32_t_all;
 };
+   uint32_toversubscription_timer;
 };

 uint32_tmax_dwords_in_api[API_FRAME_SIZE_IN_DWORDS];
@@ -265,7 +266,8 @@ union MESAPI__ADD_QUEUE {
 uint32_t is_gang_suspended  : 1;
 uint32_t is_tmz_queue   : 1;
 uint32_t map_kiq_utility_queue  : 1;
-   uint32_t reserved   : 23;
+   uint32_t oversubscription_no_aggregated_en : 1;
+   uint32_t reserved   : 22;
 };
 struct MES_API_STATUS   api_status;
 uint64_ttma_addr;
--
2.25.

Re: [PATCH v3 1/3] drm/amdgpu: Fetch MES scheduler/KIQ versions

2022-06-14 Thread Xiao, Jack
[AMD Official Use Only - General]

Reviewed-by: Jack Xiao 

Regards,
Jack

From: amd-gfx  on behalf of Graham Sider 

Sent: Monday, 13 June 2022 23:19
To: amd-gfx@lists.freedesktop.org 
Cc: Joshi, Mukul ; Kuehling, Felix 
; Sider, Graham ; Yang, Philip 

Subject: [PATCH v3 1/3] drm/amdgpu: Fetch MES scheduler/KIQ versions

Store MES scheduler and MES KIQ version numbers in amdgpu_mes for GFX11.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h |  3 +++
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c  | 12 
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 92ddee5e33db..aa06c8396ee0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -64,6 +64,9 @@ struct amdgpu_mes {

 spinlock_t  queue_id_lock;

+   uint32_tsched_version;
+   uint32_tkiq_version;
+
 uint32_ttotal_max_queue;
 uint32_tdoorbell_id_offset;
 uint32_tmax_doorbell_slices;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index e4eb87689f7f..2a9ef308e71c 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -890,6 +890,18 @@ static int mes_v11_0_queue_init(struct amdgpu_device *adev,
 mes_v11_0_queue_init_register(ring);
 }

+   /* get MES scheduler/KIQ versions */
+   mutex_lock(&adev->srbm_mutex);
+   soc21_grbm_select(adev, 3, pipe, 0, 0);
+
+   if (pipe == AMDGPU_MES_SCHED_PIPE)
+   adev->mes.sched_version = RREG32_SOC15(GC, 0, regCP_MES_GP3_LO);
+   else if (pipe == AMDGPU_MES_KIQ_PIPE && adev->enable_mes_kiq)
+   adev->mes.kiq_version = RREG32_SOC15(GC, 0, regCP_MES_GP3_LO);
+
+   soc21_grbm_select(adev, 0, 0, 0, 0);
+   mutex_unlock(&adev->srbm_mutex);
+
 return 0;
 }

--
2.25.1



Re: [PATCH] drm/amdgpu: remove redundant enable_mes/enable_mes_kiq setting

2022-06-13 Thread Xiao, Jack
[AMD Official Use Only - General]


Reviewed-by: Jack Xiao 

Regards,
Jack

From: amd-gfx  on behalf of Yifan Zhang 

Sent: Sunday, 12 June 2022 12:31
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Zhang, Yifan 

Subject: [PATCH] drm/amdgpu: remove redundant enable_mes/enable_mes_kiq setting

enable_mes and enable_mes_kiq are set in both device init and
MES IP init. Leave the ones in MES IP init, since it is
a more accurate way to judge from GC IP version.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 68a53528089f..2b92281dd0c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3614,14 +3614,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 if (amdgpu_mcbp)
 DRM_INFO("MCBP is enabled\n");

-   if (adev->asic_type >= CHIP_NAVI10) {
-   if (amdgpu_mes || amdgpu_mes_kiq)
-   adev->enable_mes = true;
-
-   if (amdgpu_mes_kiq)
-   adev->enable_mes_kiq = true;
-   }
-
 /*
  * Reset domain needs to be present early, before XGMI hive discovered
  * (if any) and intitialized to use reset sem and in_gpu reset flag
--
2.35.1



RE: [PATCH] drm/amd/powerplay: bug fix for memory clock request from display

2019-10-16 Thread Xiao, Jack
Reviewed-by: Jack Xiao 

-Original Message-
From: amd-gfx  On Behalf Of Kenneth Feng
Sent: Wednesday, October 16, 2019 4:58 PM
To: amd-gfx@lists.freedesktop.org
Cc: Feng, Kenneth 
Subject: [PATCH] drm/amd/powerplay: bug fix for memory clock request from 
display

In some cases, display fixes memory clock frequency to a high value rather than 
the natural memory clock switching.
When we comes back from s3 resume, the request from display is not reset, this 
causes the bug which makes the memory clock goes into a low value.
Then due to the insuffcient memory clock, the screen flicks.

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index e2a03f4..ee374df 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -1354,6 +1354,8 @@ static int smu_resume(void *handle)
if (smu->is_apu)
smu_set_gfx_cgpg(&adev->smu, true);
 
+   smu->disable_uclk_switch = 0;
+
mutex_unlock(&smu->mutex);
 
pr_info("SMU is resumed successfully!\n");
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: add dummy read for some GCVM status registers

2019-08-25 Thread Xiao, Jack
Reviewed-by: Jack Xiao 

-Original Message-
From: Yuan, Xiaojie  
Sent: Thursday, August 22, 2019 11:01 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Zhang, Hawking 
; Xiao, Jack ; Yuan, Xiaojie 

Subject: [PATCH] drm/amdgpu: add dummy read for some GCVM status registers

The GRBM register interface is now capable of bursting 1 cycle per register 
wr->wr, wr->rd much faster than previous muticycle per transaction done 
interface.  This has caused a problem where status registers requiring HW to 
update have a 1 cycle delay, due to the register update having to go through 
GRBM.

SW may operate on an incorrect value if they write a register and immediately 
check the corresponding status register.

Registers requiring HW to clear or set fields may be delayed by 1 cycle.
For example,

1. write VM_INVALIDATE_ENG0_REQ mask = 5a 2. read VM_INVALIDATE_ENG0_ACK till 
the ack is same as the request mask = 5a
a. HW will reset VM_INVALIDATE_ENG0_ACK = 0 until invalidation is complete 
3. write VM_INVALIDATE_ENG0_REQ mask = 5a 4. read VM_INVALIDATE_ENG0_ACK till 
the ack is same as the request mask = 5a
a. First read of VM_INVALIDATE_ENG0_ACK = 5a instead of 0
b. Second read of VM_INVALIDATE_ENG0_ACK = 0 because
   the remote GRBM h/w register takes one extra cycle to be cleared
c. In this case, SW will see a false ACK if they exit on first read

Affected registers (only GC variant)  |  Recommended Dummy Read
--+
VM_INVALIDATE_ENG*_ACK|  VM_INVALIDATE_ENG*_REQ
VM_L2_STATUS  |  VM_L2_STATUS
VM_L2_PROTECTION_FAULT_STATUS |  VM_L2_PROTECTION_FAULT_STATUS
VM_L2_PROTECTION_FAULT_ADDR_HI/LO32   |  VM_L2_PROTECTION_FAULT_ADDR_HI/LO32
VM_L2_IH_LOG_BUSY |  VM_L2_IH_LOG_BUSY
MC_VM_L2_PERFCOUNTER_HI/LO|  MC_VM_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER_HI/LO  |  ATC_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER2_HI/LO |  ATC_L2_PERFCOUNTER2_HI/LO

Signed-off-by: Xiaojie Yuan 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 15 +++  
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 
 2 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index d83d8a6a1fc0..56f76a1f32ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -135,6 +135,14 @@ static int gmc_v10_0_process_interrupt(struct 
amdgpu_device *adev,
addr |= ((u64)entry->src_data[1] & 0xf) << 44;
 
if (!amdgpu_sriov_vf(adev)) {
+   /*
+* Issue a dummy read to wait for the status register to
+* be updated to avoid reading an incorrect value due to
+* the new fast GRBM interface.
+*/
+   if (entry->vmid_src == AMDGPU_GFXHUB_0)
+   RREG32(hub->vm_l2_pro_fault_status);
+
status = RREG32(hub->vm_l2_pro_fault_status);
WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
}
@@ -228,6 +236,13 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
*adev, uint32_t vmid,
 
WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp);
 
+   /*
+* Issue a dummy read to wait for the ACK register to be cleared
+* to avoid a false ACK due to the new fast GRBM interface.
+*/
+   if (vmhub == AMDGPU_GFXHUB_0)
+   RREG32_NO_KIQ(hub->vm_inv_eng0_req + eng);
+
/* Wait for ACK with a delay.*/
for (i = 0; i < adev->usec_timeout; i++) {
tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_ack + eng); diff --git 
a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 6de17267fc35..17700606f54b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -352,6 +352,14 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
 
/* If it's the first fault for this address, process it normally */
if (!amdgpu_sriov_vf(adev)) {
+   /*
+* Issue a dummy read to wait for the status register to
+* be updated to avoid reading an incorrect value due to
+* the new fast GRBM interface.
+*/
+   if (entry->vmid_src == AMDGPU_GFXHUB_0)
+   RREG32(hub->vm_l2_pro_fault_status);
+
status = RREG32(hub->vm_l2_pro_fault_status);
WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
}
@@ -480,6 +488,14 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
*adev, uint32_t vmid,
 
spin_lock(&adev->gmc.invalidate_lock);
WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp);
+
+   /*
+* Issue a dummy read to wait for the ACK register to be c

[PATCH] drm/amd/powerplay: increase waiting time for smu response

2019-07-02 Thread Xiao, Jack
We observed some SMU commands take more time for execution,
so increase waiting time for response.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index bc39690..88d3127 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -64,9 +64,9 @@ static int smu_v11_0_read_arg(struct smu_context *smu, 
uint32_t *arg)
 static int smu_v11_0_wait_for_response(struct smu_context *smu)
 {
struct amdgpu_device *adev = smu->adev;
-   uint32_t cur_value, i;
+   uint32_t cur_value, i, timeout = adev->usec_timeout * 10;
 
-   for (i = 0; i < adev->usec_timeout; i++) {
+   for (i = 0; i < timeout; i++) {
cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0)
break;
@@ -74,7 +74,7 @@ static int smu_v11_0_wait_for_response(struct smu_context 
*smu)
}
 
/* timeout means wrong logic */
-   if (i == adev->usec_timeout)
+   if (i == timeout)
return -ETIME;
 
return RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90) == 0x1 ? 0 : -EIO;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdkfd: remove an unused variable

2019-07-01 Thread Xiao, Jack
Just for cleanup.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 24db82b..3322a44 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -487,7 +487,6 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
struct pci_dev *pdev, const struct kfd2kgd_calls *f2g)
 {
struct kfd_dev *kfd;
-   int ret;
const struct kfd_device_info *device_info =
lookup_device_info(pdev->device);
 
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job

2019-06-18 Thread Xiao, Jack
I sent out a separate patch to add job->preamble_status back.

Regards,
Jack

-Original Message-
From: Koenig, Christian  
Sent: Tuesday, June 18, 2019 5:03 PM
To: Xiao, Jack ; Alex Deucher ; 
amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Zhang, Hawking 

Subject: Re: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job

Yeah thought so, cause that would break immediately and won't work at all.

It must be something like a merge or rebase artifact.

What's the best way to fix that up? In other words how should it look like?

Christian.

Am 18.06.19 um 10:32 schrieb Xiao, Jack:
> In original patch, there is no deletion on job->preamble_status.
>
> Regards,
> Jack
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, June 18, 2019 4:04 PM
> To: Alex Deucher ; 
> amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Xiao, Jack 
> ; Zhang, Hawking 
> Subject: Re: [PATCH 092/459] drm/amdgpu: add ib preemption status in 
> amdgpu_job
>
> Am 17.06.19 um 21:10 schrieb Alex Deucher:
>> From: Jack Xiao 
>>
>> Add ib preemption status in amdgpu_job, so that ring level function 
>> can detect preemption and program for resuming it.
>>
>> Acked-by: Alex Deucher 
>> Reviewed-by: Hawking Zhang 
>> Signed-off-by: Jack Xiao 
>> Signed-off-by: Alex Deucher 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 2 +-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++
>>2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index e15531bfcd47..90cc0841b564 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -208,7 +208,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
>> unsigned num_ibs,
>>
>>  skip_preamble = ring->current_ctx == fence_ctx;
>>  if (job && ring->funcs->emit_cntxcntl) {
>> -status |= job->preamble_status;
>> +status |= job->preemption_status;
> That actually looks completely incorrect to me since it breaks the old status 
> handling.
>
> Christian.
>
>>  amdgpu_ring_emit_cntxcntl(ring, status);
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> index e1b46a6703de..51e62504c279 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
>> @@ -29,6 +29,8 @@
>>#define AMDGPU_PREAMBLE_IB_PRESENT_FIRST(1 << 1)
>>/* bit set means context switch occured */
>>#define AMDGPU_HAVE_CTX_SWITCH  (1 << 2)
>> +/* bit set means IB is preempted */
>> +#define AMDGPU_IB_PREEMPTED (1 << 3)
>>
>>#define to_amdgpu_job(sched_job)  \
>>  container_of((sched_job), struct amdgpu_job, base) @@ -45,6 
>> +47,7 @@ struct amdgpu_job {
>>  struct amdgpu_ib*ibs;
>>  struct dma_fence*fence; /* the hw fence */
>>  uint32_tpreamble_status;
>> +uint32_tpreemption_status;
>>  uint32_tnum_ibs;
>>  void*owner;
>>  boolvm_needs_flush;

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: add job->preamble_status back to status value

2019-06-18 Thread Xiao, Jack
job->preamble_status was accidently dropped from the combination value
of status varaiable which is the parameter of some ring level function.
The patch is to add it back.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 90cc084..6882eeb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -208,6 +208,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
 
skip_preamble = ring->current_ctx == fence_ctx;
if (job && ring->funcs->emit_cntxcntl) {
+   status |= job->preamble_status;
status |= job->preemption_status;
amdgpu_ring_emit_cntxcntl(ring, status);
}
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job

2019-06-18 Thread Xiao, Jack
In original patch, there is no deletion on job->preamble_status. 

Regards,
Jack

-Original Message-
From: Christian König  
Sent: Tuesday, June 18, 2019 4:04 PM
To: Alex Deucher ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Xiao, Jack 
; Zhang, Hawking 
Subject: Re: [PATCH 092/459] drm/amdgpu: add ib preemption status in amdgpu_job

Am 17.06.19 um 21:10 schrieb Alex Deucher:
> From: Jack Xiao 
>
> Add ib preemption status in amdgpu_job, so that ring level function 
> can detect preemption and program for resuming it.
>
> Acked-by: Alex Deucher 
> Reviewed-by: Hawking Zhang 
> Signed-off-by: Jack Xiao 
> Signed-off-by: Alex Deucher 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++
>   2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index e15531bfcd47..90cc0841b564 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -208,7 +208,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
> unsigned num_ibs,
>   
>   skip_preamble = ring->current_ctx == fence_ctx;
>   if (job && ring->funcs->emit_cntxcntl) {
> - status |= job->preamble_status;
> + status |= job->preemption_status;

That actually looks completely incorrect to me since it breaks the old status 
handling.

Christian.

>   amdgpu_ring_emit_cntxcntl(ring, status);
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> index e1b46a6703de..51e62504c279 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> @@ -29,6 +29,8 @@
>   #define AMDGPU_PREAMBLE_IB_PRESENT_FIRST(1 << 1)
>   /* bit set means context switch occured */
>   #define AMDGPU_HAVE_CTX_SWITCH  (1 << 2)
> +/* bit set means IB is preempted */
> +#define AMDGPU_IB_PREEMPTED (1 << 3)
>   
>   #define to_amdgpu_job(sched_job)\
>   container_of((sched_job), struct amdgpu_job, base) @@ -45,6 
> +47,7 
> @@ struct amdgpu_job {
>   struct amdgpu_ib*ibs;
>   struct dma_fence*fence; /* the hw fence */
>   uint32_tpreamble_status;
> + uint32_tpreemption_status;
>   uint32_tnum_ibs;
>   void*owner;
>   boolvm_needs_flush;

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: add smu message mutex

2019-06-03 Thread Xiao, Jack
Drop it due to a modification was missing.
A new patch was sent out.

-Original Message-
From: Xiao, Jack  
Sent: Monday, June 03, 2019 2:12 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Zhang, Hawking 
Cc: Xiao, Jack 
Subject: [PATCH] drm/amd/powerplay: add smu message mutex

Add smu message mutex preventing against race condition issue.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 +
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c  | 7 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
index 3eb1de9..735233e 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
@@ -374,6 +374,7 @@ struct smu_context
const struct smu_funcs  *funcs;
const struct pptable_funcs  *ppt_funcs;
struct mutexmutex;
+   struct mutexmsg_mutex;
uint64_t pool_size;
 
struct smu_table_contextsmu_table;
diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index d2eeb62..de737a0 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -104,6 +104,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
if (index < 0)
return index;
 
+   mutex_lock(&smu->msg_mutex);
+
smu_v11_0_wait_for_response(smu);
 
WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0); @@ -111,11 +113,11 @@ 
static int smu_v11_0_send_msg(struct smu_context *smu, uint16_t msg)
smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index);
 
ret = smu_v11_0_wait_for_response(smu);
-
if (ret)
pr_err("Failed to send message 0x%x, response 0x%x\n", index,
   ret);
 
+   mutex_unlock(&smu->msg_mutex);
return ret;
 
 }
@@ -132,6 +134,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
if (index < 0)
return index;
 
+   mutex_lock(&smu->msg_mutex);
+
ret = smu_v11_0_wait_for_response(smu);
if (ret)
pr_err("Failed to send message 0x%x, response 0x%x, param 
0x%x\n", @@ -148,6 +152,7 @@ static int smu_v11_0_send_msg(struct smu_context 
*smu, uint16_t msg)
pr_err("Failed to send message 0x%x, response 0x%x param 
0x%x\n",
   index, ret, param);
 
+   mutex_unlock(&smu->msg_mutex);
return ret;
 }
 
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/powerplay: add smu message mutex

2019-06-03 Thread Xiao, Jack
Add smu message mutex preventing against race condition issue.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 1 +
 drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 +
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c  | 7 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index 3026c7e..db2bbec 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -350,6 +350,7 @@ static int smu_early_init(void *handle)
smu->adev = adev;
smu->pm_enabled = !!amdgpu_dpm;
mutex_init(&smu->mutex);
+   mutex_init(&smu->msg_mutex);
 
return smu_set_funcs(adev);
 }
diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
index 3eb1de9..735233e 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
@@ -374,6 +374,7 @@ struct smu_context
const struct smu_funcs  *funcs;
const struct pptable_funcs  *ppt_funcs;
struct mutexmutex;
+   struct mutexmsg_mutex;
uint64_t pool_size;
 
struct smu_table_contextsmu_table;
diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index d2eeb62..de737a0 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -104,6 +104,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
if (index < 0)
return index;
 
+   mutex_lock(&smu->msg_mutex);
+
smu_v11_0_wait_for_response(smu);
 
WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0);
@@ -111,11 +113,11 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index);
 
ret = smu_v11_0_wait_for_response(smu);
-
if (ret)
pr_err("Failed to send message 0x%x, response 0x%x\n", index,
   ret);
 
+   mutex_unlock(&smu->msg_mutex);
return ret;
 
 }
@@ -132,6 +134,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
if (index < 0)
return index;
 
+   mutex_lock(&smu->msg_mutex);
+
ret = smu_v11_0_wait_for_response(smu);
if (ret)
pr_err("Failed to send message 0x%x, response 0x%x, param 
0x%x\n",
@@ -148,6 +152,7 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
pr_err("Failed to send message 0x%x, response 0x%x param 
0x%x\n",
   index, ret, param);
 
+   mutex_unlock(&smu->msg_mutex);
return ret;
 }
 
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/powerplay: add smu message mutex

2019-06-02 Thread Xiao, Jack
Add smu message mutex preventing against race condition issue.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 +
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c  | 7 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
index 3eb1de9..735233e 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
@@ -374,6 +374,7 @@ struct smu_context
const struct smu_funcs  *funcs;
const struct pptable_funcs  *ppt_funcs;
struct mutexmutex;
+   struct mutexmsg_mutex;
uint64_t pool_size;
 
struct smu_table_contextsmu_table;
diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index d2eeb62..de737a0 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -104,6 +104,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
if (index < 0)
return index;
 
+   mutex_lock(&smu->msg_mutex);
+
smu_v11_0_wait_for_response(smu);
 
WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0);
@@ -111,11 +113,11 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index);
 
ret = smu_v11_0_wait_for_response(smu);
-
if (ret)
pr_err("Failed to send message 0x%x, response 0x%x\n", index,
   ret);
 
+   mutex_unlock(&smu->msg_mutex);
return ret;
 
 }
@@ -132,6 +134,8 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
if (index < 0)
return index;
 
+   mutex_lock(&smu->msg_mutex);
+
ret = smu_v11_0_wait_for_response(smu);
if (ret)
pr_err("Failed to send message 0x%x, response 0x%x, param 
0x%x\n",
@@ -148,6 +152,7 @@ static int smu_v11_0_send_msg(struct smu_context *smu, 
uint16_t msg)
pr_err("Failed to send message 0x%x, response 0x%x param 
0x%x\n",
   index, ret, param);
 
+   mutex_unlock(&smu->msg_mutex);
return ret;
 }
 
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/3] drm/amdgpu: add field indicating if has PCIE atomics support

2019-05-28 Thread Xiao, Jack
The new field in amdgpu device is used to record whether the
system has PCIE atomics support. The field can be exposed to
UMD or kfd whether PCIE atomics have supported.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d355e9a..b0a01e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -767,6 +767,7 @@ struct amdgpu_device {
struct mutexgrbm_idx_mutex;
struct dev_pm_domainvga_pm_domain;
boolhave_disp_power_ref;
+   boolhave_atomics_support;
 
/* BIOS */
boolis_atom_fw;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/3] drm/amdgpu: enable PCIE atomics ops support

2019-05-28 Thread Xiao, Jack
GPU atomics operation depends on PCIE atomics support.
Always enable PCIE atomics ops support in case that
it hasn't been enabled.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bdd1fe73..55772eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2562,6 +2562,17 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (adev->rio_mem == NULL)
DRM_INFO("PCI I/O BAR is not found.\n");
 
+   /* enable PCIE atomic ops */
+   r = pci_enable_atomic_ops_to_root(adev->pdev,
+   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+   PCI_EXP_DEVCAP2_ATOMIC_COMP64);
+   if (r) {
+   adev->have_atomics_support = false;
+   DRM_INFO("PCIE atomic ops is not supported\n");
+   } else {
+   adev->have_atomics_support = true;
+   }
+
amdgpu_device_get_pcie_info(adev);
 
/* early init functions */
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 0/3] always enable PCIE atomics

2019-05-28 Thread Xiao, Jack
GPU atomics operation depends on PCIE atomics support.
Always enable PCIE atomics ops support in case that
it hasn't been enabled.

Add field to amdgpu device, that can be exposed to UMD or
kfd indicating whether the system has PCIE atomics support.

Jack Xiao (3):
  drm/amdgpu: add field indicating if has PCIE atomics support
  drm/amdgpu: enable PCIE atomics ops support
  drm/amdkfd: remove duplicated PCIE atomics request

 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 --
 5 files changed, 24 insertions(+), 6 deletions(-)

-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/3] drm/amdkfd: remove duplicated PCIE atomics request

2019-05-28 Thread Xiao, Jack
Since amdgpu has always requested PCIE atomics, kfd don't
need duplicated PCIE atomics enablement. Referring to amdgpu
request result is enough.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 10 --
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 98326e3b..ddd6c52 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -620,6 +620,13 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, 
u32 vmid)
return false;
 }
 
+bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
+
+   return adev->have_atomics_support;
+}
+
 #ifndef CONFIG_HSA_AMD
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index f57f297..8d135c82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -135,6 +135,7 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
uint32_t *ib_cmd, uint32_t ib_len);
 void amdgpu_amdkfd_set_compute_idle(struct kgd_dev *kgd, bool idle);
+bool amdgpu_amdkfd_have_atomics_support(struct kgd_dev *kgd);
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void);
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 7b4ea24..76a1599 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -481,17 +481,15 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 * 32 and 64-bit requests are possible and must be
 * supported.
 */
-   ret = pci_enable_atomic_ops_to_root(pdev,
-   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
-   PCI_EXP_DEVCAP2_ATOMIC_COMP64);
-   if (device_info->needs_pci_atomics && ret < 0) {
+   kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
+   if (device_info->needs_pci_atomics &&
+   !kfd->pci_atomic_requested) {
dev_info(kfd_device,
 "skipped device %x:%x, PCI rejects atomics\n",
 pdev->vendor, pdev->device);
kfree(kfd);
return NULL;
-   } else if (!ret)
-   kfd->pci_atomic_requested = true;
+   }
 
kfd->kgd = kgd;
kfd->device_info = device_info;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: enable PCIE atomics ops support

2019-05-27 Thread Xiao, Jack
GPU atomics operation depends on PCIE atomics support.
Always enable PCIE atomics ops support in case that
it hasn't been enabled.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bdd1fe73..a2c6064 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2562,6 +2562,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (adev->rio_mem == NULL)
DRM_INFO("PCI I/O BAR is not found.\n");
 
+   /* enable PCIE atomic ops */
+   r = pci_enable_atomic_ops_to_root(adev->pdev,
+   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+   PCI_EXP_DEVCAP2_ATOMIC_COMP64);
+   if (r)
+   DRM_INFO("PCIE atomic ops is not supported\n");
+
amdgpu_device_get_pcie_info(adev);
 
/* early init functions */
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: enable PCIE atomics ops support

2019-05-23 Thread Xiao, Jack
GPU atomics operation depends on PCIE atomics support.
Always enable PCIE atomics ops support in case that
it hasn't been enabled.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bdd1fe73..a2c6064 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2562,6 +2562,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (adev->rio_mem == NULL)
DRM_INFO("PCI I/O BAR is not found.\n");
 
+   /* enable PCIE atomic ops */
+   r = pci_enable_atomic_ops_to_root(adev->pdev,
+   PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+   PCI_EXP_DEVCAP2_ATOMIC_COMP64);
+   if (r)
+   DRM_INFO("PCIE atomic ops is not supported\n");
+
amdgpu_device_get_pcie_info(adev);
 
/* early init functions */
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2] drm/amdgpu: add flags to emit_ib interface v2

2019-01-20 Thread Xiao, Jack
Replace the last bool type parameter with a general flags parameter,
to make the last parameter be able to contain more information.

v2: drop setting need_ctx_switch = false

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c   | 10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h  |  2 +-
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c|  6 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/si_dma.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c|  6 +++---
 20 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index c48207b3..0b8ef2d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -202,12 +202,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
amdgpu_asic_flush_hdp(adev, ring);
}
 
+   if (need_ctx_switch)
+   status |= AMDGPU_HAVE_CTX_SWITCH;
+
skip_preamble = ring->current_ctx == fence_ctx;
if (job && ring->funcs->emit_cntxcntl) {
-   if (need_ctx_switch)
-   status |= AMDGPU_HAVE_CTX_SWITCH;
status |= job->preamble_status;
-
amdgpu_ring_emit_cntxcntl(ring, status);
}
 
@@ -221,8 +221,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
!amdgpu_sriov_vf(adev)) /* for SRIOV preemption, 
Preamble CE ib must be inserted anyway */
continue;
 
-   amdgpu_ring_emit_ib(ring, job, ib, need_ctx_switch);
-   need_ctx_switch = false;
+   amdgpu_ring_emit_ib(ring, job, ib, status);
+   status &= ~AMDGPU_HAVE_CTX_SWITCH;
}
 
if (ring->funcs->emit_tmz)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index d87e828..d7fae26 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -131,7 +131,7 @@ struct amdgpu_ring_funcs {
void (*emit_ib)(struct amdgpu_ring *ring,
struct amdgpu_job *job,
struct amdgpu_ib *ib,
-   bool ctx_switch);
+   uint32_t flags);
void (*emit_fence)(struct amdgpu_ring *ring, uint64_t addr,
   uint64_t seq, unsigned flags);
void (*emit_pipeline_sync)(struct amdgpu_ring *ring);
@@ -229,7 +229,7 @@ struct amdgpu_ring {
 #define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r))
 #define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r))
 #define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r))
-#define amdgpu_ring_emit_ib(r, job, ib, c) ((r)->funcs->emit_ib((r), (job), 
(ib), (c)))
+#define amdgpu_ring_emit_ib(r, job, ib, flags) ((r)->funcs->emit_ib((r), 
(job), (ib), (flags)))
 #define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r))
 #define amdgpu_ring_emit_vm_flush(r, vmid, addr) 
(r)->funcs->emit_vm_flush((r), (vmid), (addr))
 #define amdgpu_ring_emit_fence(r, addr, seq, flags) 
(r)->funcs->emit_fence((r), (addr), (seq), (flags))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 98a1b2c..c021b11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1035,7 +1035,7 @@ int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser 
*p, uint32_t ib_idx)
 void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring,
struct amdgpu_job *job,
struct amdgpu_ib *ib,
-   bool ctx_switch)
+   uint32_t flags)
 {
amdgpu_ring_write(ring, VCE_CMD_IB);
amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
index 5029365..30ea54d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -66,7 +66,7 @@ int amdgpu_vce_get_destroy_msg(struct amdgpu_ring *ring, 
uint32_t handle,
 int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parse

[PATCH] drm/amdgpu: add flags to emit_ib interface

2019-01-18 Thread Xiao, Jack
Replace the last bool type parameter with a general flags parameter,
to make the last parameter be able to contain more information.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c   | 7 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h  | 2 +-
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c| 2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c| 6 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/si_dma.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c| 2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c| 2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c| 4 ++--
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c| 2 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c| 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c| 6 +++---
 20 files changed, 35 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index c48207b3..1b7edd7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -221,7 +221,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
!amdgpu_sriov_vf(adev)) /* for SRIOV preemption, 
Preamble CE ib must be inserted anyway */
continue;
 
-   amdgpu_ring_emit_ib(ring, job, ib, need_ctx_switch);
+   if (need_ctx_switch)
+   status |= AMDGPU_HAVE_CTX_SWITCH;
+   else
+   status &= ~AMDGPU_HAVE_CTX_SWITCH;
+
+   amdgpu_ring_emit_ib(ring, job, ib, status);
need_ctx_switch = false;
}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index d87e828..d7fae26 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -131,7 +131,7 @@ struct amdgpu_ring_funcs {
void (*emit_ib)(struct amdgpu_ring *ring,
struct amdgpu_job *job,
struct amdgpu_ib *ib,
-   bool ctx_switch);
+   uint32_t flags);
void (*emit_fence)(struct amdgpu_ring *ring, uint64_t addr,
   uint64_t seq, unsigned flags);
void (*emit_pipeline_sync)(struct amdgpu_ring *ring);
@@ -229,7 +229,7 @@ struct amdgpu_ring {
 #define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r))
 #define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r))
 #define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r))
-#define amdgpu_ring_emit_ib(r, job, ib, c) ((r)->funcs->emit_ib((r), (job), 
(ib), (c)))
+#define amdgpu_ring_emit_ib(r, job, ib, flags) ((r)->funcs->emit_ib((r), 
(job), (ib), (flags)))
 #define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r))
 #define amdgpu_ring_emit_vm_flush(r, vmid, addr) 
(r)->funcs->emit_vm_flush((r), (vmid), (addr))
 #define amdgpu_ring_emit_fence(r, addr, seq, flags) 
(r)->funcs->emit_fence((r), (addr), (seq), (flags))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 98a1b2c..c021b11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1035,7 +1035,7 @@ int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser 
*p, uint32_t ib_idx)
 void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring,
struct amdgpu_job *job,
struct amdgpu_ib *ib,
-   bool ctx_switch)
+   uint32_t flags)
 {
amdgpu_ring_write(ring, VCE_CMD_IB);
amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
index 5029365..30ea54d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -66,7 +66,7 @@ int amdgpu_vce_get_destroy_msg(struct amdgpu_ring *ring, 
uint32_t handle,
 int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parser *p, uint32_t ib_idx);
 int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser *p, uint32_t ib_idx);
 void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_job *job,
-   struct amdgpu_ib *ib, bool ctx_switch);
+   struct amdgpu_ib *ib, uint32_t flags);
 void amdgpu_vce_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 seq,
unsigned flags);
 int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring);
diff --git a/drivers/gp

RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

2018-08-29 Thread Xiao, Jack
Can you explain what the benefits can be gained from AGP aperture enablement? 
Otherwise, it would increase our maintenance workload.

Regards,
Jack

-Original Message-
From: Christian König  
Sent: Wednesday, August 29, 2018 4:47 PM
To: Kuehling, Felix ; Koenig, Christian 
; Xiao, Jack ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

Completely agree with Felix here.

It makes system memory access slightly simpler, but I would say that you 
accidentally corrupt the GART table and that you accidentally corrupt the the 
AGP aperture is equally likely.

Regards,
Christian.

Am 28.08.2018 um 20:12 schrieb Felix Kuehling:
> The GPU always has access to the entire guest physical address space.
> You can just program arbitrary addresses into page table entries and 
> set the system bit. That's how GART and GPUVM mappings work. They will 
> not go through the AGP aperture. And there is no mechanism (other than 
> an
> IOMMU) to protect system memory from GPU access.
>
> Regards,
>    Felix
>
>
> On 2018-08-28 07:41 AM, Christian König wrote:
>> Well that is indeed true, but we still have IOMMU between the GPU and 
>> the system memory.
>>
>> So we should still catch issues when something goes terrible wrong.
>>
>> Additional to that only the system domain, e.g. kernel copies, page 
>> table updates etc are allowed to use it.
>>
>> Regards,
>> Christian.
>>
>> Am 28.08.2018 um 09:06 schrieb Xiao, Jack:
>>> I mean it has risk to make GPU allowed to access to most system 
>>> memory without explicit claiming, it's easier to mask problem.
>>>
>>> Regards,
>>> Jack
>>>
>>> -Original Message-
>>> From: Koenig, Christian
>>> Sent: Tuesday, August 28, 2018 2:46 PM
>>> To: Xiao, Jack ; Kuehling, Felix 
>>> ; Christian König 
>>> ; amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address 
>>> space on GMC9
>>>
>>>> This series patches seems to make AGP aperture allowed to access 
>>>> any system memory (16GB) bypass GPU VM protection.
>>> The system aperture should only be active in the system domain, or 
>>> otherwise applications would have access to local memory as well.
>>>
>>> There are some bits in the VM registers to enable/disable that, but 
>>> when we would have that setting incorrect we would see quite a bunch 
>>> of other problems.
>>>
>>> Might still be a good idea to double check if all the bits are setup 
>>> correctly.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 28.08.2018 um 07:31 schrieb Xiao, Jack:
>>>> This series patches seems to make AGP aperture allowed to access 
>>>> any system memory (16GB) bypass GPU VM protection.
>>>> If someone made a wrong logic requesting an illegal address which 
>>>> occasionally was located inside AGP aperture, but without any VM 
>>>> protection, the illegal address would be finally translated into a 
>>>> system memory address; if GPU read/wrote such system memory 
>>>> address, the system memory address might belong to kernel or any 
>>>> user application, the r/w operation would result in any unpredictable 
>>>> issue.
>>>> The most important is that such kind of issue is so hard to be 
>>>> addressed.
>>>> Is it worth doing this, but exposing risk?
>>>>
>>>> Regards,
>>>> Jack
>>>>
>>>> -Original Message-
>>>> From: amd-gfx  On Behalf Of 
>>>> Felix Kuehling
>>>> Sent: Tuesday, August 28, 2018 3:03 AM
>>>> To: Christian König ;
>>>> amd-gfx@lists.freedesktop.org; Koenig, Christian 
>>>> 
>>>> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address 
>>>> space on GMC9
>>>>
>>>> The point of this series seems to be to allow access to small 
>>>> system memory BOs (one page) without a GART mapping. I'm guessing 
>>>> that reduces pressure on the GART and removes the need for HDP and 
>>>> TLB flushes. Why does Patch 10 only enable that on GFXv9? Is there 
>>>> less benefit on older chips?
>>>>
>>>> Is this related to your recent changes to allow page tables in 
>>>> system memory?
>>>>
>>>> See my replies to patch 6 and 8. Other than that, the series is
>>>> Acked-by: Felix Kuehling 
>>&

RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

2018-08-28 Thread Xiao, Jack
> Well that is indeed true, but we still have IOMMU between the GPU and the 
> system memory.
Some platform hasn't IOMMU, or IOMMU isn’t enabled by default. 
If such kind of issue was reported by customer, it was so hard to narrow down.

> Additional to that only the system domain, e.g. kernel copies, page table 
> updates etc are allowed to use it.
What was the motivation of this series patch? 
I guess it is for performance improvement, right? But AGP aperture memory is 
translated as mtype=UC same with GART (mtype=UC). 
The more overhead on GART is gpuVM pagetable walker, but I guess the overhead 
is little.

Regards,
Jack

-Original Message-
From: Christian König  
Sent: Tuesday, August 28, 2018 7:42 PM
To: Xiao, Jack ; Koenig, Christian 
; Kuehling, Felix ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

Well that is indeed true, but we still have IOMMU between the GPU and the 
system memory.

So we should still catch issues when something goes terrible wrong.

Additional to that only the system domain, e.g. kernel copies, page table 
updates etc are allowed to use it.

Regards,
Christian.

Am 28.08.2018 um 09:06 schrieb Xiao, Jack:
> I mean it has risk to make GPU allowed to access to most system memory 
> without explicit claiming, it's easier to mask problem.
>
> Regards,
> Jack
>
> -Original Message-
> From: Koenig, Christian
> Sent: Tuesday, August 28, 2018 2:46 PM
> To: Xiao, Jack ; Kuehling, Felix 
> ; Christian König 
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address 
> space on GMC9
>
>> This series patches seems to make AGP aperture allowed to access any system 
>> memory (16GB) bypass GPU VM protection.
> The system aperture should only be active in the system domain, or otherwise 
> applications would have access to local memory as well.
>
> There are some bits in the VM registers to enable/disable that, but when we 
> would have that setting incorrect we would see quite a bunch of other 
> problems.
>
> Might still be a good idea to double check if all the bits are setup 
> correctly.
>
> Regards,
> Christian.
>
> Am 28.08.2018 um 07:31 schrieb Xiao, Jack:
>> This series patches seems to make AGP aperture allowed to access any system 
>> memory (16GB) bypass GPU VM protection.
>> If someone made a wrong logic requesting an illegal address which 
>> occasionally was located inside AGP aperture, but without any VM 
>> protection, the illegal address would be finally translated into a 
>> system memory address; if GPU read/wrote such system memory address, the 
>> system memory address might belong to kernel or any user application, the 
>> r/w operation would result in any unpredictable issue.
>> The most important is that such kind of issue is so hard to be addressed.
>> Is it worth doing this, but exposing risk?
>>
>> Regards,
>> Jack
>>
>> -Original Message-
>> From: amd-gfx  On Behalf Of 
>> Felix Kuehling
>> Sent: Tuesday, August 28, 2018 3:03 AM
>> To: Christian König ;
>> amd-gfx@lists.freedesktop.org; Koenig, Christian 
>> 
>> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address 
>> space on GMC9
>>
>> The point of this series seems to be to allow access to small system memory 
>> BOs (one page) without a GART mapping. I'm guessing that reduces pressure on 
>> the GART and removes the need for HDP and TLB flushes. Why does Patch 10 
>> only enable that on GFXv9? Is there less benefit on older chips?
>>
>> Is this related to your recent changes to allow page tables in system memory?
>>
>> See my replies to patch 6 and 8. Other than that, the series is
>> Acked-by: Felix Kuehling 
>>
>> Regards,
>>     Felix
>>
>>
>> On 2018-08-27 12:53 PM, Christian König wrote:
>>> Only use the lower address space on GMC9 for the system domain.
>>> Otherwise we would need to sign extend GMC addresses.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>>drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
>>>1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> index e44b5191735d..d982956c8329 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> @@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle)
>>> if (r)
>>> return r;
>>>
>>> -   /* Set the i

RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

2018-08-28 Thread Xiao, Jack
I mean it has risk to make GPU allowed to access to most system memory without 
explicit claiming, it's easier to mask problem.

Regards,
Jack

-Original Message-
From: Koenig, Christian 
Sent: Tuesday, August 28, 2018 2:46 PM
To: Xiao, Jack ; Kuehling, Felix ; 
Christian König ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

> This series patches seems to make AGP aperture allowed to access any system 
> memory (16GB) bypass GPU VM protection.
The system aperture should only be active in the system domain, or otherwise 
applications would have access to local memory as well.

There are some bits in the VM registers to enable/disable that, but when we 
would have that setting incorrect we would see quite a bunch of other problems.

Might still be a good idea to double check if all the bits are setup correctly.

Regards,
Christian.

Am 28.08.2018 um 07:31 schrieb Xiao, Jack:
> This series patches seems to make AGP aperture allowed to access any system 
> memory (16GB) bypass GPU VM protection.
> If someone made a wrong logic requesting an illegal address which 
> occasionally was located inside AGP aperture, but without any VM 
> protection, the illegal address would be finally translated into a 
> system memory address; if GPU read/wrote such system memory address, the 
> system memory address might belong to kernel or any user application, the r/w 
> operation would result in any unpredictable issue.
> The most important is that such kind of issue is so hard to be addressed.
> Is it worth doing this, but exposing risk?
>
> Regards,
> Jack
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Felix Kuehling
> Sent: Tuesday, August 28, 2018 3:03 AM
> To: Christian König ; 
> amd-gfx@lists.freedesktop.org; Koenig, Christian 
> 
> Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address 
> space on GMC9
>
> The point of this series seems to be to allow access to small system memory 
> BOs (one page) without a GART mapping. I'm guessing that reduces pressure on 
> the GART and removes the need for HDP and TLB flushes. Why does Patch 10 only 
> enable that on GFXv9? Is there less benefit on older chips?
>
> Is this related to your recent changes to allow page tables in system memory?
>
> See my replies to patch 6 and 8. Other than that, the series is
> Acked-by: Felix Kuehling 
>
> Regards,
>    Felix
>
>
> On 2018-08-27 12:53 PM, Christian König wrote:
>> Only use the lower address space on GMC9 for the system domain.
>> Otherwise we would need to sign extend GMC addresses.
>>
>> Signed-off-by: Christian König 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> index e44b5191735d..d982956c8329 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> @@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle)
>>  if (r)
>>  return r;
>>   
>> -/* Set the internal MC address mask
>> - * This is the max address of the GPU's
>> - * internal address space.
>> +/* Use only the lower range for the internal MC address mask. This is
>> + * the max address of the GPU's internal address space.
>>   */
>> -adev->gmc.mc_mask = 0xULL; /* 48 bit MC */
>> +adev->gmc.mc_mask = 0x7fffULL;
>>   
>>  /* set DMA mask + need_dma32 flags.
>>   * PCIE - can handle 44-bits.
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

2018-08-27 Thread Xiao, Jack
This series patches seems to make AGP aperture allowed to access any system 
memory (16GB) bypass GPU VM protection. 
If someone made a wrong logic requesting an illegal address which occasionally 
was located inside AGP aperture, 
but without any VM protection, the illegal address would be finally translated 
into a system memory address; 
if GPU read/wrote such system memory address, the system memory address might 
belong to kernel or any user application, 
the r/w operation would result in any unpredictable issue. 
The most important is that such kind of issue is so hard to be addressed.
Is it worth doing this, but exposing risk?

Regards,
Jack

-Original Message-
From: amd-gfx  On Behalf Of Felix 
Kuehling
Sent: Tuesday, August 28, 2018 3:03 AM
To: Christian König ; 
amd-gfx@lists.freedesktop.org; Koenig, Christian 
Subject: Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

The point of this series seems to be to allow access to small system memory BOs 
(one page) without a GART mapping. I'm guessing that reduces pressure on the 
GART and removes the need for HDP and TLB flushes. Why does Patch 10 only 
enable that on GFXv9? Is there less benefit on older chips?

Is this related to your recent changes to allow page tables in system memory?

See my replies to patch 6 and 8. Other than that, the series is
Acked-by: Felix Kuehling 

Regards,
  Felix


On 2018-08-27 12:53 PM, Christian König wrote:
> Only use the lower address space on GMC9 for the system domain.
> Otherwise we would need to sign extend GMC addresses.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index e44b5191735d..d982956c8329 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle)
>   if (r)
>   return r;
>  
> - /* Set the internal MC address mask
> -  * This is the max address of the GPU's
> -  * internal address space.
> + /* Use only the lower range for the internal MC address mask. This is
> +  * the max address of the GPU's internal address space.
>*/
> - adev->gmc.mc_mask = 0xULL; /* 48 bit MC */
> + adev->gmc.mc_mask = 0x7fffULL;
>  
>   /* set DMA mask + need_dma32 flags.
>* PCIE - can handle 44-bits.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx