On 11/15/2024 3:28 PM, Victor Zhao wrote:
> In a consecutive packet submission, for example unmap and query status,
> when CP is reading wptr caused by unmap packet doorbell ring, if in some
> case CP operates slower (e.g. doorbell_mode=1) and wptr has been updated
> to next packet (query status), but the query status packet content has
> not been flushed to memory yet, it will cause CP fetched stalled data.
>
> Adding mb to ensure ring buffer has been updated before updating wptr.
> Also adding a mb to ensure wptr updated before doorbell ring.
>
> Signed-off-by: Victor Zhao <[email protected]>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index 4843dcb9a5f7..55d18aed257b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -306,12 +306,17 @@ int kq_submit_packet(struct kernel_queue *kq)
> if (amdgpu_amdkfd_is_fed(kq->dev->adev))
> return -EIO;
>
> + /* Make sure ring buffer is updated before wptr updated */
> + mb();
> +
Maybe add a specific comment here to indicate this is especially needed
in DOORBELL_MODE=1 when CP fetches value from WPTR memory instead of
doorbell packet.
Reviewed-by: Lijo Lazar <[email protected]>
Thanks,
Lijo
> if (kq->dev->kfd->device_info.doorbell_size == 8) {
> *kq->wptr64_kernel = kq->pending_wptr64;
> + mb(); /* Make sure wptr updated before ring doorbell */
> write_kernel_doorbell64(kq->queue->properties.doorbell_ptr,
> kq->pending_wptr64);
> } else {
> *kq->wptr_kernel = kq->pending_wptr;
> + mb(); /* Make sure wptr updated before ring doorbell */
> write_kernel_doorbell(kq->queue->properties.doorbell_ptr,
> kq->pending_wptr);
> }