RE: [PATCH] drm/amdgpu: remove distinction between explicit and implicit sync (v2)

2020-06-10 Thread Zhou, David(ChunMing)
[AMD Official Use Only - Internal Distribution Only]

Not sue if this is right direction, I think usermode wants all synchronizations 
to be explicit. Implicit sync often confuses people who don't know its history. 
I remember Jason from Intel  is driving explicit synchronization through the 
Linux ecosystem, which even removes implicit sync of shared buffer.

-David

From: amd-gfx  On Behalf Of Marek Olšák
Sent: Tuesday, June 9, 2020 6:58 PM
To: amd-gfx mailing list 
Subject: [PATCH] drm/amdgpu: remove distinction between explicit and implicit 
sync (v2)

Hi,

This enables a full pipeline sync for implicit sync. It's Christian's patch 
with the driver version bumped. With this, user mode drivers don't have to wait 
for idle at the end of gfx IBs.

Any concerns?

Thanks,
Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

2020-02-21 Thread Zhou, David(ChunMing)
[AMD Official Use Only - Internal Distribution Only]

That's fine to me.

-David

From: Koenig, Christian 
Sent: Friday, February 21, 2020 11:33 PM
To: Deucher, Alexander ; Christian König 
; Zhou, David(ChunMing) 
; He, Jacob ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

I would just do this as part of the vm_flush() callback on the ring.

E.g. check if the VMID you want to flush is reserved and if yes enable SPM.

Maybe pass along a flag or something in the job to make things easier.

Christian.

Am 21.02.20 um 16:31 schrieb Deucher, Alexander:

[AMD Public Use]

We already have the RESERVE_VMID ioctl interface, can't we just use that 
internally in the kernel to update the rlc register via the ring when we 
schedule the relevant IB?  E.g., add a new ring callback to set SPM state and 
then set it to the reserved vmid before we schedule the ib, and then reset it 
to 0 after the IB in amdgpu_ib_schedule().

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 4b2342d11520..e0db9362c6ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -185,6 +185,9 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
if (ring->funcs->insert_start)
ring->funcs->insert_start(ring);

+   if (ring->funcs->setup_spm)
+   ring->funcs->setup_spm(ring, job);
+
if (job) {
r = amdgpu_vm_flush(ring, job, need_pipe_sync);
if (r) {
@@ -273,6 +276,9 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
return r;
}

+   if (ring->funcs->setup_spm)
+   ring->funcs->setup_spm(ring, NULL);
+
if (ring->funcs->insert_end)
ring->funcs->insert_end(ring);



Alex

From: amd-gfx 
<mailto:amd-gfx-boun...@lists.freedesktop.org>
 on behalf of Christian König 
<mailto:ckoenig.leichtzumer...@gmail.com>
Sent: Friday, February 21, 2020 5:28 AM
To: Zhou, David(ChunMing) <mailto:david1.z...@amd.com>; 
He, Jacob <mailto:jacob...@amd.com>; Koenig, Christian 
<mailto:christian.koe...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

That would probably be a no-go, but we could enhance the kernel driver to 
update the RLC_SPM_VMID register with the reserved VMID.

Handling that in userspace is most likely not working anyway, since the RLC 
registers are usually not accessible by userspace.

Regards,
Christian.

Am 20.02.20 um 16:15 schrieb Zhou, David(ChunMing):

[AMD Official Use Only - Internal Distribution Only]



You can enhance amdgpu_vm_ioctl In amdgpu_vm.c to return vmid to userspace.



-David





From: He, Jacob <mailto:jacob...@amd.com>
Sent: Thursday, February 20, 2020 10:46 PM
To: Zhou, David(ChunMing) <mailto:david1.z...@amd.com>; 
Koenig, Christian <mailto:christian.koe...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace



amdgpu_vm_reserve_vmid doesn't return the reserved vmid back to user space. 
There is no chance for user mode driver to update RLC_SPM_VMID.



Thanks

Jacob



From: He, Jacob<mailto:jacob...@amd.com>
Sent: Thursday, February 20, 2020 6:20 PM
To: Zhou, David(ChunMing)<mailto:david1.z...@amd.com>; Koenig, 
Christian<mailto:christian.koe...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace



Looks like amdgpu_vm_reserve_vmid could work, let me have a try to update the 
RLC_SPM_VMID with pm4 packets in UMD.



Thanks

Jacob



From: Zhou, David(ChunMing)<mailto:david1.z...@amd.com>
Sent: Thursday, February 20, 2020 10:13 AM
To: Koenig, Christian<mailto:christian.koe...@amd.com>; He, 
Jacob<mailto:jacob...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace



[AMD Official Use Only - Internal Distribution Only]

Christian is right here, that will cause many problems for simply using VMID in 
kernel.
We already have an pair interface for RGP, I think you can use it instead of 
involving additional kernel change.
amdgpu_vm_reserve_vmid/ amdgpu_vm_unreserve_vmid.

-David

-Original Message-
From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Christian König
Sent: Wednesday, February 19, 2020 7:03 PM
To: He, Jacob mailto:jacob...@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

2020-02-20 Thread Zhou, David(ChunMing)
[AMD Official Use Only - Internal Distribution Only]

You can enhance amdgpu_vm_ioctl In amdgpu_vm.c to return vmid to userspace.

-David


From: He, Jacob 
Sent: Thursday, February 20, 2020 10:46 PM
To: Zhou, David(ChunMing) ; Koenig, Christian 
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

amdgpu_vm_reserve_vmid doesn't return the reserved vmid back to user space. 
There is no chance for user mode driver to update RLC_SPM_VMID.

Thanks
Jacob

From: He, Jacob<mailto:jacob...@amd.com>
Sent: Thursday, February 20, 2020 6:20 PM
To: Zhou, David(ChunMing)<mailto:david1.z...@amd.com>; Koenig, 
Christian<mailto:christian.koe...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

Looks like amdgpu_vm_reserve_vmid could work, let me have a try to update the 
RLC_SPM_VMID with pm4 packets in UMD.

Thanks
Jacob

From: Zhou, David(ChunMing)<mailto:david1.z...@amd.com>
Sent: Thursday, February 20, 2020 10:13 AM
To: Koenig, Christian<mailto:christian.koe...@amd.com>; He, 
Jacob<mailto:jacob...@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

[AMD Official Use Only - Internal Distribution Only]

Christian is right here, that will cause many problems for simply using VMID in 
kernel.
We already have an pair interface for RGP, I think you can use it instead of 
involving additional kernel change.
amdgpu_vm_reserve_vmid/ amdgpu_vm_unreserve_vmid.

-David

-Original Message-
From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Christian König
Sent: Wednesday, February 19, 2020 7:03 PM
To: He, Jacob mailto:jacob...@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

Am 19.02.20 um 11:15 schrieb Jacob He:
> [WHY]
> When SPM trace enabled, SPM_VMID should be updated with the current
> vmid.
>
> [HOW]
> Add a chunk id, AMDGPU_CHUNK_ID_SPM_TRACE, so that UMD can tell us
> which job should update SPM_VMID.
> Right before a job is submitted to GPU, set the SPM_VMID accordingly.
>
> [Limitation]
> Running more than one SPM trace enabled processes simultaneously is
> not supported.

Well there are multiple problems with that patch.

First of all you need to better describe what SPM tracing is in the commit 
message.

Then the updating of mmRLC_SPM_MC_CNTL must be executed asynchronously on the 
ring. Otherwise we might corrupt an already executing SPM trace.

And you also need to make sure to disable the tracing again or otherwise we run 
into a bunch of trouble when the VMID is reused.

You also need to make sure that IBs using the SPM trace are serialized with 
each other, e.g. hack into amdgpu_ids.c file and make sure that only one VMID 
at a time can have that attribute.

Regards,
Christian.

>
> Change-Id: Ic932ef6ac9dbf244f03aaee90550e8ff3a675666
> Signed-off-by: Jacob He mailto:jacob...@amd.com>>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 10 +++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 15 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 15 ++-
>   8 files changed, 48 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index f9fa6e104fef..3f32c4db5232 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -113,6 +113,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
> *p, union drm_amdgpu_cs
>uint32_t uf_offset = 0;
>int i;
>int ret;
> + bool update_spm_vmid = false;
>
>if (cs->in.num_chunks == 0)
>return 0;
> @@ -221,6 +222,10 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
> *p, union drm_amdgpu_cs
>case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL:
>break;
>
> + case AMDGPU_CHUNK_ID_SPM_TRACE:
> + update_spm_vmid = true;
> + break;
> +
>default:
>ret = -EINVAL;
>goto free_partial_kdata;
> @@ -231,6 +236,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
> *p, union drm_amdgpu_cs
>if (ret)
>goto free_all_kdata;
>
> + p->job->ne

RE: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

2020-02-19 Thread Zhou, David(ChunMing)
[AMD Official Use Only - Internal Distribution Only]

Christian is right here, that will cause many problems for simply using VMID in 
kernel.
We already have an pair interface for RGP, I think you can use it instead of 
involving additional kernel change.
amdgpu_vm_reserve_vmid/ amdgpu_vm_unreserve_vmid.

-David

-Original Message-
From: amd-gfx  On Behalf Of Christian 
König
Sent: Wednesday, February 19, 2020 7:03 PM
To: He, Jacob ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Add a chunk ID for spm trace

Am 19.02.20 um 11:15 schrieb Jacob He:
> [WHY]
> When SPM trace enabled, SPM_VMID should be updated with the current 
> vmid.
>
> [HOW]
> Add a chunk id, AMDGPU_CHUNK_ID_SPM_TRACE, so that UMD can tell us 
> which job should update SPM_VMID.
> Right before a job is submitted to GPU, set the SPM_VMID accordingly.
>
> [Limitation]
> Running more than one SPM trace enabled processes simultaneously is 
> not supported.

Well there are multiple problems with that patch.

First of all you need to better describe what SPM tracing is in the commit 
message.

Then the updating of mmRLC_SPM_MC_CNTL must be executed asynchronously on the 
ring. Otherwise we might corrupt an already executing SPM trace.

And you also need to make sure to disable the tracing again or otherwise we run 
into a bunch of trouble when the VMID is reused.

You also need to make sure that IBs using the SPM trace are serialized with 
each other, e.g. hack into amdgpu_ids.c file and make sure that only one VMID 
at a time can have that attribute.

Regards,
Christian.

>
> Change-Id: Ic932ef6ac9dbf244f03aaee90550e8ff3a675666
> Signed-off-by: Jacob He 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 10 +++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 15 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 15 ++-
>   8 files changed, 48 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index f9fa6e104fef..3f32c4db5232 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -113,6 +113,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
> *p, union drm_amdgpu_cs
>   uint32_t uf_offset = 0;
>   int i;
>   int ret;
> + bool update_spm_vmid = false;
>   
>   if (cs->in.num_chunks == 0)
>   return 0;
> @@ -221,6 +222,10 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
> *p, union drm_amdgpu_cs
>   case AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL:
>   break;
>   
> + case AMDGPU_CHUNK_ID_SPM_TRACE:
> + update_spm_vmid = true;
> + break;
> +
>   default:
>   ret = -EINVAL;
>   goto free_partial_kdata;
> @@ -231,6 +236,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
> *p, union drm_amdgpu_cs
>   if (ret)
>   goto free_all_kdata;
>   
> + p->job->need_update_spm_vmid = update_spm_vmid;
> +
>   if (p->ctx->vram_lost_counter != p->job->vram_lost_counter) {
>   ret = -ECANCELED;
>   goto free_all_kdata;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index cae81914c821..36faab12b585 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -156,9 +156,13 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, 
> unsigned num_ibs,
>   return -EINVAL;
>   }
>   
> - if (vm && !job->vmid) {
> - dev_err(adev->dev, "VM IB without ID\n");
> - return -EINVAL;
> + if (vm) {
> + if (!job->vmid) {
> + dev_err(adev->dev, "VM IB without ID\n");
> + return -EINVAL;
> + } else if (adev->gfx.rlc.funcs->update_spm_vmid && 
> job->need_update_spm_vmid) {
> + adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
> + }
>   }
>   
>   alloc_size = ring->funcs->emit_frame_size + num_ibs * diff --git 
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> index 2e2110dddb76..4582536961c7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
> @@ -52,6 +52,7 @@ struct amdgpu_job {
>   boolvm_needs_flush;
>   uint64_tvm_pd_addr;
>   unsignedvmid;
> + boolneed_update_spm_vmid;
>   unsignedpasid;
>   uint32_tgds_base, gds_size;
>  

Re: [PATCH] drm/ttm: use the parent resv for ghost objects v2

2019-10-24 Thread Zhou, David(ChunMing)

On 2019/10/24 下午6:25, Christian König wrote:
> Ping?
>
> Am 18.10.19 um 13:58 schrieb Christian König:
>> This way the TTM is destroyed with the correct dma_resv object
>> locked and we can even pipeline imported BO evictions.
>>
>> v2: Limit this to only cases when the parent object uses a separate
>>  reservation object as well. This fixes another OOM problem.
>>
>> Signed-off-by: Christian König 
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +---
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index e030c27f53cf..45e440f80b7b 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -512,7 +512,9 @@ static int ttm_buffer_object_transfer(struct 
>> ttm_buffer_object *bo,
>>   kref_init(>base.kref);
>>   fbo->base.destroy = _transfered_destroy;
>>   fbo->base.acc_size = 0;
>> -    fbo->base.base.resv = >base.base._resv;
>> +    if (bo->base.resv == >base._resv)
>> +    fbo->base.base.resv = >base.base._resv;
>> +
>>   dma_resv_init(fbo->base.base.resv);

Doesn't this lead to issue if you force to init parent resv? Otherwise 
how to deal with if parent->resv is locking?


>>   ret = dma_resv_trylock(fbo->base.base.resv);
>>   WARN_ON(!ret);
>> @@ -711,7 +713,7 @@ int ttm_bo_move_accel_cleanup(struct 
>> ttm_buffer_object *bo,
>>   if (ret)
>>   return ret;
>>   -    dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
>> +    dma_resv_add_excl_fence(_obj->base._resv, fence);
>>     /**
>>    * If we're not moving to fixed memory, the TTM object
>> @@ -724,7 +726,7 @@ int ttm_bo_move_accel_cleanup(struct 
>> ttm_buffer_object *bo,
>>   else
>>   bo->ttm = NULL;
>>   -    ttm_bo_unreserve(ghost_obj);
>> +    dma_resv_unlock(_obj->base._resv);

fbo->base.base.resv?

-David

>>   ttm_bo_put(ghost_obj);
>>   }
>>   @@ -767,7 +769,7 @@ int ttm_bo_pipeline_move(struct 
>> ttm_buffer_object *bo,
>>   if (ret)
>>   return ret;
>>   -    dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
>> +    dma_resv_add_excl_fence(_obj->base._resv, fence);
>>     /**
>>    * If we're not moving to fixed memory, the TTM object
>> @@ -780,7 +782,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object 
>> *bo,
>>   else
>>   bo->ttm = NULL;
>>   -    ttm_bo_unreserve(ghost_obj);
>> +    dma_resv_unlock(_obj->base._resv);
>>   ttm_bo_put(ghost_obj);
>>     } else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) {
>> @@ -836,7 +838,7 @@ int ttm_bo_pipeline_gutting(struct 
>> ttm_buffer_object *bo)
>>   if (ret)
>>   return ret;
>>   -    ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv);
>> +    ret = dma_resv_copy_fences(>base._resv, bo->base.resv);
>>   /* Last resort, wait for the BO to be idle when we are OOM */
>>   if (ret)
>>   ttm_bo_wait(bo, false, false);
>> @@ -845,7 +847,7 @@ int ttm_bo_pipeline_gutting(struct 
>> ttm_buffer_object *bo)
>>   bo->mem.mem_type = TTM_PL_SYSTEM;
>>   bo->ttm = NULL;
>>   -    ttm_bo_unreserve(ghost);
>> +    dma_resv_unlock(>base._resv);
>>   ttm_bo_put(ghost);
>>     return 0;
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: remove gfx9 NGG

2019-09-19 Thread Zhou, David(ChunMing)
+Alex Yan to confirm which doesn't affect us.



-Original Message-
From: amd-gfx  On Behalf Of Marek Olšák
Sent: Friday, September 20, 2019 10:16 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: remove gfx9 NGG

From: Marek Olšák 

Never used.

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |   5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  41 -  
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  25 ---  
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |  11 --
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 195 
 5 files changed, 277 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6ff02bb60140..80116e63e209 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -140,25 +140,20 @@ extern int amdgpu_dc;  extern int amdgpu_sched_jobs;  
extern int amdgpu_sched_hw_submission;  extern uint amdgpu_pcie_gen_cap;  
extern uint amdgpu_pcie_lane_cap;  extern uint amdgpu_cg_mask;  extern uint 
amdgpu_pg_mask;  extern uint amdgpu_sdma_phase_quantum;  extern char 
*amdgpu_disable_cu;  extern char *amdgpu_virtual_display;  extern uint 
amdgpu_pp_feature_mask; -extern int amdgpu_ngg; -extern int 
amdgpu_prim_buf_per_se; -extern int amdgpu_pos_buf_per_se; -extern int 
amdgpu_cntl_sb_buf_per_se; -extern int amdgpu_param_buf_per_se;  extern int 
amdgpu_job_hang_limit;  extern int amdgpu_lbpw;  extern int 
amdgpu_compute_multipipe;  extern int amdgpu_gpu_recovery;  extern int 
amdgpu_emu_mode;  extern uint amdgpu_smu_memory_pool_size;  extern uint 
amdgpu_dc_feature_mask;  extern uint amdgpu_dm_abm_level;  extern struct 
amdgpu_mgpu_info mgpu_info;  extern int amdgpu_ras_enable; diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index b49ed39c1fea..cbe4ef4813f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -119,25 +119,20 @@ int amdgpu_sched_jobs = 32;  int 
amdgpu_sched_hw_submission = 2;  uint amdgpu_pcie_gen_cap = 0;  uint 
amdgpu_pcie_lane_cap = 0;  uint amdgpu_cg_mask = 0x;  uint 
amdgpu_pg_mask = 0x;  uint amdgpu_sdma_phase_quantum = 32;  char 
*amdgpu_disable_cu = NULL;  char *amdgpu_virtual_display = NULL;
 /* OverDrive(bit 14) disabled by default*/  uint amdgpu_pp_feature_mask = 
0xbfff; -int amdgpu_ngg = 0; -int amdgpu_prim_buf_per_se = 0; -int 
amdgpu_pos_buf_per_se = 0; -int amdgpu_cntl_sb_buf_per_se = 0; -int 
amdgpu_param_buf_per_se = 0;  int amdgpu_job_hang_limit = 0;  int amdgpu_lbpw = 
-1;  int amdgpu_compute_multipipe = -1;  int amdgpu_gpu_recovery = -1; /* auto 
*/  int amdgpu_emu_mode = 0;  uint amdgpu_smu_memory_pool_size = 0;
 /* FBC (bit 0) disabled by default*/
 uint amdgpu_dc_feature_mask = 0;
 int amdgpu_async_gfx_ring = 1;
 int amdgpu_mcbp = 0;
@@ -443,56 +438,20 @@ module_param_named(disable_cu, amdgpu_disable_cu, charp, 
0444);
  * DOC: virtual_display (charp)
  * Set to enable virtual display feature. This feature provides a virtual 
display hardware on headless boards
  * or in virtualized environments. It will be set like 
:xx:xx.x,x;:xx:xx.x,x. It's the pci address of
  * the device, plus the number of crtcs to expose. E.g., :26:00.0,4 would 
enable 4 virtual crtcs on the pci
  * device at 26:00.0. The default is NULL.
  */
 MODULE_PARM_DESC(virtual_display,
 "Enable virtual display feature (the virtual_display will be 
set like :xx:xx.x,x;:xx:xx.x,x)");  module_param_named(virtual_display, 
amdgpu_virtual_display, charp, 0444);
 
-/**
- * DOC: ngg (int)
- * Set to enable Next Generation Graphics (1 = enable). The default is 0 
(disabled).
- */
-MODULE_PARM_DESC(ngg, "Next Generation Graphics (1 = enable, 0 = 
disable(default depending on gfx))"); -module_param_named(ngg, amdgpu_ngg, int, 
0444);
-
-/**
- * DOC: prim_buf_per_se (int)
- * Override the size of Primitive Buffer per Shader Engine in Byte. The 
default is 0 (depending on gfx).
- */
-MODULE_PARM_DESC(prim_buf_per_se, "the size of Primitive Buffer per Shader 
Engine (default depending on gfx)"); -module_param_named(prim_buf_per_se, 
amdgpu_prim_buf_per_se, int, 0444);
-
-/**
- * DOC: pos_buf_per_se (int)
- * Override the size of Position Buffer per Shader Engine in Byte. The default 
is 0 (depending on gfx).
- */
-MODULE_PARM_DESC(pos_buf_per_se, "the size of Position Buffer per Shader 
Engine (default depending on gfx)"); -module_param_named(pos_buf_per_se, 
amdgpu_pos_buf_per_se, int, 0444);
-
-/**
- * DOC: cntl_sb_buf_per_se (int)
- * Override the size of Control Sideband per Shader Engine in Byte. The 
default is 0 (depending on gfx).
- */
-MODULE_PARM_DESC(cntl_sb_buf_per_se, "the size of Control Sideband per Shader 
Engine (default depending on gfx)"); -module_param_named(cntl_sb_buf_per_se, 
amdgpu_cntl_sb_buf_per_se, int, 0444);
-
-/**
- * DOC: param_buf_per_se (int)
- * Override the size of Off-Chip Parameter 

Re:[PATCH] drm/amdgpu: resvert "disable bulk moves for now"

2019-09-12 Thread Zhou, David(ChunMing)
I dont know dkms status,anyway, we should submit this one as early as possible.

 原始邮件 
主题:Re: [PATCH] drm/amdgpu: resvert "disable bulk moves for now"
发件人:Christian König
收件人:"Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org
抄送:

Just to double check: We do have that enabled in the DKMS package for a
while and doesn't encounter any more problems with it, correct?

Thanks,
Christian.

Am 12.09.19 um 16:02 schrieb Chunming Zhou:
> RB on it to go ahead.
>
> -David
>
> 在 2019/9/12 18:15, Christian König 写道:
>> This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.
>>
>> The changes to fix this should have landed in 5.1.
>>
>> Signed-off-by: Christian König 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 --
>>1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 48349e4f0701..fd3fbaa73fa3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device 
>> *adev,
>>   struct ttm_bo_global *glob = adev->mman.bdev.glob;
>>   struct amdgpu_vm_bo_base *bo_base;
>>
>> -#if 0
>>   if (vm->bulk_moveable) {
>>   spin_lock(>lru_lock);
>>   ttm_bo_bulk_move_lru_tail(>lru_bulk_move);
>>   spin_unlock(>lru_lock);
>>   return;
>>   }
>> -#endif
>>
>>   memset(>lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
>>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables

2019-09-03 Thread Zhou, David(ChunMing)
Do you need update the vram size reported to UMD ?

-David

-Original Message-
From: amd-gfx  On Behalf Of Christian 
König
Sent: Monday, September 2, 2019 6:52 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables

This hopefully helps reduce the contention for page tables.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h   | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 9 +++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 2eda3a8c330d..3352a87b822e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -99,6 +99,9 @@ struct amdgpu_bo_list_entry;
 #define AMDGPU_VM_FAULT_STOP_FIRST 1
 #define AMDGPU_VM_FAULT_STOP_ALWAYS2
 
+/* Reserve 4MB VRAM for page tables */
+#define AMDGPU_VM_RESERVED_VRAM(4ULL << 20)
+
 /* max number of VMHUB */
 #define AMDGPU_MAX_VMHUBS  3
 #define AMDGPU_GFXHUB_00
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 1150e34bc28f..59440f71d304 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -24,6 +24,7 @@
 
 #include 
 #include "amdgpu.h"
+#include "amdgpu_vm.h"
 
 struct amdgpu_vram_mgr {
struct drm_mm mm;
@@ -276,7 +277,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager 
*man,
struct drm_mm_node *nodes;
enum drm_mm_insert_mode mode;
unsigned long lpfn, num_nodes, pages_per_node, pages_left;
-   uint64_t vis_usage = 0, mem_bytes;
+   uint64_t vis_usage = 0, mem_bytes, max_bytes;
unsigned i;
int r;
 
@@ -284,9 +285,13 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager 
*man,
if (!lpfn)
lpfn = man->size;
 
+   max_bytes = adev->gmc.mc_vram_size;
+   if (tbo->type != ttm_bo_type_kernel)
+   max_bytes -= AMDGPU_VM_RESERVED_VRAM;
+
/* bail out quickly if there's likely not enough VRAM for this BO */
mem_bytes = (u64)mem->num_pages << PAGE_SHIFT;
-   if (atomic64_add_return(mem_bytes, >usage) > 
adev->gmc.mc_vram_size) {
+   if (atomic64_add_return(mem_bytes, >usage) > max_bytes) {
atomic64_sub(mem_bytes, >usage);
mem->mm_node = NULL;
return 0;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3

2019-05-29 Thread Zhou, David(ChunMing)
Patch #1,#5,#6,#8,#9,#10 are Reviewed-by: Chunming Zhou 
Patch #2,#3,#4 are Acked-by: Chunming Zhou 

-David

> -Original Message-
> From: dri-devel  On Behalf Of
> Christian K?nig
> Sent: Wednesday, May 29, 2019 8:27 PM
> To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
> 
> This avoids OOM situations when we have lots of threads submitting at the
> same time.
> 
> v3: apply this to the whole driver, not just CS
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c| 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 20f2955d2a55..3e2da24cd17a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct
> amdgpu_cs_parser *p,
>   }
> 
>   r = ttm_eu_reserve_buffers(>ticket, >validated, true,
> -, true);
> +, false);
>   if (unlikely(r != 0)) {
>   if (r != -ERESTARTSYS)
>   DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> index 06f83cac0d3a..f660628e6af9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device
> *adev, struct amdgpu_vm *vm,
>   list_add(_tv.head, );
>   amdgpu_vm_get_pd_bo(vm, , );
> 
> - r = ttm_eu_reserve_buffers(, , true, NULL, true);
> + r = ttm_eu_reserve_buffers(, , true, NULL, false);
>   if (r) {
>   DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
>   return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index d513a5ad03dd..ed25a4e14404 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct
> drm_gem_object *obj,
> 
>   amdgpu_vm_get_pd_bo(vm, , _pd);
> 
> - r = ttm_eu_reserve_buffers(, , false, , true);
> + r = ttm_eu_reserve_buffers(, , false, , false);
>   if (r) {
>   dev_err(adev->dev, "leaking bo va because "
>   "we fail to reserve bo (%d)\n", r);
> @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev,
> void *data,
> 
>   amdgpu_vm_get_pd_bo(>vm, , _pd);
> 
> - r = ttm_eu_reserve_buffers(, , true, , true);
> + r = ttm_eu_reserve_buffers(, , true, , false);
>   if (r)
>   goto error_unref;
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index c430e8259038..d60593cc436e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct
> amdgpu_bo *bo, bool no_intr)
>   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   int r;
> 
> - r = ttm_bo_reserve(>tbo, !no_intr, false, NULL);
> + r = __ttm_bo_reserve(>tbo, !no_intr, false, NULL);
>   if (unlikely(r != 0)) {
>   if (r != -ERESTARTSYS)
>   dev_err(adev->dev, "%p reserve failed\n", bo);
> --
> 2.17.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 01/11] drm/ttm: Make LRU removal optional.

2019-05-17 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Tuesday, May 14, 2019 8:31 PM
> To: Olsak, Marek ; Zhou, David(ChunMing)
> ; Liang, Prike ; dri-
> de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: [PATCH 01/11] drm/ttm: Make LRU removal optional.
> 
> [CAUTION: External Email]
> 
> We are already doing this for DMA-buf imports and also for amdgpu VM BOs
> for quite a while now.
> 
> If this doesn't run into any problems we are probably going to stop removing
> BOs from the LRU altogether.
> 
> Signed-off-by: Christian König 
> ---
[snip]
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index 0075eb9a0b52..957ec375a4ba 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -69,7 +69,8 @@ void ttm_eu_backoff_reservation(struct
> ww_acquire_ctx *ticket,
> list_for_each_entry(entry, list, head) {
> struct ttm_buffer_object *bo = entry->bo;
> 
> -   ttm_bo_add_to_lru(bo);
> +   if (list_empty(>lru))
> +   ttm_bo_add_to_lru(bo);
> reservation_object_unlock(bo->resv);
> }
> spin_unlock(>lru_lock);
> @@ -93,7 +94,7 @@ EXPORT_SYMBOL(ttm_eu_backoff_reservation);
> 
>  int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
>struct list_head *list, bool intr,
> -  struct list_head *dups)
> +  struct list_head *dups, bool del_lru)
>  {
> struct ttm_bo_global *glob;
> struct ttm_validate_buffer *entry; @@ -172,11 +173,11 @@ int
> ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
> list_add(>head, list);
> }
> 
> -   if (ticket)
> -   ww_acquire_done(ticket);
> -   spin_lock(>lru_lock);
> -   ttm_eu_del_from_lru_locked(list);
> -   spin_unlock(>lru_lock);
> +   if (del_lru) {
> +   spin_lock(>lru_lock);
> +   ttm_eu_del_from_lru_locked(list);
> +   spin_unlock(>lru_lock);
> +   }

Can you make bo to lru tail here when del_lru is false?

Busy iteration in evict_first will try other process Bos first, which could 
save loop time.

> return 0;
>  }
>  EXPORT_SYMBOL(ttm_eu_reserve_buffers);
> @@ -203,7 +204,10 @@ void ttm_eu_fence_buffer_objects(struct
> ww_acquire_ctx *ticket,
> reservation_object_add_shared_fence(bo->resv, fence);
> else
> reservation_object_add_excl_fence(bo->resv, fence);
> -   ttm_bo_add_to_lru(bo);
> +   if (list_empty(>lru))
> +   ttm_bo_add_to_lru(bo);
> +   else
> +   ttm_bo_move_to_lru_tail(bo, NULL);

If this line is done in above, then we don't need this here.

-David
> reservation_object_unlock(bo->resv);
> }
> spin_unlock(>lru_lock);
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> index 161b80fee492..5cffaa24259f 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> @@ -63,7 +63,7 @@ static int virtio_gpu_object_list_validate(struct
> ww_acquire_ctx *ticket,
> struct virtio_gpu_object *qobj;
> int ret;
> 
> -   ret = ttm_eu_reserve_buffers(ticket, head, true, NULL);
> +   ret = ttm_eu_reserve_buffers(ticket, head, true, NULL, true);
> if (ret != 0)
> return ret;
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> index a7c30e567f09..d28cbedba0b5 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> @@ -465,7 +465,8 @@ vmw_resource_check_buffer(struct ww_acquire_ctx
> *ticket,
> val_buf->bo = >backup->base;
> val_buf->num_shared = 0;
> list_add_tail(_buf->head, _list);
> -   ret = ttm_eu_reserve_buffers(ticket, _list, interruptible, NULL);
> +   ret = ttm_eu_reserve_buffers(ticket, _list, interruptible, NULL,
> +true);
> if (unlikely(ret != 0))
> goto out_no_reserve;
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> index 3b396fea40d7..ac435b51f4eb 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.h
> @@ -165,7 +165,7 @@ vmw_validation_bo_reserve

Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-15 Thread Zhou, David(ChunMing)
Ah, sorry, I missed  "+  ttm_bo_move_to_lru_tail(bo, 
NULL);".

Right, moving them to end before releasing is fixing my concern.

Sorry for noise.
-David


 Original Message 
Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: "Koenig, Christian"
To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org
CC:

[CAUTION: External Email]
BO list? No, we stop removing them from the LRU.

But we still move them to the end of the LRU before releasing them.

Christian.

Am 15.05.19 um 16:21 schrieb Zhou, David(ChunMing):
Isn't this patch trying to stop removing for all BOs  from bo list?

-David

 Original Message 
Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Olsak, Marek" ,"Liang, 
Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org>
CC:

[CAUTION: External Email]
That is a good point, but actually not a problem in practice.

See the change to ttm_eu_fence_buffer_objects:
-   ttm_bo_add_to_lru(bo);
+   if (list_empty(>lru))
+   ttm_bo_add_to_lru(bo);
+   else
+   ttm_bo_move_to_lru_tail(bo, NULL);

We still move the BOs to the end of the LRU in the same order we have before, 
we just don't remove them when they are reserved.

Regards,
Christian.

Am 14.05.19 um 16:31 schrieb Zhou, David(ChunMing):
how to refresh LRU to keep the order align with bo list passed from user space?

you can verify it by some games, performance could be different much between 
multiple runnings.

-David

 Original Message 
Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org>
CC:

[CAUTION: External Email]
Hui? What do you mean with that?

Christian.

Am 14.05.19 um 15:12 schrieb Zhou, David(ChunMing):
my only concern is how to fresh LRU when bo is from bo list.

-David

 Original Message ----
Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org>
CC:

[CAUTION: External Email]

This avoids OOM situations when we have lots of threads
submitting at the same time.

Signed-off-by: Christian König 
<mailto:christian.koe...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fff558cf385b..f9240a94217b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}

r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  , true);
+  , false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
--
2.17.1





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-15 Thread Zhou, David(ChunMing)
Isn't this patch trying to stop removing for all BOs  from bo list?

-David

 Original Message 
Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Olsak, Marek" ,"Liang, 
Prike" ,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org
CC:

[CAUTION: External Email]
That is a good point, but actually not a problem in practice.

See the change to ttm_eu_fence_buffer_objects:
-   ttm_bo_add_to_lru(bo);
+   if (list_empty(>lru))
+   ttm_bo_add_to_lru(bo);
+   else
+   ttm_bo_move_to_lru_tail(bo, NULL);

We still move the BOs to the end of the LRU in the same order we have before, 
we just don't remove them when they are reserved.

Regards,
Christian.

Am 14.05.19 um 16:31 schrieb Zhou, David(ChunMing):
how to refresh LRU to keep the order align with bo list passed from user space?

you can verify it by some games, performance could be different much between 
multiple runnings.

-David

 Original Message 
Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org>
CC:

[CAUTION: External Email]
Hui? What do you mean with that?

Christian.

Am 14.05.19 um 15:12 schrieb Zhou, David(ChunMing):
my only concern is how to fresh LRU when bo is from bo list.

-David

 Original Message 
Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org>
CC:

[CAUTION: External Email]

This avoids OOM situations when we have lots of threads
submitting at the same time.

Signed-off-by: Christian König 
<mailto:christian.koe...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fff558cf385b..f9240a94217b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}

r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  , true);
+  , false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
--
2.17.1





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Zhou, David(ChunMing)
how to refresh LRU to keep the order align with bo list passed from user space?

you can verify it by some games, performance could be different much between 
multiple runnings.

-David

 Original Message 
Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Zhou, David(ChunMing)" ,"Olsak, Marek" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org
CC:

[CAUTION: External Email]
Hui? What do you mean with that?

Christian.

Am 14.05.19 um 15:12 schrieb Zhou, David(ChunMing):
my only concern is how to fresh LRU when bo is from bo list.

-David

 Original Message 
Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org>
CC:

[CAUTION: External Email]

This avoids OOM situations when we have lots of threads
submitting at the same time.

Signed-off-by: Christian König 
<mailto:christian.koe...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fff558cf385b..f9240a94217b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}

r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  , true);
+  , false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
--
2.17.1


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Zhou, David(ChunMing)
my only concern is how to fresh LRU when bo is from bo list.

-David

 Original Message 
Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS
From: Christian König
To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org
CC:

[CAUTION: External Email]

This avoids OOM situations when we have lots of threads
submitting at the same time.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fff558cf385b..f9240a94217b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}

r = ttm_eu_reserve_buffers(>ticket, >validated, true,
-  , true);
+  , false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-27 Thread Zhou, David(ChunMing)
Sorry, I only can put my Acked-by: Chunming Zhou  on 
patch#3.

I cannot fully judge patch #4, #5, #6.

-David

From: amd-gfx  On Behalf Of Grodzovsky, 
Andrey
Sent: Friday, April 26, 2019 10:09 PM
To: Koenig, Christian ; Zhou, David(ChunMing) 
; dri-de...@lists.freedesktop.org; 
amd-gfx@lists.freedesktop.org; e...@anholt.net; etna...@lists.freedesktop.org
Cc: Kazlauskas, Nicholas ; Liu, Monk 

Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already 
signaled.


Ping (mostly David and Monk).

Andrey
On 4/24/19 3:09 AM, Christian König wrote:
Am 24.04.19 um 05:02 schrieb Zhou, David(ChunMing):
>> -drm_sched_stop(>sched, >base);
>> -
>>   /* after all hw jobs are reset, hw fence is meaningless, so 
>> force_completion */
>>   amdgpu_fence_driver_force_completion(ring);
>>   }

HW fence are already forced completion, then we can just disable irq fence 
process and ignore hw fence signal when we are trying to do GPU reset, I think. 
Otherwise which will make the logic much more complex.
If this situation happens because of long time execution, we can increase 
timeout of reset detection.

You are not thinking widely enough, forcing the hw fence to complete can 
trigger other to start other activity in the system.

We first need to stop everything and make sure that we don't do any processing 
any more and then start with our reset procedure including forcing all hw 
fences to complete.

Christian.



-David

From: amd-gfx 
<mailto:amd-gfx-boun...@lists.freedesktop.org>
 On Behalf Of Grodzovsky, Andrey
Sent: Wednesday, April 24, 2019 12:00 AM
To: Zhou, David(ChunMing) <mailto:david1.z...@amd.com>; 
dri-de...@lists.freedesktop.org<mailto:dri-de...@lists.freedesktop.org>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; 
e...@anholt.net<mailto:e...@anholt.net>; 
etna...@lists.freedesktop.org<mailto:etna...@lists.freedesktop.org>; 
ckoenig.leichtzumer...@gmail.com<mailto:ckoenig.leichtzumer...@gmail.com>
Cc: Kazlauskas, Nicholas 
<mailto:nicholas.kazlaus...@amd.com>; Liu, Monk 
<mailto:monk@amd.com>
Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already 
signaled.


No, i mean the actual HW fence which signals when the job finished execution on 
the HW.

Andrey
On 4/23/19 11:19 AM, Zhou, David(ChunMing) wrote:
do you mean fence timer? why not stop it as well when stopping sched for the 
reason of hw reset?

 Original Message 
Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already 
signaled.
From: "Grodzovsky, Andrey"
To: "Zhou, David(ChunMing)" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com>
CC: "Kazlauskas, Nicholas" ,"Liu, Monk"

On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote:
> +Monk.
>
> GPU reset is used widely in SRIOV, so need virtulizatino guy take a look.
>
> But out of curious, why guilty job can signal more if the job is already
> set to guilty? set it wrongly?
>
>
> -David


It's possible that the job does completes at a later time then it's
timeout handler started processing so in this patch we try to protect
against this by rechecking the HW fence after stopping all SW
schedulers. We do it BEFORE marking guilty on the job's sched_entity so
at the point we check the guilty flag is not set yet.

Andrey


>
> 在 2019/4/18 23:00, Andrey Grodzovsky 写道:
>> Also reject TDRs if another one already running.
>>
>> v2:
>> Stop all schedulers across device and entire XGMI hive before
>> force signaling HW fences.
>> Avoid passing job_signaled to helper fnctions to keep all the decision
>> making about skipping HW reset in one place.
>>
>> v3:
>> Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced
>> against it's decrement in drm_sched_stop in non HW reset case.
>> v4: rebase
>> v5: Revert v3 as we do it now in sceduler code.
>>
>> Signed-off-by: Andrey Grodzovsky 
>> <mailto:andrey.grodzov...@amd.com>
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143 
>> +++--
>>1 file changed, 95 insertions(+), 48 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index a0e165c..85f8792 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3334,8 +3334,6 @@ static int amdgpu_device_pre_

RE: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Zhou, David(ChunMing)
>> -drm_sched_stop(>sched, >base);
>> -
>>   /* after all hw jobs are reset, hw fence is meaningless, so 
>> force_completion */
>>   amdgpu_fence_driver_force_completion(ring);
>>   }

HW fence are already forced completion, then we can just disable irq fence 
process and ignore hw fence signal when we are trying to do GPU reset, I think. 
Otherwise which will make the logic much more complex.
If this situation happens because of long time execution, we can increase 
timeout of reset detection.

-David

From: amd-gfx  On Behalf Of Grodzovsky, 
Andrey
Sent: Wednesday, April 24, 2019 12:00 AM
To: Zhou, David(ChunMing) ; 
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; 
e...@anholt.net; etna...@lists.freedesktop.org; ckoenig.leichtzumer...@gmail.com
Cc: Kazlauskas, Nicholas ; Liu, Monk 

Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already 
signaled.


No, i mean the actual HW fence which signals when the job finished execution on 
the HW.

Andrey
On 4/23/19 11:19 AM, Zhou, David(ChunMing) wrote:
do you mean fence timer? why not stop it as well when stopping sched for the 
reason of hw reset?

 Original Message 
Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already 
signaled.
From: "Grodzovsky, Andrey"
To: "Zhou, David(ChunMing)" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com<mailto:dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com>
CC: "Kazlauskas, Nicholas" ,"Liu, Monk"

On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote:
> +Monk.
>
> GPU reset is used widely in SRIOV, so need virtulizatino guy take a look.
>
> But out of curious, why guilty job can signal more if the job is already
> set to guilty? set it wrongly?
>
>
> -David


It's possible that the job does completes at a later time then it's
timeout handler started processing so in this patch we try to protect
against this by rechecking the HW fence after stopping all SW
schedulers. We do it BEFORE marking guilty on the job's sched_entity so
at the point we check the guilty flag is not set yet.

Andrey


>
> 在 2019/4/18 23:00, Andrey Grodzovsky 写道:
>> Also reject TDRs if another one already running.
>>
>> v2:
>> Stop all schedulers across device and entire XGMI hive before
>> force signaling HW fences.
>> Avoid passing job_signaled to helper fnctions to keep all the decision
>> making about skipping HW reset in one place.
>>
>> v3:
>> Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced
>> against it's decrement in drm_sched_stop in non HW reset case.
>> v4: rebase
>> v5: Revert v3 as we do it now in sceduler code.
>>
>> Signed-off-by: Andrey Grodzovsky 
>> <mailto:andrey.grodzov...@amd.com>
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143 
>> +++--
>>1 file changed, 95 insertions(+), 48 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index a0e165c..85f8792 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3334,8 +3334,6 @@ static int amdgpu_device_pre_asic_reset(struct 
>> amdgpu_device *adev,
>>   if (!ring || !ring->sched.thread)
>>   continue;
>>
>> -drm_sched_stop(>sched, >base);
>> -
>>   /* after all hw jobs are reset, hw fence is meaningless, so 
>> force_completion */
>>   amdgpu_fence_driver_force_completion(ring);
>>   }
>> @@ -3343,6 +3341,7 @@ static int amdgpu_device_pre_asic_reset(struct 
>> amdgpu_device *adev,
>>   if(job)
>>   drm_sched_increase_karma(>base);
>>
>> +/* Don't suspend on bare metal if we are not going to HW reset the ASIC 
>> */
>>   if (!amdgpu_sriov_vf(adev)) {
>>
>>   if (!need_full_reset)
>> @@ -3480,37 +3479,21 @@ static int amdgpu_do_asic_reset(struct 
>> amdgpu_hive_info *hive,
>>   return r;
>>}
>>
>> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev)
>> +static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, bool 
>> trylock)
>>{
>> -int i;
>> -
>> -for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>> -struct amdgpu_ring *ring = ad

Re:[PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Zhou, David(ChunMing)
do you mean fence timer? why not stop it as well when stopping sched for the 
reason of hw reset?

 Original Message 
Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already 
signaled.
From: "Grodzovsky, Andrey"
To: "Zhou, David(ChunMing)" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com
CC: "Kazlauskas, Nicholas" ,"Liu, Monk"


On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote:
> +Monk.
>
> GPU reset is used widely in SRIOV, so need virtulizatino guy take a look.
>
> But out of curious, why guilty job can signal more if the job is already
> set to guilty? set it wrongly?
>
>
> -David


It's possible that the job does completes at a later time then it's
timeout handler started processing so in this patch we try to protect
against this by rechecking the HW fence after stopping all SW
schedulers. We do it BEFORE marking guilty on the job's sched_entity so
at the point we check the guilty flag is not set yet.

Andrey


>
> 在 2019/4/18 23:00, Andrey Grodzovsky 写道:
>> Also reject TDRs if another one already running.
>>
>> v2:
>> Stop all schedulers across device and entire XGMI hive before
>> force signaling HW fences.
>> Avoid passing job_signaled to helper fnctions to keep all the decision
>> making about skipping HW reset in one place.
>>
>> v3:
>> Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced
>> against it's decrement in drm_sched_stop in non HW reset case.
>> v4: rebase
>> v5: Revert v3 as we do it now in sceduler code.
>>
>> Signed-off-by: Andrey Grodzovsky 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143 
>> +++--
>>1 file changed, 95 insertions(+), 48 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index a0e165c..85f8792 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3334,8 +3334,6 @@ static int amdgpu_device_pre_asic_reset(struct 
>> amdgpu_device *adev,
>>   if (!ring || !ring->sched.thread)
>>   continue;
>>
>> -drm_sched_stop(>sched, >base);
>> -
>>   /* after all hw jobs are reset, hw fence is meaningless, so 
>> force_completion */
>>   amdgpu_fence_driver_force_completion(ring);
>>   }
>> @@ -3343,6 +3341,7 @@ static int amdgpu_device_pre_asic_reset(struct 
>> amdgpu_device *adev,
>>   if(job)
>>   drm_sched_increase_karma(>base);
>>
>> +/* Don't suspend on bare metal if we are not going to HW reset the ASIC 
>> */
>>   if (!amdgpu_sriov_vf(adev)) {
>>
>>   if (!need_full_reset)
>> @@ -3480,37 +3479,21 @@ static int amdgpu_do_asic_reset(struct 
>> amdgpu_hive_info *hive,
>>   return r;
>>}
>>
>> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev)
>> +static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, bool 
>> trylock)
>>{
>> -int i;
>> -
>> -for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>> -struct amdgpu_ring *ring = adev->rings[i];
>> -
>> -if (!ring || !ring->sched.thread)
>> -continue;
>> -
>> -if (!adev->asic_reset_res)
>> -drm_sched_resubmit_jobs(>sched);
>> +if (trylock) {
>> +if (!mutex_trylock(>lock_reset))
>> +return false;
>> +} else
>> +mutex_lock(>lock_reset);
>>
>> -drm_sched_start(>sched, !adev->asic_reset_res);
>> -}
>> -
>> -if (!amdgpu_device_has_dc_support(adev)) {
>> -drm_helper_resume_force_mode(adev->ddev);
>> -}
>> -
>> -adev->asic_reset_res = 0;
>> -}
>> -
>> -static void amdgpu_device_lock_adev(struct amdgpu_device *adev)
>> -{
>> -mutex_lock(>lock_reset);
>>   atomic_inc(>gpu_reset_counter);
>>   adev->in_gpu_reset = 1;
>>   /* Block kfd: SRIOV would do it separately */
>>   if (!amdgpu_sriov_vf(adev))
>>amdgpu_amdkfd_pre_reset(adev);
>> +
>> +return true;
>>}
>>
>>static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
>> @@ -3538,40 +3521,42 @@ s

Re:[PATCH v5 3/6] drm/scheduler: rework job destruction

2019-04-23 Thread Zhou, David(ChunMing)
This patch is to fix deadlock between fence->lock and sched->job_list_lock, 
right?
So I suggest to just move list_del_init(_job->node) from 
drm_sched_process_job to work thread. That will avoid deadlock described in the 
link.


 Original Message 
Subject: Re: [PATCH v5 3/6] drm/scheduler: rework job destruction
From: "Grodzovsky, Andrey"
To: "Zhou, David(ChunMing)" 
,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.com
CC: "Kazlauskas, Nicholas" ,"Koenig, Christian"


On 4/22/19 8:48 AM, Chunming Zhou wrote:
> Hi Andrey,
>
> static void drm_sched_process_job(struct dma_fence *f, struct
> dma_fence_cb *cb)
> {
> ...
>   spin_lock_irqsave(>job_list_lock, flags);
>   /* remove job from ring_mirror_list */
>   list_del_init(_job->node);
>   spin_unlock_irqrestore(>job_list_lock, flags);
> [David] How about just remove above to worker from irq process? Any
> problem? Maybe I missed previous your discussion, but I think removing
> lock for list is a risk for future maintenance although you make sure
> thread safe currently.
>
> -David


We remove the lock exactly because of the fact that insertion and
removal to/from the list will be done form exactly one thread at ant
time now. So I am not sure I understand what you mean.

Andrey


>
> ...
>
>   schedule_work(_job->finish_work);
> }
>
> 在 2019/4/18 23:00, Andrey Grodzovsky 写道:
>> From: Christian König 
>>
>> We now destroy finished jobs from the worker thread to make sure that
>> we never destroy a job currently in timeout processing.
>> By this we avoid holding lock around ring mirror list in drm_sched_stop
>> which should solve a deadlock reported by a user.
>>
>> v2: Remove unused variable.
>> v4: Move guilty job free into sched code.
>> v5:
>> Move sched->hw_rq_count to drm_sched_start to account for counter
>> decrement in drm_sched_stop even when we don't call resubmit jobs
>> if guily job did signal.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692
>>
>> Signed-off-by: Christian König 
>> Signed-off-by: Andrey Grodzovsky 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   9 +-
>>drivers/gpu/drm/etnaviv/etnaviv_dump.c |   4 -
>>drivers/gpu/drm/etnaviv/etnaviv_sched.c|   2 +-
>>drivers/gpu/drm/lima/lima_sched.c  |   2 +-
>>drivers/gpu/drm/panfrost/panfrost_job.c|   2 +-
>>drivers/gpu/drm/scheduler/sched_main.c | 159 
>> +
>>drivers/gpu/drm/v3d/v3d_sched.c|   2 +-
>>include/drm/gpu_scheduler.h|   6 +-
>>8 files changed, 102 insertions(+), 84 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 7cee269..a0e165c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3334,7 +3334,7 @@ static int amdgpu_device_pre_asic_reset(struct 
>> amdgpu_device *adev,
>>if (!ring || !ring->sched.thread)
>>continue;
>>
>> - drm_sched_stop(>sched);
>> + drm_sched_stop(>sched, >base);
>>
>>/* after all hw jobs are reset, hw fence is meaningless, so 
>> force_completion */
>>amdgpu_fence_driver_force_completion(ring);
>> @@ -3343,8 +3343,6 @@ static int amdgpu_device_pre_asic_reset(struct 
>> amdgpu_device *adev,
>>if(job)
>>drm_sched_increase_karma(>base);
>>
>> -
>> -
>>if (!amdgpu_sriov_vf(adev)) {
>>
>>if (!need_full_reset)
>> @@ -3482,8 +3480,7 @@ static int amdgpu_do_asic_reset(struct 
>> amdgpu_hive_info *hive,
>>return r;
>>}
>>
>> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev,
>> -   struct amdgpu_job *job)
>> +static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev)
>>{
>>int i;
>>
>> @@ -3623,7 +3620,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
>> *adev,
>>
>>/* Post ASIC reset for all devs .*/
>>list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
>> - amdgpu_device_post_asic_reset(tmp_adev, tmp_adev == adev ? job : NULL);
>> + amdgpu_device_post_asic_reset(tmp_adev);
>>
>>if (r) {
>>/* bad news, how to tell it to userspace ? */
>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c 
>> b/drivers/gpu/drm/e

RE: DMA-buf P2P

2019-04-19 Thread Zhou, David(ChunMing)
Which test are you using? Can share?

-David

> -Original Message-
> From: dri-devel  On Behalf Of
> Christian K?nig
> Sent: Thursday, April 18, 2019 8:09 PM
> To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: DMA-buf P2P
> 
> Hi guys,
> 
> as promised this is the patch set which enables P2P buffer sharing with DMA-
> buf.
> 
> Basic idea is that importers can set a flag noting that they can deal with and
> sgt which doesn't contains pages.
> 
> This in turn is the signal to the exporter that we don't need to move a buffer
> to system memory any more when a remote device wants to access it.
> 
> Please review and/or comment,
> Christian.
> 
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4

2019-03-31 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Lionel Landwerlin 
> Sent: Saturday, March 30, 2019 10:09 PM
> To: Koenig, Christian ; Zhou, David(ChunMing)
> ; dri-de...@lists.freedesktop.org; amd-
> g...@lists.freedesktop.org; ja...@jlekstrand.net; Hector, Tobias
> 
> Subject: Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point
> interface v4
> 
> On 28/03/2019 15:18, Christian König wrote:
> > Am 28.03.19 um 14:50 schrieb Lionel Landwerlin:
> >> On 25/03/2019 08:32, Chunming Zhou wrote:
> >>> From: Christian König 
> >>>
> >>> Use the dma_fence_chain object to create a timeline of fence objects
> >>> instead of just replacing the existing fence.
> >>>
> >>> v2: rebase and cleanup
> >>> v3: fix garbage collection parameters
> >>> v4: add unorder point check, print a warn calltrace
> >>>
> >>> Signed-off-by: Christian König 
> >>> Cc: Lionel Landwerlin 
> >>> ---
> >>>   drivers/gpu/drm/drm_syncobj.c | 39
> >>> +++
> >>>   include/drm/drm_syncobj.h |  5 +
> >>>   2 files changed, 44 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/drm_syncobj.c
> >>> b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119
> >>> 100644
> >>> --- a/drivers/gpu/drm/drm_syncobj.c
> >>> +++ b/drivers/gpu/drm/drm_syncobj.c
> >>> @@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct
> >>> drm_syncobj *syncobj,
> >>>   spin_unlock(>lock);
> >>>   }
> >>>   +/**
> >>> + * drm_syncobj_add_point - add new timeline point to the syncobj
> >>> + * @syncobj: sync object to add timeline point do
> >>> + * @chain: chain node to use to add the point
> >>> + * @fence: fence to encapsulate in the chain node
> >>> + * @point: sequence number to use for the point
> >>> + *
> >>> + * Add the chain node as new timeline point to the syncobj.
> >>> + */
> >>> +void drm_syncobj_add_point(struct drm_syncobj *syncobj,
> >>> +   struct dma_fence_chain *chain,
> >>> +   struct dma_fence *fence,
> >>> +   uint64_t point)
> >>> +{
> >>> +    struct syncobj_wait_entry *cur, *tmp;
> >>> +    struct dma_fence *prev;
> >>> +
> >>> +    dma_fence_get(fence);
> >>> +
> >>> +    spin_lock(>lock);
> >>> +
> >>> +    prev = drm_syncobj_fence_get(syncobj);
> >>> +    /* You are adding an unorder point to timeline, which could
> >>> cause payload returned from query_ioctl is 0! */
> >>> +    WARN_ON_ONCE(prev && prev->seqno >= point);
> >>
> >>
> >> I think the WARN/BUG macros should only fire when there is an issue
> >> with programming from within the kernel.
> >>
> >> But this particular warning can be triggered by an application.
> >>
> >>
> >> Probably best to just remove it?
> >
> > Yeah, that was also my argument against it.
> >
> > Key point here is that we still want to note somehow that userspace
> > did something wrong and returning an error is not an option.
> >
> > Maybe just use DRM_ERROR with a static variable to print the message
> > only once.
> >
> > Christian.
> 
> I don't really see any point in printing an error once. If you run your
> application twice you end up thinking there was an issue just on the first run
> but it's actually always wrong.
> 

Except this nitpick, is there any other concern to push whole patch set? Is 
that time to push whole patch set?

-David

> 
> Unless we're willing to take the syncobj lock for longer periods of time when
> adding points, I guess we'll have to defer validation to validation layers.
> 
> 
> -Lionel
> 
> >
> >>
> >>
> >> -Lionel
> >>
> >>
> >>> +    dma_fence_chain_init(chain, prev, fence, point);
> >>> +    rcu_assign_pointer(syncobj->fence, >base);
> >>> +
> >>> +    list_for_each_entry_safe(cur, tmp, >cb_list, node) {
> >>> +    list_del_init(>node);
> >>> +    syncobj_wait_syncobj_func(syncobj, cur);
> >>> +    }
> >>> +    spin_unlock(>lock);
> >>> +
> >>> +    /* Walk the chain once to trigger garbage collection */
> >>> +    dma_fence_c

RE: [PATCH] drm/amdgpu: fix old fence check in amdgpu_fence_emit

2019-03-31 Thread Zhou, David(ChunMing)


> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Saturday, March 30, 2019 2:33 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: fix old fence check in amdgpu_fence_emit
> 
> We don't hold a reference to the old fence, so it can go away any time we are
> waiting for it to signal.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 24 -
> --
>  1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index ee47c11e92ce..4dee2326b29c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -136,8 +136,9 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring,
> struct dma_fence **f,  {
>   struct amdgpu_device *adev = ring->adev;
>   struct amdgpu_fence *fence;
> - struct dma_fence *old, **ptr;
> + struct dma_fence __rcu **ptr;
>   uint32_t seq;
> + int r;
> 
>   fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
>   if (fence == NULL)
> @@ -153,15 +154,24 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring,
> struct dma_fence **f,
>  seq, flags | AMDGPU_FENCE_FLAG_INT);
> 
>   ptr = >fence_drv.fences[seq & ring-
> >fence_drv.num_fences_mask];
> + if (unlikely(rcu_dereference_protected(*ptr, 1))) {

Isn't this line redundant with dma_fence_get_rcu_safe? I think it's unnecessary.
Otherwise looks ok to me.

-David
> + struct dma_fence *old;
> +
> + rcu_read_lock();
> + old = dma_fence_get_rcu_safe(ptr);
> + rcu_read_unlock();
> +
> + if (old) {
> + r = dma_fence_wait(old, false);
> + dma_fence_put(old);
> + if (r)
> + return r;
> + }
> + }
> +
>   /* This function can't be called concurrently anyway, otherwise
>* emitting the fence would mess up the hardware ring buffer.
>*/
> - old = rcu_dereference_protected(*ptr, 1);
> - if (old && !dma_fence_is_signaled(old)) {
> - DRM_INFO("rcu slot is busy\n");
> - dma_fence_wait(old, false);
> - }
> -
>   rcu_assign_pointer(*ptr, dma_fence_get(>base));
> 
>   *f = >base;
> --
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re:[PATCH 1/9] dma-buf: add new dma_fence_chain container v6

2019-03-21 Thread Zhou, David(ChunMing)
cmpxchg be replaced by some simple c sentance?
otherwise we have to remove __rcu of chian->prev.

-David

 Original Message 
Subject: Re: [PATCH 1/9] dma-buf: add new dma_fence_chain container v6
From: Christian König
To: "Zhou, David(ChunMing)" ,kbuild test robot ,"Zhou, David(ChunMing)"
CC: 
kbuild-...@01.org,dri-de...@lists.freedesktop.org,amd-gfx@lists.freedesktop.org,lionel.g.landwer...@intel.com,ja...@jlekstrand.net,"Koenig,
 Christian" ,"Hector, Tobias"

Hi David,

For the cmpxchg() case I of hand don't know either. Looks like so far
nobody has used cmpxchg() with rcu protected structures.

The other cases should be replaced by RCU_INIT_POINTER() or
rcu_dereference_protected(.., true);

Regards,
Christian.

Am 21.03.19 um 07:34 schrieb zhoucm1:
> Hi Lionel and Christian,
>
> Below is robot report for chain->prev, which was added __rcu as you
> suggested.
>
> How to fix this line "tmp = cmpxchg(>prev, prev, replacement); "?
> I checked kernel header file, seems it has no cmpxchg for rcu.
>
> Any suggestion to fix this robot report?
>
> Thanks,
> -David
>
> On 2019年03月21日 08:24, kbuild test robot wrote:
>> Hi Chunming,
>>
>> I love your patch! Perhaps something to improve:
>>
>> [auto build test WARNING on linus/master]
>> [also build test WARNING on v5.1-rc1 next-20190320]
>> [if your patch is applied to the wrong git tree, please drop us a
>> note to help improve the system]
>>
>> url:
>> https://github.com/0day-ci/linux/commits/Chunming-Zhou/dma-buf-add-new-dma_fence_chain-container-v6/20190320-223607
>> reproduce:
>>  # apt-get install sparse
>>  make ARCH=x86_64 allmodconfig
>>  make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
>>
>>
>> sparse warnings: (new ones prefixed by >>)
>>
>>>> drivers/dma-buf/dma-fence-chain.c:73:23: sparse: incorrect type in
>>>> initializer (different address spaces) @@expected struct
>>>> dma_fence [noderef] *__old @@got  dma_fence [noderef]
>>>> *__old @@
>> drivers/dma-buf/dma-fence-chain.c:73:23:expected struct
>> dma_fence [noderef] *__old
>> drivers/dma-buf/dma-fence-chain.c:73:23:got struct dma_fence
>> *[assigned] prev
>>>> drivers/dma-buf/dma-fence-chain.c:73:23: sparse: incorrect type in
>>>> initializer (different address spaces) @@expected struct
>>>> dma_fence [noderef] *__new @@got  dma_fence [noderef]
>>>> *__new @@
>> drivers/dma-buf/dma-fence-chain.c:73:23:expected struct
>> dma_fence [noderef] *__new
>> drivers/dma-buf/dma-fence-chain.c:73:23:got struct dma_fence
>> *[assigned] replacement
>>>> drivers/dma-buf/dma-fence-chain.c:73:21: sparse: incorrect type in
>>>> assignment (different address spaces) @@expected struct
>>>> dma_fence *tmp @@got struct dma_fence [noderef] >>> dma_fence *tmp @@
>> drivers/dma-buf/dma-fence-chain.c:73:21:expected struct
>> dma_fence *tmp
>> drivers/dma-buf/dma-fence-chain.c:73:21:got struct dma_fence
>> [noderef] *[assigned] __ret
>>>> drivers/dma-buf/dma-fence-chain.c:190:28: sparse: incorrect type in
>>>> argument 1 (different address spaces) @@expected struct
>>>> dma_fence *fence @@got struct dma_fence struct dma_fence *fence @@
>> drivers/dma-buf/dma-fence-chain.c:190:28:expected struct
>> dma_fence *fence
>> drivers/dma-buf/dma-fence-chain.c:190:28:got struct dma_fence
>> [noderef] *prev
>>>> drivers/dma-buf/dma-fence-chain.c:222:21: sparse: incorrect type in
>>>> assignment (different address spaces) @@expected struct
>>>> dma_fence [noderef] *prev @@got [noderef] *prev @@
>> drivers/dma-buf/dma-fence-chain.c:222:21:expected struct
>> dma_fence [noderef] *prev
>> drivers/dma-buf/dma-fence-chain.c:222:21:got struct dma_fence
>> *prev
>> drivers/dma-buf/dma-fence-chain.c:235:33: sparse: expression
>> using sizeof(void)
>> drivers/dma-buf/dma-fence-chain.c:235:33: sparse: expression
>> using sizeof(void)
>>
>> vim +73 drivers/dma-buf/dma-fence-chain.c
>>
>>  38
>>  39/**
>>  40 * dma_fence_chain_walk - chain walking function
>>  41 * @fence: current chain node
>>  42 *
>>  43 * Walk the chain to the next node. Returns the next fence
>> or NULL if we are at
>>  44 * the end of the chain. Garbage collects chain nodes
>> which are already
>> 

Re:[PATCH] drm/amdgpu: enable bo priority setting from user space

2019-03-07 Thread Zhou, David(ChunMing)
yes,per submission bo list priority already is used by us. but per vm bo still 
is in fly, no priority on that.

-David

send from my phone

 Original Message 
Subject: Re: [PATCH] drm/amdgpu: enable bo priority setting from user space
From: "Koenig, Christian"
To: "Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org
CC:

Well you can already use the per submission priority for the BOs.

Additional to that as I said for per VM BOs we can add a priority to sort them 
in the LRU.

Not sure how effective both of those actually are.

Regards,
Christian.

Am 07.03.19 um 14:09 schrieb Zhou, David(ChunMing):
Yes, you are right, thanks to point it out. Will see if there is other way.

-David

send from my phone

 Original Message 
Subject: Re: [PATCH] drm/amdgpu: enable bo priority setting from user space
From: Christian König
To: "Zhou, David(ChunMing)" 
,amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
CC:

Am 07.03.19 um 10:15 schrieb Chunming Zhou:
> Signed-off-by: Chunming Zhou <mailto:david1.z...@amd.com>

Well NAK to the whole approach.

The TTM priority is a global priority, but processes are only allowed to
specific the priority inside their own allocations. So this approach
will never fly upstream.

What you can do is to add a priority for per vm BOs to affect their sort
order on the LRU, but I doubt that this will have much of an effect.

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 13 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h|  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
>   include/drm/ttm/ttm_bo_driver.h|  9 -
>   include/uapi/drm/amdgpu_drm.h  |  3 +++
>   7 files changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> index 5cbde74b97dd..70a6baf20c22 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> @@ -144,6 +144,7 @@ static int amdgpufb_create_pinned_object(struct 
> amdgpu_fbdev *rfbdev,
>size = mode_cmd->pitches[0] * height;
>aligned_size = ALIGN(size, PAGE_SIZE);
>ret = amdgpu_gem_object_create(adev, aligned_size, 0, domain,
> +TTM_BO_PRIORITY_NORMAL,
>   AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
>   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
>   AMDGPU_GEM_CREATE_VRAM_CLEARED,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index d21dd2f369da..7c1c2362c67e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -44,6 +44,7 @@ void amdgpu_gem_object_free(struct drm_gem_object *gobj)
>
>   int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
> int alignment, u32 initial_domain,
> +  enum ttm_bo_priority priority,
> u64 flags, enum ttm_bo_type type,
> struct reservation_object *resv,
> struct drm_gem_object **obj)
> @@ -60,6 +61,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
> unsigned long size,
>bp.type = type;
>bp.resv = resv;
>bp.preferred_domain = initial_domain;
> + bp.priority = priority;
>   retry:
>bp.flags = flags;
>bp.domain = initial_domain;
> @@ -229,6 +231,14 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
> *data,
>if (args->in.domains & ~AMDGPU_GEM_DOMAIN_MASK)
>return -EINVAL;
>
> + /* check priority */
> + if (args->in.priority == 0) {
> + /* default is normal */
> + args->in.priority = TTM_BO_PRIORITY_NORMAL;
> + } else if (args->in.priority > TTM_MAX_BO_PRIORITY) {
> + args->in.priority = TTM_MAX_BO_PRIORITY;
> + DRM_ERROR("priority specified from user space is over MAX 
> priority\n");
> + }
>/* create a gem object to contain this object in */
>if (args->in.domains & (AMDGPU_GEM_DOMAIN_GDS |
>AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) {
> @@ -252,6 +262,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
> *data,
>
>r = amdgpu_gem_object_create(adev, size, args->in.alignment,
> (u32)(0x & args->in.domains),
> +

Re:[PATCH] drm/amdgpu: enable bo priority setting from user space

2019-03-07 Thread Zhou, David(ChunMing)
Yes, you are right, thanks to point it out. Will see if there is other way.

-David

send from my phone

 Original Message 
Subject: Re: [PATCH] drm/amdgpu: enable bo priority setting from user space
From: Christian König
To: "Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org
CC:

Am 07.03.19 um 10:15 schrieb Chunming Zhou:
> Signed-off-by: Chunming Zhou 

Well NAK to the whole approach.

The TTM priority is a global priority, but processes are only allowed to
specific the priority inside their own allocations. So this approach
will never fly upstream.

What you can do is to add a priority for per vm BOs to affect their sort
order on the LRU, but I doubt that this will have much of an effect.

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 13 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h|  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
>   include/drm/ttm/ttm_bo_driver.h|  9 -
>   include/uapi/drm/amdgpu_drm.h  |  3 +++
>   7 files changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> index 5cbde74b97dd..70a6baf20c22 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> @@ -144,6 +144,7 @@ static int amdgpufb_create_pinned_object(struct 
> amdgpu_fbdev *rfbdev,
>size = mode_cmd->pitches[0] * height;
>aligned_size = ALIGN(size, PAGE_SIZE);
>ret = amdgpu_gem_object_create(adev, aligned_size, 0, domain,
> +TTM_BO_PRIORITY_NORMAL,
>   AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
>   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
>   AMDGPU_GEM_CREATE_VRAM_CLEARED,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index d21dd2f369da..7c1c2362c67e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -44,6 +44,7 @@ void amdgpu_gem_object_free(struct drm_gem_object *gobj)
>
>   int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
> int alignment, u32 initial_domain,
> +  enum ttm_bo_priority priority,
> u64 flags, enum ttm_bo_type type,
> struct reservation_object *resv,
> struct drm_gem_object **obj)
> @@ -60,6 +61,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
> unsigned long size,
>bp.type = type;
>bp.resv = resv;
>bp.preferred_domain = initial_domain;
> + bp.priority = priority;
>   retry:
>bp.flags = flags;
>bp.domain = initial_domain;
> @@ -229,6 +231,14 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
> *data,
>if (args->in.domains & ~AMDGPU_GEM_DOMAIN_MASK)
>return -EINVAL;
>
> + /* check priority */
> + if (args->in.priority == 0) {
> + /* default is normal */
> + args->in.priority = TTM_BO_PRIORITY_NORMAL;
> + } else if (args->in.priority > TTM_MAX_BO_PRIORITY) {
> + args->in.priority = TTM_MAX_BO_PRIORITY;
> + DRM_ERROR("priority specified from user space is over MAX 
> priority\n");
> + }
>/* create a gem object to contain this object in */
>if (args->in.domains & (AMDGPU_GEM_DOMAIN_GDS |
>AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) {
> @@ -252,6 +262,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
> *data,
>
>r = amdgpu_gem_object_create(adev, size, args->in.alignment,
> (u32)(0x & args->in.domains),
> +  args->in.priority - 1,
> flags, ttm_bo_type_device, resv, );
>if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
>if (!r) {
> @@ -304,6 +315,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void 
> *data,
>
>/* create a gem object to contain this object in */
>r = amdgpu_gem_object_create(adev, args->size, 0, 
> AMDGPU_GEM_DOMAIN_CPU,
> +  TTM_BO_PRIORITY_NORMAL,
> 0, ttm_bo_type_device, NULL, );
>if (r)
>return r;
> @@ -755,6 +

RE: [PATCH] drm/amdgpu: force to use CPU_ACCESS hint optimization

2019-03-06 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Wednesday, March 06, 2019 7:55 PM
> To: Zhou, David(ChunMing) ; Koenig, Christian
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: force to use CPU_ACCESS hint
> optimization
> 
> Am 06.03.19 um 12:52 schrieb Chunming Zhou:
> > As we know, visible vram can be placed to invisible when no cpu access.
> >
> > Signed-off-by: Chunming Zhou 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 +++-
> >   1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> > index bc62bf41b7e9..823deb66f5da 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> > @@ -592,8 +592,7 @@ static int amdgpu_info_ioctl(struct drm_device
> > *dev, void *data, struct drm_file
> >
> > vram_gtt.vram_size = adev->gmc.real_vram_size -
> > atomic64_read(>vram_pin_size);
> > -   vram_gtt.vram_cpu_accessible_size = adev-
> >gmc.visible_vram_size -
> > -   atomic64_read(>visible_pin_size);
> > +   vram_gtt.vram_cpu_accessible_size = vram_gtt.vram_size;
> 
> Well, NAK that would of course report the full VRAM as visible which isn't
> correct.

UMD also said same reason that they like report explicit vram info to 
application.
No idea to do that.

-David
> 
> Christian.
> 
> > vram_gtt.gtt_size = adev-
> >mman.bdev.man[TTM_PL_TT].size;
> > vram_gtt.gtt_size *= PAGE_SIZE;
> > vram_gtt.gtt_size -= atomic64_read(>gart_pin_size);
> > @@ -612,9 +611,8 @@ static int amdgpu_info_ioctl(struct drm_device
> *dev, void *data, struct drm_file
> > mem.vram.max_allocation = mem.vram.usable_heap_size *
> 3 / 4;
> >
> > mem.cpu_accessible_vram.total_heap_size =
> > -   adev->gmc.visible_vram_size;
> > -   mem.cpu_accessible_vram.usable_heap_size = adev-
> >gmc.visible_vram_size -
> > -   atomic64_read(>visible_pin_size);
> > +   mem.vram.total_heap_size;
> > +   mem.cpu_accessible_vram.usable_heap_size =
> > +mem.vram.usable_heap_size;
> > mem.cpu_accessible_vram.heap_usage =
> > amdgpu_vram_mgr_vis_usage(
> >mman.bdev.man[TTM_PL_VRAM]);
> > mem.cpu_accessible_vram.max_allocation =

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/3] drm/amdgpu: change Vega IH ring 1 config

2019-03-06 Thread Zhou, David(ChunMing)
Acked-by: Chunming Zhou 


> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Wednesday, March 06, 2019 5:29 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/3] drm/amdgpu: change Vega IH ring 1 config
> 
> Ping? Can anybody review this?
> 
> Thanks,
> Christian.
> 
> Am 04.03.19 um 20:10 schrieb Christian König:
> > Disable overflow and enable full drain. This makes fault handling on
> > ring 1 much more reliable since we don't generate back pressure any more.
> >
> > Signed-off-by: Christian König 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 4 
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> > b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> > index 6d1f804277f8..d4a3cc413af8 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> > @@ -203,6 +203,10 @@ static int vega10_ih_irq_init(struct
> > amdgpu_device *adev)
> >
> > ih_rb_cntl = RREG32_SOC15(OSSSYS, 0,
> mmIH_RB_CNTL_RING1);
> > ih_rb_cntl = vega10_ih_rb_cntl(ih, ih_rb_cntl);
> > +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
> > +  WPTR_OVERFLOW_ENABLE, 0);
> > +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
> > +  RB_FULL_DRAIN_ENABLE, 1);
> > WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1,
> ih_rb_cntl);
> >
> > /* set rptr, wptr to 0 */
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Error handling issues about CHECKED_RETURN

2019-02-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Bo YU 
> Sent: Thursday, February 14, 2019 12:46 PM
> To: Deucher, Alexander ; Koenig, Christian
> ; Zhou, David(ChunMing)
> ; airl...@linux.ie; dan...@ffwll.ch; Zhu, Rex
> ; Grodzovsky, Andrey
> ; dri-de...@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Cc: Bo Yu ; amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: Error handling issues about
> CHECKED_RETURN
> 
> From: Bo Yu 
> 
> Calling "amdgpu_ring_test_helper" without checking return value

We could need to continue to ring test even there is one ring test failed.

-David

> 
> Signed-off-by: Bo Yu 
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> index 57cb3a51bda7..48465a61516b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -4728,7 +4728,9 @@ static int gfx_v8_0_cp_test_all_rings(struct
> amdgpu_device *adev)
> 
>   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>   ring = >gfx.compute_ring[i];
> - amdgpu_ring_test_helper(ring);
> + r = amdgpu_ring_test_helper(ring);
> + if (r)
> + return r;
>   }
> 
>   return 0;
> --
> 2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: partial revert cleanup setting bulk_movable v2

2019-01-31 Thread Zhou, David(ChunMing)
If Tom tests it OK as well, feel free add my RB to submit it ASAP.

-David

> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Thursday, January 31, 2019 3:57 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: partial revert cleanup setting bulk_movable
> v2
> 
> We still need to set bulk_movable to false when new BOs are added or
> removed.
> 
> v2: also set it to false on removal
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 79f9dde70bc0..822546a149fa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -332,6 +332,7 @@ static void amdgpu_vm_bo_base_init(struct
> amdgpu_vm_bo_base *base,
>   if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
>   return;
> 
> + vm->bulk_moveable = false;
>   if (bo->tbo.type == ttm_bo_type_kernel)
>   amdgpu_vm_bo_relocated(base);
>   else
> @@ -2772,6 +2773,9 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device
> *adev,
>   struct amdgpu_vm_bo_base **base;
> 
>   if (bo) {
> + if (bo->tbo.resv == vm->root.base.bo->tbo.resv)
> + vm->bulk_moveable = false;
> +
>   for (base = _va->base.bo->vm_bo; *base;
>base = &(*base)->next) {
>   if (*base != _va->base)
> --
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


回复:[PATCH 2/2] drm/amdgpu: cleanup setting bulk_movable

2019-01-28 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 


send from my phone

 原始邮件 
主题:[PATCH 2/2] drm/amdgpu: cleanup setting bulk_movable
发件人:Christian König
收件人:amd-gfx@lists.freedesktop.org
抄送:

We only need to set this to false now when BOs are removed from the LRU.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a404ac17e5ae..79f9dde70bc0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -332,7 +332,6 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,
 if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
 return;

-   vm->bulk_moveable = false;
 if (bo->tbo.type == ttm_bo_type_kernel)
 amdgpu_vm_bo_relocated(base);
 else
@@ -698,8 +697,6 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
 struct amdgpu_vm_bo_base *bo_base, *tmp;
 int r = 0;

-   vm->bulk_moveable &= list_empty(>evicted);
-
 list_for_each_entry_safe(bo_base, tmp, >evicted, vm_status) {
 struct amdgpu_bo *bo = bo_base->bo;

@@ -2775,9 +2772,6 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
 struct amdgpu_vm_bo_base **base;

 if (bo) {
-   if (bo->tbo.resv == vm->root.base.bo->tbo.resv)
-   vm->bulk_moveable = false;
-
 for (base = _va->base.bo->vm_bo; *base;
  base = &(*base)->next) {
 if (*base != _va->base)
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH libdrm] amdgpu: add a faster BO list API

2019-01-07 Thread Zhou, David(ChunMing)
Looks good to me, Reviewed-by: Chunming Zhou 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Marek Ol?ák
> Sent: Tuesday, January 08, 2019 3:31 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH libdrm] amdgpu: add a faster BO list API
> 
> From: Marek Olšák 
> 
> ---
>  amdgpu/amdgpu-symbol-check |  3 ++
>  amdgpu/amdgpu.h| 56
> +-
>  amdgpu/amdgpu_bo.c | 36 
>  amdgpu/amdgpu_cs.c | 25 +
>  4 files changed, 119 insertions(+), 1 deletion(-)
> 
> diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol-
> check index 6f5e0f95..96a44b40 100755
> --- a/amdgpu/amdgpu-symbol-check
> +++ b/amdgpu/amdgpu-symbol-check
> @@ -12,20 +12,22 @@ _edata
>  _end
>  _fini
>  _init
>  amdgpu_bo_alloc
>  amdgpu_bo_cpu_map
>  amdgpu_bo_cpu_unmap
>  amdgpu_bo_export
>  amdgpu_bo_free
>  amdgpu_bo_import
>  amdgpu_bo_inc_ref
> +amdgpu_bo_list_create_raw
> +amdgpu_bo_list_destroy_raw
>  amdgpu_bo_list_create
>  amdgpu_bo_list_destroy
>  amdgpu_bo_list_update
>  amdgpu_bo_query_info
>  amdgpu_bo_set_metadata
>  amdgpu_bo_va_op
>  amdgpu_bo_va_op_raw
>  amdgpu_bo_wait_for_idle
>  amdgpu_create_bo_from_user_mem
>  amdgpu_cs_chunk_fence_info_to_data
> @@ -40,20 +42,21 @@ amdgpu_cs_destroy_semaphore
> amdgpu_cs_destroy_syncobj  amdgpu_cs_export_syncobj
> amdgpu_cs_fence_to_handle  amdgpu_cs_import_syncobj
> amdgpu_cs_query_fence_status  amdgpu_cs_query_reset_state
> amdgpu_query_sw_info  amdgpu_cs_signal_semaphore
> amdgpu_cs_submit  amdgpu_cs_submit_raw
> +amdgpu_cs_submit_raw2
>  amdgpu_cs_syncobj_export_sync_file
>  amdgpu_cs_syncobj_import_sync_file
>  amdgpu_cs_syncobj_reset
>  amdgpu_cs_syncobj_signal
>  amdgpu_cs_syncobj_wait
>  amdgpu_cs_wait_fences
>  amdgpu_cs_wait_semaphore
>  amdgpu_device_deinitialize
>  amdgpu_device_initialize
>  amdgpu_find_bo_by_cpu_mapping
> diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h index
> dc51659a..5b800033 100644
> --- a/amdgpu/amdgpu.h
> +++ b/amdgpu/amdgpu.h
> @@ -35,20 +35,21 @@
>  #define _AMDGPU_H_
> 
>  #include 
>  #include 
> 
>  #ifdef __cplusplus
>  extern "C" {
>  #endif
> 
>  struct drm_amdgpu_info_hw_ip;
> +struct drm_amdgpu_bo_list_entry;
> 
>  
> /*--*/
>  /* --- Defines  
> */  /*-
> -*/
> 
>  /**
>   * Define max. number of Command Buffers (IB) which could be sent to the
> single
>   * hardware IP to accommodate CE/DE requirements
>   *
>   * \sa amdgpu_cs_ib_info
> @@ -767,34 +768,65 @@ int amdgpu_bo_cpu_unmap(amdgpu_bo_handle
> buf_handle);
>   *and no GPU access is scheduled.
>   *  1 GPU access is in fly or scheduled
>   *
>   * \return   0 - on success
>   *  <0 - Negative POSIX Error code
>   */
>  int amdgpu_bo_wait_for_idle(amdgpu_bo_handle buf_handle,
>   uint64_t timeout_ns,
>   bool *buffer_busy);
> 
> +/**
> + * Creates a BO list handle for command submission.
> + *
> + * \param   dev  - \c [in] Device handle.
> + *  See #amdgpu_device_initialize()
> + * \param   number_of_buffers- \c [in] Number of BOs in the list
> + * \param   buffers  - \c [in] List of BO handles
> + * \param   result   - \c [out] Created BO list handle
> + *
> + * \return   0 on success\n
> + *  <0 - Negative POSIX Error code
> + *
> + * \sa amdgpu_bo_list_destroy_raw()
> +*/
> +int amdgpu_bo_list_create_raw(amdgpu_device_handle dev,
> +   uint32_t number_of_buffers,
> +   struct drm_amdgpu_bo_list_entry *buffers,
> +   uint32_t *result);
> +
> +/**
> + * Destroys a BO list handle.
> + *
> + * \param   bo_list  - \c [in] BO list handle.
> + *
> + * \return   0 on success\n
> + *  <0 - Negative POSIX Error code
> + *
> + * \sa amdgpu_bo_list_create_raw(), amdgpu_cs_submit_raw2() */ int
> +amdgpu_bo_list_destroy_raw(amdgpu_device_handle dev, uint32_t
> bo_list);
> +
>  /**
>   * Creates a BO list handle for command submission.
>   *
>   * \param   dev  - \c [in] Device handle.
>   *  See #amdgpu_device_initialize()
>   * \param   number_of_resources  - \c [in] Number of BOs in the list
>   * \param   resources- \c [in] List of BO handles
>   * \param   resource_prios   - \c [in] Optional priority for each handle
>   * \param   result   - \c [out] Created BO list handle
>   *
>   * \return   0 on success\n
>   *  <0 - Negative POSIX Error code
>   *
> - * \sa amdgpu_bo_list_destroy()
> + * \sa amdgpu_bo_list_destroy(), amdgpu_cs_submit_raw2()
>  */
>  int amdgpu_bo_list_create(amdgpu_device_handle dev,
> 

RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt accessed

2019-01-03 Thread Zhou, David(ChunMing)
Doesn't gpu check PTE prt bit first and then access va range?

Even wrte to dummy page, seem there still is no problem, we don't care that 
content at all.

-David

> -Original Message-
> From: Christian König 
> Sent: Thursday, January 03, 2019 5:54 PM
> To: Zhou, David(ChunMing) ; Koenig, Christian
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt
> accessed
> 
> Writes are then not ignored and garbage the dummy page.
> 
> Christian.
> 
> Am 03.01.19 um 10:46 schrieb Zhou, David(ChunMing):
> > Seems we don't need two page table, we just map every prt range to
> dummy page, any problem?
> >
> > -David
> >
> >> -Original Message-
> >> From: Zhou, David(ChunMing)
> >> Sent: Thursday, January 03, 2019 5:23 PM
> >> To: Koenig, Christian ; amd-
> >> g...@lists.freedesktop.org
> >> Subject: RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt
> >> accessed
> >>
> >>
> >>
> >>> -Original Message-
> >>> From: Christian König 
> >>> Sent: Thursday, January 03, 2019 5:05 PM
> >>> To: Zhou, David(ChunMing) ; Koenig, Christian
> >>> ; Zhou, David(ChunMing)
> >>> ; amd-gfx@lists.freedesktop.org
> >>> Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt
> >>> accessed
> >>>
> >>> Yes, exactly.
> >>>
> >>> Problem is that we then probably need two page tables. One for the
> >>> CB/TC and one for the SDMA.
> >> But when setup page table, how can we know the client is CB/TC or SDMA?
> >>
> >> -David
> >>
> >>> Christian.
> >>>
> >>> Am 03.01.19 um 10:02 schrieb zhoucm1:
> >>>> need dummy page for that?
> >>>>
> >>>>
> >>>> -David
> >>>>
> >>>>
> >>>> On 2019年01月03日 17:01, Christian König wrote:
> >>>>> NAK, the problem is not the interrupt.
> >>>>>
> >>>>> E.g. causing faults by accessing unmapped pages with the SDMA can
> >>>>> still crash the MC.
> >>>>>
> >>>>> The key point is that SDMA can't work with PRT tiles on pre-gmc9
> >>>>> and we need to forbid access on the application side.
> >>>>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>> Am 03.01.19 um 09:54 schrieb Chunming Zhou:
> >>>>>> For pre-gmc9, UMD can only access unmapped PRT tile from CB/TC
> >>>>>> without firing VM fault. Kernel would still receive the VM fault
> >>>>>> interrupt and output the error message if SDMA is the mc_client.
> >>>>>> GMC9 don't need the same since it handle the PRT in different way.
> >>>>>> We cannot just skip message for SDMA, as Christian pointed, VM
> >>>>>> fault could crash mc block, so we disable vm fault irq during prt
> >>>>>> range is accesed.
> >>>>>> The nagative is normal vm fault could be ignored during that
> >>>>>> peroid without enabling vm_debug kernel parameter.
> >>>>>>
> >>>>>> Change-Id: Ic3c62393768eca90e3e45eaf81e7f26f2e91de84
> >>>>>> Signed-off-by: Chunming Zhou 
> >>>>>> ---
> >>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 6 ++
> >>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 ++
> >>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 6 ++
> >>>>>>    3 files changed, 18 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>>>>> index dae73f6768c2..175c4b319559 100644
> >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>>>>> @@ -486,6 +486,10 @@ static void gmc_v6_0_set_prt(struct
> >>>>>> amdgpu_device *adev, bool enable)
> >>>>>>    WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high);
> >>>>>>    WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high);
> >>>>>>    WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high);
> >>>>>> +    /* Note: whe

RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt accessed

2019-01-03 Thread Zhou, David(ChunMing)
Seems we don't need two page table, we just map every prt range to dummy page, 
any problem?

-David

> -Original Message-
> From: Zhou, David(ChunMing)
> Sent: Thursday, January 03, 2019 5:23 PM
> To: Koenig, Christian ; amd-
> g...@lists.freedesktop.org
> Subject: RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt
> accessed
> 
> 
> 
> > -Original Message-
> > From: Christian König 
> > Sent: Thursday, January 03, 2019 5:05 PM
> > To: Zhou, David(ChunMing) ; Koenig, Christian
> > ; Zhou, David(ChunMing)
> > ; amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt
> > accessed
> >
> > Yes, exactly.
> >
> > Problem is that we then probably need two page tables. One for the
> > CB/TC and one for the SDMA.
> 
> But when setup page table, how can we know the client is CB/TC or SDMA?
> 
> -David
> 
> >
> > Christian.
> >
> > Am 03.01.19 um 10:02 schrieb zhoucm1:
> > > need dummy page for that?
> > >
> > >
> > > -David
> > >
> > >
> > > On 2019年01月03日 17:01, Christian König wrote:
> > >> NAK, the problem is not the interrupt.
> > >>
> > >> E.g. causing faults by accessing unmapped pages with the SDMA can
> > >> still crash the MC.
> > >>
> > >> The key point is that SDMA can't work with PRT tiles on pre-gmc9
> > >> and we need to forbid access on the application side.
> > >>
> > >> Regards,
> > >> Christian.
> > >>
> > >> Am 03.01.19 um 09:54 schrieb Chunming Zhou:
> > >>> For pre-gmc9, UMD can only access unmapped PRT tile from CB/TC
> > >>> without firing VM fault. Kernel would still receive the VM fault
> > >>> interrupt and output the error message if SDMA is the mc_client.
> > >>> GMC9 don't need the same since it handle the PRT in different way.
> > >>> We cannot just skip message for SDMA, as Christian pointed, VM
> > >>> fault could crash mc block, so we disable vm fault irq during prt
> > >>> range is accesed.
> > >>> The nagative is normal vm fault could be ignored during that
> > >>> peroid without enabling vm_debug kernel parameter.
> > >>>
> > >>> Change-Id: Ic3c62393768eca90e3e45eaf81e7f26f2e91de84
> > >>> Signed-off-by: Chunming Zhou 
> > >>> ---
> > >>>   drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 6 ++
> > >>>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 ++
> > >>>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 6 ++
> > >>>   3 files changed, 18 insertions(+)
> > >>>
> > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> > >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> > >>> index dae73f6768c2..175c4b319559 100644
> > >>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> > >>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> > >>> @@ -486,6 +486,10 @@ static void gmc_v6_0_set_prt(struct
> > >>> amdgpu_device *adev, bool enable)
> > >>>   WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high);
> > >>>   WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high);
> > >>>   WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high);
> > >>> +    /* Note: when vm_debug enabled, vm fault from SDMAx
> > >>> +accessing
> > >>> + * PRT range is normal. */
> > >>> +    if (!amdgpu_vm_debug)
> > >>> +    amdgpu_irq_put(adev, >gmc.vm_fault, 0);
> > >>>   } else {
> > >>>   WREG32(mmVM_PRT_APERTURE0_LOW_ADDR, 0xfff);
> > >>>   WREG32(mmVM_PRT_APERTURE1_LOW_ADDR, 0xfff); @@ -
> > 495,6
> > >>> +499,8 @@ static void gmc_v6_0_set_prt(struct amdgpu_device
> *adev,
> > >>> bool enable)
> > >>>   WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, 0x0);
> > >>>   WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, 0x0);
> > >>>   WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, 0x0);
> > >>> +    if (!amdgpu_vm_debug)
> > >>> +    amdgpu_irq_get(adev, >gmc.vm_fault, 0);
> > >>>   }
> > >>>   }
> > >>>   diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> > >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> > >>> index 5bdeb358bfb5..a4d6d219

RE: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt accessed

2019-01-03 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Thursday, January 03, 2019 5:05 PM
> To: Zhou, David(ChunMing) ; Koenig, Christian
> ; Zhou, David(ChunMing)
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 2/2] drm/amdgpu: disable vm fault irq during prt
> accessed
> 
> Yes, exactly.
> 
> Problem is that we then probably need two page tables. One for the CB/TC
> and one for the SDMA.

But when setup page table, how can we know the client is CB/TC or SDMA?

-David

> 
> Christian.
> 
> Am 03.01.19 um 10:02 schrieb zhoucm1:
> > need dummy page for that?
> >
> >
> > -David
> >
> >
> > On 2019年01月03日 17:01, Christian König wrote:
> >> NAK, the problem is not the interrupt.
> >>
> >> E.g. causing faults by accessing unmapped pages with the SDMA can
> >> still crash the MC.
> >>
> >> The key point is that SDMA can't work with PRT tiles on pre-gmc9 and
> >> we need to forbid access on the application side.
> >>
> >> Regards,
> >> Christian.
> >>
> >> Am 03.01.19 um 09:54 schrieb Chunming Zhou:
> >>> For pre-gmc9, UMD can only access unmapped PRT tile from CB/TC
> >>> without firing VM fault. Kernel would still receive the VM fault
> >>> interrupt and output the error message if SDMA is the mc_client.
> >>> GMC9 don't need the same since it handle the PRT in different way.
> >>> We cannot just skip message for SDMA, as Christian pointed, VM fault
> >>> could crash mc block, so we disable vm fault irq during prt range is
> >>> accesed.
> >>> The nagative is normal vm fault could be ignored during that peroid
> >>> without enabling vm_debug kernel parameter.
> >>>
> >>> Change-Id: Ic3c62393768eca90e3e45eaf81e7f26f2e91de84
> >>> Signed-off-by: Chunming Zhou 
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 6 ++
> >>>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 6 ++
> >>>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 6 ++
> >>>   3 files changed, 18 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>> index dae73f6768c2..175c4b319559 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> >>> @@ -486,6 +486,10 @@ static void gmc_v6_0_set_prt(struct
> >>> amdgpu_device *adev, bool enable)
> >>>   WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high);
> >>>   WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high);
> >>>   WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high);
> >>> +    /* Note: when vm_debug enabled, vm fault from SDMAx
> >>> +accessing
> >>> + * PRT range is normal. */
> >>> +    if (!amdgpu_vm_debug)
> >>> +    amdgpu_irq_put(adev, >gmc.vm_fault, 0);
> >>>   } else {
> >>>   WREG32(mmVM_PRT_APERTURE0_LOW_ADDR, 0xfff);
> >>>   WREG32(mmVM_PRT_APERTURE1_LOW_ADDR, 0xfff); @@ -
> 495,6
> >>> +499,8 @@ static void gmc_v6_0_set_prt(struct amdgpu_device *adev,
> >>> bool enable)
> >>>   WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, 0x0);
> >>>   WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, 0x0);
> >>>   WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, 0x0);
> >>> +    if (!amdgpu_vm_debug)
> >>> +    amdgpu_irq_get(adev, >gmc.vm_fault, 0);
> >>>   }
> >>>   }
> >>>   diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>> index 5bdeb358bfb5..a4d6d219f4e8 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> >>> @@ -582,6 +582,10 @@ static void gmc_v7_0_set_prt(struct
> >>> amdgpu_device *adev, bool enable)
> >>>   WREG32(mmVM_PRT_APERTURE1_HIGH_ADDR, high);
> >>>   WREG32(mmVM_PRT_APERTURE2_HIGH_ADDR, high);
> >>>   WREG32(mmVM_PRT_APERTURE3_HIGH_ADDR, high);
> >>> +    /* Note: when vm_debug enabled, vm fault from SDMAx
> >>> +accessing
> >>> + * PRT range is normal. */
> >>> +    if (!amdgpu_vm_debug)
> >>> +    amdgpu_irq_put(adev, >gmc.vm_fault, 0);
> >>>   } else {
> >>>    

RE: [Intel-gfx] [PATCH 03/10] drm/syncobj: add new drm_syncobj_add_point interface v2

2018-12-12 Thread Zhou, David(ChunMing)
+ Daniel Rakos and Jason Ekstrand.

 Below is the background, which is from Daniel R should  be able to explain 
that's why: 
" ISVs, especially those coming from D3D12, are unsatisfied with the behavior 
of the Vulkan semaphores as they are unhappy with the fact that for every 
single dependency they need to use separate semaphores due to their binary 
nature.
Compared to that a synchronization primitive like D3D12 monitored fences enable 
one of those to be used to track a sequence of operations by simply associating 
timeline values to the completion of individual operations. This allows them to 
track the lifetime and usage of resources and the ordered completion of 
sequences.
Besides that, they also want to use a single synchronization primitive to be 
able to handle GPU-to-GPU and GPU-to-CPU dependencies, compared to using 
semaphores for the former and fences for the latter.
In addition, compared to legacy semaphores, timeline semaphores are proposed to 
support wait-before-signal, i.e. allow enqueueing a semaphore wait operation 
with a wait value that is larger than any of the already enqueued signal 
values. This seems to be a hard requirement for ISVs. Without UMD-side queue 
batching, and even UMD-side queue batching doesn’t help the situation when such 
a semaphore is externally shared with another API. Thus in order to properly 
support wait-before-signal the KMD implementation has to also be able to 
support such dependencies.
"

Btw, we already add test case to igt, and tested by many existing test, like 
libdrm unit test, igt related test, vulkan cts, and steam games.

-David
> -Original Message-
> From: Daniel Vetter 
> Sent: Wednesday, December 12, 2018 7:15 PM
> To: Koenig, Christian 
> Cc: Zhou, David(ChunMing) ; dri-devel  de...@lists.freedesktop.org>; amd-gfx list ;
> intel-gfx ; Christian König
> 
> Subject: Re: [Intel-gfx] [PATCH 03/10] drm/syncobj: add new
> drm_syncobj_add_point interface v2
> 
> On Wed, Dec 12, 2018 at 12:08 PM Koenig, Christian
>  wrote:
> >
> > Am 12.12.18 um 11:49 schrieb Daniel Vetter:
> > > On Fri, Dec 07, 2018 at 11:54:15PM +0800, Chunming Zhou wrote:
> > >> From: Christian König 
> > >>
> > >> Use the dma_fence_chain object to create a timeline of fence
> > >> objects instead of just replacing the existing fence.
> > >>
> > >> v2: rebase and cleanup
> > >>
> > >> Signed-off-by: Christian König 
> > > Somewhat jumping back into this. Not sure we discussed this already
> > > or not. I'm a bit unclear on why we have to chain the fences in the
> timeline:
> > >
> > > - The timeline stuff is modelled after the WDDM2 monitored fences.
> Which
> > >really are just u64 counters in memory somewhere (I think could be
> > >system ram or vram). Because WDDM2 has the memory management
> entirely
> > >separated from rendering synchronization it totally allows userspace to
> > >create loops and deadlocks and everything else nasty using this - the
> > >memory manager won't deadlock because these monitored fences
> never leak
> > >into the buffer manager. And if CS deadlock, gpu reset takes care of 
> > > the
> > >mess.
> > >
> > > - This has a few consequences, as in they seem to indeed work like a
> > >memory location: Userspace incrementing out-of-order (because they
> run
> > >batches updating the same fence on different engines) is totally fine,
> > >as is doing anything else "stupid".
> > >
> > > - Now on linux we can't allow anything, because we need to make sure
> that
> > >deadlocks don't leak into the memory manager. But as long as we block
> > >until the underlying dma_fence has materialized, nothing userspace can
> > >do will lead to such a deadlock. Even if userspace ends up submitting
> > >jobs without enough built-in synchronization, leading to out-of-order
> > >signalling of fences on that "timeline". And I don't think that would
> > >pose a problem for us.
> > >
> > > Essentially I think we can look at timeline syncobj as a dma_fence
> > > container indexed through an integer, and there's no need to enforce
> > > that the timline works like a real dma_fence timeline, with all it's
> > > guarantees. It's just a pile of (possibly, if userspace is stupid)
> > > unrelated dma_fences. You could implement the entire thing in
> > > userspace after all, except for the "we want to share these timeline
> > > objects between processes" problem.
> > >
> > > tldr; I think

RE: [PATCH 01/10] dma-buf: add new dma_fence_chain container v4

2018-12-11 Thread Zhou, David(ChunMing)
Hi Daniel and Chris,

Could you take a look on all the patches? Can we get your RB or AB on all 
patches including igt patch before we submit to drm-misc? 

We already fix all existing issues, and also add  test case in IGT as your 
required.

Btw, the patch set is tested by below tests:
a. vulkan cts  " ./deqp-vk -n dEQP-VK. *semaphore*" 
b. internal vulkan timeline test
c. libdrm test "sudo ./amdgpu_test -s 9"
d. IGT test, "sudo ./syncobj_basic"
e. IGT test, "sudo ./syncobj_wait"
f. IGT test, "sudo ./syncobj_timeline"

Any other suggestion or requirement is welcome.

-David

> -Original Message-
> From: dri-devel  On Behalf Of
> Chunming Zhou
> Sent: Tuesday, December 11, 2018 6:35 PM
> To: Koenig, Christian ; dri-
> de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; intel-
> g...@lists.freedesktop.org
> Cc: Christian König ; Koenig, Christian
> 
> Subject: [PATCH 01/10] dma-buf: add new dma_fence_chain container v4
> 
> From: Christian König 
> 
> Lockless container implementation similar to a dma_fence_array, but with
> only two elements per node and automatic garbage collection.
> 
> v2: properly document dma_fence_chain_for_each, add
> dma_fence_chain_find_seqno,
> drop prev reference during garbage collection if it's not a chain fence.
> v3: use head and iterator for dma_fence_chain_for_each
> v4: fix reference count in dma_fence_chain_enable_signaling
> 
> Signed-off-by: Christian König 
> ---
>  drivers/dma-buf/Makefile  |   3 +-
>  drivers/dma-buf/dma-fence-chain.c | 241
> ++
>  include/linux/dma-fence-chain.h   |  81 ++
>  3 files changed, 324 insertions(+), 1 deletion(-)  create mode 100644
> drivers/dma-buf/dma-fence-chain.c  create mode 100644 include/linux/dma-
> fence-chain.h
> 
> diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index
> 0913a6ccab5a..1f006e083eb9 100644
> --- a/drivers/dma-buf/Makefile
> +++ b/drivers/dma-buf/Makefile
> @@ -1,4 +1,5 @@
> -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-
> fence.o
> +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
> +  reservation.o seqno-fence.o
>  obj-$(CONFIG_SYNC_FILE)  += sync_file.o
>  obj-$(CONFIG_SW_SYNC)+= sw_sync.o sync_debug.o
>  obj-$(CONFIG_UDMABUF)+= udmabuf.o
> diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-
> fence-chain.c
> new file mode 100644
> index ..0c5e3c902fa0
> --- /dev/null
> +++ b/drivers/dma-buf/dma-fence-chain.c
> @@ -0,0 +1,241 @@
> +/*
> + * fence-chain: chain fences together in a timeline
> + *
> + * Copyright (C) 2018 Advanced Micro Devices, Inc.
> + * Authors:
> + *   Christian König 
> + *
> + * This program is free software; you can redistribute it and/or modify
> +it
> + * under the terms of the GNU General Public License version 2 as
> +published by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> +WITHOUT
> + * ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY
> +or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +License for
> + * more details.
> + */
> +
> +#include 
> +
> +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
> +
> +/**
> + * dma_fence_chain_get_prev - use RCU to get a reference to the
> +previous fence
> + * @chain: chain node to get the previous node from
> + *
> + * Use dma_fence_get_rcu_safe to get a reference to the previous fence
> +of the
> + * chain node.
> + */
> +static struct dma_fence *dma_fence_chain_get_prev(struct
> +dma_fence_chain *chain) {
> + struct dma_fence *prev;
> +
> + rcu_read_lock();
> + prev = dma_fence_get_rcu_safe(>prev);
> + rcu_read_unlock();
> + return prev;
> +}
> +
> +/**
> + * dma_fence_chain_walk - chain walking function
> + * @fence: current chain node
> + *
> + * Walk the chain to the next node. Returns the next fence or NULL if
> +we are at
> + * the end of the chain. Garbage collects chain nodes which are already
> + * signaled.
> + */
> +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) {
> + struct dma_fence_chain *chain, *prev_chain;
> + struct dma_fence *prev, *replacement, *tmp;
> +
> + chain = to_dma_fence_chain(fence);
> + if (!chain) {
> + dma_fence_put(fence);
> + return NULL;
> + }
> +
> + while ((prev = dma_fence_chain_get_prev(chain))) {
> +
> + prev_chain = to_dma_fence_chain(prev);
> + if (prev_chain) {
> + if (!dma_fence_is_signaled(prev_chain->fence))
> + break;
> +
> + replacement =
> dma_fence_chain_get_prev(prev_chain);
> + } else {
> + if (!dma_fence_is_signaled(prev))
> + break;
> +
> + replacement = NULL;
> + }
> +
> + tmp 

RE: [PATCH v3 2/2] drm/sched: Rework HW fence processing.

2018-12-10 Thread Zhou, David(ChunMing)
I don't think adding cb to sched job would work as soon as their lifetime is 
different with fence.
Unless you make the sched job reference, otherwise we will get trouble sooner 
or later.

-David

> -Original Message-
> From: amd-gfx  On Behalf Of
> Andrey Grodzovsky
> Sent: Tuesday, December 11, 2018 5:44 AM
> To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
> ckoenig.leichtzumer...@gmail.com; e...@anholt.net;
> etna...@lists.freedesktop.org
> Cc: Zhou, David(ChunMing) ; Liu, Monk
> ; Grodzovsky, Andrey
> 
> Subject: [PATCH v3 2/2] drm/sched: Rework HW fence processing.
> 
> Expedite job deletion from ring mirror list to the HW fence signal callback
> instead from finish_work, together with waiting for all such fences to signal 
> in
> drm_sched_stop we garantee that already signaled job will not be processed
> twice.
> Remove the sched finish fence callback and just submit finish_work directly
> from the HW fence callback.
> 
> v2: Fix comments.
> 
> v3: Attach  hw fence cb to sched_job
> 
> Suggested-by: Christian Koenig 
> Signed-off-by: Andrey Grodzovsky 
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 58 --
> 
>  include/drm/gpu_scheduler.h|  6 ++--
>  2 files changed, 30 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index cdf95e2..f0c1f32 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -284,8 +284,6 @@ static void drm_sched_job_finish(struct work_struct
> *work)
>   cancel_delayed_work_sync(>work_tdr);
> 
>   spin_lock_irqsave(>job_list_lock, flags);
> - /* remove job from ring_mirror_list */
> - list_del_init(_job->node);
>   /* queue TDR for next job */
>   drm_sched_start_timeout(sched);
>   spin_unlock_irqrestore(>job_list_lock, flags); @@ -293,22
> +291,11 @@ static void drm_sched_job_finish(struct work_struct *work)
>   sched->ops->free_job(s_job);
>  }
> 
> -static void drm_sched_job_finish_cb(struct dma_fence *f,
> - struct dma_fence_cb *cb)
> -{
> - struct drm_sched_job *job = container_of(cb, struct drm_sched_job,
> -  finish_cb);
> - schedule_work(>finish_work);
> -}
> -
>  static void drm_sched_job_begin(struct drm_sched_job *s_job)  {
>   struct drm_gpu_scheduler *sched = s_job->sched;
>   unsigned long flags;
> 
> - dma_fence_add_callback(_job->s_fence->finished, _job-
> >finish_cb,
> -drm_sched_job_finish_cb);
> -
>   spin_lock_irqsave(>job_list_lock, flags);
>   list_add_tail(_job->node, >ring_mirror_list);
>   drm_sched_start_timeout(sched);
> @@ -359,12 +346,11 @@ void drm_sched_stop(struct drm_gpu_scheduler
> *sched, struct drm_sched_job *bad,
>   list_for_each_entry_reverse(s_job, >ring_mirror_list, node)
> {
>   if (s_job->s_fence->parent &&
>   dma_fence_remove_callback(s_job->s_fence->parent,
> -   _job->s_fence->cb)) {
> +   _job->cb)) {
>   dma_fence_put(s_job->s_fence->parent);
>   s_job->s_fence->parent = NULL;
>   atomic_dec(>hw_rq_count);
> - }
> - else {
> + } else {
>   /* TODO Is it get/put neccessey here ? */
>   dma_fence_get(_job->s_fence->finished);
>   list_add(_job->finish_node, _list); @@ -
> 417,31 +403,34 @@ EXPORT_SYMBOL(drm_sched_stop);  void
> drm_sched_start(struct drm_gpu_scheduler *sched, bool unpark_only)  {
>   struct drm_sched_job *s_job, *tmp;
> - unsigned long flags;
>   int r;
> 
>   if (unpark_only)
>   goto unpark;
> 
> - spin_lock_irqsave(>job_list_lock, flags);
> + /*
> +  * Locking the list is not required here as the sched thread is parked
> +  * so no new jobs are being pushed in to HW and in drm_sched_stop
> we
> +  * flushed all the jobs who were still in mirror list but who already
> +  * signaled and removed them self from the list. Also concurrent
> +  * GPU recovers can't run in parallel.
> +  */
>   list_for_each_entry_safe(s_job, tmp, >ring_mirror_list,
> node) {
> - struct drm_sched_fence *s_fence = s_job->s_fence;
>   struct dma_fence *fence = s_job->s_fence->p

RE: [PATCH -next] drm/amdgpu: remove set but not used variable 'grbm_soft_reset'

2018-12-09 Thread Zhou, David(ChunMing)


> -Original Message-
> From: YueHaibing 
> Sent: Saturday, December 08, 2018 11:01 PM
> To: Deucher, Alexander ; Koenig, Christian
> ; Zhou, David(ChunMing)
> ; airl...@linux.ie; Liu, Leo ;
> Gao, Likun ; Panariti, David
> ; S, Shirish ; Zhu, Rex
> ; Grodzovsky, Andrey 
> Cc: YueHaibing ; amd-gfx@lists.freedesktop.org;
> dri-de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; kernel-
> janit...@vger.kernel.org
> Subject: [PATCH -next] drm/amdgpu: remove set but not used variable
> 'grbm_soft_reset'
> 
> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function
> 'gfx_v8_0_pre_soft_reset':
> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:4950:27: warning:
>  variable 'srbm_soft_reset' set but not used [-Wunused-but-set-variable]
> 
> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function
> 'gfx_v8_0_post_soft_reset':
> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:5054:27: warning:
>  variable 'srbm_soft_reset' set but not used [-Wunused-but-set-variable]
> 
> It never used since introduction in commit d31a501ead7f ("drm/amdgpu: add
> pre_soft_reset ip func") and e4ae0fc33631 ("drm/amdgpu: implement
> gfx8 post_soft_reset")
> 
> Signed-off-by: YueHaibing 

Reviewed-by: Chunming Zhou 

> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> index 1454fc3..8c1ba79 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
> @@ -4947,14 +4947,13 @@ static bool gfx_v8_0_check_soft_reset(void
> *handle)  static int gfx_v8_0_pre_soft_reset(void *handle)  {
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> - u32 grbm_soft_reset = 0, srbm_soft_reset = 0;
> + u32 grbm_soft_reset = 0;
> 
>   if ((!adev->gfx.grbm_soft_reset) &&
>   (!adev->gfx.srbm_soft_reset))
>   return 0;
> 
>   grbm_soft_reset = adev->gfx.grbm_soft_reset;
> - srbm_soft_reset = adev->gfx.srbm_soft_reset;
> 
>   /* stop the rlc */
>   adev->gfx.rlc.funcs->stop(adev);
> @@ -5051,14 +5050,13 @@ static int gfx_v8_0_soft_reset(void *handle)
> static int gfx_v8_0_post_soft_reset(void *handle)  {
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> - u32 grbm_soft_reset = 0, srbm_soft_reset = 0;
> + u32 grbm_soft_reset = 0;
> 
>   if ((!adev->gfx.grbm_soft_reset) &&
>   (!adev->gfx.srbm_soft_reset))
>   return 0;
> 
>   grbm_soft_reset = adev->gfx.grbm_soft_reset;
> - srbm_soft_reset = adev->gfx.srbm_soft_reset;
> 
>   if (REG_GET_FIELD(grbm_soft_reset, GRBM_SOFT_RESET,
> SOFT_RESET_CP) ||
>   REG_GET_FIELD(grbm_soft_reset, GRBM_SOFT_RESET,
> SOFT_RESET_CPF) ||
> 
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu notifier v6

2018-12-07 Thread Zhou, David(ChunMing)
Even you should rename amdgpu_mn.c/h to amdgpu_hmm.c/h.

-David

> -Original Message-
> From: amd-gfx  On Behalf Of Yang,
> Philip
> Sent: Friday, December 07, 2018 5:03 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Yang, Philip 
> Subject: [PATCH 1/3] drm/amdgpu: use HMM mirror callback to replace mmu
> notifier v6
> 
> Replace our MMU notifier with
> hmm_mirror_ops.sync_cpu_device_pagetables
> callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a
> dependency in DRM_AMDGPU_USERPTR Kconfig.
> 
> It supports both KFD userptr and gfx userptr paths.
> 
> The depdent HMM patchset from Jérôme Glisse are all merged into 4.20.0
> kernel now.
> 
> Change-Id: Ie62c3c5e3c5b8521ab3b438d1eff2aa2a003835e
> Signed-off-by: Philip Yang 
> ---
>  drivers/gpu/drm/amd/amdgpu/Kconfig |   6 +-
>  drivers/gpu/drm/amd/amdgpu/Makefile|   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 122 ++-
> --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h |   2 +-
>  4 files changed, 55 insertions(+), 77 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig
> b/drivers/gpu/drm/amd/amdgpu/Kconfig
> index 9221e5489069..960a63355705 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Kconfig
> +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
> @@ -26,10 +26,10 @@ config DRM_AMDGPU_CIK  config
> DRM_AMDGPU_USERPTR
>   bool "Always enable userptr write support"
>   depends on DRM_AMDGPU
> - select MMU_NOTIFIER
> + select HMM_MIRROR
>   help
> -   This option selects CONFIG_MMU_NOTIFIER if it isn't already
> -   selected to enabled full userptr support.
> +   This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it
> +   isn't already selected to enabled full userptr support.
> 
>  config DRM_AMDGPU_GART_DEBUGFS
>   bool "Allow GART access through debugfs"
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index f76bcb9c45e4..675efc850ff4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -172,7 +172,7 @@ endif
>  amdgpu-$(CONFIG_COMPAT) += amdgpu_ioc32.o
>  amdgpu-$(CONFIG_VGA_SWITCHEROO) += amdgpu_atpx_handler.o
>  amdgpu-$(CONFIG_ACPI) += amdgpu_acpi.o
> -amdgpu-$(CONFIG_MMU_NOTIFIER) += amdgpu_mn.o
> +amdgpu-$(CONFIG_HMM_MIRROR) += amdgpu_mn.o
> 
>  include $(FULL_AMD_PATH)/powerplay/Makefile
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> index e55508b39496..5d518d2bb9be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> @@ -45,7 +45,7 @@
> 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -58,7 +58,6 @@
>   *
>   * @adev: amdgpu device pointer
>   * @mm: process address space
> - * @mn: MMU notifier structure
>   * @type: type of MMU notifier
>   * @work: destruction work item
>   * @node: hash table node to find structure by adev and mn @@ -66,6 +65,7
> @@
>   * @objects: interval tree containing amdgpu_mn_nodes
>   * @read_lock: mutex for recursive locking of @lock
>   * @recursion: depth of recursion
> + * @mirror: HMM mirror function support
>   *
>   * Data for each amdgpu device and process address space.
>   */
> @@ -73,7 +73,6 @@ struct amdgpu_mn {
>   /* constant after initialisation */
>   struct amdgpu_device*adev;
>   struct mm_struct*mm;
> - struct mmu_notifier mn;
>   enum amdgpu_mn_type type;
> 
>   /* only used on destruction */
> @@ -87,6 +86,9 @@ struct amdgpu_mn {
>   struct rb_root_cached   objects;
>   struct mutexread_lock;
>   atomic_trecursion;
> +
> + /* HMM mirror */
> + struct hmm_mirror   mirror;
>  };
> 
>  /**
> @@ -103,7 +105,7 @@ struct amdgpu_mn_node {  };
> 
>  /**
> - * amdgpu_mn_destroy - destroy the MMU notifier
> + * amdgpu_mn_destroy - destroy the HMM mirror
>   *
>   * @work: previously sheduled work item
>   *
> @@ -129,28 +131,26 @@ static void amdgpu_mn_destroy(struct work_struct
> *work)
>   }
>   up_write(>lock);
>   mutex_unlock(>mn_lock);
> - mmu_notifier_unregister_no_release(>mn, amn->mm);
> +
> + hmm_mirror_unregister(>mirror);
>   kfree(amn);
>  }
> 
>  /**
> - * amdgpu_mn_release - callback to notify about mm destruction
> + * amdgpu_hmm_mirror_release - callback to notify about mm destruction
>   *
> - * @mn: our notifier
> - * @mm: the mm this callback is about
> + * @mirror: the HMM mirror (mm) this callback is about
>   *
> - * Shedule a work item to lazy destroy our notifier.
> + * Shedule a work item to lazy destroy HMM mirror.
>   */
> -static void amdgpu_mn_release(struct mmu_notifier *mn,
> -   struct mm_struct *mm)
> +static void amdgpu_hmm_mirror_release(struct hmm_mirror *mirror)
>  {
> - struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn,
> mn);
> + struct amdgpu_mn *amn = container_of(mirror, struct amdgpu_mn,
> 

RE: [PATCH 02/11] dma-buf: add new dma_fence_chain container v2

2018-12-03 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Monday, December 03, 2018 9:56 PM
> To: Zhou, David(ChunMing) ; Koenig, Christian
> ; dri-de...@lists.freedesktop.org; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container
> v2
> 
> Am 03.12.18 um 14:44 schrieb Chunming Zhou:
> >
> > 在 2018/12/3 21:28, Christian König 写道:
> >> Am 03.12.18 um 14:18 schrieb Chunming Zhou:
> >>> 在 2018/12/3 19:00, Christian König 写道:
> >>>> Am 03.12.18 um 06:25 schrieb zhoucm1:
> >>>>> On 2018年11月28日 22:50, Christian König wrote:
> >>>>>> Lockless container implementation similar to a dma_fence_array,
> >>>>>> but with only two elements per node and automatic garbage
> >>>>>> collection.
> >>>>>>
> >>>>>> v2: properly document dma_fence_chain_for_each, add
> >>>>>> dma_fence_chain_find_seqno,
> >>>>>>    drop prev reference during garbage collection if it's not
> >>>>>> a chain fence.
> >>>>>>
> >>>>>> Signed-off-by: Christian König 
> >>>>>> ---
 [snip]
> >>>>>> +
> >>>>>> +/**
> >>>>>> + * dma_fence_chain_init - initialize a fence chain
> >>>>>> + * @chain: the chain node to initialize
> >>>>>> + * @prev: the previous fence
> >>>>>> + * @fence: the current fence
> >>>>>> + *
> >>>>>> + * Initialize a new chain node and either start a new chain or
> >>>>>> +add
> >>>>>> the node to
> >>>>>> + * the existing chain of the previous fence.
> >>>>>> + */
> >>>>>> +void dma_fence_chain_init(struct dma_fence_chain *chain,
> >>>>>> +  struct dma_fence *prev,
> >>>>>> +  struct dma_fence *fence,
> >>>>>> +  uint64_t seqno)
> >>>>>> +{
> >>>>>> +    struct dma_fence_chain *prev_chain =
> >>>>>> +to_dma_fence_chain(prev);
> >>>>>> +    uint64_t context;
> >>>>>> +
> >>>>>> +    spin_lock_init(>lock);
> >>>>>> +    chain->prev = prev;
> >>>>>> +    chain->fence = fence;
> >>>>>> +    chain->prev_seqno = 0;
> >>>>>> +    init_irq_work(>work, dma_fence_chain_irq_work);
> >>>>>> +
> >>>>>> +    /* Try to reuse the context of the previous chain node. */
> >>>>>> +    if (prev_chain && seqno > prev->seqno &&
> >>>>>> +    __dma_fence_is_later(seqno, prev->seqno)) {
> >>>>> As your patch#1 makes __dma_fence_is_later only be valid for
> >>>>> 32bit, we cannot use it for 64bit here, we should remove it from
> >>>>> here, just compare seqno directly.
> >>>> That is intentional. We must make sure that the number both
> >>>> increments as 64bit number as well as not wraps around as 32bit
> number.
> >>>>
> >>>> In other words the largest difference between two sequence numbers
> >>>> userspace is allowed to submit is 1<<31.
> >>> Why? no one can make sure that, application users would only think
> >>> it is an uint64 sequence nubmer, and they can signal any advanced
> >>> point. I already see umd guys writing timeline test use max_uint64-1
> >>> as a final signal.
> >>> We shouldn't add this limitation here.
> >> We need to be backward compatible to hardware which can only do 32bit
> >> signaling with the dma-fence implementation.
> > I see that, you already explained that before.
> > but can't we just grep low 32bit of seqno only when 32bit hardware try
> > to use?
> >
> > then we can make dma_fence_later use 64bit comparation.
> 
> The problem is that we don't know at all times when to use a 32bit compare
> and when to use a 64bit compare.
> 
> What we could do is test if any of the upper 32bits of a sequence number is
> not 0 and if that is the case do a 64bit compare. This way max_uint64_t would
> still be handled correctly.
Sounds we can have a try, and we need mask upper 32bits for 32bit hardware case 
in the meanwhile,  right?

-David
> 
> 
> Christian.
> 
> >
> >> Otherwise dma_fence_later() could return an inconsistent result and
> >> break at other places.
> >>
> >> So if userspace wants to use more than 1<<31 difference between
> >> sequence numbers we need to push back on this.
> > It's rare case, but I don't think we can convince them add this
> > limitation. So we cannot add this limitation here.
> >
> > -David

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH libdrm 4/5] wrap syncobj timeline query/wait APIs for amdgpu v3

2018-11-30 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Friday, November 30, 2018 5:15 PM
> To: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH libdrm 4/5] wrap syncobj timeline query/wait APIs for
> amdgpu v3
> 
[snip]
> >> +drm_public int amdgpu_cs_syncobj_query(amdgpu_device_handle dev,
> >> +   uint32_t *handles, uint64_t *points,
> > This interfaces is public to umd, I think they like "uint64_t
> > **points" for batch query, I've verified before, it works well and
> > more convenience.
> > If removing num_handles, that means only one syncobj to query, I agree
> > with "uint64_t *point".
> 
> "handles" as well as "points" are an array of objects. If the UMD wants to
> write the points to separate locations it can do so manually after calling the
> function.

Ok, it doesn't matter.

-David
> 
> It doesn't make any sense that libdrm or the kernel does the extra
> indirection, the transferred pointers are 64bit as well (even on a 32bit
> system) so the overhead is identical.
> 
> Adding another indirection just makes the implementation unnecessary
> complex.


> 
> Christian.
> 
> >
> > -David
> >> +   unsigned num_handles) {
> >> +    if (NULL == dev)
> >> +    return -EINVAL;
> >> +
> >> +    return drmSyncobjQuery(dev->fd, handles, points, num_handles); }
> >> +
> >>   drm_public int amdgpu_cs_export_syncobj(amdgpu_device_handle dev,
> >>   uint32_t handle,
> >>   int *shared_fd)
> >

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer dereference

2018-11-26 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Monday, November 26, 2018 5:23 PM
> To: Sharma, Deepak ; Zhou, David(ChunMing)
> ; Koenig, Christian ;
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer
> dereference
> 
> Am 26.11.18 um 02:59 schrieb Sharma, Deepak:
> >
> > 在 2018/11/24 2:10, Koenig, Christian 写道:
> >> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing):
> >>> 在 2018/11/23 21:30, Koenig, Christian 写道:
> >>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing):
> >>>>> 在 2018/11/22 19:25, Christian König 写道:
> >>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak:
> >>>>>>> when returned fence is not valid mostly due to userspace ignored
> >>>>>>> previous error causes NULL pointer dereference.
> >>>>>> Again, this is clearly incorrect. The my other mails on the
> >>>>>> earlier patch.
> >>>>> Sorry for I didn't get your history, but looks from the patch
> >>>>> itself, it is still a valid patch, isn't it?
> >>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL
> >>>> when the fence is already signaled.
> >>>>
> >>>> So this patch could totally break userspace because it changes the
> >>>> behavior when we try to sync to an already signaled fence.
> >>> Ah, I got your meaning, how about attached patch?
> >> Yeah something like this, but I would just give the
> >> DRM_SYNCOBJ_CREATE_SIGNALED instead.
> >>
> >> I mean that's what this flag is good for isn't it?
> > Yeah, I give a flag initally when creating patch, but as you know, there is 
> > a
> swich case not be able to use that flag:
> >
> >       case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD:
> >       fd = get_unused_fd_flags(O_CLOEXEC);
> >       if (fd < 0) {
> >       dma_fence_put(fence);
> >       return fd;
> >       }
> >
> >       sync_file = sync_file_create(fence);
> >       dma_fence_put(fence);
> >       if (!sync_file) {
> >       put_unused_fd(fd);
> >       return -ENOMEM;
> >       }
> >
> >       fd_install(fd, sync_file->file);
> >       info->out.handle = fd;
> >       return 0;
> >
> > So I change to stub fence instead.
> 
> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL
> fence.
> 
> We should then probably move the stub fence function into
> dma_fence_stub.c under drivers/dma-buf to keep the stuff together.

Yes, you wrap it to review first with your stub fence, we can do it separately 
first.

-David 
> 
> >
> > -David
> >
> > I have not applied this patch.
> > The issue was trying to address is when amdgpu_cs_ioctl() failed due to
> low memory (ENOMEM) but userspace chose to proceed and called
> amdgpu_cs_fence_to_handle_ioctl().
> > In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing
> > NULL pointer dereference, this patch was to avoid that and system panic
> But I understand now that its a valid case retuning NULL if fence was already
> signaled but need to handle case so avoid kernel panic. Seems David patch
> should fix this, I will test it tomorrow.
> 
> Mhm, but don't we bail out with an error if we ask for a failed command
> submission? If not that sounds like a bug as well.
> 
> Christian.
> 
> >
> > -Deepak
> >> Christian.
> >>
> >>> -David
> >>>> If that patch was applied then please revert it immediately.
> >>>>
> >>>> Christian.
> >>>>
> >>>>> -David
> >>>>>> If you have already pushed the patch then please revert.
> >>>>>>
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Signed-off-by: Deepak Sharma 
> >>>>>>> ---
> >>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
> >>>>>>>    1 file changed, 2 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> index 024dfbd87f11..14166cd8a12f 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence
> >>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev,
> >>>>>>>      fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no);
> >>>>>>>    amdgpu_ctx_put(ctx);
> >>>>>>> +    if(!fence)
> >>>>>>> +    return ERR_PTR(-EINVAL);
> >>>>>>>      return fence;
> >>>>>>>    }
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd: add the checking to avoid NULL pointer dereference

2018-11-21 Thread Zhou, David(ChunMing)


> -Original Message-
> From: amd-gfx  On Behalf Of
> Sharma, Deepak
> Sent: Thursday, November 22, 2018 10:37 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Sharma, Deepak 
> Subject: [PATCH] drm/amd: add the checking to avoid NULL pointer
> dereference
> 
> when returned fence is not valid mostly due to userspace ignored previous
> error causes NULL pointer dereference
> 
> Signed-off-by: Deepak Sharma 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 024dfbd87f11..c85bb313e6df 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1420,6 +1420,8 @@ int amdgpu_cs_fence_to_handle_ioctl(struct
> drm_device *dev, void *data,
>   fence = amdgpu_cs_get_fence(adev, filp, >in.fence);
>   if (IS_ERR(fence))
>   return PTR_ERR(fence);
> + if (!fence)
> + return -EINVAL;
Could you move them into the end of amdgpu_cs_get_fence()? Like:
If (!fence)
return ERR_PTR(-EINVAL);

Thanks,
-David
> 
>   switch (info->in.what) {
>   case AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ:
> --
> 2.15.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH libdrm 5/5] [libdrm] add syncobj timeline tests

2018-11-05 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Daniel Vetter  On Behalf Of Daniel Vetter
> Sent: Monday, November 05, 2018 5:39 PM
> To: Zhou, David(ChunMing) 
> Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH libdrm 5/5] [libdrm] add syncobj timeline tests
> 
> On Fri, Nov 02, 2018 at 04:26:49PM +0800, Chunming Zhou wrote:
> > Signed-off-by: Chunming Zhou 
> > ---
> >  tests/amdgpu/Makefile.am |   3 +-
> >  tests/amdgpu/amdgpu_test.c   |  12 ++
> >  tests/amdgpu/amdgpu_test.h   |  21 +++
> >  tests/amdgpu/meson.build |   2 +-
> >  tests/amdgpu/syncobj_tests.c | 263
> > +++
> >  5 files changed, 299 insertions(+), 2 deletions(-)  create mode
> > 100644 tests/amdgpu/syncobj_tests.c
> 
> This testcase seems very much a happy sunday scenario, no tests at all for
> corner cases, invalid input, and generally trying to pull the kernel over the
> table. I think we need a lot more, and preferrably in igt, where we already
> have a good baseline of drm_syncobj tests.
Hi Daniel,

OK, if you insist on that, I would switch to implement a timeline test on IGT.
Btw,  timeline syncobj test needs based on command submission, Can I write it 
with amdgpu driver on IGT?
And after that, where should I send igt patch to review? 

Last, if you are free, Could you also take a look the u/k interface of timeline 
syncobj?


Thanks,
David Zhou
> -Daniel
> 
> >
> > diff --git a/tests/amdgpu/Makefile.am b/tests/amdgpu/Makefile.am index
> > 447ff217..d3fbe2bb 100644
> > --- a/tests/amdgpu/Makefile.am
> > +++ b/tests/amdgpu/Makefile.am
> > @@ -33,4 +33,5 @@ amdgpu_test_SOURCES = \
> > vcn_tests.c \
> > uve_ib.h \
> > deadlock_tests.c \
> > -   vm_tests.c
> > +   vm_tests.c \
> > +   syncobj_tests.c
> > diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c
> > index 96fcd687..cdcb93a5 100644
> > --- a/tests/amdgpu/amdgpu_test.c
> > +++ b/tests/amdgpu/amdgpu_test.c
> > @@ -56,6 +56,7 @@
> >  #define UVD_ENC_TESTS_STR "UVD ENC Tests"
> >  #define DEADLOCK_TESTS_STR "Deadlock Tests"
> >  #define VM_TESTS_STR "VM Tests"
> > +#define SYNCOBJ_TIMELINE_TESTS_STR "SYNCOBJ TIMELINE Tests"
> >
> >  /**
> >   *  Open handles for amdgpu devices
> > @@ -116,6 +117,12 @@ static CU_SuiteInfo suites[] = {
> > .pCleanupFunc = suite_vm_tests_clean,
> > .pTests = vm_tests,
> > },
> > +   {
> > +   .pName = SYNCOBJ_TIMELINE_TESTS_STR,
> > +   .pInitFunc = suite_syncobj_timeline_tests_init,
> > +   .pCleanupFunc = suite_syncobj_timeline_tests_clean,
> > +   .pTests = syncobj_timeline_tests,
> > +   },
> >
> > CU_SUITE_INFO_NULL,
> >  };
> > @@ -165,6 +172,11 @@ static Suites_Active_Status suites_active_stat[] = {
> > .pName = VM_TESTS_STR,
> > .pActive = suite_vm_tests_enable,
> > },
> > +   {
> > +   .pName = SYNCOBJ_TIMELINE_TESTS_STR,
> > +   .pActive = suite_syncobj_timeline_tests_enable,
> > +   },
> > +
> >  };
> >
> >
> > diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h
> > index 0609a74b..946e91c2 100644
> > --- a/tests/amdgpu/amdgpu_test.h
> > +++ b/tests/amdgpu/amdgpu_test.h
> > @@ -194,6 +194,27 @@ CU_BOOL suite_vm_tests_enable(void);
> >   */
> >  extern CU_TestInfo vm_tests[];
> >
> > +/**
> > + * Initialize syncobj timeline test suite  */ int
> > +suite_syncobj_timeline_tests_init();
> > +
> > +/**
> > + * Deinitialize syncobj timeline test suite  */ int
> > +suite_syncobj_timeline_tests_clean();
> > +
> > +/**
> > + * Decide if the suite is enabled by default or not.
> > + */
> > +CU_BOOL suite_syncobj_timeline_tests_enable(void);
> > +
> > +/**
> > + * Tests in syncobj timeline test suite  */ extern CU_TestInfo
> > +syncobj_timeline_tests[];
> > +
> > +
> >  /**
> >   * Helper functions
> >   */
> > diff --git a/tests/amdgpu/meson.build b/tests/amdgpu/meson.build index
> > 4c1237c6..3ceec715 100644
> > --- a/tests/amdgpu/meson.build
> > +++ b/tests/amdgpu/meson.build
> > @@ -24,7 +24,7 @@ if dep_cunit.found()
> >  files(
> >'amdgpu_test.c', 'basic_tests.c', 'bo_tests.c', 'cs_tests.c',
> >'vce_tests.c', 'uvd_enc_tests.c', 'vcn_tests.c', 'deadlock_tests.c',
> > -  'vm_tests.c'

RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-04 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Koenig, Christian
> Sent: Monday, November 05, 2018 3:48 PM
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org; Zhou,
> David(ChunMing) 
> Subject: Re: [PATCH 2/3] drm/amdgpu: drop the sched_sync
> 
> Am 05.11.18 um 08:24 schrieb Liu, Monk:
> >> David Zhou had an use case which saw a >10% performance drop the last
> time he tried it.
> > I really don't believe that, because if you insert a WAIT_MEM on an already
> signaled fence, it only cost GPU couple clocks to move  on, right ? no reason
> to slow down up to 10% ... with 3dmark vulkan version test, the performance
> is barely different ... with my patch applied ...
> 
> Why do you think that we insert a WAIT_MEM on an already signaled fence?
> The pipeline sync always wait for the last fence value (because we can't
> handle wraparounds in PM4).
> 
> So you have a pipeline sync when you don't need one and that is really really
> bad for things shared between processes, e.g. X/Wayland and it's clients.
> 
> I also expects that this doesn't effect 3dmark at all, but everything which 
> runs
> in a window which is composed by X could be slowed down massively.
> 
> David do you remember which use case was affected when you tried to drop
> this optimization?
That was a long time ago, I remember Andrey also tried to remove sched_sync 
before, but he eventually kept it, right?
From Monk's patch, seems he doesn't change main logic, he just  moved 
sched_sync logic to job->need_pipe_sync.
But at least, I can see a bit effect, e.g. job process evaluates fence to 
sched_sync, but the fence could be signaled when amdgpu_ib_schedule, then don't 
need insert pipeline sync.

Anyway, this is a sensitive path, we should change it carefully, we should give 
a wide test.

Regards,
David Zhou
> 
> >> When a reset happens we flush the VMIDs when re-submitting the jobs
> to the rings and while doing so we also always do a pipeline sync.
> > I will check that point in my branch, I didn't use drm-next, maybe
> > there is gap in this part
> 
> We had that logic for a very long time now, but we recently simplified it.
> Could be that there was a bug introduced doing so.
> 
> Maybe we should add a specific flag to run_job() to note that we are re-
> running a job and then always add VM flushes/pipeline syncs?
> 
> But my main question is why do you see any issues with quark? That is a
> workaround for an issue for Vulkan sync handling and should only surface
> when a specific test is run many many times.
> 
> Regards,
> Christian.
> 
> >
> > /Monk
> > -Original Message-
> > From: Koenig, Christian
> > Sent: Monday, November 5, 2018 3:02 AM
> > To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH 2/3] drm/amdgpu: drop the sched_sync
> >
> >> Can you tell me which game/benchmark will have performance drop with
> this fix by your understanding ?
> > When you sync between submission things like composing X windows are
> slowed down massively.
> >
> > David Zhou had an use case which saw a >10% performance drop the last
> time he tried it.
> >
> >> The problem I hit is during the massive stress test against
> >> multi-process + quark , if the quark process hang the engine while there is
> another two job following the bad job, After the TDR these two job will lose
> the explicit and the pipeline-sync was also lost.
> > Well that is really strange. This workaround is only for a very specific 
> > Vulkan
> CTS test which we are still not 100% sure is actually valid.
> >
> > When a reset happens we flush the VMIDs when re-submitting the jobs to
> the rings and while doing so we also always do a pipeline sync.
> >
> > So you should never ever run into any issues in quark with that, even when
> we completely disable this workaround.
> >
> > Regards,
> > Christian.
> >
> > Am 04.11.18 um 01:48 schrieb Liu, Monk:
> >>> NAK, that would result in a severe performance drop.
> >>> We need the fence here to determine if we actually need to do the
> pipeline sync or not.
> >>> E.g. the explicit requested fence could already be signaled.
> >> For the performance issue, only insert a WAIT_REG_MEM on
> GFX/compute ring *doesn't* give the "severe" drop (it's mimic in fact) ...  At
> least I didn't observe any performance drop with 3dmark benchmark (also
> tested vulkan CTS), Can you tell me which game/benchmark will have
> performance drop with this fix by your understanding ? let me check it .
> >>
> >> The problem I hit is during the massive stress test against
> >> multi-process + qua

RE: [PATCH] drm/amdgpu: wait for IB test on first device open

2018-11-02 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 

> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Friday, November 02, 2018 4:45 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: wait for IB test on first device open
> 
> Instead of delaying that to the first query. Otherwise we could try to use the
> SDMA for VM updates before the IB tests are done.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 08d04f68dfeb..f87f717cc905 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -467,9 +467,6 @@ static int amdgpu_info_ioctl(struct drm_device *dev,
> void *data, struct drm_file
>   if (!info->return_size || !info->return_pointer)
>   return -EINVAL;
> 
> - /* Ensure IB tests are run on ring */
> - flush_delayed_work(>late_init_work);
> -
>   switch (info->query) {
>   case AMDGPU_INFO_ACCEL_WORKING:
>   ui32 = adev->accel_working;
> @@ -950,6 +947,9 @@ int amdgpu_driver_open_kms(struct drm_device
> *dev, struct drm_file *file_priv)
>   struct amdgpu_fpriv *fpriv;
>   int r, pasid;
> 
> + /* Ensure IB tests are run on ring */
> + flush_delayed_work(>late_init_work);
> +
>   file_priv->driver_priv = NULL;
> 
>   r = pm_runtime_get_sync(dev->dev);
> --
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [igt-dev] [PATCH] RFC: Make igts for cross-driver stuff mandatory?

2018-10-25 Thread Zhou, David(ChunMing)
Make igt for cross-driver, I think you should rename it first, not an intel 
specific. NO company wants their employee working on other company stuff.
You can rename it to DGT(drm graphics test), and published following  libdrm, 
or directly merge to libdrm, then everyone  can use it and develop it in same 
page, which is only my personal opinion. 

Regards,
David

> -Original Message-
> From: dri-devel  On Behalf Of Eric
> Anholt
> Sent: Friday, October 26, 2018 12:36 AM
> To: Sean Paul ; Daniel Vetter 
> Cc: IGT development ; Intel Graphics
> Development ; DRI Development  de...@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org
> Subject: Re: [igt-dev] [PATCH] RFC: Make igts for cross-driver stuff
> mandatory?
> 
> Sean Paul  writes:
> 
> > On Fri, Oct 19, 2018 at 10:50:49AM +0200, Daniel Vetter wrote:
> >> Hi all,
> >>
> >> This is just to collect feedback on this idea, and see whether the
> >> overall dri-devel community stands on all this. I think the past few
> >> cross-vendor uapi extensions all came with igts attached, and
> >> personally I think there's lots of value in having them: A
> >> cross-vendor interface isn't useful if every driver implements it
> >> slightly differently.
> >>
> >> I think there's 2 questions here:
> >>
> >> - Do we want to make such testcases mandatory?
> >>
> >
> > Yes, more testing == better code.
> >
> >
> >> - If yes, are we there yet, or is there something crucially missing
> >>   still?
> >
> > In my experience, no. Last week while trying to replicate an intel-gfx
> > CI failure, I tried compiling igt for one of my (intel) chromebooks.
> > It seems like cross-compilation (or, in my case, just specifying
> > prefix/ld_library_path/sbin_path) is broken on igt. If we want to
> > impose restrictions across the entire subsystem, we need to make sure
> > that everyone can build and deploy igt easily.
> >
> > I managed to hack around everything and get it working, but I still
> > haven't tried switching out the toolchain. Once we have some GitLab CI
> > to validate cross-compilation, then we can consider making IGT mandatory.
> >
> > It's possible that I'm just a meson n00b and didn't use the right
> > incantation, so maybe it already works, but then we need better
> documentation.
> >
> > I've pasted my horrible hacks below, I also didn't have libunwind, so
> > removed its usage.
> 
> I've also had to cut out libunwind for cross-compiling on many occasions.
> Worst library.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet

2018-10-24 Thread Zhou, David(ChunMing)


> -Original Message-
> From: amd-gfx  On Behalf Of Rex
> Zhu
> Sent: Wednesday, October 24, 2018 2:03 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex 
> Subject: [PATCH v2] drm/amdgpu: Patch csa mc address in IB packet
> 
> the csa buffer is used by sdma engine to do context save when preemption
> happens. it the mc address is zero, mean the preemtpion feature(MCBP) is
> disabled.
> 
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 13 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |  2 ++
>  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c   |  8 ++--
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   |  8 ++--
>  4 files changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> index 0fb9907..24b80bc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> @@ -40,3 +40,16 @@ struct amdgpu_sdma_instance *
> amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
> 
>   return NULL;
>  }
> +
> +int amdgpu_get_sdma_index(struct amdgpu_ring *ring, uint32_t *index) {
> + struct amdgpu_device *adev = ring->adev;
> + int i;
> +
> + for (i = 0; i < adev->sdma.num_instances; i++)
> + if (ring == >sdma.instance[i].ring ||
> + ring == >sdma.instance[i].page)
> + return i;
> +
> + return -EINVAL;
> +}

Loop for checking works,  but looks not good.

If you need ring index, you can define them first as enum, and evaluate enum 
index to ring when ring initializing.

Regards,
David Zhou
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> index 479a245..314078a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> @@ -26,6 +26,7 @@
> 
>  /* max number of IP instances */
>  #define AMDGPU_MAX_SDMA_INSTANCES2
> +#define AMDGPU_SDMA_CSA_SIZE (1024)
> 
>  enum amdgpu_sdma_irq {
>   AMDGPU_SDMA_IRQ_TRAP0 = 0,
> @@ -96,4 +97,5 @@ struct amdgpu_buffer_funcs {  struct
> amdgpu_sdma_instance *  amdgpu_get_sdma_instance(struct amdgpu_ring
> *ring);
> 
> +int amdgpu_get_sdma_index(struct amdgpu_ring *ring, uint32_t *index);
>  #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> index f5e6aa2..fdc5d75 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
> @@ -424,7 +424,11 @@ static void sdma_v3_0_ring_emit_ib(struct
> amdgpu_ring *ring,
>  bool ctx_switch)
>  {
>   unsigned vmid = GET_VMID(job);
> + uint64_t csa_mc_addr = job ? job->csa_mc_addr : 0;
> + uint32_t i = 0;
> 
> + if (amdgpu_get_sdma_index(ring, ))
> + return -EINVAL;
>   /* IB packet must end on a 8 DW boundary */
>   sdma_v3_0_ring_insert_nop(ring, (10 - (lower_32_bits(ring->wptr) &
> 7)) % 8);
> 
> @@ -434,8 +438,8 @@ static void sdma_v3_0_ring_emit_ib(struct
> amdgpu_ring *ring,
>   amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr) & 0xffe0);
>   amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr));
>   amdgpu_ring_write(ring, ib->length_dw);
> - amdgpu_ring_write(ring, 0);
> - amdgpu_ring_write(ring, 0);
> + amdgpu_ring_write(ring, lower_32_bits(csa_mc_addr + i *
> AMDGPU_SDMA_CSA_SIZE));
> + amdgpu_ring_write(ring, upper_32_bits(csa_mc_addr + i *
> +AMDGPU_SDMA_CSA_SIZE));
> 
>  }
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> index 2282ac1..e69a584 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -507,7 +507,11 @@ static void sdma_v4_0_ring_emit_ib(struct
> amdgpu_ring *ring,
>  bool ctx_switch)
>  {
>   unsigned vmid = GET_VMID(job);
> + uint64_t csa_mc_addr = job ? job->csa_mc_addr : 0;
> + uint32_t i = 0;
> 
> + if (amdgpu_get_sdma_index(ring, ))
> + return -EINVAL;
>   /* IB packet must end on a 8 DW boundary */
>   sdma_v4_0_ring_insert_nop(ring, (10 - (lower_32_bits(ring->wptr) &
> 7)) % 8);
> 
> @@ -517,8 +521,8 @@ static void sdma_v4_0_ring_emit_ib(struct
> amdgpu_ring *ring,
>   amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr) & 0xffe0);
>   amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr));
>   amdgpu_ring_write(ring, ib->length_dw);
> - amdgpu_ring_write(ring, 0);
> - amdgpu_ring_write(ring, 0);
> + amdgpu_ring_write(ring, lower_32_bits(csa_mc_addr + i *
> AMDGPU_SDMA_CSA_SIZE));
> + amdgpu_ring_write(ring, upper_32_bits(csa_mc_addr + i *
> +AMDGPU_SDMA_CSA_SIZE));
> 
>  }
> 
> --
> 1.9.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amdgpu: Fix null point errro

2018-10-18 Thread Zhou, David(ChunMing)
A minor suggestion, not sure if it's proper, Can we insert these callback 
checking to func? I know these func could be defined as a macro, can we change 
them to function definition?

David

> -Original Message-
> From: amd-gfx  On Behalf Of Rex
> Zhu
> Sent: Friday, October 19, 2018 10:51 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex 
> Subject: [PATCH 2/2] drm/amdgpu: Fix null point errro
> 
> need to check adev->powerplay.pp_funcs first, becasue from AI, the smu ip
> may be disabled by user, and the pp_handle is null in this case.
> 
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c| 6 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 2 +-
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c| 2 +-
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 6 --
>  5 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
> index 297a549..0a4fba1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
> @@ -135,7 +135,8 @@ static int acp_poweroff(struct generic_pm_domain
> *genpd)
>* 2. power off the acp tiles
>* 3. check and enter ulv state
>*/
> - if (adev->powerplay.pp_funcs->set_powergating_by_smu)
> + if (adev->powerplay.pp_funcs &&
> + adev->powerplay.pp_funcs-
> >set_powergating_by_smu)
>   amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_ACP, true);
>   }
>   return 0;
> @@ -517,7 +518,8 @@ static int acp_set_powergating_state(void *handle,
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>   bool enable = state == AMD_PG_STATE_GATE ? true : false;
> 
> - if (adev->powerplay.pp_funcs->set_powergating_by_smu)
> + if (adev->powerplay.pp_funcs &&
> + adev->powerplay.pp_funcs->set_powergating_by_smu)
>   amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_ACP, enable);
> 
>   return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4fca67a..7dad682 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1783,6 +1783,7 @@ static int amdgpu_device_set_pg_state(struct
> amdgpu_device *adev, enum amd_power
>   adev->ip_blocks[i].version->type ==
> AMD_IP_BLOCK_TYPE_VCE ||
>   adev->ip_blocks[i].version->type ==
> AMD_IP_BLOCK_TYPE_VCN ||
>   adev->ip_blocks[i].version->type ==
> AMD_IP_BLOCK_TYPE_ACP) &&
> + adev->powerplay.pp_funcs &&
>   adev->powerplay.pp_funcs->set_powergating_by_smu) {
>   if (!adev->ip_blocks[i].status.valid) {
> 
>   amdgpu_dpm_set_powergating_by_smu(adev, adev-
> >ip_blocks[i].version->type, state == AMD_PG_STATE_GATE ?
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 790fd54..1a656b8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -392,7 +392,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device
> *adev, bool enable)
>   if (!(adev->powerplay.pp_feature & PP_GFXOFF_MASK))
>   return;
> 
> - if (!adev->powerplay.pp_funcs->set_powergating_by_smu)
> + if (!adev->powerplay.pp_funcs ||
> +!adev->powerplay.pp_funcs->set_powergating_by_smu)
>   return;
> 
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> index 14649f8..fd23ba1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
> @@ -280,7 +280,7 @@ void mmhub_v1_0_update_power_gating(struct
> amdgpu_device *adev,
>   return;
> 
>   if (enable && adev->pg_flags & AMD_PG_SUPPORT_MMHUB) {
> - if (adev->powerplay.pp_funcs->set_powergating_by_smu)
> + if (adev->powerplay.pp_funcs &&
> +adev->powerplay.pp_funcs->set_powergating_by_smu)
>   amdgpu_dpm_set_powergating_by_smu(adev,
> AMD_IP_BLOCK_TYPE_GMC, true);
> 
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> index 2e8365d..d97e6a2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -1595,7 +1595,8 @@ static int sdma_v4_0_hw_init(void *handle)
>   int r;
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> 
> - if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs-
> >set_powergating_by_smu)
> + if (adev->asic_type == CHIP_RAVEN && adev->powerplay.pp_funcs
> &&
> + adev->powerplay.pp_funcs-
> >set_powergating_by_smu)
>   amdgpu_dpm_set_powergating_by_smu(adev,
> 

RE: [PATCH 7/7] drm/amdgpu: update version for timeline syncobj support in amdgpu

2018-10-15 Thread Zhou, David(ChunMing)
Ping...
Christian, Could I get your RB on the series? And help me to push to drm-misc?
After that I can rebase libdrm header file based on drm-next.

Thanks,
David Zhou

> -Original Message-
> From: amd-gfx  On Behalf Of
> Chunming Zhou
> Sent: Monday, October 15, 2018 4:56 PM
> To: dri-de...@lists.freedesktop.org
> Cc: Zhou, David(ChunMing) ; amd-
> g...@lists.freedesktop.org
> Subject: [PATCH 7/7] drm/amdgpu: update version for timeline syncobj
> support in amdgpu
> 
> Signed-off-by: Chunming Zhou 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6870909da926..58cba492ba55 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -70,9 +70,10 @@
>   * - 3.25.0 - Add support for sensor query info (stable pstate sclk/mclk).
>   * - 3.26.0 - GFX9: Process AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE.
>   * - 3.27.0 - Add new chunk to to AMDGPU_CS to enable BO_LIST creation.
> + * - 3.28.0 - Add syncobj timeline support to AMDGPU_CS.
>   */
>  #define KMS_DRIVER_MAJOR 3
> -#define KMS_DRIVER_MINOR 27
> +#define KMS_DRIVER_MINOR 28
>  #define KMS_DRIVER_PATCHLEVEL0
> 
>  int amdgpu_vram_limit = 0;
> --
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/6] drm: add support of syncobj timeline point wait v2

2018-10-07 Thread Zhou, David(ChunMing)
>> Another general comment (no good place to put it) is that I think we want 
>> two kinds of waits:  Wait for time point to be completed and wait for time 
>> point to become available.  The first is the usual CPU wait for completion 
>> while the second is for use by userspace drivers to wait until the first 
>> moment where they can submit work which depends on a given time point.

Hi Jason,

How about adding two new wait flags?
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_COMPLETED
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE

Thanks,
David

From: Christian König 
Sent: Tuesday, September 25, 2018 5:50 PM
To: Jason Ekstrand ; Zhou, David(ChunMing) 

Cc: amd-gfx mailing list ; Maling list - DRI 
developers 
Subject: Re: [PATCH 3/6] drm: add support of syncobj timeline point wait v2

Am 25.09.2018 um 11:22 schrieb Jason Ekstrand:
On Thu, Sep 20, 2018 at 6:04 AM Chunming Zhou 
mailto:david1.z...@amd.com>> wrote:
points array is one-to-one match with syncobjs array.
v2:
add seperate ioctl for timeline point wait, otherwise break uapi.

I think ioctl structs can be extended as long as fields aren't re-ordered.  I'm 
not sure on the details of this though as I'm not a particularly experienced 
kernel developer.

Yeah, that is correct. The problem in this particular case is that we don't 
change the direct IOCTL parameter, but rather the array it points to.

We could do something like keep the existing handles array and add a separate 
optional one for the timeline points. That would also drop the need for the 
padding of the structure.


Another general comment (no good place to put it) is that I think we want two 
kinds of waits:  Wait for time point to be completed and wait for time point to 
become available.  The first is the usual CPU wait for completion while the 
second is for use by userspace drivers to wait until the first moment where 
they can submit work which depends on a given time point.

Oh, yeah that is a really good point as ell.

Christian.



Signed-off-by: Chunming Zhou mailto:david1.z...@amd.com>>
---
 drivers/gpu/drm/drm_internal.h |  2 +
 drivers/gpu/drm/drm_ioctl.c|  2 +
 drivers/gpu/drm/drm_syncobj.c  | 99 +-
 include/uapi/drm/drm.h | 14 +
 4 files changed, 103 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 0c4eb4a9ab31..566d44e3c782 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -183,6 +183,8 @@ int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, 
void *data,
   struct drm_file *file_private);
 int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_private);
+int drm_syncobj_timeline_wait_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file_private);
 int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_private);
 int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 6b4a633b4240..c0891614f516 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -669,6 +669,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, 
drm_syncobj_timeline_wait_ioctl,
+ DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl,
  DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 67472bd77c83..a43de0e4616c 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -126,13 +126,14 @@ static void drm_syncobj_add_callback_locked(struct 
drm_syncobj *syncobj,
 }

 static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
+u64 point,
 struct dma_fence **fence,
 struct drm_syncobj_cb *cb,
 drm_syncobj_func_t func)
 {
int ret;

-   ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
+   ret = drm_syncobj_search_fence(syncobj, point, 0, fence);
if (!ret)
return 1;

@@ -143,7 +144,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct 
drm_syncobj *syncobj,
 */
if (!list_empty(>signal_pt_list)) {
spin_unlock(>lock);
-   drm_syncobj_search_fence(syncobj, 0, 0, 

RE: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS

2018-10-07 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Nicolai Hähnle 
> Sent: Wednesday, September 26, 2018 4:44 PM
> To: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS
> 
> >   static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, diff --git
> > a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index
> > 1ceec56de015..412359b446f1 100644
> > --- a/include/uapi/drm/amdgpu_drm.h
> > +++ b/include/uapi/drm/amdgpu_drm.h
> > @@ -517,6 +517,8 @@ struct drm_amdgpu_gem_va {
> >   #define AMDGPU_CHUNK_ID_SYNCOBJ_IN  0x04
> >   #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05
> >   #define AMDGPU_CHUNK_ID_BO_HANDLES  0x06
> > +#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x07
> > +#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x08
> >
> >   struct drm_amdgpu_cs_chunk {
> > __u32   chunk_id;
> > @@ -592,6 +594,14 @@ struct drm_amdgpu_cs_chunk_sem {
> > __u32 handle;
> >   };
> >
> > +struct drm_amdgpu_cs_chunk_syncobj {
> > +   __u32 handle;
> > +   __u32 pad;
> > +   __u64 point;
> > +   __u64 flags;
> > +};
> 
> Sure it's nice to be forward-looking, but can't we just put the flags into the
> padding?

Will change.

Thanks,
David
> 
> Cheers,
> Nicolai
> 
> 
> > +
> > +
> >   #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ0
> >   #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD 1
> >   #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD   2
> >
> 
> 
> --
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS

2018-10-07 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Nicolai Hähnle 
> Sent: Wednesday, September 26, 2018 5:06 PM
> To: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 5/6] drm/amdgpu: add timeline support in amdgpu CS
> 
> Hey Chunming,
> 
> On 20.09.2018 13:03, Chunming Zhou wrote:
> > @@ -1113,48 +1117,91 @@ static int
> amdgpu_syncobj_lookup_and_add_to_sync(struct amdgpu_cs_parser *p,
> >   }
> >
> >   static int amdgpu_cs_process_syncobj_in_dep(struct amdgpu_cs_parser
> *p,
> > -   struct amdgpu_cs_chunk *chunk)
> > +   struct amdgpu_cs_chunk *chunk,
> > +   bool timeline)
> >   {
> > unsigned num_deps;
> > int i, r;
> > -   struct drm_amdgpu_cs_chunk_sem *deps;
> >
> > -   deps = (struct drm_amdgpu_cs_chunk_sem *)chunk->kdata;
> > -   num_deps = chunk->length_dw * 4 /
> > -   sizeof(struct drm_amdgpu_cs_chunk_sem);
> > +   if (!timeline) {
> > +   struct drm_amdgpu_cs_chunk_sem *deps;
> >
> > -   for (i = 0; i < num_deps; ++i) {
> > -   r = amdgpu_syncobj_lookup_and_add_to_sync(p,
> deps[i].handle);
> > +   deps = (struct drm_amdgpu_cs_chunk_sem *)chunk->kdata;
> > +   num_deps = chunk->length_dw * 4 /
> > +   sizeof(struct drm_amdgpu_cs_chunk_sem);
> > +   for (i = 0; i < num_deps; ++i) {
> > +   r = amdgpu_syncobj_lookup_and_add_to_sync(p,
> deps[i].handle,
> > + 0, 0);
> > if (r)
> > return r;
> 
> The indentation looks wrong.
> 
> 
> > +   }
> > +   } else {
> > +   struct drm_amdgpu_cs_chunk_syncobj *syncobj_deps;
> > +
> > +   syncobj_deps = (struct drm_amdgpu_cs_chunk_syncobj
> *)chunk->kdata;
> > +   num_deps = chunk->length_dw * 4 /
> > +   sizeof(struct drm_amdgpu_cs_chunk_syncobj);
> > +   for (i = 0; i < num_deps; ++i) {
> > +   r = amdgpu_syncobj_lookup_and_add_to_sync(p,
> syncobj_deps[i].handle,
> > +
> syncobj_deps[i].point,
> > +
> syncobj_deps[i].flags);
> > +   if (r)
> > +   return r;
> 
> Here as well.
> 
> So I'm wondering a bit about this uapi. Specifically, what happens if you try 
> to
> use timeline syncobjs here as dependencies _without_
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT?
> 
> My understanding is, it'll just return -EINVAL without any indication as to
> which syncobj actually failed. What's the caller supposed to do then?

How about adding a print to indicate which syncobj failed?

Thanks,
David Zhou
> 
> Cheers,
> Nicolai
> --
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: Making use of more Gitlab features for xf86-video-amdgpu

2018-09-21 Thread Zhou, David(ChunMing)
After moving patch submitted and reviewed as MRs, mail list will be no useful? 
And many people could miss new patches, right?

Regards,
David Zhou

> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel D?nzer
> Sent: Friday, September 21, 2018 3:13 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: Re: Making use of more Gitlab features for xf86-video-amdgpu
> 
> On 2018-09-19 6:46 p.m., Michel Dänzer wrote:
> >
> > With the 18.1.0 release out the door, I want to start making use of
> > more Gitlab features for xf86-video-amdgpu development.
> >
> > I've already enabled merge requests (MRs) at
> > https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu . From
> > now on, patches should primarily be submitted and reviewed as MRs. I
> > don't know yet if it'll be possible for this mailing list to get
> > notifications of new MRs; you may want to enable notifications on the page
> above.
> 
> FWIW,
> https://gitlab.freedesktop.org/xorg/driver/xf86-video-
> amdgpu/merge_requests
> now has some actual content. :)
> 
> 
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/6] drm: add support of syncobj timeline point wait v2

2018-09-21 Thread Zhou, David(ChunMing)


> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Thursday, September 20, 2018 7:11 PM
> To: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 3/6] drm: add support of syncobj timeline point wait v2
> 
> Am 20.09.2018 um 13:03 schrieb Chunming Zhou:
> > points array is one-to-one match with syncobjs array.
> > v2:
> > add seperate ioctl for timeline point wait, otherwise break uapi.
> >
> > Signed-off-by: Chunming Zhou 
> > ---
> >   drivers/gpu/drm/drm_internal.h |  2 +
> >   drivers/gpu/drm/drm_ioctl.c|  2 +
> >   drivers/gpu/drm/drm_syncobj.c  | 99
> +-
> >   include/uapi/drm/drm.h | 14 +
> >   4 files changed, 103 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_internal.h
> > b/drivers/gpu/drm/drm_internal.h index 0c4eb4a9ab31..566d44e3c782
> > 100644
> > --- a/drivers/gpu/drm/drm_internal.h
> > +++ b/drivers/gpu/drm/drm_internal.h
> > @@ -183,6 +183,8 @@ int drm_syncobj_fd_to_handle_ioctl(struct
> drm_device *dev, void *data,
> >struct drm_file *file_private);
> >   int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
> >struct drm_file *file_private);
> > +int drm_syncobj_timeline_wait_ioctl(struct drm_device *dev, void *data,
> > +   struct drm_file *file_private);
> >   int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file_private);
> >   int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
> > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
> > index 6b4a633b4240..c0891614f516 100644
> > --- a/drivers/gpu/drm/drm_ioctl.c
> > +++ b/drivers/gpu/drm/drm_ioctl.c
> > @@ -669,6 +669,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
> >   DRM_UNLOCKED|DRM_RENDER_ALLOW),
> > DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT,
> drm_syncobj_wait_ioctl,
> >   DRM_UNLOCKED|DRM_RENDER_ALLOW),
> > +   DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT,
> drm_syncobj_timeline_wait_ioctl,
> > + DRM_UNLOCKED|DRM_RENDER_ALLOW),
> > DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET,
> drm_syncobj_reset_ioctl,
> >   DRM_UNLOCKED|DRM_RENDER_ALLOW),
> > DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL,
> drm_syncobj_signal_ioctl,
> > diff --git a/drivers/gpu/drm/drm_syncobj.c
> > b/drivers/gpu/drm/drm_syncobj.c index 67472bd77c83..a43de0e4616c
> > 100644
> > --- a/drivers/gpu/drm/drm_syncobj.c
> > +++ b/drivers/gpu/drm/drm_syncobj.c
> > @@ -126,13 +126,14 @@ static void
> drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
> >   }
> >
> >   static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj
> > *syncobj,
> > +u64 point,
> >  struct dma_fence **fence,
> >  struct drm_syncobj_cb *cb,
> >  drm_syncobj_func_t func)
> >   {
> > int ret;
> >
> > -   ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
> > +   ret = drm_syncobj_search_fence(syncobj, point, 0, fence);
> > if (!ret)
> > return 1;
> >
> > @@ -143,7 +144,7 @@ static int
> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
> >  */
> > if (!list_empty(>signal_pt_list)) {
> > spin_unlock(>lock);
> > -   drm_syncobj_search_fence(syncobj, 0, 0, fence);
> > +   drm_syncobj_search_fence(syncobj, point, 0, fence);
> > if (*fence)
> > return 1;
> > spin_lock(>lock);
> > @@ -358,7 +359,9 @@ void drm_syncobj_replace_fence(struct
> drm_syncobj *syncobj,
> > spin_lock(>lock);
> > list_for_each_entry_safe(cur, tmp, >cb_list, node)
> {
> > list_del_init(>node);
> > +   spin_unlock(>lock);
> > cur->func(syncobj, cur);
> > +   spin_lock(>lock);
> 
> That looks fishy to me. Why do we need to unlock 

Cb func will call _search_fence, which will need to grab the lock, otherwise 
deadlock.


>and who guarantees that
> tmp is still valid when we grab the lock again?

Sorry for that, quickly  fix deadlock and forget to

RE: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7

2018-09-20 Thread Zhou, David(ChunMing)


> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Thursday, September 20, 2018 5:35 PM
> To: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; Daniel Vetter ; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7
> 
> The only thing I can still see is that you use wait_event_timeout() instead of
> wait_event_interruptible().
> 
> Any particular reason for that?

I tried again after you said last thread, CTS always fail, and syncobj unit 
test fails as well.


> 
> Apart from that it now looks good to me.

Thanks, Can I get your RB on it?

Btw, I realize Vulkan spec names semaphore type as binary and timeline, so how 
about change _TYPE_INDIVIDUAL  to _TYPE_BINARY ?

Regards,
David Zhou
> 
> Christian.
> 
> Am 20.09.2018 um 11:29 schrieb Zhou, David(ChunMing):
> > Ping...
> >
> >> -Original Message-
> >> From: amd-gfx  On Behalf Of
> >> Chunming Zhou
> >> Sent: Wednesday, September 19, 2018 5:18 PM
> >> To: dri-de...@lists.freedesktop.org
> >> Cc: Zhou, David(ChunMing) ; amd-
> >> g...@lists.freedesktop.org; Rakos, Daniel ;
> >> Daniel Vetter ; Dave Airlie ;
> >> Koenig, Christian 
> >> Subject: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7
> >>
> >> This patch is for VK_KHR_timeline_semaphore extension, semaphore is
> >> called syncobj in kernel side:
> >> This extension introduces a new type of syncobj that has an integer
> >> payload identifying a point in a timeline. Such timeline syncobjs
> >> support the following
> >> operations:
> >> * CPU query - A host operation that allows querying the payload of the
> >>   timeline syncobj.
> >> * CPU wait - A host operation that allows a blocking wait for a
> >>   timeline syncobj to reach a specified value.
> >> * Device wait - A device operation that allows waiting for a
> >>   timeline syncobj to reach a specified value.
> >> * Device signal - A device operation that allows advancing the
> >>   timeline syncobj to a specified value.
> >>
> >> v1:
> >> Since it's a timeline, that means the front time point(PT) always is
> >> signaled before the late PT.
> >> a. signal PT design:
> >> Signal PT fence N depends on PT[N-1] fence and signal opertion fence,
> >> when PT[N] fence is signaled, the timeline will increase to value of PT[N].
> >> b. wait PT design:
> >> Wait PT fence is signaled by reaching timeline point value, when
> >> timeline is increasing, will compare wait PTs value with new timeline
> >> value, if PT value is lower than timeline value, then wait PT will be 
> >> signaled,
> otherwise keep in list.
> >> syncobj wait operation can wait on any point of timeline, so need a
> >> RB tree to order them. And wait PT could ahead of signal PT, we need
> >> a sumission fence to perform that.
> >>
> >> v2:
> >> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2.
> >> move unexposed denitions to .c file. (Daniel Vetter) 3. split up the
> >> change to
> >> drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up
> >> the change to drm_syncobj_replace_fence() in a separate patch.
> >> 5. drop the submission_fence implementation and instead use
> >> wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL
> type syncobj case.
> >> (Daniel Vetter)
> >>
> >> v3:
> >> 1. replace normal syncobj with timeline implemenation. (Vetter and
> Christian)
> >>  a. normal syncobj signal op will create a signal PT to tail of signal 
> >> pt list.
> >>  b. normal syncobj wait op will create a wait pt with last signal
> >> point, and this wait PT is only signaled by related signal point PT.
> >> 2. many bug fix and clean up
> >> 3. stub fence moving is moved to other patch.
> >>
> >> v4:
> >> 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2.
> >> fix syncobj lifecycle. (Christian) 3. only enable_signaling when
> >> there is wait_pt. (Christian) 4. fix timeline path issues.
> >> 5. write a timeline test in libdrm
> >>
> >> v5: (Christian)
> >> 1. semaphore is called syncobj in kernel side.
> >> 2. don't need 'timeline' characters in some function name.
> >> 3. keep syncobj cb.
> >>
> >> v6: (Christian)
&

RE: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7

2018-09-20 Thread Zhou, David(ChunMing)
Ping...

> -Original Message-
> From: amd-gfx  On Behalf Of
> Chunming Zhou
> Sent: Wednesday, September 19, 2018 5:18 PM
> To: dri-de...@lists.freedesktop.org
> Cc: Zhou, David(ChunMing) ; amd-
> g...@lists.freedesktop.org; Rakos, Daniel ; Daniel
> Vetter ; Dave Airlie ; Koenig,
> Christian 
> Subject: [PATCH 2/6] [RFC]drm: add syncobj timeline support v7
> 
> This patch is for VK_KHR_timeline_semaphore extension, semaphore is
> called syncobj in kernel side:
> This extension introduces a new type of syncobj that has an integer payload
> identifying a point in a timeline. Such timeline syncobjs support the 
> following
> operations:
>* CPU query - A host operation that allows querying the payload of the
>  timeline syncobj.
>* CPU wait - A host operation that allows a blocking wait for a
>  timeline syncobj to reach a specified value.
>* Device wait - A device operation that allows waiting for a
>  timeline syncobj to reach a specified value.
>* Device signal - A device operation that allows advancing the
>  timeline syncobj to a specified value.
> 
> v1:
> Since it's a timeline, that means the front time point(PT) always is signaled
> before the late PT.
> a. signal PT design:
> Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when
> PT[N] fence is signaled, the timeline will increase to value of PT[N].
> b. wait PT design:
> Wait PT fence is signaled by reaching timeline point value, when timeline is
> increasing, will compare wait PTs value with new timeline value, if PT value 
> is
> lower than timeline value, then wait PT will be signaled, otherwise keep in 
> list.
> syncobj wait operation can wait on any point of timeline, so need a RB tree to
> order them. And wait PT could ahead of signal PT, we need a sumission fence
> to perform that.
> 
> v2:
> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2.
> move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to
> drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the
> change to drm_syncobj_replace_fence() in a separate patch.
> 5. drop the submission_fence implementation and instead use wait_event()
> for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case.
> (Daniel Vetter)
> 
> v3:
> 1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
> a. normal syncobj signal op will create a signal PT to tail of signal pt 
> list.
> b. normal syncobj wait op will create a wait pt with last signal point, 
> and this
> wait PT is only signaled by related signal point PT.
> 2. many bug fix and clean up
> 3. stub fence moving is moved to other patch.
> 
> v4:
> 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj
> lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. 
> (Christian)
> 4. fix timeline path issues.
> 5. write a timeline test in libdrm
> 
> v5: (Christian)
> 1. semaphore is called syncobj in kernel side.
> 2. don't need 'timeline' characters in some function name.
> 3. keep syncobj cb.
> 
> v6: (Christian)
> 1. merge syncobj_timeline to syncobj structure.
> 2. simplify some check sentences.
> 3. some misc change.
> 4. fix CTS failed issue.
> 
> v7: (Christian)
> 1. error handling when creating signal pt.
> 2. remove timeline naming in func.
> 3. export flags in find_fence.
> 4. allow reset timeline.
> 
> individual syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline
> syncobj is tested by ./amdgpu_test -s 9
> 
> Signed-off-by: Chunming Zhou 
> Cc: Christian Konig 
> Cc: Dave Airlie 
> Cc: Daniel Rakos 
> Cc: Daniel Vetter 
> ---
>  drivers/gpu/drm/drm_syncobj.c  | 293 ++---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +-
>  include/drm/drm_syncobj.h  |  65 ++---
>  include/uapi/drm/drm.h |   1 +
>  4 files changed, 287 insertions(+), 74 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_syncobj.c
> b/drivers/gpu/drm/drm_syncobj.c index f796c9fc3858..95b60ac045c6 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -56,6 +56,9 @@
>  #include "drm_internal.h"
>  #include 
> 
> +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
> +#define DRM_SYNCOBJ_INDIVIDUAL_POINT 1
> +
>  struct drm_syncobj_stub_fence {
>   struct dma_fence base;
>   spinlock_t lock;
> @@ -82,6 +85,11 @@ static const struct dma_fence_ops
> drm_syncobj_stub_fence_ops = {
>   .release = drm_syncobj_stub_fence_release,  };
> 
> +struct drm_syncobj_signal_pt {
> +

RE: [PATCH 1/4] [RFC]drm: add syncobj timeline support v6

2018-09-19 Thread Zhou, David(ChunMing)


> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Wednesday, September 19, 2018 3:45 PM
> To: Zhou, David(ChunMing) ; Zhou,
> David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; Daniel Vetter ; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 1/4] [RFC]drm: add syncobj timeline support v6
> 
> Am 19.09.2018 um 09:32 schrieb zhoucm1:
> >
> >
> > On 2018年09月19日 15:18, Christian König wrote:
> >> Am 19.09.2018 um 06:26 schrieb Chunming Zhou:
> > [snip]
> >>>   *fence = NULL;
> >>>   drm_syncobj_add_callback_locked(syncobj, cb, func); @@
> >>> -164,6 +177,153 @@ void drm_syncobj_remove_callback(struct
> >>> drm_syncobj *syncobj,
> >>>   spin_unlock(>lock);
> >>>   }
> >>>   +static void drm_syncobj_timeline_init(struct drm_syncobj
> >>> *syncobj)
> >>
> >> We still have _timeline_ in the name here.
> > the func is relevant to timeline members, or which name is proper?
> 
> Yeah, but we now use the timeline implementation for the individual syncobj
> as well.
> 
> Not a big issue, but I would just name it
> drm_syncobj_init()/drm_syncobj_fini.

There is already drm_syncobj_init/fini in drm_syncboj.c , any other name can be 
suggested?

> 
> >
> >>
> >>> +{
> >>> +    spin_lock(>lock);
> >>> +    syncobj->timeline_context = dma_fence_context_alloc(1);
> > [snip]
> >>> +}
> >>> +
> >>> +int drm_syncobj_lookup_fence(struct drm_syncobj *syncobj, u64
> >>> +point,
> >>> +   struct dma_fence **fence) {
> >>> +
> >>> +    return drm_syncobj_search_fence(syncobj, point,
> >>> +    DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> >>
> >> I still have a bad feeling setting that flag as default cause it
> >> might change the behavior for the UAPI.
> >>
> >> Maybe export drm_syncobj_search_fence directly? E.g. with the flags
> >> parameter.
> > previous v5 indeed do this, you let me wrap it, need change back?
> 
> No, the problem is that drm_syncobj_find_fence() is still using
> drm_syncobj_lookup_fence() which sets the flag instead of
> drm_syncobj_search_fence() without the flag.
> 
> That changes the UAPI behavior because previously we would have returned
> an error code and now we block for a fence to appear.
> 
> So I think the right solution would be to add the flags parameter to
> drm_syncobj_find_fence() and let the driver decide if we need to block or
> get -ENOENT.

Got your means,
Exporting flag in func is easy,
 but driver doesn't pass flag, which flag is proper by default? We still need 
to give a default flag in patch, don't we?

Thanks,
David Zhou

> 
> Regards,
> Christian.
> 
> >
> > Regards,
> > David Zhou
> >>
> >> Regards,
> >> Christian.
> >>
> >>> +    fence);
> >>> +}
> >>> +EXPORT_SYMBOL(drm_syncobj_lookup_fence);
> >>> +
> >>>   /**
> >>>    * drm_syncobj_find_fence - lookup and reference the fence in a
> >>> sync object
> >>>    * @file_private: drm file private pointer @@ -228,7 +443,7 @@
> >>> static int drm_syncobj_assign_null_handle(struct
> >>> drm_syncobj *syncobj)
> >>>    * @fence: out parameter for the fence
> >>>    *
> >>>    * This is just a convenience function that combines
> >>> drm_syncobj_find() and
> >>> - * drm_syncobj_fence_get().
> >>> + * drm_syncobj_lookup_fence().
> >>>    *
> >>>    * Returns 0 on success or a negative error value on failure. On
> >>> success @fence
> >>>    * contains a reference to the fence, which must be released by
> >>> calling @@ -236,18 +451,11 @@ static int
> >>> drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
> >>>    */
> >>>   int drm_syncobj_find_fence(struct drm_file *file_private,
> >>>  u32 handle, u64 point,
> >>> -   struct dma_fence **fence) -{
> >>> +   struct dma_fence **fence) {
> >>>   struct drm_syncobj *syncobj = drm_syncobj_find(file_private,
> >>> handle);
> >>> -    int ret = 0;
> >>> -
> >>> -    if (!syncobj)
> >>> -    return -ENOENT;
> >>> +    int ret;
> >>>   -    *fe

RE: [PATCH] drm/amdgpu: remove fence fallback

2018-09-18 Thread Zhou, David(ChunMing)


> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian K?nig
> Sent: Tuesday, September 18, 2018 4:43 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: remove fence fallback
> 
> DC doesn't seem to have a fallback path either.
> 
> So when interrupts doesn't work any more we are pretty much busted no
> matter what.
> 
> Signed-off-by: Christian König 

Reviewed-by: Chunming Zhou 


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 56 ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |  1 -
>  3 files changed, 58 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 27382767e15a..c18d68575462 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -146,7 +146,6 @@ extern int amdgpu_cik_support;
>  #define AMDGPU_DEFAULT_GTT_SIZE_MB   3072ULL /* 3GB by
> default */
>  #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS   3000
>  #define AMDGPU_MAX_USEC_TIMEOUT  10  /* 100
> ms */
> -#define AMDGPU_FENCE_JIFFIES_TIMEOUT (HZ / 2)
>  /* AMDGPU_IB_POOL_SIZE must be a power of 2 */
>  #define AMDGPU_IB_POOL_SIZE  16
>  #define AMDGPU_DEBUGFS_MAX_COMPONENTS32
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index da36731460b5..176f28777f5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -195,19 +195,6 @@ int amdgpu_fence_emit_polling(struct amdgpu_ring
> *ring, uint32_t *s)
>   return 0;
>  }
> 
> -/**
> - * amdgpu_fence_schedule_fallback - schedule fallback check
> - *
> - * @ring: pointer to struct amdgpu_ring
> - *
> - * Start a timer as fallback to our interrupts.
> - */
> -static void amdgpu_fence_schedule_fallback(struct amdgpu_ring *ring) -{
> - mod_timer(>fence_drv.fallback_timer,
> -   jiffies + AMDGPU_FENCE_JIFFIES_TIMEOUT);
> -}
> -
>  /**
>   * amdgpu_fence_process - check for fence activity
>   *
> @@ -229,9 +216,6 @@ void amdgpu_fence_process(struct amdgpu_ring
> *ring)
> 
>   } while (atomic_cmpxchg(>last_seq, last_seq, seq) != last_seq);
> 
> - if (seq != ring->fence_drv.sync_seq)
> - amdgpu_fence_schedule_fallback(ring);
> -
>   if (unlikely(seq == last_seq))
>   return;
> 
> @@ -262,21 +246,6 @@ void amdgpu_fence_process(struct amdgpu_ring
> *ring)
>   } while (last_seq != seq);
>  }
> 
> -/**
> - * amdgpu_fence_fallback - fallback for hardware interrupts
> - *
> - * @work: delayed work item
> - *
> - * Checks for fence activity.
> - */
> -static void amdgpu_fence_fallback(struct timer_list *t) -{
> - struct amdgpu_ring *ring = from_timer(ring, t,
> -   fence_drv.fallback_timer);
> -
> - amdgpu_fence_process(ring);
> -}
> -
>  /**
>   * amdgpu_fence_wait_empty - wait for all fences to signal
>   *
> @@ -424,8 +393,6 @@ int amdgpu_fence_driver_init_ring(struct
> amdgpu_ring *ring,
>   atomic_set(>fence_drv.last_seq, 0);
>   ring->fence_drv.initialized = false;
> 
> - timer_setup(>fence_drv.fallback_timer,
> amdgpu_fence_fallback, 0);
> -
>   ring->fence_drv.num_fences_mask = num_hw_submission * 2 - 1;
>   spin_lock_init(>fence_drv.lock);
>   ring->fence_drv.fences = kcalloc(num_hw_submission * 2,
> sizeof(void *), @@ -501,7 +468,6 @@ void amdgpu_fence_driver_fini(struct
> amdgpu_device *adev)
>   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
>  ring->fence_drv.irq_type);
>   drm_sched_fini(>sched);
> - del_timer_sync(>fence_drv.fallback_timer);
>   for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
>   dma_fence_put(ring->fence_drv.fences[j]);
>   kfree(ring->fence_drv.fences);
> @@ -594,27 +560,6 @@ static const char
> *amdgpu_fence_get_timeline_name(struct dma_fence *f)
>   return (const char *)fence->ring->name;  }
> 
> -/**
> - * amdgpu_fence_enable_signaling - enable signalling on fence
> - * @fence: fence
> - *
> - * This function is called with fence_queue lock held, and adds a callback
> - * to fence_queue that checks if this fence is signaled, and if so it
> - * signals the fence and removes itself.
> - */
> -static bool amdgpu_fence_enable_signaling(struct dma_fence *f) -{
> - struct amdgpu_fence *fence = to_amdgpu_fence(f);
> - struct amdgpu_ring *ring = fence->ring;
> -
> - if (!timer_pending(>fence_drv.fallback_timer))
> - amdgpu_fence_schedule_fallback(ring);
> -
> - DMA_FENCE_TRACE(>base, "armed on ring %i!\n", ring-
> >idx);
> -
> - return true;
> -}
> -
>  /**
>   * amdgpu_fence_free - free up the fence memory
>   *
> @@ -645,7 +590,6 @@ static void amdgpu_fence_release(struct dma_fence
> *f)  

RE: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-16 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Daniel Vetter  On Behalf Of Daniel Vetter
> Sent: Saturday, September 15, 2018 12:11 AM
> To: Koenig, Christian 
> Cc: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Dave Airlie
> ; Rakos, Daniel ; Daniel
> Vetter 
> Subject: Re: [PATCH] [RFC]drm: add syncobj timeline support v5
> 
> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
> > Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
> > > This patch is for VK_KHR_timeline_semaphore extension, semaphore is
> called syncobj in kernel side:
> > > This extension introduces a new type of syncobj that has an integer
> > > payload identifying a point in a timeline. Such timeline syncobjs
> > > support the following operations:
> > > * CPU query - A host operation that allows querying the payload of the
> > >   timeline syncobj.
> > > * CPU wait - A host operation that allows a blocking wait for a
> > >   timeline syncobj to reach a specified value.
> > > * Device wait - A device operation that allows waiting for a
> > >   timeline syncobj to reach a specified value.
> > > * Device signal - A device operation that allows advancing the
> > >   timeline syncobj to a specified value.
> > >
> > > Since it's a timeline, that means the front time point(PT) always is
> signaled before the late PT.
> > > a. signal PT design:
> > > Signal PT fence N depends on PT[N-1] fence and signal opertion
> > > fence, when PT[N] fence is signaled, the timeline will increase to value 
> > > of
> PT[N].
> > > b. wait PT design:
> > > Wait PT fence is signaled by reaching timeline point value, when
> > > timeline is increasing, will compare wait PTs value with new
> > > timeline value, if PT value is lower than timeline value, then wait
> > > PT will be signaled, otherwise keep in list. syncobj wait operation
> > > can wait on any point of timeline, so need a RB tree to order them. And
> wait PT could ahead of signal PT, we need a sumission fence to perform that.
> > >
> > > v2:
> > > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2.
> move
> > > unexposed denitions to .c file. (Daniel Vetter) 3. split up the
> > > change to drm_syncobj_find_fence() in a separate patch. (Christian)
> > > 4. split up the change to drm_syncobj_replace_fence() in a separate
> patch.
> > > 5. drop the submission_fence implementation and instead use
> > > wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL
> > > type syncobj case. (Daniel Vetter)
> > >
> > > v3:
> > > 1. replace normal syncobj with timeline implemenation. (Vetter and
> Christian)
> > >  a. normal syncobj signal op will create a signal PT to tail of 
> > > signal pt list.
> > >  b. normal syncobj wait op will create a wait pt with last signal 
> > > point, and
> this wait PT is only signaled by related signal point PT.
> > > 2. many bug fix and clean up
> > > 3. stub fence moving is moved to other patch.
> > >
> > > v4:
> > > 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2.
> > > fix syncobj lifecycle. (Christian) 3. only enable_signaling when
> > > there is wait_pt. (Christian) 4. fix timeline path issues.
> > > 5. write a timeline test in libdrm
> > >
> > > v5: (Christian)
> > > 1. semaphore is called syncobj in kernel side.
> > > 2. don't need 'timeline' characters in some function name.
> > > 3. keep syncobj cb
> > >
> > > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline
> > > syncobj is tested by ./amdgpu_test -s 9
> > >
> > > Signed-off-by: Chunming Zhou 
> > > Cc: Christian Konig 
> > > Cc: Dave Airlie 
> > > Cc: Daniel Rakos 
> > > Cc: Daniel Vetter 
> >
> > At least on first glance that looks like it should work, going to do a
> > detailed review on Monday.
> 
> Just for my understanding, it's all condensed down to 1 patch now?

Yes, Christian suggest that.

 >I kinda
> didn't follow the detailed discussion last few days at all :-/
> 
> Also, is there a testcase, igt highly preferred (because then we'll run it in 
> our
> intel-gfx CI, and a bunch of people outside of intel have already discovered
> that and are using it).


I already wrote the test in libdrm unit test, since I'm not familiar with IGT 
stuff.

Thanks,
David Zhou
> 
> Thanks, Daniel
> 
> >
>

RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-14 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Koenig, Christian
> Sent: Friday, September 14, 2018 3:27 PM
> To: Zhou, David(ChunMing) ; Zhou,
> David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org; Daniel Vetter
> 
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 14.09.2018 um 05:59 schrieb zhoucm1:
> >
> >
> > On 2018年09月14日 11:14, zhoucm1 wrote:
> >>
> >>
> >> On 2018年09月13日 18:22, Christian König wrote:
> >>> Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing):
> >>>>
> >>>>> -----Original Message-
> >>>>> From: Koenig, Christian
> >>>>> Sent: Thursday, September 13, 2018 5:20 PM
> >>>>> To: Zhou, David(ChunMing) ; dri-
> >>>>> de...@lists.freedesktop.org
> >>>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>>> ; amd-gfx@lists.freedesktop.org
> >>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>>>>
> >>>>> Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):
> >>>>>>> -Original Message-
> >>>>>>> From: Christian König 
> >>>>>>> Sent: Thursday, September 13, 2018 4:50 PM
> >>>>>>> To: Zhou, David(ChunMing) ; Koenig,
> >>>>>>> Christian ;
> >>>>>>> dri-de...@lists.freedesktop.org
> >>>>>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>>>>> ; amd-gfx@lists.freedesktop.org
> >>>>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support
> >>>>>>> v4
> >>>>>>>
> >>>>>>> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):
> >>>>>>>>> -Original Message-
> >>>>>>>>> From: Koenig, Christian
> >>>>>>>>> Sent: Thursday, September 13, 2018 2:56 PM
> >>>>>>>>> To: Zhou, David(ChunMing) ; Zhou,
> >>>>>>>>> David(ChunMing) ; dri-
> >>>>>>>>> de...@lists.freedesktop.org
> >>>>>>>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>>>>>>> ; amd-gfx@lists.freedesktop.org
> >>>>>>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline
> >>>>>>>>> support v4
> >>>>>>>>>
> >>>>>>>>> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> >>>>>>>>>> On 2018年09月12日 19:05, Christian König wrote:
> >>>>>>>>>>>>>>> [SNIP]
> >>>>>>>>>>>>>>> +static void
> >>>>>>>>>>>>>>> +drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>>>>>>>>>>> drm_syncobj *syncobj,
> >>>>>>>>>>>>>>> +   struct drm_syncobj_wait_pt
> >>>>>>>>>>>>>>> +*wait_pt) {
> >>>>>>>>>>>>>> That whole approach still looks horrible complicated to me.
> >>>>>>>>>>>> It's already very close to what you said before.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> Especially the separation of signal and wait pt is
> >>>>>>>>>>>>>> completely unnecessary as far as I can see.
> >>>>>>>>>>>>>> When a wait pt is requested we just need to search for
> >>>>>>>>>>>>>> the signal point which it will trigger.
> >>>>>>>>>>>> Yeah, I tried this, but when I implement cpu wait ioctl on
> >>>>>>>>>>>> specific point, we need a advanced wait pt fence,
> >>>>>>>>>>>> otherwise, we could still need old syncobj cb.
> >>>>>>>>>>> Why? I mean you just need to call drm_syncobj_find_fence()
> >>>>>>>>>>> and
> >>>>>>> when
> >>>>>>>>>>> that one returns NULL you use wait_event_*() to wait for a
> >>>>>>>>>>> signal point >= your wait point to appear and tr

RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Koenig, Christian
> Sent: Thursday, September 13, 2018 5:20 PM
> To: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):
> >
> >> -Original Message-
> >> From: Christian König 
> >> Sent: Thursday, September 13, 2018 4:50 PM
> >> To: Zhou, David(ChunMing) ; Koenig, Christian
> >> ; dri-de...@lists.freedesktop.org
> >> Cc: Dave Airlie ; Rakos, Daniel
> >> ; amd-gfx@lists.freedesktop.org
> >> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>
> >> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):
> >>>> -----Original Message-
> >>>> From: Koenig, Christian
> >>>> Sent: Thursday, September 13, 2018 2:56 PM
> >>>> To: Zhou, David(ChunMing) ; Zhou,
> >>>> David(ChunMing) ; dri-
> >>>> de...@lists.freedesktop.org
> >>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>> ; amd-gfx@lists.freedesktop.org
> >>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>>>
> >>>> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> >>>>> On 2018年09月12日 19:05, Christian König wrote:
> >>>>>>>>>> [SNIP]
> >>>>>>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>>>>>> drm_syncobj *syncobj,
> >>>>>>>>>> +   struct drm_syncobj_wait_pt
> >>>>>>>>>> +*wait_pt) {
> >>>>>>>>> That whole approach still looks horrible complicated to me.
> >>>>>>> It's already very close to what you said before.
> >>>>>>>
> >>>>>>>>> Especially the separation of signal and wait pt is completely
> >>>>>>>>> unnecessary as far as I can see.
> >>>>>>>>> When a wait pt is requested we just need to search for the
> >>>>>>>>> signal point which it will trigger.
> >>>>>>> Yeah, I tried this, but when I implement cpu wait ioctl on
> >>>>>>> specific point, we need a advanced wait pt fence, otherwise, we
> >>>>>>> could still need old syncobj cb.
> >>>>>> Why? I mean you just need to call drm_syncobj_find_fence() and
> >> when
> >>>>>> that one returns NULL you use wait_event_*() to wait for a signal
> >>>>>> point >= your wait point to appear and try again.
> >>>>> e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have
> >>>>> no fence yet, as you said, during drm_syncobj_find_fence(A) is
> >>>>> working on wait_event, syncobjB and syncobjC could already be
> >>>>> signaled, then we don't know which one is first signaled, which is
> >>>>> need when wait ioctl returns.
> >>>> I don't really see a problem with that. When you wait for the first
> >>>> one you need to wait for A,B,C at the same time anyway.
> >>>>
> >>>> So what you do is to register a fence callback on the fences you
> >>>> already have and for the syncobj which doesn't yet have a fence you
> >>>> make sure that they wake up your thread when they get one.
> >>>>
> >>>> So essentially exactly what drm_syncobj_fence_get_or_add_callback()
> >>>> already does today.
> >>> So do you mean we need still use old syncobj CB for that?
> >> Yes, as far as I can see it should work.
> >>
> >>>Advanced wait pt is bad?
> >> Well it isn't bad, I just don't see any advantage in it.
> >
> > The advantage is to replace old syncobj cb.
> >
> >> The existing mechanism
> >> should already be able to handle that.
> > I thought more a bit, we don't that mechanism at all, if use advanced wait
> pt, we can easily use fence array to achieve it for wait ioctl, we should use
> kernel existing feature as much as possible, not invent another, shouldn't we?
> I remember  you said  it before.
> 
> Yeah, but the syncobj cb is an existing feature.

This is obviously a workaround when doing for wait ioctl, Do you see it used in 
other place?

> And I absolutely don't

RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Thursday, September 13, 2018 4:50 PM
> To: Zhou, David(ChunMing) ; Koenig, Christian
> ; dri-de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):
> >
> >> -Original Message-
> >> From: Koenig, Christian
> >> Sent: Thursday, September 13, 2018 2:56 PM
> >> To: Zhou, David(ChunMing) ; Zhou,
> >> David(ChunMing) ; dri-
> >> de...@lists.freedesktop.org
> >> Cc: Dave Airlie ; Rakos, Daniel
> >> ; amd-gfx@lists.freedesktop.org
> >> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>
> >> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> >>> On 2018年09月12日 19:05, Christian König wrote:
> >>>>>>>> [SNIP]
> >>>>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>>>> drm_syncobj *syncobj,
> >>>>>>>> +   struct drm_syncobj_wait_pt
> >>>>>>>> +*wait_pt) {
> >>>>>>> That whole approach still looks horrible complicated to me.
> >>>>> It's already very close to what you said before.
> >>>>>
> >>>>>>> Especially the separation of signal and wait pt is completely
> >>>>>>> unnecessary as far as I can see.
> >>>>>>> When a wait pt is requested we just need to search for the
> >>>>>>> signal point which it will trigger.
> >>>>> Yeah, I tried this, but when I implement cpu wait ioctl on
> >>>>> specific point, we need a advanced wait pt fence, otherwise, we
> >>>>> could still need old syncobj cb.
> >>>> Why? I mean you just need to call drm_syncobj_find_fence() and
> when
> >>>> that one returns NULL you use wait_event_*() to wait for a signal
> >>>> point >= your wait point to appear and try again.
> >>> e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have
> >>> no fence yet, as you said, during drm_syncobj_find_fence(A) is
> >>> working on wait_event, syncobjB and syncobjC could already be
> >>> signaled, then we don't know which one is first signaled, which is
> >>> need when wait ioctl returns.
> >> I don't really see a problem with that. When you wait for the first
> >> one you need to wait for A,B,C at the same time anyway.
> >>
> >> So what you do is to register a fence callback on the fences you
> >> already have and for the syncobj which doesn't yet have a fence you
> >> make sure that they wake up your thread when they get one.
> >>
> >> So essentially exactly what drm_syncobj_fence_get_or_add_callback()
> >> already does today.
> > So do you mean we need still use old syncobj CB for that?
> 
> Yes, as far as I can see it should work.
> 
> >   Advanced wait pt is bad?
> 
> Well it isn't bad, I just don't see any advantage in it.


The advantage is to replace old syncobj cb.

> The existing mechanism
> should already be able to handle that.

I thought more a bit, we don't that mechanism at all, if use advanced wait pt, 
we can easily use fence array to achieve it for wait ioctl, we should use 
kernel existing feature as much as possible, not invent another, shouldn't we?  
I remember  you said  it before.

Thanks,
David Zhou
> 
> Christian.
> 
> >
> > Thanks,
> > David Zhou
> >> Regards,
> >> Christian.
> >>
> >>> Back to my implementation, it already fixes all your concerns
> >>> before, and can be able to easily used in wait_ioctl. When you feel
> >>> that is complicated, I guess that is because we merged all logic to
> >>> that and much clean up in one patch. In fact, it already is very
> >>> simple, timeline_init/fini, create signal/wait_pt, find signal_pt
> >>> for wait_pt, garbage collection, just them.
> >>>
> >>> Thanks,
> >>> David Zhou
> >>>> Regards,
> >>>> Christian.
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Koenig, Christian
> Sent: Thursday, September 13, 2018 2:56 PM
> To: Zhou, David(ChunMing) ; Zhou,
> David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> > On 2018年09月12日 19:05, Christian König wrote:
> >>>>>
> >>>>>> [SNIP]
> >>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>> drm_syncobj *syncobj,
> >>>>>> +   struct drm_syncobj_wait_pt *wait_pt)
> >>>>>> +{
> >>>>>
> >>>>> That whole approach still looks horrible complicated to me.
> >>> It's already very close to what you said before.
> >>>
> >>>>>
> >>>>> Especially the separation of signal and wait pt is completely
> >>>>> unnecessary as far as I can see.
> >>>>> When a wait pt is requested we just need to search for the signal
> >>>>> point which it will trigger.
> >>> Yeah, I tried this, but when I implement cpu wait ioctl on specific
> >>> point, we need a advanced wait pt fence, otherwise, we could still
> >>> need old syncobj cb.
> >>
> >> Why? I mean you just need to call drm_syncobj_find_fence() and when
> >> that one returns NULL you use wait_event_*() to wait for a signal
> >> point >= your wait point to appear and try again.
> > e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have no
> > fence yet, as you said, during drm_syncobj_find_fence(A) is working on
> > wait_event, syncobjB and syncobjC could already be signaled, then we
> > don't know which one is first signaled, which is need when wait ioctl
> > returns.
> 
> I don't really see a problem with that. When you wait for the first one you
> need to wait for A,B,C at the same time anyway.
> 
> So what you do is to register a fence callback on the fences you already have
> and for the syncobj which doesn't yet have a fence you make sure that they
> wake up your thread when they get one.
> 
> So essentially exactly what drm_syncobj_fence_get_or_add_callback()
> already does today.

So do you mean we need still use old syncobj CB for that? Advanced wait pt is 
bad?

Thanks,
David Zhou
> 
> Regards,
> Christian.
> 
> >
> > Back to my implementation, it already fixes all your concerns before,
> > and can be able to easily used in wait_ioctl. When you feel that is
> > complicated, I guess that is because we merged all logic to that and
> > much clean up in one patch. In fact, it already is very simple,
> > timeline_init/fini, create signal/wait_pt, find signal_pt for wait_pt,
> > garbage collection, just them.
> >
> > Thanks,
> > David Zhou
> >>
> >> Regards,
> >> Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/scheduler: Add stopped flag to drm_sched_entity

2018-08-20 Thread Zhou, David(ChunMing)


-Original Message-
From: dri-devel  On Behalf Of Andrey 
Grodzovsky
Sent: Friday, August 17, 2018 11:16 PM
To: dri-de...@lists.freedesktop.org
Cc: Koenig, Christian ; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/scheduler: Add stopped flag to drm_sched_entity

The flag will prevent another thread from same process to reinsert the entity 
queue into scheduler's rq after it was already removed from there by another 
thread during drm_sched_entity_flush.

Signed-off-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 10 +-
 include/drm/gpu_scheduler.h  |  2 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 1416edb..07cfe63 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -177,8 +177,12 @@ long drm_sched_entity_flush(struct drm_sched_entity 
*entity, long timeout)
/* For killed process disable any more IBs enqueue right now */
last_user = cmpxchg(>last_user, current->group_leader, NULL);
if ((!last_user || last_user == current->group_leader) &&
-   (current->flags & PF_EXITING) && (current->exit_code == SIGKILL))
+   (current->flags & PF_EXITING) && (current->exit_code == SIGKILL)) {
+   spin_lock(>rq_lock);
+   entity->stopped = true;
drm_sched_rq_remove_entity(entity->rq, entity);
+   spin_unlock(>rq_lock);
+   }
 
return ret;
 }
@@ -504,6 +508,10 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job,
if (first) {
/* Add the entity to the run queue */
spin_lock(>rq_lock);
+   if (entity->stopped) {
+   spin_unlock(>rq_lock);
+   return;
+   }
[DZ] the code changes so frequent recently and has this regression, my code 
synced last Friday still has below checking:
spin_lock(>rq_lock);
if (!entity->rq) {
DRM_ERROR("Trying to push to a killed entity\n");
spin_unlock(>rq_lock);
return;
}
So you should add DRM_ERROR as well when hitting it.

With that fix, patch is Reviewed-by: Chunming Zhou 

Regards,
David Zhou
drm_sched_rq_add_entity(entity->rq, entity);
spin_unlock(>rq_lock);
drm_sched_wakeup(entity->rq->sched);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 
919ae57..daec50f 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -70,6 +70,7 @@ enum drm_sched_priority {
  * @fini_status: contains the exit status in case the process was signalled.
  * @last_scheduled: points to the finished fence of the last scheduled job.
  * @last_user: last group leader pushing a job into the entity.
+ * @stopped: Marks the enity as removed from rq and destined for termination.
  *
  * Entities will emit jobs in order to their corresponding hardware
  * ring, and the scheduler will alternate between entities based on @@ -92,6 
+93,7 @@ struct drm_sched_entity {
atomic_t*guilty;
struct dma_fence*last_scheduled;
struct task_struct  *last_user;
+   boolstopped;
 };
 
 /**
--
2.7.4

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/4] drm/scheduler: add new function to get least loaded sched v2

2018-08-01 Thread Zhou, David(ChunMing)
Another big question:
I agree the general idea is good to balance scheduler load for same ring family.
But, when same entity job run on different scheduler, that means the later job 
could be completed ahead of front, Right?
That will break fence design, later fence must be signaled after front fence in 
same fence context.

Anything I missed?

Regards,
David Zhou

From: dri-devel  On Behalf Of Nayan 
Deshmukh
Sent: Thursday, August 02, 2018 12:07 AM
To: Grodzovsky, Andrey 
Cc: amd-gfx@lists.freedesktop.org; Maling list - DRI developers 
; Koenig, Christian 
Subject: Re: [PATCH 3/4] drm/scheduler: add new function to get least loaded 
sched v2

Yes, that is correct.

Nayan

On Wed, Aug 1, 2018, 9:05 PM Andrey Grodzovsky 
mailto:andrey.grodzov...@amd.com>> wrote:
Clarification question -  if the run queues belong to different
schedulers they effectively point to different rings,

it means we allow to move (reschedule) a drm_sched_entity from one ring
to another - i assume that the idea int the first place, that

you have a set of HW rings and you can utilize any of them for your jobs
(like compute rings). Correct ?

Andrey


On 08/01/2018 04:20 AM, Nayan Deshmukh wrote:
> The function selects the run queue from the rq_list with the
> least load. The load is decided by the number of jobs in a
> scheduler.
>
> v2: avoid using atomic read twice consecutively, instead store
>  it locally
>
> Signed-off-by: Nayan Deshmukh 
> mailto:nayan26deshm...@gmail.com>>
> ---
>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 25 +
>   1 file changed, 25 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 375f6f7f6a93..fb4e542660b0 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -255,6 +255,31 @@ static bool drm_sched_entity_is_ready(struct 
> drm_sched_entity *entity)
>   return true;
>   }
>
> +/**
> + * drm_sched_entity_get_free_sched - Get the rq from rq_list with least load
> + *
> + * @entity: scheduler entity
> + *
> + * Return the pointer to the rq with least load.
> + */
> +static struct drm_sched_rq *
> +drm_sched_entity_get_free_sched(struct drm_sched_entity *entity)
> +{
> + struct drm_sched_rq *rq = NULL;
> + unsigned int min_jobs = UINT_MAX, num_jobs;
> + int i;
> +
> + for (i = 0; i < entity->num_rq_list; ++i) {
> + num_jobs = atomic_read(>rq_list[i]->sched->num_jobs);
> + if (num_jobs < min_jobs) {
> + min_jobs = num_jobs;
> + rq = entity->rq_list[i];
> + }
> + }
> +
> + return rq;
> +}
> +
>   static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>   struct dma_fence_cb *cb)
>   {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3)

2018-07-30 Thread Zhou, David(ChunMing)
Typo, excepted -> expected

-Original Message-
From: amd-gfx  On Behalf Of Zhou, 
David(ChunMing)
Sent: Tuesday, July 31, 2018 9:41 AM
To: Koenig, Christian ; Zhang, Jerry 
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of 
bo (v3)

Thanks for Jerry  still  remembers  this series.

Hi Christian,

For upstream of this feature, seems we already had agreement long time ago. Two 
reasons for upstreaming:
1. this bug was found by an opengl game, so this bug also is in mesa driver in 
theory.
2. after upstream these patches, we can reduce pro specific patches, and close 
to open source.

Btw, an unit test is excepted when upstreaming, I remember Alex mentioned.

Thanks,
David Zhou
-Original Message-
From: Christian König 
Sent: Monday, July 30, 2018 6:48 PM
To: Zhang, Jerry ; amd-gfx@lists.freedesktop.org
Cc: Zhou, David(ChunMing) ; Koenig, Christian 

Subject: Re: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of 
bo (v3)

Am 30.07.2018 um 12:02 schrieb Junwei Zhang:
> From: Chunming Zhou 
>
> v2: get original gem handle from gobj
> v3: update find bo data structure as union(in, out)
>  simply some code logic

Do we now have an open source user for this, so that we can upstream it? 
One more point below.

>
> Signed-off-by: Chunming Zhou 
> Signed-off-by: Junwei Zhang  (v3)
> Reviewed-by: Christian König 
> Reviewed-by: Jammy Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 63 
> +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |  3 +-
>   include/uapi/drm/amdgpu_drm.h   | 21 +++
>   4 files changed, 88 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 4cd20e7..46c370b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1213,6 +1213,8 @@ int amdgpu_gem_info_ioctl(struct drm_device *dev, void 
> *data,
> struct drm_file *filp);
>   int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
>   struct drm_file *filp);
> +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void 
> *data,
> + struct drm_file *filp);
>   int amdgpu_gem_mmap_ioctl(struct drm_device *dev, void *data,
> struct drm_file *filp);
>   int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data, 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 71792d8..bae8417 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -288,6 +288,69 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
> *data,
>   return 0;
>   }
>   
> +static int amdgpu_gem_get_handle_from_object(struct drm_file *filp,
> +  struct drm_gem_object *obj) {
> + int i;
> + struct drm_gem_object *tmp;
> +
> + spin_lock(>table_lock);
> + idr_for_each_entry(>object_idr, tmp, i) {
> + if (obj == tmp) {
> + drm_gem_object_reference(obj);
> + spin_unlock(>table_lock);
> + return i;
> + }
> + }

Please double check if that is still up to date.

I think we could as well try to use the DMA-buf handle tree for that.

Christian.

> + spin_unlock(>table_lock);
> +
> + return 0;
> +}
> +
> +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void 
> *data,
> + struct drm_file *filp)
> +{
> + union drm_amdgpu_gem_find_bo *args = data;
> + struct drm_gem_object *gobj;
> + struct amdgpu_bo *bo;
> + struct ttm_buffer_object *tbo;
> + struct vm_area_struct *vma;
> + uint32_t handle;
> + int r;
> +
> + if (offset_in_page(args->in.addr | args->in.size))
> + return -EINVAL;
> +
> + down_read(>mm->mmap_sem);
> + vma = find_vma(current->mm, args->in.addr);
> + if (!vma || vma->vm_file != filp->filp ||
> + (args->in.size > (vma->vm_end - args->in.addr))) {
> + args->out.handle = 0;
> + up_read(>mm->mmap_sem);
> + return -EINVAL;
> + }
> + args->out.offset = args->in.addr - vma->vm_start;
> +
> + tbo = vma->vm_private_data;
> + bo = container_of(tbo, struct amdgpu_bo, tbo);
> + amdgpu_bo_ref(bo);
> + gobj = >gem_base;
> +
> + handle = amdgpu

RE: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3)

2018-07-30 Thread Zhou, David(ChunMing)
Thanks for Jerry  still  remembers  this series.

Hi Christian,

For upstream of this feature, seems we already had agreement long time ago. Two 
reasons for upstreaming:
1. this bug was found by an opengl game, so this bug also is in mesa driver in 
theory.
2. after upstream these patches, we can reduce pro specific patches, and close 
to open source.

Btw, an unit test is excepted when upstreaming, I remember Alex mentioned.

Thanks,
David Zhou
-Original Message-
From: Christian König  
Sent: Monday, July 30, 2018 6:48 PM
To: Zhang, Jerry ; amd-gfx@lists.freedesktop.org
Cc: Zhou, David(ChunMing) ; Koenig, Christian 

Subject: Re: [PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of 
bo (v3)

Am 30.07.2018 um 12:02 schrieb Junwei Zhang:
> From: Chunming Zhou 
>
> v2: get original gem handle from gobj
> v3: update find bo data structure as union(in, out)
>  simply some code logic

Do we now have an open source user for this, so that we can upstream it? 
One more point below.

>
> Signed-off-by: Chunming Zhou 
> Signed-off-by: Junwei Zhang  (v3)
> Reviewed-by: Christian König 
> Reviewed-by: Jammy Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 63 
> +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |  3 +-
>   include/uapi/drm/amdgpu_drm.h   | 21 +++
>   4 files changed, 88 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 4cd20e7..46c370b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1213,6 +1213,8 @@ int amdgpu_gem_info_ioctl(struct drm_device *dev, void 
> *data,
> struct drm_file *filp);
>   int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
>   struct drm_file *filp);
> +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void 
> *data,
> + struct drm_file *filp);
>   int amdgpu_gem_mmap_ioctl(struct drm_device *dev, void *data,
> struct drm_file *filp);
>   int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data, 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 71792d8..bae8417 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -288,6 +288,69 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
> *data,
>   return 0;
>   }
>   
> +static int amdgpu_gem_get_handle_from_object(struct drm_file *filp,
> +  struct drm_gem_object *obj) {
> + int i;
> + struct drm_gem_object *tmp;
> +
> + spin_lock(>table_lock);
> + idr_for_each_entry(>object_idr, tmp, i) {
> + if (obj == tmp) {
> + drm_gem_object_reference(obj);
> + spin_unlock(>table_lock);
> + return i;
> + }
> + }

Please double check if that is still up to date.

I think we could as well try to use the DMA-buf handle tree for that.

Christian.

> + spin_unlock(>table_lock);
> +
> + return 0;
> +}
> +
> +int amdgpu_gem_find_bo_by_cpu_mapping_ioctl(struct drm_device *dev, void 
> *data,
> + struct drm_file *filp)
> +{
> + union drm_amdgpu_gem_find_bo *args = data;
> + struct drm_gem_object *gobj;
> + struct amdgpu_bo *bo;
> + struct ttm_buffer_object *tbo;
> + struct vm_area_struct *vma;
> + uint32_t handle;
> + int r;
> +
> + if (offset_in_page(args->in.addr | args->in.size))
> + return -EINVAL;
> +
> + down_read(>mm->mmap_sem);
> + vma = find_vma(current->mm, args->in.addr);
> + if (!vma || vma->vm_file != filp->filp ||
> + (args->in.size > (vma->vm_end - args->in.addr))) {
> + args->out.handle = 0;
> + up_read(>mm->mmap_sem);
> + return -EINVAL;
> + }
> + args->out.offset = args->in.addr - vma->vm_start;
> +
> + tbo = vma->vm_private_data;
> + bo = container_of(tbo, struct amdgpu_bo, tbo);
> + amdgpu_bo_ref(bo);
> + gobj = >gem_base;
> +
> + handle = amdgpu_gem_get_handle_from_object(filp, gobj);
> + if (!handle) {
> + r = drm_gem_handle_create(filp, gobj, );
> + if (r) {
> + DRM_ERROR("create gem handle failed\n");
> + up_read(>mm->mmap_sem);
> +  

RE: [PATCH 9/9] drm/amdgpu: create an empty bo_list if no handle is provided

2018-07-30 Thread Zhou, David(ChunMing)
Series is Reviewed-by: Chunming  Zhou 

-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: Monday, July 30, 2018 10:52 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 9/9] drm/amdgpu: create an empty bo_list if no handle is 
provided

Instead of having extra handling just create an empty bo_list when no handle is 
provided.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 111 ++---
 1 file changed, 46 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 1d7292ab2b62..502b94fb116a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -561,6 +561,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
union drm_amdgpu_cs *cs)
 {
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
+   struct amdgpu_vm *vm = >vm;
struct amdgpu_bo_list_entry *e;
struct list_head duplicates;
struct amdgpu_bo *gds;
@@ -580,13 +581,17 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
*p,
   >bo_list);
if (r)
return r;
+   } else if (!p->bo_list) {
+   /* Create a empty bo_list when no handle is provided */
+   r = amdgpu_bo_list_create(p->adev, p->filp, NULL, 0,
+ >bo_list);
+   if (r)
+   return r;
}
 
-   if (p->bo_list) {
-   amdgpu_bo_list_get_list(p->bo_list, >validated);
-   if (p->bo_list->first_userptr != p->bo_list->num_entries)
-   p->mn = amdgpu_mn_get(p->adev, AMDGPU_MN_TYPE_GFX);
-   }
+   amdgpu_bo_list_get_list(p->bo_list, >validated);
+   if (p->bo_list->first_userptr != p->bo_list->num_entries)
+   p->mn = amdgpu_mn_get(p->adev, AMDGPU_MN_TYPE_GFX);
 
INIT_LIST_HEAD();
amdgpu_vm_get_pd_bo(>vm, >validated, >vm_pd); @@ -605,10 
+610,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
goto error_free_pages;
}
 
-   /* Without a BO list we don't have userptr BOs */
-   if (!p->bo_list)
-   break;
-
INIT_LIST_HEAD(_pages);
amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
struct amdgpu_bo *bo = e->robj;
@@ -703,21 +704,12 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
*p,
amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved,
 p->bytes_moved_vis);
 
-   if (p->bo_list) {
-   struct amdgpu_vm *vm = >vm;
-   struct amdgpu_bo_list_entry *e;
+   gds = p->bo_list->gds_obj;
+   gws = p->bo_list->gws_obj;
+   oa = p->bo_list->oa_obj;
 
-   gds = p->bo_list->gds_obj;
-   gws = p->bo_list->gws_obj;
-   oa = p->bo_list->oa_obj;
-
-   amdgpu_bo_list_for_each_entry(e, p->bo_list)
-   e->bo_va = amdgpu_vm_bo_find(vm, e->robj);
-   } else {
-   gds = p->adev->gds.gds_gfx_bo;
-   gws = p->adev->gds.gws_gfx_bo;
-   oa = p->adev->gds.oa_gfx_bo;
-   }
+   amdgpu_bo_list_for_each_entry(e, p->bo_list)
+   e->bo_va = amdgpu_vm_bo_find(vm, e->robj);
 
if (gds) {
p->job->gds_base = amdgpu_bo_gpu_offset(gds); @@ -745,15 
+737,13 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 
 error_free_pages:
 
-   if (p->bo_list) {
-   amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
-   if (!e->user_pages)
-   continue;
+   amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) {
+   if (!e->user_pages)
+   continue;
 
-   release_pages(e->user_pages,
- e->robj->tbo.ttm->num_pages);
-   kvfree(e->user_pages);
-   }
+   release_pages(e->user_pages,
+ e->robj->tbo.ttm->num_pages);
+   kvfree(e->user_pages);
}
 
return r;
@@ -815,9 +805,10 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser 
*parser, int error,
 
 static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)  {
-   struct amdgpu_device *adev = p->adev;
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
+   struct amdgpu_device *adev = p->adev;
struct amdgpu_vm *vm = >vm;
+   struct amdgpu_bo_list_entry *e;
struct amdgpu_bo_va *bo_va;
struct amdgpu_bo *bo;
int r;
@@ -850,31 +841,26 @@ static int amdgpu_bo_vm_update_pte(struct 
amdgpu_cs_parser *p)
return r;
}
 
-  

RE: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function v2

2018-07-30 Thread Zhou, David(ChunMing)
Go ahead with my RB.

-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: Monday, July 30, 2018 5:19 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function v2

This allows us to trace all VM ranges which should be valid inside a CS.

v2: dump mappings without BO as well

Signed-off-by: Christian König 
Reviewed-by: Chunming  Zhou  (v1)
Reviewed-and-tested-by: Andrey Grodzovsky  (v1)
Reviewed-by: Huang Rui  (v1)
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h |  5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 29 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  1 +
 4 files changed, 37 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 8a49c3b97bd4..871401cd9997 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1220,6 +1220,7 @@ static void amdgpu_cs_post_dependencies(struct 
amdgpu_cs_parser *p)  static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
union drm_amdgpu_cs *cs)
 {
+   struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
struct amdgpu_ring *ring = p->ring;
struct drm_sched_entity *entity = >ctx->rings[ring->idx].entity;
enum drm_sched_priority priority;
@@ -1272,6 +1273,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
amdgpu_job_free_resources(job);
 
trace_amdgpu_cs_ioctl(job);
+   amdgpu_vm_bo_trace_cs(>vm, >ticket);
priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 11f262f15200..7206a0025b17 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -314,6 +314,11 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_mapping,
TP_ARGS(mapping)
 );
 
+DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_cs,
+   TP_PROTO(struct amdgpu_bo_va_mapping *mapping),
+   TP_ARGS(mapping)
+);
+
 TRACE_EVENT(amdgpu_vm_set_ptes,
TP_PROTO(uint64_t pe, uint64_t addr, unsigned count,
 uint32_t incr, uint64_t flags),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 5d7d7900ccab..015613b4f98b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2343,6 +2343,35 @@ struct amdgpu_bo_va_mapping 
*amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
return amdgpu_vm_it_iter_first(>va, addr, addr);  }
 
+/**
+ * amdgpu_vm_bo_trace_cs - trace all reserved mappings
+ *
+ * @vm: the requested vm
+ * @ticket: CS ticket
+ *
+ * Trace all mappings of BOs reserved during a command submission.
+ */
+void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx 
+*ticket) {
+   struct amdgpu_bo_va_mapping *mapping;
+
+   if (!trace_amdgpu_vm_bo_cs_enabled())
+   return;
+
+   for (mapping = amdgpu_vm_it_iter_first(>va, 0, U64_MAX); mapping;
+mapping = amdgpu_vm_it_iter_next(mapping, 0, U64_MAX)) {
+   if (mapping->bo_va && mapping->bo_va->base.bo) {
+   struct amdgpu_bo *bo;
+
+   bo = mapping->bo_va->base.bo;
+   if (READ_ONCE(bo->tbo.resv->lock.ctx) != ticket)
+   continue;
+   }
+
+   trace_amdgpu_vm_bo_cs(mapping);
+   }
+}
+
 /**
  * amdgpu_vm_bo_rmv - remove a bo to a specific vm
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index d416f895233d..67a15d439ac0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -318,6 +318,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
uint64_t saddr, uint64_t size);
 struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
 uint64_t addr);
+void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx 
+*ticket);
 void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
  struct amdgpu_bo_va *bo_va);
 void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function

2018-07-29 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming  Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Friday, July 27, 2018 10:58 PM
To: amd-gfx@lists.freedesktop.org; Grodzovsky, Andrey 

Subject: [PATCH] drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function

This allows us to trace all VM ranges which should be valid inside a CS.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h |  5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 30 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  1 +
 4 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 8a49c3b97bd4..871401cd9997 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1220,6 +1220,7 @@ static void amdgpu_cs_post_dependencies(struct 
amdgpu_cs_parser *p)  static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
union drm_amdgpu_cs *cs)
 {
+   struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
struct amdgpu_ring *ring = p->ring;
struct drm_sched_entity *entity = >ctx->rings[ring->idx].entity;
enum drm_sched_priority priority;
@@ -1272,6 +1273,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
amdgpu_job_free_resources(job);
 
trace_amdgpu_cs_ioctl(job);
+   amdgpu_vm_bo_trace_cs(>vm, >ticket);
priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 11f262f15200..7206a0025b17 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -314,6 +314,11 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_mapping,
TP_ARGS(mapping)
 );
 
+DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_cs,
+   TP_PROTO(struct amdgpu_bo_va_mapping *mapping),
+   TP_ARGS(mapping)
+);
+
 TRACE_EVENT(amdgpu_vm_set_ptes,
TP_PROTO(uint64_t pe, uint64_t addr, unsigned count,
 uint32_t incr, uint64_t flags),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 5d7d7900ccab..7aedf3184e36 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2343,6 +2343,36 @@ struct amdgpu_bo_va_mapping 
*amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
return amdgpu_vm_it_iter_first(>va, addr, addr);  }
 
+/**
+ * amdgpu_vm_bo_trace_cs - trace all reserved mappings
+ *
+ * @vm: the requested vm
+ * @ticket: CS ticket
+ *
+ * Trace all mappings of BOs reserved during a command submission.
+ */
+void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx 
+*ticket) {
+   struct amdgpu_bo_va_mapping *mapping;
+
+   if (!trace_amdgpu_vm_bo_cs_enabled())
+   return;
+
+   for (mapping = amdgpu_vm_it_iter_first(>va, 0, U64_MAX); mapping;
+mapping = amdgpu_vm_it_iter_next(mapping, 0, U64_MAX)) {
+   struct amdgpu_bo *bo;
+
+   if (!mapping->bo_va || !mapping->bo_va->base.bo)
+   continue;
+
+   bo = mapping->bo_va->base.bo;
+   if (READ_ONCE(bo->tbo.resv->lock.ctx) != ticket)
+   continue;
+
+   trace_amdgpu_vm_bo_cs(mapping);
+   }
+}
+
 /**
  * amdgpu_vm_bo_rmv - remove a bo to a specific vm
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index d416f895233d..67a15d439ac0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -318,6 +318,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
uint64_t saddr, uint64_t size);
 struct amdgpu_bo_va_mapping *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
 uint64_t addr);
+void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx 
+*ticket);
 void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
  struct amdgpu_bo_va *bo_va);
 void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: add proper error handling to amdgpu_bo_list_get

2018-07-29 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Friday, July 27, 2018 9:39 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: add proper error handling to amdgpu_bo_list_get

Otherwise we silently don't use a BO list when the handle is invalid.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 28 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 11 ---
 3 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 13aaa118aca4..4cd20e722d70 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -713,8 +713,8 @@ struct amdgpu_bo_list {
struct amdgpu_bo_list_entry *array;
 };
 
-struct amdgpu_bo_list *
-amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id);
+int amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id,
+  struct amdgpu_bo_list **result);
 void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list,
 struct list_head *validated);
 void amdgpu_bo_list_put(struct amdgpu_bo_list *list); diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index 7679c068c89a..944868e47119 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -180,27 +180,20 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
return r;
 }
 
-struct amdgpu_bo_list *
-amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id)
+int amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id,
+  struct amdgpu_bo_list **result)
 {
-   struct amdgpu_bo_list *result;
-
rcu_read_lock();
-   result = idr_find(>bo_list_handles, id);
+   *result = idr_find(>bo_list_handles, id);
 
-   if (result) {
-   if (kref_get_unless_zero(>refcount)) {
-   rcu_read_unlock();
-   mutex_lock(>lock);
-   } else {
-   rcu_read_unlock();
-   result = NULL;
-   }
-   } else {
+   if (*result && kref_get_unless_zero(&(*result)->refcount)) {
rcu_read_unlock();
+   mutex_lock(&(*result)->lock);
+   return 0;
}
 
-   return result;
+   rcu_read_unlock();
+   return -ENOENT;
 }
 
 void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list, @@ -335,9 +328,8 @@ 
int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
break;
 
case AMDGPU_BO_LIST_OP_UPDATE:
-   r = -ENOENT;
-   list = amdgpu_bo_list_get(fpriv, handle);
-   if (!list)
+   r = amdgpu_bo_list_get(fpriv, handle, );
+   if (r)
goto error_free;
 
r = amdgpu_bo_list_set(adev, filp, list, info, diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 533b2e7656c0..8a49c3b97bd4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -572,11 +572,16 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
*p,
INIT_LIST_HEAD(>validated);
 
/* p->bo_list could already be assigned if AMDGPU_CHUNK_ID_BO_HANDLES 
is present */
-   if (!p->bo_list)
-   p->bo_list = amdgpu_bo_list_get(fpriv, cs->in.bo_list_handle);
-   else
+   if (p->bo_list) {
mutex_lock(>bo_list->lock);
 
+   } else if (cs->in.bo_list_handle) {
+   r = amdgpu_bo_list_get(fpriv, cs->in.bo_list_handle,
+  >bo_list);
+   if (r)
+   return r;
+   }
+
if (p->bo_list) {
amdgpu_bo_list_get_list(p->bo_list, >validated);
if (p->bo_list->first_userptr != p->bo_list->num_entries)
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amdgpu: remove superflous UVD encode entity

2018-07-18 Thread Zhou, David(ChunMing)
Acked-by: Chunming  Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Thursday, July 19, 2018 2:45 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 1/2] drm/amdgpu: remove superflous UVD encode entity

Not sure what that was every used for, but now it is completely unused.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |  1 -
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   | 12 
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   | 14 --
 3 files changed, 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
index 8b23a1b00c76..cae3f526216b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
@@ -48,7 +48,6 @@ struct amdgpu_uvd_inst {
struct amdgpu_ring  ring_enc[AMDGPU_MAX_UVD_ENC_RINGS];
struct amdgpu_irq_src   irq;
struct drm_sched_entity entity;
-   struct drm_sched_entity entity_enc;
uint32_tsrbm_soft_reset;
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index b796dc8375cd..598dbeaba636 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -418,16 +418,6 @@ static int uvd_v6_0_sw_init(void *handle)
adev->uvd.num_enc_rings = 0;
 
DRM_INFO("UVD ENC is disabled\n");
-   } else {
-   struct drm_sched_rq *rq;
-   ring = >uvd.inst->ring_enc[0];
-   rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
-   r = drm_sched_entity_init(>uvd.inst->entity_enc,
- , 1, NULL);
-   if (r) {
-   DRM_ERROR("Failed setting up UVD ENC run queue.\n");
-   return r;
-   }
}
 
r = amdgpu_uvd_resume(adev);
@@ -463,8 +453,6 @@ static int uvd_v6_0_sw_fini(void *handle)
return r;
 
if (uvd_v6_0_enc_support(adev)) {
-   drm_sched_entity_destroy(>uvd.inst->ring_enc[0].sched, 
>uvd.inst->entity_enc);
-
for (i = 0; i < adev->uvd.num_enc_rings; ++i)
amdgpu_ring_fini(>uvd.inst->ring_enc[i]);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 89fe910e5c9a..2192f4536c24 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -389,7 +389,6 @@ static int uvd_v7_0_early_init(void *handle)  static int 
uvd_v7_0_sw_init(void *handle)  {
struct amdgpu_ring *ring;
-   struct drm_sched_rq *rq;
int i, j, r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
@@ -421,17 +420,6 @@ static int uvd_v7_0_sw_init(void *handle)
DRM_INFO("PSP loading UVD firmware\n");
}
 
-   for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
-   ring = >uvd.inst[j].ring_enc[0];
-   rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
-   r = drm_sched_entity_init(>uvd.inst[j].entity_enc,
- , 1, NULL);
-   if (r) {
-   DRM_ERROR("(%d)Failed setting up UVD ENC run queue.\n", 
j);
-   return r;
-   }
-   }
-
r = amdgpu_uvd_resume(adev);
if (r)
return r;
@@ -484,8 +472,6 @@ static int uvd_v7_0_sw_fini(void *handle)
return r;
 
for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
-   drm_sched_entity_destroy(>uvd.inst[j].ring_enc[0].sched, 
>uvd.inst[j].entity_enc);
-
for (i = 0; i < adev->uvd.num_enc_rings; ++i)
amdgpu_ring_fini(>uvd.inst[j].ring_enc[i]);
}
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu/powerplay: use irq source defines for smu7 sources

2018-07-18 Thread Zhou, David(ChunMing)
Rewiewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Alex 
Deucher
Sent: Thursday, July 19, 2018 5:09 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH] drm/amdgpu/powerplay: use irq source defines for smu7 sources

Use the newly added irq source defines rather than magic numbers for smu7 
thermal interrupts.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c
index 8eea49e4c74d..2aab1b475945 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c
@@ -27,6 +27,7 @@
 #include "atom.h"
 #include "ivsrcid/thm/irqsrcs_thm_9_0.h"
 #include "ivsrcid/smuio/irqsrcs_smuio_9_0.h"
+#include "ivsrcid/ivsrcid_vislands30.h"
 
 uint8_t convert_to_vid(uint16_t vddc)
 {
@@ -545,17 +546,17 @@ int phm_irq_process(struct amdgpu_device *adev,
uint32_t src_id = entry->src_id;
 
if (client_id == AMDGPU_IH_CLIENTID_LEGACY) {
-   if (src_id == 230)
+   if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_LOW_TO_HIGH)
pr_warn("GPU over temperature range detected on PCIe 
%d:%d.%d!\n",
PCI_BUS_NUM(adev->pdev->devfn),
PCI_SLOT(adev->pdev->devfn),
PCI_FUNC(adev->pdev->devfn));
-   else if (src_id == 231)
+   else if (src_id == 
VISLANDS30_IV_SRCID_CG_TSS_THERMAL_HIGH_TO_LOW)
pr_warn("GPU under temperature range detected on PCIe 
%d:%d.%d!\n",
PCI_BUS_NUM(adev->pdev->devfn),
PCI_SLOT(adev->pdev->devfn),
PCI_FUNC(adev->pdev->devfn));
-   else if (src_id == 83)
+   else if (src_id == VISLANDS30_IV_SRCID_GPIO_19)
pr_warn("GPU Critical Temperature Fault detected on 
PCIe %d:%d.%d!\n",
PCI_BUS_NUM(adev->pdev->devfn),
PCI_SLOT(adev->pdev->devfn),
--
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amdgpu: clean up UVD instance handling v2

2018-07-18 Thread Zhou, David(ChunMing)
Acked-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Thursday, July 19, 2018 2:45 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 2/2] drm/amdgpu: clean up UVD instance handling v2

The whole handle, filp and entity handling is superfluous here.

We should have reviewed that more thoughtfully. It looks like somebody just 
made the code instance aware without knowing the background.

v2: fix one more missed case in amdgpu_uvd_suspend

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 121  
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |  10 +--
 2 files changed, 64 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index d708970244eb..80b5c453f8c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -263,21 +263,20 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
dev_err(adev->dev, "(%d) failed to allocate UVD bo\n", 
r);
return r;
}
+   }
 
-   ring = >uvd.inst[j].ring;
-   rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
-   r = drm_sched_entity_init(>uvd.inst[j].entity, ,
- 1, NULL);
-   if (r != 0) {
-   DRM_ERROR("Failed setting up UVD(%d) run queue.\n", j);
-   return r;
-   }
-
-   for (i = 0; i < adev->uvd.max_handles; ++i) {
-   atomic_set(>uvd.inst[j].handles[i], 0);
-   adev->uvd.inst[j].filp[i] = NULL;
-   }
+   ring = >uvd.inst[0].ring;
+   rq = >sched.sched_rq[DRM_SCHED_PRIORITY_NORMAL];
+   r = drm_sched_entity_init(>uvd.entity, , 1, NULL);
+   if (r) {
+   DRM_ERROR("Failed setting up UVD kernel entity.\n");
+   return r;
}
+   for (i = 0; i < adev->uvd.max_handles; ++i) {
+   atomic_set(>uvd.handles[i], 0);
+   adev->uvd.filp[i] = NULL;
+   }
+
/* from uvd v5.0 HW addressing capacity increased to 64 bits */
if (!amdgpu_device_ip_block_version_cmp(adev, AMD_IP_BLOCK_TYPE_UVD, 5, 
0))
adev->uvd.address_64_bit = true;
@@ -306,11 +305,12 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)  {
int i, j;
 
+   drm_sched_entity_destroy(>uvd.inst->ring.sched,
+>uvd.entity);
+
for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
kfree(adev->uvd.inst[j].saved_bo);
 
-   drm_sched_entity_destroy(>uvd.inst[j].ring.sched, 
>uvd.inst[j].entity);
-
amdgpu_bo_free_kernel(>uvd.inst[j].vcpu_bo,
  >uvd.inst[j].gpu_addr,
  (void **)>uvd.inst[j].cpu_addr); @@ 
-333,20 +333,20 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 
cancel_delayed_work_sync(>uvd.idle_work);
 
+   /* only valid for physical mode */
+   if (adev->asic_type < CHIP_POLARIS10) {
+   for (i = 0; i < adev->uvd.max_handles; ++i)
+   if (atomic_read(>uvd.handles[i]))
+   break;
+
+   if (i == adev->uvd.max_handles)
+   return 0;
+   }
+
for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
if (adev->uvd.inst[j].vcpu_bo == NULL)
continue;
 
-   /* only valid for physical mode */
-   if (adev->asic_type < CHIP_POLARIS10) {
-   for (i = 0; i < adev->uvd.max_handles; ++i)
-   if (atomic_read(>uvd.inst[j].handles[i]))
-   break;
-
-   if (i == adev->uvd.max_handles)
-   continue;
-   }
-
size = amdgpu_bo_size(adev->uvd.inst[j].vcpu_bo);
ptr = adev->uvd.inst[j].cpu_addr;
 
@@ -398,30 +398,27 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 
 void amdgpu_uvd_free_handles(struct amdgpu_device *adev, struct drm_file 
*filp)  {
-   struct amdgpu_ring *ring;
-   int i, j, r;
-
-   for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
-   ring = >uvd.inst[j].ring;
+   struct amdgpu_ring *ring = >uvd.inst[0].ring;
+   int i, r;
 
-   for (i = 0; i < adev->uvd.max_handles; ++i) {
-   uint32_t handle = 
atomic_read(>uvd.inst[j].handles[i]);
-   if (handle != 0 && adev->uvd.inst[j].filp[i] == filp) {
-   struct dma_fence *fence;
-
-   r = amdgpu_uvd_get_destroy_msg(ring, handle,
-  false, );
- 

RE: [PATCH] drm/amdgpu: fix job priority handling

2018-07-18 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Thursday, July 19, 2018 2:15 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: fix job priority handling

The job might already be released at this point.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 4 +++-  
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 911c4a12a163..7c5cc33d0cda 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1209,6 +1209,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,  {
struct amdgpu_ring *ring = p->ring;
struct drm_sched_entity *entity = >ctx->rings[ring->idx].entity;
+   enum drm_sched_priority priority;
struct amdgpu_job *job;
unsigned i;
uint64_t seq;
@@ -1258,10 +1259,11 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
amdgpu_job_free_resources(job);
 
trace_amdgpu_cs_ioctl(job);
+   priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);
 
ring = to_amdgpu_ring(entity->sched);
-   amdgpu_ring_priority_get(ring, job->base.s_priority);
+   amdgpu_ring_priority_get(ring, priority);
 
ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence);
amdgpu_mn_unlock(p->mn);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 8b679c85d213..5a2c26a85984 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -126,6 +126,7 @@ void amdgpu_job_free(struct amdgpu_job *job)  int 
amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
  void *owner, struct dma_fence **f)  {
+   enum drm_sched_priority priority;
struct amdgpu_ring *ring;
int r;
 
@@ -139,10 +140,11 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
job->owner = owner;
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
+   priority = job->base.s_priority;
drm_sched_entity_push_job(>base, entity);
 
ring = to_amdgpu_ring(entity->sched);
-   amdgpu_ring_priority_get(ring, job->base.s_priority);
+   amdgpu_ring_priority_get(ring, priority);
 
return 0;
 }
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: always initialize job->base.sched

2018-07-17 Thread Zhou, David(ChunMing)
Acked-by: Chunming Zhou , but I think it isn't a nice 
evaluation although there is comment in code.


-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Tuesday, July 17, 2018 3:05 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: always initialize job->base.sched

Otherwise we can't clean up the job if we run into an error before it is pushed 
to the scheduler.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 024efb7ea6d6..42a4764d728e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -54,6 +54,11 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned 
num_ibs,
if (!*job)
return -ENOMEM;
 
+   /*
+* Initialize the scheduler to at least some ring so that we always
+* have a pointer to adev.
+*/
+   (*job)->base.sched = >rings[0]->sched;
(*job)->vm = vm;
(*job)->ibs = (void *)&(*job)[1];
(*job)->num_ibs = num_ibs;
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amdgpu: change ring priority after pushing the job

2018-07-16 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou  for series.

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Monday, July 16, 2018 9:25 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 2/2] drm/amdgpu: change ring priority after pushing the job

Pushing a job can change the ring assignment of an entity.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 4 +++-  
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 --
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 72dc9b36b937..911c4a12a163 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1256,11 +1256,13 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job->uf_sequence = seq;
 
amdgpu_job_free_resources(job);
-   amdgpu_ring_priority_get(p->ring, job->base.s_priority);
 
trace_amdgpu_cs_ioctl(job);
drm_sched_entity_push_job(>base, entity);
 
+   ring = to_amdgpu_ring(entity->sched);
+   amdgpu_ring_priority_get(ring, job->base.s_priority);
+
ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence);
amdgpu_mn_unlock(p->mn);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 024efb7ea6d6..10c769db5d67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -121,7 +121,7 @@ void amdgpu_job_free(struct amdgpu_job *job)  int 
amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
  void *owner, struct dma_fence **f)  {
-   struct amdgpu_ring *ring = to_amdgpu_ring(entity->sched);
+   struct amdgpu_ring *ring;
int r;
 
if (!f)
@@ -134,9 +134,11 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
job->owner = owner;
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
-   amdgpu_ring_priority_get(ring, job->base.s_priority);
drm_sched_entity_push_job(>base, entity);
 
+   ring = to_amdgpu_ring(entity->sched);
+   amdgpu_ring_priority_get(ring, job->base.s_priority);
+
return 0;
 }
 
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 7/7] drm/amdgpu: minor cleanup in amdgpu_job.c

2018-07-15 Thread Zhou, David(ChunMing)
Series is Acked-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Friday, July 13, 2018 11:20 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 7/7] drm/amdgpu: minor cleanup in amdgpu_job.c

Remove superflous NULL check, fix coding style a bit, shorten error messages.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index bd708b726003..024efb7ea6d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -33,7 +33,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
struct amdgpu_job *job = to_amdgpu_job(s_job);
 
-   DRM_ERROR("ring %s timeout, last signaled seq=%u, last emitted 
seq=%u\n",
+   DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
  job->base.sched->name, atomic_read(>fence_drv.last_seq),
  ring->fence_drv.sync_seq);
 
@@ -161,16 +161,17 @@ static struct dma_fence *amdgpu_job_dependency(struct 
drm_sched_job *sched_job,
struct amdgpu_ring *ring = to_amdgpu_ring(s_entity->sched);
struct amdgpu_job *job = to_amdgpu_job(sched_job);
struct amdgpu_vm *vm = job->vm;
+   struct dma_fence *fence;
bool explicit = false;
int r;
-   struct dma_fence *fence = amdgpu_sync_get_fence(>sync, );
 
+   fence = amdgpu_sync_get_fence(>sync, );
if (fence && explicit) {
if (drm_sched_dependency_optimized(fence, s_entity)) {
r = amdgpu_sync_fence(ring->adev, >sched_sync,
  fence, false);
if (r)
-   DRM_ERROR("Error adding fence to sync (%d)\n", 
r);
+   DRM_ERROR("Error adding fence (%d)\n", r);
}
}
 
@@ -194,10 +195,6 @@ static struct dma_fence *amdgpu_job_run(struct 
drm_sched_job *sched_job)
struct amdgpu_job *job;
int r;
 
-   if (!sched_job) {
-   DRM_ERROR("job is null\n");
-   return NULL;
-   }
job = to_amdgpu_job(sched_job);
finished = >base.s_fence->finished;
 
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH v2] drm/amdgpu: Allow to create BO lists in CS ioctl v2

2018-07-11 Thread Zhou, David(ChunMing)
Hi Andrey,

Could you add compatibility flag or increase kms driver version? So that user 
space can keep old path when using new one.

Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
zhoucm1
Sent: Thursday, July 12, 2018 10:31 AM
To: Grodzovsky, Andrey ; 
amd-gfx@lists.freedesktop.org
Cc: Olsak, Marek ; Koenig, Christian 

Subject: Re: [PATCH v2] drm/amdgpu: Allow to create BO lists in CS ioctl v2



On 2018年07月12日 04:57, Andrey Grodzovsky wrote:
> This change is to support MESA performace optimization.
> Modify CS IOCTL to allow its input as command buffer and an array of 
> buffer handles to create a temporay bo list and then destroy it when 
> IOCTL completes.
> This saves on calling for BO_LIST create and destry IOCTLs in MESA and 
> by this improves performance.
>
> v2: Avoid inserting the temp list into idr struct.
>
> Signed-off-by: Andrey Grodzovsky 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h | 11 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 86 
> ++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 51 +++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  3 +-
>   include/uapi/drm/amdgpu_drm.h   |  1 +
>   5 files changed, 114 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 8eaba0f..9b472b2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -732,6 +732,17 @@ void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list,
>struct list_head *validated);
>   void amdgpu_bo_list_put(struct amdgpu_bo_list *list);
>   void amdgpu_bo_list_free(struct amdgpu_bo_list *list);
> +int amdgpu_bo_create_list_entry_array(struct drm_amdgpu_bo_list_in *in,
> +   struct drm_amdgpu_bo_list_entry 
> **info_param);
> +
> +int amdgpu_bo_list_create(struct amdgpu_device *adev,
> +  struct drm_file *filp,
> +  struct drm_amdgpu_bo_list_entry *info,
> +  unsigned num_entries,
> +  int *id,
> +  struct amdgpu_bo_list **list);
> +
> +void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id);
>   
>   /*
>* GFX stuff
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index 92be7f6..14c7c59 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -55,11 +55,12 @@ static void amdgpu_bo_list_release_rcu(struct kref *ref)
>   kfree_rcu(list, rhead);
>   }
>   
> -static int amdgpu_bo_list_create(struct amdgpu_device *adev,
> +int amdgpu_bo_list_create(struct amdgpu_device *adev,
>struct drm_file *filp,
>struct drm_amdgpu_bo_list_entry *info,
>unsigned num_entries,
> -  int *id)
> +  int *id,
> +  struct amdgpu_bo_list **list_out)
>   {
>   int r;
>   struct amdgpu_fpriv *fpriv = filp->driver_priv; @@ -78,20 +79,25 @@ 
> static int amdgpu_bo_list_create(struct amdgpu_device *adev,
>   return r;
>   }
>   
> + if (id) {
>   /* idr alloc should be called only after initialization of bo list. */
> - mutex_lock(>bo_list_lock);
> - r = idr_alloc(>bo_list_handles, list, 1, 0, GFP_KERNEL);
> - mutex_unlock(>bo_list_lock);
> - if (r < 0) {
> - amdgpu_bo_list_free(list);
> - return r;
> + mutex_lock(>bo_list_lock);
> + r = idr_alloc(>bo_list_handles, list, 1, 0, GFP_KERNEL);
> + mutex_unlock(>bo_list_lock);
> + if (r < 0) {
> + amdgpu_bo_list_free(list);
> + return r;
> + }
> + *id = r;
>   }
> - *id = r;
> +
> + if (list_out)
> + *list_out = list;
>   
>   return 0;
>   }
>   
> -static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int 
> id)
> +void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
>   {
>   struct amdgpu_bo_list *list;
>   
> @@ -263,53 +269,68 @@ void amdgpu_bo_list_free(struct amdgpu_bo_list *list)
>   kfree(list);
>   }
>   
> -int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
> - struct drm_file *filp)
> +int amdgpu_bo_create_list_entry_array(struct drm_amdgpu_bo_list_in *in,
> +   struct drm_amdgpu_bo_list_entry 
> **info_param)
>   {
> - const uint32_t info_size = sizeof(struct drm_amdgpu_bo_list_entry);
> -
> - struct amdgpu_device *adev = dev->dev_private;
> - struct amdgpu_fpriv *fpriv = filp->driver_priv;
> - 

RE: [PATCH 1/2] drm/amdgpu: switch firmware path for CIK parts

2018-07-02 Thread Zhou, David(ChunMing)
Yes, agree, radeon driver uses radeon path, amdgpu uses amdgpu path, which 
makes sense to me.

The series is Reviewed-by: Chunming Zhou 

Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Alex 
Deucher
Sent: Tuesday, July 03, 2018 4:32 AM
To: Dave Airlie 
Cc: Deucher, Alexander ; amd-gfx mailing list 

Subject: Re: [PATCH 1/2] drm/amdgpu: switch firmware path for CIK parts

On Mon, Jul 2, 2018 at 4:12 PM, Dave Airlie  wrote:
> On 3 July 2018 at 05:36, Alex Deucher  wrote:
>> Use separate firmware path for amdgpu to avoid conflicts with radeon 
>> on CIK parts.
>>
>
> Won't that cause a chicken and egg problem, new kernel with old 
> firmware package will suddenly start failing, or do we not really care 
> since in theory we don't suppose amdgpu on those parts yet?
>
> Seems like we'd want to fallback to the old paths if possible.

I guess we could fall back, but in most cases the firmware loader will have to 
timeout first and then most users will assume it's broken anyway.  radeon is 
still the default with most distros, so I don't think it's super critical.

Alex

>
> Dave.
>
>> Signed-off-by: Alex Deucher 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c |  8 ++--  
>> drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 10 ++---  
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 10 ++---
>>  drivers/gpu/drm/amd/amdgpu/ci_dpm.c | 10 ++---
>>  drivers/gpu/drm/amd/amdgpu/cik_sdma.c   | 24 +--
>>  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   | 72 
>> -
>>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  6 +--
>>  7 files changed, 70 insertions(+), 70 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
>> index e950730f1933..693ec5ea4950 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
>> @@ -314,17 +314,17 @@ static int amdgpu_cgs_get_firmware_info(struct 
>> cgs_device *cgs_device,
>> (adev->pdev->revision == 0x81) ||
>> (adev->pdev->device == 0x665f)) {
>> info->is_kicker = true;
>> -   strcpy(fw_name, 
>> "radeon/bonaire_k_smc.bin");
>> +   strcpy(fw_name, 
>> + "amdgpu/bonaire_k_smc.bin");
>> } else {
>> -   strcpy(fw_name, 
>> "radeon/bonaire_smc.bin");
>> +   strcpy(fw_name, 
>> + "amdgpu/bonaire_smc.bin");
>> }
>> break;
>> case CHIP_HAWAII:
>> if (adev->pdev->revision == 0x80) {
>> info->is_kicker = true;
>> -   strcpy(fw_name, 
>> "radeon/hawaii_k_smc.bin");
>> +   strcpy(fw_name, 
>> + "amdgpu/hawaii_k_smc.bin");
>> } else {
>> -   strcpy(fw_name, 
>> "radeon/hawaii_smc.bin");
>> +   strcpy(fw_name, 
>> + "amdgpu/hawaii_smc.bin");
>> }
>> break;
>> case CHIP_TOPAZ:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> index 0b46ea1c6290..3e70eb61a960 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> @@ -53,11 +53,11 @@
>>
>>  /* Firmware Names */
>>  #ifdef CONFIG_DRM_AMDGPU_CIK
>> -#define FIRMWARE_BONAIRE   "radeon/bonaire_uvd.bin"
>> -#define FIRMWARE_KABINI"radeon/kabini_uvd.bin"
>> -#define FIRMWARE_KAVERI"radeon/kaveri_uvd.bin"
>> -#define FIRMWARE_HAWAII"radeon/hawaii_uvd.bin"
>> -#define FIRMWARE_MULLINS   "radeon/mullins_uvd.bin"
>> +#define FIRMWARE_BONAIRE   "amdgpu/bonaire_uvd.bin"
>> +#define FIRMWARE_KABINI"amdgpu/kabini_uvd.bin"
>> +#define FIRMWARE_KAVERI"amdgpu/kaveri_uvd.bin"
>> +#define FIRMWARE_HAWAII"amdgpu/hawaii_uvd.bin"
>> +#define FIRMWARE_MULLINS   "amdgpu/mullins_uvd.bin"
>>  #endif
>>  #define FIRMWARE_TONGA "amdgpu/tonga_uvd.bin"
>>  #define FIRMWARE_CARRIZO   "amdgpu/carrizo_uvd.bin"
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> index b0dcdfd85f5b..6ae1ad7e83b3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
>> @@ -40,11 +40,11 @@
>>
>>  /* Firmware Names */
>>  #ifdef CONFIG_DRM_AMDGPU_CIK
>> -#define FIRMWARE_BONAIRE   "radeon/bonaire_vce.bin"
>> -#define FIRMWARE_KABINI"radeon/kabini_vce.bin"
>> -#define FIRMWARE_KAVERI

RE: [PATCH] drm/amdgpu: remove duplicated codes

2018-06-27 Thread Zhou, David(ChunMing)
Feel free add my RB on that.

Thanks,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Flora 
Cui
Sent: Wednesday, June 27, 2018 3:06 PM
To: Zhou, David(ChunMing) 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove duplicated codes


the fence_context and seqno is init in amdgpu_vm_manager_init() & 
amdgpu_vmid_mgr_init(). remove the amdgpu_vmid_mgr_init() copy.

Change-Id: Ic0dbd693bac093e54eb95b5e547c89b64a5743b8
Signed-off-by: Flora Cui 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index a1c78f9..3a072a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -578,11 +578,6 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev)
list_add_tail(_mgr->ids[j].list, _mgr->ids_lru);
}
}
-
-   adev->vm_manager.fence_context =
-   dma_fence_context_alloc(AMDGPU_MAX_RINGS);
-   for (i = 0; i < AMDGPU_MAX_RINGS; ++i)
-   adev->vm_manager.seqno[i] = 0;
 }
 
 /**
--
2.7.4

On Wed, Jun 27, 2018 at 02:38:09PM +0800, Zhou, David(ChunMing) wrote:
> Please add patch's comment to describe where and where are duplicated.
> 
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf 
> Of Flora Cui
> Sent: Wednesday, June 27, 2018 2:10 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Cui, Flora 
> Subject: [PATCH] drm/amdgpu: remove duplicated codes
> 
> Change-Id: Ic0dbd693bac093e54eb95b5e547c89b64a5743b8
> Signed-off-by: Flora Cui 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> index a1c78f9..3a072a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> @@ -578,11 +578,6 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev)
>   list_add_tail(_mgr->ids[j].list, _mgr->ids_lru);
>   }
>   }
> -
> - adev->vm_manager.fence_context =
> - dma_fence_context_alloc(AMDGPU_MAX_RINGS);
> - for (i = 0; i < AMDGPU_MAX_RINGS; ++i)
> - adev->vm_manager.seqno[i] = 0;
>  }
>  
>  /**
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: remove duplicated codes

2018-06-27 Thread Zhou, David(ChunMing)
Please add patch's comment to describe where and where are duplicated.

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Flora 
Cui
Sent: Wednesday, June 27, 2018 2:10 PM
To: amd-gfx@lists.freedesktop.org
Cc: Cui, Flora 
Subject: [PATCH] drm/amdgpu: remove duplicated codes

Change-Id: Ic0dbd693bac093e54eb95b5e547c89b64a5743b8
Signed-off-by: Flora Cui 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index a1c78f9..3a072a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -578,11 +578,6 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev)
list_add_tail(_mgr->ids[j].list, _mgr->ids_lru);
}
}
-
-   adev->vm_manager.fence_context =
-   dma_fence_context_alloc(AMDGPU_MAX_RINGS);
-   for (i = 0; i < AMDGPU_MAX_RINGS; ++i)
-   adev->vm_manager.seqno[i] = 0;
 }
 
 /**
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Add AMDGPU_GPU_PAGES_IN_CPU_PAGE define

2018-06-25 Thread Zhou, David(ChunMing)
current amdgpu driver indeed always set GPU PAGE SIZE is 4096.
In fact, our gpu supports bigger page size like 64KB, just we don't use it. I 
remeber previous amdsoc(old android kernel driver) used 64KB.

correct me if I'm wrong.


send from Smartisan Pro

Michel D鋘zer  于 2018年6月25日 下午5:10写道:

On 2018-06-25 03:56 AM, zhoucm1 wrote:
> one question to you:
>
> Did you consider the case that GPU_PAGE_SIZE > CPU_PAGE_SIZE?

That is never the case: AMDGPU_GPU_PAGE_SIZE is always 4096, and
PAGE_SIZE is always >= 4096 (an integer multiple of it).


--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH v2 1/2] drm/scheduler: Rename cleanup functions v2.

2018-06-21 Thread Zhou, David(ChunMing)
Acked-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Andrey Grodzovsky
Sent: Thursday, June 21, 2018 11:33 PM
To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org
Cc: e...@anholt.net; Koenig, Christian ; Grodzovsky, 
Andrey ; l.st...@pengutronix.de
Subject: [PATCH v2 1/2] drm/scheduler: Rename cleanup functions v2.

Everything in the flush code path (i.e. waiting for SW queue to become empty) 
names with *_flush() and everything in the release code path names *_fini()

This patch also effect the amdgpu and etnaviv drivers which use those functions.

v2:
Also apply the change to vd3.

Signed-off-by: Andrey Grodzovsky 
Suggested-by: Christian König 
Acked-by: Lucas Stach 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   |  8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_drv.c |  4 ++--
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 18 +-
 drivers/gpu/drm/v3d/v3d_drv.c |  2 +-
 include/drm/gpu_scheduler.h   |  6 +++---
 11 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 64b3a1e..c0f06c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -104,7 +104,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
 
 failed:
for (j = 0; j < i; j++)
-   drm_sched_entity_fini(>rings[j]->sched,
+   drm_sched_entity_destroy(>rings[j]->sched,
  >rings[j].entity);
kfree(ctx->fences);
ctx->fences = NULL;
@@ -178,7 +178,7 @@ static void amdgpu_ctx_do_release(struct kref *ref)
if (ctx->adev->rings[i] == >adev->gfx.kiq.ring)
continue;
 
-   drm_sched_entity_fini(>adev->rings[i]->sched,
+   drm_sched_entity_destroy(>adev->rings[i]->sched,
>rings[i].entity);
}
 
@@ -466,7 +466,7 @@ void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr *mgr)
if (ctx->adev->rings[i] == >adev->gfx.kiq.ring)
continue;
 
-   max_wait = 
drm_sched_entity_do_release(>adev->rings[i]->sched,
+   max_wait = 
drm_sched_entity_flush(>adev->rings[i]->sched,
  >rings[i].entity, max_wait);
}
}
@@ -492,7 +492,7 @@ void amdgpu_ctx_mgr_entity_cleanup(struct amdgpu_ctx_mgr 
*mgr)
continue;
 
if (kref_read(>refcount) == 1)
-   
drm_sched_entity_cleanup(>adev->rings[i]->sched,
+   
drm_sched_entity_fini(>adev->rings[i]->sched,
>rings[i].entity);
else
DRM_ERROR("ctx %p is still alive\n", ctx); diff 
--git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 0c084d3..0246cb8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -162,7 +162,7 @@ static int amdgpu_ttm_global_init(struct amdgpu_device 
*adev)  static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)  {
if (adev->mman.mem_global_referenced) {
-   drm_sched_entity_fini(adev->mman.entity.sched,
+   drm_sched_entity_destroy(adev->mman.entity.sched,
  >mman.entity);
mutex_destroy(>mman.gtt_window_lock);
drm_global_item_unref(>mman.bo_global_ref.ref);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index cc15d32..0b46ea1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -309,7 +309,7 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
kfree(adev->uvd.inst[j].saved_bo);
 
-   drm_sched_entity_fini(>uvd.inst[j].ring.sched, 
>uvd.inst[j].entity);
+   drm_sched_entity_destroy(>uvd.inst[j].ring.sched, 
+>uvd.inst[j].entity);
 
amdgpu_bo_free_kernel(>uvd.inst[j].vcpu_bo,
  >uvd.inst[j].gpu_addr,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 23d960e..b0dcdfd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -222,7 +222,7 @@ int amdgpu_vce_sw_fini(struct 

RE: [PATCH] drm/amdgpu: skip huge page for PRT mapping

2018-06-03 Thread Zhou, David(ChunMing)
Good catch, Reviewed-by: Chunming  Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Junwei Zhang
Sent: Monday, June 04, 2018 10:04 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Jerry 
Subject: [PATCH] drm/amdgpu: skip huge page for PRT mapping

PRT mapping doesn't support huge page, since it's per PTE basis.

Signed-off-by: Junwei Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 850cd66..4ce8bb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -,7 +,8 @@ static void amdgpu_vm_handle_huge_pages(struct 
amdgpu_pte_update_params *p,
 
/* In the case of a mixed PT the PDE must point to it*/
if (p->adev->asic_type >= CHIP_VEGA10 && !p->src &&
-   nptes == AMDGPU_VM_PTE_COUNT(p->adev)) {
+   nptes == AMDGPU_VM_PTE_COUNT(p->adev) &&
+   !(flags & AMDGPU_PTE_PRT)) {
/* Set the huge page flag to stop scanning at this PDE */
flags |= AMDGPU_PDE_PTE;
}
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: add rcu_barrier after entity fini

2018-05-16 Thread Zhou, David(ChunMing)
Looks good, Acked-by: Chunming  Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Emily 
Deng
Sent: Thursday, May 17, 2018 11:05 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily 
Subject: [PATCH] drm/amdgpu: add rcu_barrier after entity fini

To free the fence from the amdgpu_fence_slab, need twice call_rcu, to avoid the 
amdgpu_fence_slab_fini call kmem_cache_destroy(amdgpu_fence_slab) before 
kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after 
drm_sched_entity_fini.

The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
1.drm_sched_entity_fini ->
drm_sched_entity_cleanup ->
dma_fence_put(entity->last_scheduled) -> drm_sched_fence_release_finished -> 
drm_sched_fence_release_scheduled -> call_rcu(>finished.rcu, 
drm_sched_fence_free)

2.drm_sched_fence_free ->
dma_fence_put(fence->parent) ->
amdgpu_fence_release ->
call_rcu(>rcu, amdgpu_fence_free) ->
kmem_cache_free(amdgpu_fence_slab, fence);

Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index cc3b067..07b2e10 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -134,6 +134,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device 
*adev)
if (adev->mman.mem_global_referenced) {
drm_sched_entity_fini(adev->mman.entity.sched,
  >mman.entity);
+   rcu_barrier();
mutex_destroy(>mman.gtt_window_lock);
drm_global_item_unref(>mman.bo_global_ref.ref);
drm_global_item_unref(>mman.mem_global_ref);
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amdgpu: set ttm bo priority before initialization

2018-05-10 Thread Zhou, David(ChunMing)
The series  is OK to me, Reviewed-by: Chunming  Zhou <david1.z...@amd.com>
It is better to wait Christian to have a look  before pushing patch.

Regards,
David Zhou
-Original Message-
From: Junwei Zhang [mailto:jerry.zh...@amd.com] 
Sent: Friday, May 11, 2018 12:58 PM
To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org
Cc: Koenig, Christian <christian.koe...@amd.com>; Zhou, David(ChunMing) 
<david1.z...@amd.com>; Zhang, Jerry <jerry.zh...@amd.com>
Subject: [PATCH 2/2] drm/amdgpu: set ttm bo priority before initialization

Signed-off-by: Junwei Zhang <jerry.zh...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index e62153a..6a9e46a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -419,6 +419,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 
bo->tbo.bdev = >mman.bdev;
amdgpu_ttm_placement_from_domain(bo, bp->domain);
+   if (bp->type == ttm_bo_type_kernel)
+   bo->tbo.priority = 1;
 
r = ttm_bo_init_reserved(>mman.bdev, >tbo, size, bp->type,
 >placement, page_align, , acc_size, @@ 
-434,9 +436,6 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
else
amdgpu_cs_report_moved_bytes(adev, ctx.bytes_moved, 0);
 
-   if (bp->type == ttm_bo_type_kernel)
-   bo->tbo.priority = 1;
-
if (bp->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED &&
bo->tbo.mem.placement & TTM_PL_FLAG_VRAM) {
struct dma_fence *fence;
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: bo could be null when access in vm bo update

2018-04-23 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Junwei Zhang
Sent: Monday, April 23, 2018 5:29 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Jerry 
Subject: [PATCH] drm/amdgpu: bo could be null when access in vm bo update

Signed-off-by: Junwei Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 6a372ca..1c00f1a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1509,7 +1509,6 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
struct drm_mm_node *nodes;
struct dma_fence *exclusive, **last_update;
uint64_t flags;
-   uint32_t mem_type;
int r;
 
if (clear || !bo_va->base.bo) {
@@ -1568,9 +1567,9 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 * the evicted list so that it gets validated again on the
 * next command submission.
 */
-   mem_type = bo->tbo.mem.mem_type;
if (bo && bo->tbo.resv == vm->root.base.bo->tbo.resv &&
-   !(bo->preferred_domains & amdgpu_mem_type_to_domain(mem_type)))
+   !(bo->preferred_domains &
+   amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type)))
list_add_tail(_va->base.vm_status, >evicted);
spin_unlock(>status_lock);
 
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/scheduler: fix build broken by "move last_sched fence updating prior to job popping"

2018-04-18 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Wednesday, April 18, 2018 6:06 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/scheduler: fix build broken by "move last_sched fence 
updating prior to job popping"

We don't have s_fence as local variable here.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 5de79bbb12c8..f4b862503710 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -402,7 +402,7 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
dma_fence_set_error(_job->s_fence->finished, -ECANCELED);
 
dma_fence_put(entity->last_scheduled);
-   entity->last_scheduled = dma_fence_get(_fence->finished);
+   entity->last_scheduled = dma_fence_get(_job->s_fence->finished);
 
spsc_queue_pop(>job_queue);
return sched_job;
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: PROBLEM: linux-firmware provided firmware files do not support AMDVLK driver

2018-04-17 Thread Zhou, David(ChunMing)
As I known, AMDVLK is running on amdgpu driver, not support readon kernel 
driver.

Regards,
David Zhou

-Original Message-
From: boomboom psh [mailto:andrewston...@gmail.com] 
Sent: Wednesday, April 18, 2018 11:16 AM
To: Deucher, Alexander <alexander.deuc...@amd.com>; Zhou, David(ChunMing) 
<david1.z...@amd.com>; Koenig, Christian <christian.koe...@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Subject: PROBLEM: linux-firmware provided firmware files do not support AMDVLK 
driver

[1.] linux-firmware provided firmware files do not support AMDVLK driver [2.] 
Full description of the problem/report: Vulkan instance fails to load on the 
AMDVLK driver, using a radeon HD7770 card. It throws a 
VK_ERROR_OUT_OF_HOST_MEMORY. This appears to be due to an outdated firmware, as 
replacing the firmware with the firmware provided by the amdgpu-pro driver 
fixes the issue ( https://github.com/GPUOpen-Drivers/AMDVLK/issues/17). The 
issue appears to also be present on pitcairn cards. (
https://github.com/GPUOpen-Drivers/AMDVLK/issues/25)
[3.] firmware, AMDVLK, vulkan, SI, verde, pitcairn [4.1.] Linux version 
4.15.15-1-ARCH (builduser@heftig-4572) (gcc version
7.3.1 20180312 (GCC)) #1 SMP PREEMPT Sat Mar 31 23:59:25 UTC 2018 [7.] 
vulkaninfo from https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers
demonstrates the problem
[8.] AMD Ryzen 3 1200, 8GB DDR4, Radeon HD7770 1GB [X.] Workaround: copy 
firmware from amdgpupro driver, however this must be redone every time there is 
an update to linux-firmware.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amdgpu: refresh per vm bo lru

2018-03-27 Thread Zhou, David(ChunMing)
then how to keep unique lru order? any ideas?

To stable performance, we have to keep unique lru order, otherwise like the 
issue I look into, sometimes F1game is 40fps, sometimes 28fps...even 
re-validate allowed domains BO.

The left root cause is the moved BOs are not same.


send from Smartisan Pro

Christian K鰊ig  于 2018年3月27日 下午6:50写道:

NAK, we already tried that and it is really not a good idea because it
massively increases the per submission overhead.

Christian.

Am 27.03.2018 um 12:16 schrieb Chunming Zhou:
> Change-Id: Ibad84ed585b0746867a5f4cd1eadc2273e7cf596
> Signed-off-by: Chunming Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  1 +
>   3 files changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 383bf2d31c92..414e61799236 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -919,6 +919,8 @@ static int amdgpu_bo_vm_update_pte(struct 
> amdgpu_cs_parser *p)
>}
>}
>
> + amdgpu_vm_refresh_lru(adev, vm);
> +
>return r;
>   }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 5e35e23511cf..8ad2bb705765 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1902,6 +1902,21 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct 
> amdgpu_device *adev,
>return bo_va;
>   }
>
> +void amdgpu_vm_refresh_lru(struct amdgpu_device *adev, struct amdgpu_vm *vm)
> +{
> + struct ttm_bo_global *glob = adev->mman.bdev.glob;
> + struct amdgpu_vm_bo_base *bo_base;
> +
> + spin_lock(>status_lock);
> + list_for_each_entry(bo_base, >vm_bo_list, vm_bo) {
> + spin_lock(>lru_lock);
> + ttm_bo_move_to_lru_tail(_base->bo->tbo);
> + if (bo_base->bo->shadow)
> + ttm_bo_move_to_lru_tail(_base->bo->shadow->tbo);
> + spin_unlock(>lru_lock);
> + }
> + spin_unlock(>status_lock);
> +}
>
>   /**
>* amdgpu_vm_bo_insert_mapping - insert a new mapping
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 1886a561c84e..e01895581489 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -285,6 +285,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>  struct dma_fence **fence);
>   int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
>   struct amdgpu_vm *vm);
> +void amdgpu_vm_refresh_lru(struct amdgpu_device *adev, struct amdgpu_vm *vm);
>   int amdgpu_vm_bo_update(struct amdgpu_device *adev,
>struct amdgpu_bo_va *bo_va,
>bool clear);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: use separate status for buffer funcs availability v2

2018-03-01 Thread Zhou, David(ChunMing)
Patch#1 is Reviewed-by: Chunming Zhou 
Patch#2~#4 are Acked-by: Chunming zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Thursday, March 01, 2018 7:53 PM
To: amd-gfx@lists.freedesktop.org
Cc: ckoenig.leichtzumer...@gmail.com
Subject: [PATCH] drm/amdgpu: use separate status for buffer funcs availability 
v2

The ring status can change during GPU reset, but we still need to be able to 
schedule TTM buffer moves in the meantime.

Otherwise we can ran into problems because of aborted move/fill operations 
during GPU resets.

v2: still check if ring is available during direct submit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 +++--  
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 +
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 2aa6823ef503..614811061d3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -213,9 +213,7 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
abo = ttm_to_amdgpu_bo(bo);
switch (bo->mem.mem_type) {
case TTM_PL_VRAM:
-   if (adev->mman.buffer_funcs &&
-   adev->mman.buffer_funcs_ring &&
-   adev->mman.buffer_funcs_ring->ready == false) {
+   if (!adev->mman.buffer_funcs_enabled) {
amdgpu_ttm_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_CPU);
} else if (adev->gmc.visible_vram_size < 
adev->gmc.real_vram_size &&
   !(abo->flags & 
AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) { @@ -331,7 +329,7 @@ int 
amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
const uint64_t GTT_MAX_BYTES = (AMDGPU_GTT_MAX_TRANSFER_SIZE *
AMDGPU_GPU_PAGE_SIZE);
 
-   if (!ring->ready) {
+   if (!adev->mman.buffer_funcs_enabled) {
DRM_ERROR("Trying to move memory with ring turned off.\n");
return -EINVAL;
}
@@ -577,12 +575,9 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
amdgpu_move_null(bo, new_mem);
return 0;
}
-   if (adev->mman.buffer_funcs == NULL ||
-   adev->mman.buffer_funcs_ring == NULL ||
-   !adev->mman.buffer_funcs_ring->ready) {
-   /* use memcpy */
+
+   if (!adev->mman.buffer_funcs_enabled)
goto memcpy;
-   }
 
if (old_mem->mem_type == TTM_PL_VRAM &&
new_mem->mem_type == TTM_PL_SYSTEM) { @@ -1549,6 +1544,7 @@ void 
amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
else
size = adev->gmc.visible_vram_size;
man->size = size >> PAGE_SHIFT;
+   adev->mman.buffer_funcs_enabled = enable;
 }
 
 int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma) @@ -1647,6 
+1643,11 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t 
src_offset,
unsigned i;
int r;
 
+   if (direct_submit && !ring->ready) {
+   DRM_ERROR("Trying to move memory with ring turned off.\n");
+   return -EINVAL;
+   }
+
max_bytes = adev->mman.buffer_funcs->copy_max_bytes;
num_loops = DIV_ROUND_UP(byte_count, max_bytes);
num_dw = num_loops * adev->mman.buffer_funcs->copy_num_dw;
@@ -1720,7 +1721,7 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
struct amdgpu_job *job;
int r;
 
-   if (!ring->ready) {
+   if (!adev->mman.buffer_funcs_enabled) {
DRM_ERROR("Trying to clear memory with ring turned off.\n");
return -EINVAL;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index b8117c6e51f1..6ea7de863041 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -53,6 +53,7 @@ struct amdgpu_mman {
/* buffer handling */
const struct amdgpu_buffer_funcs*buffer_funcs;
struct amdgpu_ring  *buffer_funcs_ring;
+   boolbuffer_funcs_enabled;
 
struct mutexgtt_window_lock;
/* Scheduler entity for buffer moves */
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order

2018-01-26 Thread Zhou, David(ChunMing)
I don't want to prevent all, my new approach is to prevent the later allocation 
is trying and ahead of front to get the memory space that the front made from 
eviction.



发自坚果 Pro

Christian K鰊ig <ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日 下午9:24写道:

Yes, exactly that's the problem.

See when you want to prevent a process B from allocating the memory process A 
has evicted, you need to prevent all concurrent allocation.

And we don't do that because it causes a major performance drop.

Regards,
Christian.

Am 26.01.2018 um 14:21 schrieb Zhou, David(ChunMing):
You patch will prevent concurrent allocation, and will result in allocation 
performance drop much.


发自坚果 Pro

Christian K鰊ig 
<ckoenig.leichtzumer...@gmail.com><mailto:ckoenig.leichtzumer...@gmail.com> 于 
2018年1月26日 下午9:04写道:

Attached is what you actually want to do cleanly implemented. But as I said 
this is a NO-GO.

Regards,
Christian.

Am 26.01.2018 um 13:43 schrieb Christian König:
After my investigation, this issue should be detect of TTM design self, which 
breaks scheduling balance.
Yeah, but again. This is indented design we can't change easily.

Regards,
Christian.

Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing):
I am off work, so reply mail by phone, the format could not be text.

back to topic itself:
the problem indeed happen on amdgpu driver, someone reports me that application 
runs with two instances, the performance are different.
I also reproduced the issue with unit test(bo_eviction_test). They always think 
our scheduler isn't working as expected.

After my investigation, this issue should be detect of TTM design self, which 
breaks scheduling balance.

Further, if we run containers for our gpu, container A could run high score, 
container B runs low score with same benchmark.

So this is bug that we need fix.

Regards,
David Zhou


发自坚果 Pro

Christian K鰊ig 
<ckoenig.leichtzumer...@gmail.com><mailto:ckoenig.leichtzumer...@gmail.com> 于 
2018年1月26日 下午6:31写道:

Am 26.01.2018 um 11:22 schrieb Chunming Zhou:
> there is a scheduling balance issue about get node like:
> a. process A allocates full memory and use it for submission.
> b. process B tries to allocates memory, will wait for process A BO idle in 
> eviction.
> c. process A completes the job, process B eviction will put process A BO node,
> but in the meantime, process C is comming to allocate BO, whill directly get 
> node successfully, and do submission,
> process B will again wait for process C BO idle.
> d. repeat the above setps, process B could be delayed much more.
>
> later allocation must not be ahead of front in same place.

Again NAK to the whole approach.

At least with amdgpu the problem you described above never occurs
because evictions are pipelined operations. We could only block for
deleted regions to become free.

But independent of that incoming memory requests while we make room for
eviction are intended to be served first.

Changing that is certainly a no-go cause that would favor memory hungry
applications over small clients.

Regards,
Christian.

>
> Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18
> Signed-off-by: Chunming Zhou <david1.z...@amd.com><mailto:david1.z...@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c| 69 
> +++--
>   include/drm/ttm/ttm_bo_api.h|  7 +
>   include/drm/ttm/ttm_bo_driver.h |  7 +
>   3 files changed, 80 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index d33a6bb742a1..558ec2cf465d 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct 
> ttm_buffer_object *bo,
>return 0;
>   }
>
> +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter,
> + struct ttm_buffer_object *bo,
> + const struct ttm_place *place)
> +{
> + waiter->tbo = bo;
> + memcpy((void *)>place, (void *)place, sizeof(*place));
> + INIT_LIST_HEAD(>list);
> +}
> +
> +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man,
> +struct ttm_bo_waiter *waiter)
> +{
> + if (!waiter)
> + return;
> + spin_lock(>wait_lock);
> + list_add_tail(>list, >waiter_list);
> + spin_unlock(>wait_lock);
> +}
> +
> +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man,
> +struct ttm_bo_waiter *waiter)
> +{
> + if (!waiter)
> + return;
> + spin_lock(>wait_lock);
> + if (!list_empty(>list))
> + list_del(>list);
> + spin_unlock(>wait_lock);
> + kfree(waiter);
> +}
> +
> +int ttm_man_check_bo

Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order

2018-01-26 Thread Zhou, David(ChunMing)
You patch will prevent concurrent allocation, and will result in allocation 
performance drop much.


发自坚果 Pro

Christian K鰊ig <ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日 下午9:04写道:

Attached is what you actually want to do cleanly implemented. But as I said 
this is a NO-GO.

Regards,
Christian.

Am 26.01.2018 um 13:43 schrieb Christian König:
After my investigation, this issue should be detect of TTM design self, which 
breaks scheduling balance.
Yeah, but again. This is indented design we can't change easily.

Regards,
Christian.

Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing):
I am off work, so reply mail by phone, the format could not be text.

back to topic itself:
the problem indeed happen on amdgpu driver, someone reports me that application 
runs with two instances, the performance are different.
I also reproduced the issue with unit test(bo_eviction_test). They always think 
our scheduler isn't working as expected.

After my investigation, this issue should be detect of TTM design self, which 
breaks scheduling balance.

Further, if we run containers for our gpu, container A could run high score, 
container B runs low score with same benchmark.

So this is bug that we need fix.

Regards,
David Zhou


发自坚果 Pro

Christian K鰊ig 
<ckoenig.leichtzumer...@gmail.com><mailto:ckoenig.leichtzumer...@gmail.com> 于 
2018年1月26日 下午6:31写道:

Am 26.01.2018 um 11:22 schrieb Chunming Zhou:
> there is a scheduling balance issue about get node like:
> a. process A allocates full memory and use it for submission.
> b. process B tries to allocates memory, will wait for process A BO idle in 
> eviction.
> c. process A completes the job, process B eviction will put process A BO node,
> but in the meantime, process C is comming to allocate BO, whill directly get 
> node successfully, and do submission,
> process B will again wait for process C BO idle.
> d. repeat the above setps, process B could be delayed much more.
>
> later allocation must not be ahead of front in same place.

Again NAK to the whole approach.

At least with amdgpu the problem you described above never occurs
because evictions are pipelined operations. We could only block for
deleted regions to become free.

But independent of that incoming memory requests while we make room for
eviction are intended to be served first.

Changing that is certainly a no-go cause that would favor memory hungry
applications over small clients.

Regards,
Christian.

>
> Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18
> Signed-off-by: Chunming Zhou <david1.z...@amd.com><mailto:david1.z...@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c| 69 
> +++--
>   include/drm/ttm/ttm_bo_api.h|  7 +
>   include/drm/ttm/ttm_bo_driver.h |  7 +
>   3 files changed, 80 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index d33a6bb742a1..558ec2cf465d 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct 
> ttm_buffer_object *bo,
>return 0;
>   }
>
> +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter,
> + struct ttm_buffer_object *bo,
> + const struct ttm_place *place)
> +{
> + waiter->tbo = bo;
> + memcpy((void *)>place, (void *)place, sizeof(*place));
> + INIT_LIST_HEAD(>list);
> +}
> +
> +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man,
> +struct ttm_bo_waiter *waiter)
> +{
> + if (!waiter)
> + return;
> + spin_lock(>wait_lock);
> + list_add_tail(>list, >waiter_list);
> + spin_unlock(>wait_lock);
> +}
> +
> +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man,
> +struct ttm_bo_waiter *waiter)
> +{
> + if (!waiter)
> + return;
> + spin_lock(>wait_lock);
> + if (!list_empty(>list))
> + list_del(>list);
> + spin_unlock(>wait_lock);
> + kfree(waiter);
> +}
> +
> +int ttm_man_check_bo(struct ttm_mem_type_manager *man,
> +  struct ttm_buffer_object *bo,
> +  const struct ttm_place *place)
> +{
> + struct ttm_bo_waiter *waiter, *tmp;
> +
> + spin_lock(>wait_lock);
> + list_for_each_entry_safe(waiter, tmp, >waiter_list, list) {
> + if ((bo != waiter->tbo) &&
> + ((place->fpfn >= waiter->place.fpfn &&
> +   place->fpfn <= waiter->place.lpfn) ||
> +  (place->lpfn <= waiter->place.lpfn && place->lpfn >

Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order

2018-01-26 Thread Zhou, David(ChunMing)
I am off work, so reply mail by phone, the format could not be text.

back to topic itself:
the problem indeed happen on amdgpu driver, someone reports me that application 
runs with two instances, the performance are different.
I also reproduced the issue with unit test(bo_eviction_test). They always think 
our scheduler isn't working as expected.

After my investigation, this issue should be detect of TTM design self, which 
breaks scheduling balance.

Further, if we run containers for our gpu, container A could run high score, 
container B runs low score with same benchmark.

So this is bug that we need fix.

Regards,
David Zhou


发自坚果 Pro

Christian K�nig  于 2018年1月26日 下午6:31写道:

Am 26.01.2018 um 11:22 schrieb Chunming Zhou:
> there is a scheduling balance issue about get node like:
> a. process A allocates full memory and use it for submission.
> b. process B tries to allocates memory, will wait for process A BO idle in 
> eviction.
> c. process A completes the job, process B eviction will put process A BO node,
> but in the meantime, process C is comming to allocate BO, whill directly get 
> node successfully, and do submission,
> process B will again wait for process C BO idle.
> d. repeat the above setps, process B could be delayed much more.
>
> later allocation must not be ahead of front in same place.

Again NAK to the whole approach.

At least with amdgpu the problem you described above never occurs
because evictions are pipelined operations. We could only block for
deleted regions to become free.

But independent of that incoming memory requests while we make room for
eviction are intended to be served first.

Changing that is certainly a no-go cause that would favor memory hungry
applications over small clients.

Regards,
Christian.

>
> Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18
> Signed-off-by: Chunming Zhou 
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c| 69 
> +++--
>   include/drm/ttm/ttm_bo_api.h|  7 +
>   include/drm/ttm/ttm_bo_driver.h |  7 +
>   3 files changed, 80 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index d33a6bb742a1..558ec2cf465d 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct 
> ttm_buffer_object *bo,
>return 0;
>   }
>
> +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter,
> + struct ttm_buffer_object *bo,
> + const struct ttm_place *place)
> +{
> + waiter->tbo = bo;
> + memcpy((void *)>place, (void *)place, sizeof(*place));
> + INIT_LIST_HEAD(>list);
> +}
> +
> +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man,
> +struct ttm_bo_waiter *waiter)
> +{
> + if (!waiter)
> + return;
> + spin_lock(>wait_lock);
> + list_add_tail(>list, >waiter_list);
> + spin_unlock(>wait_lock);
> +}
> +
> +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man,
> +struct ttm_bo_waiter *waiter)
> +{
> + if (!waiter)
> + return;
> + spin_lock(>wait_lock);
> + if (!list_empty(>list))
> + list_del(>list);
> + spin_unlock(>wait_lock);
> + kfree(waiter);
> +}
> +
> +int ttm_man_check_bo(struct ttm_mem_type_manager *man,
> +  struct ttm_buffer_object *bo,
> +  const struct ttm_place *place)
> +{
> + struct ttm_bo_waiter *waiter, *tmp;
> +
> + spin_lock(>wait_lock);
> + list_for_each_entry_safe(waiter, tmp, >waiter_list, list) {
> + if ((bo != waiter->tbo) &&
> + ((place->fpfn >= waiter->place.fpfn &&
> +   place->fpfn <= waiter->place.lpfn) ||
> +  (place->lpfn <= waiter->place.lpfn && place->lpfn >=
> +   waiter->place.fpfn)))
> + goto later_bo;
> + }
> + spin_unlock(>wait_lock);
> + return true;
> +later_bo:
> + spin_unlock(>wait_lock);
> + return false;
> +}
>   /**
>* Repeatedly evict memory from the LRU for @mem_type until we create enough
>* space, or we've evicted everything and there isn't enough space.
> @@ -853,17 +905,26 @@ static int ttm_bo_mem_force_space(struct 
> ttm_buffer_object *bo,
>   {
>struct ttm_bo_device *bdev = bo->bdev;
>struct ttm_mem_type_manager *man = >man[mem_type];
> + struct ttm_bo_waiter waiter;
>int ret;
>
> + ttm_man_init_waiter(, bo, place);
> + ttm_man_add_waiter(man, );
>do {
>ret = (*man->func->get_node)(man, bo, place, mem);
> - if (unlikely(ret != 0))
> + if (unlikely(ret != 0)) {
> + ttm_man_del_waiter(man, );
>return ret;
> - if (mem->mm_node)
> + 

RE: [PATCH 1/5] drm/amdgpu: add new asic callbacks for HDP flush/invalidation

2018-01-04 Thread Zhou, David(ChunMing)
Seems amdgpu_asic_invalidate_hdp() isn't used any more in following patch.

Regards,
David Zhou

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Alex 
Deucher
Sent: Friday, January 05, 2018 12:19 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Li, Samuel 

Subject: [PATCH 1/5] drm/amdgpu: add new asic callbacks for HDP 
flush/invalidation

Needed to properly flush the HDP cache with the CPU from rather than the GPU.

Signed-off-by: Alex Deucher 
Signed-off-by: Samuel Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 6 ++
 1 file changed, 6 insertions(+)

I keep needing to resurrect these patches to test things periodically so I'd 
like to get them merged even if we don't have a pressing use case at the moment.

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 642bea2c9b3a..88f41c41c70a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1287,6 +1287,10 @@ struct amdgpu_asic_funcs {
void (*set_pcie_lanes)(struct amdgpu_device *adev, int lanes);
/* get config memsize register */
u32 (*get_config_memsize)(struct amdgpu_device *adev);
+   /* flush hdp write queue */
+   void (*flush_hdp)(struct amdgpu_device *adev);
+   /* invalidate hdp read cache */
+   void (*invalidate_hdp)(struct amdgpu_device *adev);
 };
 
 /*
@@ -1836,6 +1840,8 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)  
#define amdgpu_asic_read_bios_from_rom(adev, b, l) 
(adev)->asic_funcs->read_bios_from_rom((adev), (b), (l))  #define 
amdgpu_asic_read_register(adev, se, sh, offset, 
v)((adev)->asic_funcs->read_register((adev), (se), (sh), (offset), (v)))  
#define amdgpu_asic_get_config_memsize(adev) 
(adev)->asic_funcs->get_config_memsize((adev))
+#define amdgpu_asic_flush_hdp(adev) 
+(adev)->asic_funcs->flush_hdp((adev))
+#define amdgpu_asic_invalidate_hdp(adev) 
+(adev)->asic_funcs->invalidate_hdp((adev))
 #define amdgpu_gart_flush_gpu_tlb(adev, vmid) 
(adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid))  #define 
amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) 
(adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags))  
#define amdgpu_gart_get_vm_pde(adev, level, dst, flags) 
(adev)->gart.gart_funcs->get_vm_pde((adev), (level), (dst), (flags))
--
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/4] drm/amdgpu: minor optimize VM moved handling v2

2018-01-03 Thread Zhou, David(ChunMing)
>>>> +else if (reservation_object_trylock(resv))
>>>> +clear = false;

this will effect bo in bo list,wont it?


发自坚果 Pro

Koenig, Christian <christian.koe...@amd.com> 于 2018年1月3日 下午6:47写道:

Am 03.01.2018 um 11:43 schrieb Chunming Zhou:
>
>
> On 2018年01月03日 17:25, Christian König wrote:
>> Am 03.01.2018 um 09:10 schrieb Zhou, David(ChunMing):
>>>
>>> On 2018年01月02日 22:47, Christian König wrote:
>>>> Try to lock moved BOs if it's successful we can update the
>>>> PTEs directly to the new location.
>>>>
>>>> v2: rebase
>>>>
>>>> Signed-off-by: Christian König <christian.koe...@amd.com>
>>>> ---
>>>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 ++-
>>>>1 file changed, 14 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 3632c69f1814..c1c5ccdee783 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -1697,18 +1697,31 @@ int amdgpu_vm_handle_moved(struct
>>>> amdgpu_device *adev,
>>>>spin_lock(>status_lock);
>>>>while (!list_empty(>moved)) {
>>>>struct amdgpu_bo_va *bo_va;
>>>> +struct reservation_object *resv;
>>>>   bo_va = list_first_entry(>moved,
>>>>struct amdgpu_bo_va, base.vm_status);
>>>>spin_unlock(>status_lock);
>>>>+resv = bo_va->base.bo->tbo.resv;
>>>> +
>>>>/* Per VM BOs never need to bo cleared in the page
>>>> tables */
>>> This reminders us Per-VM-BOs need to cleared as well after we allow to
>>> evict/swap out per-vm-bos.
>>
>> Actually they don't. The page tables only need to be valid during CS.
>>
>> So what happens is that the per VM-BOs are validated in right before
>> we call amdgpu_vm_handle_moved().
> Yeah, agree it for per-vm-bo situation after I checked all adding
> moved list cases:
> 1. validate pt bos
> 2. bo invalidate
> 3. insert_map for per-vm-bo
> item #1 and #3 both are per-vm-bo, they are already validated before
> handle_moved().
>
> For item #2, there are three places to call it:
> a. amdgpu_bo_vm_update_pte in CS for amdgpu_vm_debug
> b. amdgpu_gem_op_ioctl, but it is for evicted list, nothing with moved
> list.
> c. amdgpu_bo_move_notify when bo validate.
>
> For c case, your optimization is valid, we don't need clear for
> validate bo.
> But for a case, yours will break amdgpu_vm_debug functionality.
>
> Right?

Interesting point, but no that should be handled as well.

The vm_debug handling is only for the BOs on the BO-list. E.g. per VM
BOs are never handled here.

Regards,
Christian.

>
> Regards,
> David Zhou
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> Regards,
>>> David Zhou
>>>> -clear = bo_va->base.bo->tbo.resv !=
>>>> vm->root.base.bo->tbo.resv;
>>>> +if (resv == vm->root.base.bo->tbo.resv)
>>>> +clear = false;
>>>> +/* Try to reserve the BO to avoid clearing its ptes */
>>>> +else if (reservation_object_trylock(resv))
>>>> +clear = false;
>>>> +/* Somebody else is using the BO right now */
>>>> +else
>>>> +clear = true;
>>>>   r = amdgpu_vm_bo_update(adev, bo_va, clear);
>>>>if (r)
>>>>return r;
>>>>+if (!clear && resv != vm->root.base.bo->tbo.resv)
>>>> +reservation_object_unlock(resv);
>>>> +
>>>>spin_lock(>status_lock);
>>>>}
>>>>spin_unlock(>status_lock);
>>
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/4] drm/amdgpu: minor optimize VM moved handling v2

2018-01-03 Thread Zhou, David(ChunMing)


On 2018年01月02日 22:47, Christian König wrote:
> Try to lock moved BOs if it's successful we can update the
> PTEs directly to the new location.
>
> v2: rebase
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 ++-
>   1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 3632c69f1814..c1c5ccdee783 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1697,18 +1697,31 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
>   spin_lock(>status_lock);
>   while (!list_empty(>moved)) {
>   struct amdgpu_bo_va *bo_va;
> + struct reservation_object *resv;
>   
>   bo_va = list_first_entry(>moved,
>   struct amdgpu_bo_va, base.vm_status);
>   spin_unlock(>status_lock);
>   
> + resv = bo_va->base.bo->tbo.resv;
> +
>   /* Per VM BOs never need to bo cleared in the page tables */
This reminders us Per-VM-BOs need to cleared as well after we allow to 
evict/swap out per-vm-bos.

Regards,
David Zhou
> - clear = bo_va->base.bo->tbo.resv != vm->root.base.bo->tbo.resv;
> + if (resv == vm->root.base.bo->tbo.resv)
> + clear = false;
> + /* Try to reserve the BO to avoid clearing its ptes */
> + else if (reservation_object_trylock(resv))
> + clear = false;
> + /* Somebody else is using the BO right now */
> + else
> + clear = true;
>   
>   r = amdgpu_vm_bo_update(adev, bo_va, clear);
>   if (r)
>   return r;
>   
> + if (!clear && resv != vm->root.base.bo->tbo.resv)
> + reservation_object_unlock(resv);
> +
>   spin_lock(>status_lock);
>   }
>   spin_unlock(>status_lock);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 4/5] drm/amdgpu: rename vm_id to vmid

2017-12-20 Thread Zhou, David(ChunMing)
Reviewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Wednesday, December 20, 2017 9:21 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 4/5] drm/amdgpu: rename vm_id to vmid

sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c|  8 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c   |  6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h|  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 28 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 14 ++--
 drivers/gpu/drm/amd/amdgpu/cik_ih.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 14 ++--
 drivers/gpu/drm/amd/amdgpu/cz_ih.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 14 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 18 
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 18 
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 18 
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 14 ++--
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c| 16 ++
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c| 16 ++
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c| 18 +++-
 drivers/gpu/drm/amd/amdgpu/si_dma.c   | 16 +++---
 drivers/gpu/drm/amd/amdgpu/si_ih.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 26 +++---
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 36 +++
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 10 -
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 18 
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 36 +++
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c|  4 ++--
 33 files changed, 188 insertions(+), 194 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 680d4f6de52d..15903ffdf0b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -351,7 +351,7 @@ struct amdgpu_gart_funcs {
/* get the pde for a given mc addr */
void (*get_vm_pde)(struct amdgpu_device *adev, int level,
   u64 *dst, u64 *flags);
-   uint32_t (*get_invalidate_req)(unsigned int vm_id);
+   uint32_t (*get_invalidate_req)(unsigned int vmid);
 };
 
 /* provided by the ih block */
@@ -1124,7 +1124,7 @@ struct amdgpu_job {
void*owner;
uint64_tfence_ctx; /* the fence_context this job uses */
boolvm_needs_flush;
-   unsignedvm_id;
+   unsignedvmid;
uint64_tvm_pd_addr;
uint32_tgds_base, gds_size;
uint32_tgws_base, gws_size;
@@ -1852,7 +1852,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r))
 #define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r))
 #define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r))
-#define amdgpu_ring_emit_ib(r, ib, vm_id, c) (r)->funcs->emit_ib((r), (ib), 
(vm_id), (c))
+#define amdgpu_ring_emit_ib(r, ib, vmid, c) (r)->funcs->emit_ib((r), (ib), 
(vmid), (c))
 #define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r))
 #define amdgpu_ring_emit_vm_flush(r, vmid, addr) 
(r)->funcs->emit_vm_flush((r), (vmid), (addr))
 #define amdgpu_ring_emit_fence(r, addr, seq, flags) 
(r)->funcs->emit_fence((r), (addr), (seq), (flags))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 03a69942cce5..a162d87ca0c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -149,7 +149,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
return -EINVAL;
}
 
-   if (vm && !job->vm_id) {
+   if (vm && !job->vmid) {
dev_err(adev->dev, "VM IB without ID\n");
return -EINVAL;
}
@@ -211,7 +211,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
!amdgpu_sriov_vf(adev)) /* for SRIOV preemption, 
Preamble CE ib must be inserted anyway */
continue;
 
-   

RE: [PATCH 3/5] drm/amdgpu: separate VMID and PASID handling

2017-12-20 Thread Zhou, David(ChunMing)
Looks very good, Reviewed-by: Chunming Zhou 

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Wednesday, December 20, 2017 9:21 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 3/5] drm/amdgpu: separate VMID and PASID handling

Move both into the new files amdgpu_ids.[ch]. No functional change.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c   | 459 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h   |  91 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 422 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  44 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c |   2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c |   2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c |   2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |   2 +-
 13 files changed, 579 insertions(+), 465 deletions(-)  create mode 100644 
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index d8da12c114b1..d6e5b7273853 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -52,7 +52,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
-   amdgpu_queue_mgr.o amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o
+   amdgpu_queue_mgr.o amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o \
+   amdgpu_ids.o
 
 # add asic specific block
 amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \ diff 
--git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 1e3e9be7d77e..1ae149456c9f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -169,8 +169,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_vmem_size = get_vmem_size,
.get_gpu_clock_counter = get_gpu_clock_counter,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
-   .alloc_pasid = amdgpu_vm_alloc_pasid,
-   .free_pasid = amdgpu_vm_free_pasid,
+   .alloc_pasid = amdgpu_pasid_alloc,
+   .free_pasid = amdgpu_pasid_free,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
.init_pipeline = kgd_init_pipeline,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 056929b8ccd0..e9b436bc8dcb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -128,8 +128,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_vmem_size = get_vmem_size,
.get_gpu_clock_counter = get_gpu_clock_counter,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
-   .alloc_pasid = amdgpu_vm_alloc_pasid,
-   .free_pasid = amdgpu_vm_free_pasid,
+   .alloc_pasid = amdgpu_pasid_alloc,
+   .free_pasid = amdgpu_pasid_free,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
.init_pipeline = kgd_init_pipeline,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 0cf86eb357d6..03a69942cce5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -230,8 +230,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
if (r) {
dev_err(adev->dev, "failed to emit fence (%d)\n", r);
if (job && job->vm_id)
-   amdgpu_vm_reset_id(adev, ring->funcs->vmhub,
-  job->vm_id);
+   amdgpu_vmid_reset(adev, ring->funcs->vmhub, job->vm_id);
amdgpu_ring_undo(ring);
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
new file mode 100644
index ..71f8a76d4c10
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -0,0 +1,459 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person 
+obtaining a
+ * copy of 

  1   2   >