from:"Chunming Zhou"

Re: [PATCH] drm/amdgpu: remove distinction between explicit and implicit sync (v2)

2020-06-11 Thread Chunming Zhou

I didn't check the patch details, if it is for existing implicit sync of
shared buffer, feel free go ahead.

But if you add some description for its usage, that will be more clear
to others.

-David

在 2020/6/11 15:19, Marek Olšák 写道:

Hi David,

Explicit sync has nothing to do with this. This is for implicit sync,
which is required by DRI3. This fix allows removing existing
inefficiencies from drivers, so it's a good thing.

Marek

On Wed., Jun. 10, 2020, 03:56 Chunming Zhou, <mailto:zhou...@amd.com>> wrote:

在 2020/6/10 15:41, Christian König 写道:

That's true, but for now we are stuck with the implicit sync for
quite a number of use cases.

My problem is rather that we already tried this and it backfired
immediately.

I do remember that it was your patch who introduced the pipeline
sync flag handling and I warned that this could be problematic.
You then came back with a QA result saying that this is indeed
causing a huge performance drop in one test case and we need to
do something else. Together we then came up with the different
handling between implicit and explicit sync.

Isn't pipeline sync flag to fix some issue because of parralel
execution between jobs in one pipeline? I really don't have this
memory in mind why that's realted to this, Or do you mean extra
sync hides many other potential issues?

Anyway, when I go through Vulkan WSI code, the synchronization
isn't so smooth between OS window system. And when I saw Jason
drives explicit sync through the whole Linux ecosystem like
Android window system does, I feel that's really a good direction.

-David

But I can't find that stupid mail thread any more. I knew that it
was a couple of years ago when we started with the explicit sync
for Vulkan.

Christian.

Am 10.06.20 um 08:29 schrieb Zhou, David(ChunMing):

[AMD Official Use Only - Internal Distribution Only]

Not sue if this is right direction, I think usermode wants all
synchronizations to be explicit. Implicit sync often confuses
people who don’t know its history. I remember Jason from Intel
is driving explicit synchronization through the Linux
ecosystem, which even removes implicit sync of shared buffer.

-David

*From:* amd-gfx
<mailto:amd-gfx-boun...@lists.freedesktop.org> *On Behalf Of
*Marek Olšák
*Sent:* Tuesday, June 9, 2020 6:58 PM
*To:* amd-gfx mailing list
<mailto:amd-gfx@lists.freedesktop.org>
*Subject:* [PATCH] drm/amdgpu: remove distinction between
explicit and implicit sync (v2)

Hi,

This enables a full pipeline sync for implicit sync. It's
Christian's patch with the driver version bumped. With this,
user mode drivers don't have to wait for idle at the end of gfx IBs.

Any concerns?

Thanks,

Marek

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx=02%7C01%7CDavid1.Zhou%40amd.com%7C0d3096fc043f4443f14e08d80dd7c674%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637274567683552668=xIHDswGRsdCP%2BE7MRI4nKXdoMgV2LBzFPP46zGpQusk%3D=0>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: remove distinction between explicit and implicit sync (v2)

2020-06-10 Thread Chunming Zhou



在 2020/6/10 15:41, Christian König 写道:
That's true, but for now we are stuck with the implicit sync for quite 
a number of use cases.


My problem is rather that we already tried this and it backfired 
immediately.


I do remember that it was your patch who introduced the pipeline sync 
flag handling and I warned that this could be problematic. You then 
came back with a QA result saying that this is indeed causing a huge 
performance drop in one test case and we need to do something else. 
Together we then came up with the different handling between implicit 
and explicit sync.


Isn't pipeline sync flag to fix some issue because of parralel execution 
between jobs in one pipeline?  I really don't have this memory in mind 
why that's realted to this, Or do you mean extra sync hides many other 
potential issues?


Anyway, when I go through Vulkan WSI code, the synchronization isn't so 
smooth between OS window system. And when I saw Jason drives explicit 
sync through the whole Linux ecosystem like Android window system does, 
I feel that's really a good direction.


-David



But I can't find that stupid mail thread any more. I knew that it was 
a couple of years ago when we started with the explicit sync for Vulkan.


Christian.

Am 10.06.20 um 08:29 schrieb Zhou, David(ChunMing):


[AMD Official Use Only - Internal Distribution Only]

Not sue if this is right direction, I think usermode wants all 
synchronizations to be explicit. Implicit sync often confuses people 
who don’t know its history. I remember Jason from Intel  is driving 
explicit synchronization through the Linux ecosystem, which even 
removes implicit sync of shared buffer.


-David

*From:* amd-gfx  *On Behalf Of 
*Marek Olšák

*Sent:* Tuesday, June 9, 2020 6:58 PM
*To:* amd-gfx mailing list 
*Subject:* [PATCH] drm/amdgpu: remove distinction between explicit 
and implicit sync (v2)


Hi,

This enables a full pipeline sync for implicit sync. It's Christian's 
patch with the driver version bumped. With this, user mode drivers 
don't have to wait for idle at the end of gfx IBs.


Any concerns?

Thanks,

Marek


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] MAINTAINERS: Remove me from amdgpu maintainers

2020-05-06 Thread Chunming Zhou

Glad to spend time on kernel driver in past years.
I've moved to new focus in umd and couldn't commit
enough time to discussions.

Signed-off-by: Chunming Zhou 
---
 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 938316092634..4ca508bd4c9e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14066,7 +14066,6 @@ F:  drivers/net/wireless/quantenna
 RADEON and AMDGPU DRM DRIVERS
 M: Alex Deucher 
 M: Christian König 
-M: David (ChunMing) Zhou 
 L: amd-gfx@lists.freedesktop.org
 S: Supported
 T: git git://people.freedesktop.org/~agd5f/linux
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: drm/amdgpu: apply AMDGPU_IB_FLAG_EMIT_MEM_SYNC to compute IBs too

2020-04-27 Thread Chunming Zhou

Yes, same question.

In fact, PAL cmd stream has itself Relase/Acquire packets. That we use
the flag is per your request.

-David

在 2020/4/27 22:53, Christian König 写道:

Yeah, but is Mesa going to use it?

Christian.

Am 27.04.20 um 15:54 schrieb Marek Olšák:
PAL requested it and they are going to use it. (it looks like they
have to use it for correctness)

Marek

On Mon, Apr 27, 2020 at 9:02 AM Deucher, Alexander
mailto:alexander.deuc...@amd.com>> wrote:

[AMD Official Use Only - Internal Distribution Only]

Do we have open source code UMD code which uses this?

Alex

*From:* Christian König mailto:ckoenig.leichtzumer...@gmail.com>>
*Sent:* Sunday, April 26, 2020 4:55 AM
*To:* Marek Olšák mailto:mar...@gmail.com>>;
Koenig, Christian mailto:christian.koe...@amd.com>>
*Cc:* Deucher, Alexander mailto:alexander.deuc...@amd.com>>; amd-gfx mailing list
mailto:amd-gfx@lists.freedesktop.org>>
*Subject:* Re: drm/amdgpu: apply AMDGPU_IB_FLAG_EMIT_MEM_SYNC to
compute IBs too
Thanks for that explanation. I suspected that there was a good
reason to have that in the kernel, but couldn't find one.

In this case the patch is Reviewed-by: Christian König

We should probably add this explanation as comment to the flag as
well.

Thanks,
Christian.

Am 26.04.20 um 02:43 schrieb Marek Olšák:

It was merged into amd-staging-drm-next.

I'm not absolutely sure, but I think we need to invalidate
before IBs if an IB is cached in L2 and the CPU has updated it.
It can only be cached in L2 if something other than CP has read
it or written to it without invalidation. CP reads don't cache
it but they can hit the cache if it's already cached.

For CE, we need to invalidate before the IB in the kernel,
because CE IBs can't do cache invalidations IIRC. This is the
number one reason for merging the already pushed commits.

Marek

On Sat., Apr. 25, 2020, 11:03 Christian König,
mailto:ckoenig.leichtzumer...@gmail.com>> wrote:

Was that patch set actually merged upstream? My last status
is that we couldn't find a reason why we need to do this in
the kernel.

Christian.

Am 25.04.20 um 10:52 schrieb Marek Olšák:

This was missed.

Marek

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cdavid1.zhou%40amd.com%7Ced56cca1a5214cf9132808d7eabac6d9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637235960880895689sdata=6p%2BAuZXHiUrO8wElftOqsJzHF%2BVLe5TMDIF%2BbJNV6ac%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker

2020-04-09 Thread Chunming Zhou


We can have both of yours, I think.

Even switch to use spin_trylock, I think we are ok to have 
cond_resched() Xinhui added in this patch. That can give more chance to 
urgent task to use cpu.



-David

在 2020/4/9 22:59, Christian König 写道:

Why we break out the loops when there are pending bos to be released?


We do this anyway if we can't acquire the necessary locks. Freeing 
already deleted BOs is just a very lazy background work.



So it did not break anything with this patch I think.


Oh, the patch will certainly work. I'm just not sure if it's the ideal 
behavior.



https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026

This is another example of the usage of  cond_sched.


Yes, and that is also a good example of what I mean here:

	if  (!mutex_trylock 
(_mutex 
))

/* Give up. Setup the next iteration. */
goto  out;


If the function can't acquire the lock immediately it gives up and 
waits for the next iteration.


I think it would be better if we do this in TTM as well if we spend to 
much time cleaning up old BOs.


On the other hand you are right that cond_resched() has the advantage 
that we could spend more time on cleaning up old BOs if there is 
nothing else for the CPU TODO.


Regards,
Christian.

Am 09.04.20 um 16:24 schrieb Pan, Xinhui:

https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026

This is another example of the usage of  cond_sched.

*From:* Pan, Xinhui 
*Sent:* Thursday, April 9, 2020 10:11:08 PM
*To:* Lucas Stach ; 
amd-gfx@lists.freedesktop.org ; 
Koenig, Christian 

*Cc:* dri-de...@lists.freedesktop.org 
*Subject:* Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed 
delete worker
I think it doesn't matter if workitem schedule out. Even we did not 
schedule out, the workqueue itself will schedule out later.

So it did not break anything with this patch I think.

*From:* Pan, Xinhui 
*Sent:* Thursday, April 9, 2020 10:07:09 PM
*To:* Lucas Stach ; 
amd-gfx@lists.freedesktop.org ; 
Koenig, Christian 

*Cc:* dri-de...@lists.freedesktop.org 
*Subject:* Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed 
delete worker

Why we break out the loops when there are pending bos to be released?

And I just checked the process_one_work. Right after the work item 
callback is called,  the workqueue itself will call cond_resched. So 
I think


*From:* Koenig, Christian 
*Sent:* Thursday, April 9, 2020 9:38:24 PM
*To:* Lucas Stach ; Pan, Xinhui 
; amd-gfx@lists.freedesktop.org 


*Cc:* dri-de...@lists.freedesktop.org 
*Subject:* Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed 
delete worker

Am 09.04.20 um 15:25 schrieb Lucas Stach:
> Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König:
>> Am 09.04.20 um 03:31 schrieb xinhui pan:
>>> The delayed delete list is per device which might be very huge. 
And in

>>> a heavy workload test, the list might always not be empty. That will
>>> trigger any RCU stall warnings or softlockups in non-preemptible 
kernels

>>> Lets do schedule out if possible in that case.
>> Mhm, I'm not sure if that is actually allowed. This is called from a
>> work item and those are not really supposed to be scheduled away.
> Huh? Workitems can schedule out just fine, otherwise they would be
> horribly broken when it comes to sleeping locks.

Let me refine the sentence: Work items are not really supposed to be
scheduled purposely. E.g. you shouldn't call schedule() or
cond_resched() like in the case here.

Getting scheduled away because we wait for a lock is of course perfectly
fine.

>   The workqueue code
> even has measures to keep the workqueues at the expected concurrency
> level by starting other workitems when one of them goes to sleep.

Yeah, and exactly that's what I would say we should avoid here :)

In other words work items can be scheduled away, but they should not if
not really necessary (e.g. waiting for a lock).

Otherwise as you said new threads for work item processing are started
up and I don't think we want that.

Just returning from the work item and waiting for the next cycle is most
likely the better option.

Regards,
Christian.

>
>

Re: [PATCH] drm/amdgpu: resvert "disable bulk moves for now"

2019-09-12 Thread Chunming Zhou

RB on it to go ahead.

-David

在 2019/9/12 18:15, Christian König 写道:
> This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.
>
> The changes to fix this should have landed in 5.1.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 --
>   1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 48349e4f0701..fd3fbaa73fa3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device 
> *adev,
>   struct ttm_bo_global *glob = adev->mman.bdev.glob;
>   struct amdgpu_vm_bo_base *bo_base;
>   
> -#if 0
>   if (vm->bulk_moveable) {
>   spin_lock(>lru_lock);
>   ttm_bo_bulk_move_lru_tail(>lru_bulk_move);
>   spin_unlock(>lru_lock);
>   return;
>   }
> -#endif
>   
>   memset(>lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
>   
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: grab the id mgr lock while accessing passid_mapping

2019-09-10 Thread Chunming Zhou

Reviewed-by: Chunming Zhou 

在 2019/9/10 16:56, Christian König 写道:
> Ping!
>
> Am 09.09.19 um 13:59 schrieb Christian König:
>> Need to make sure that we actually dropping the right fence.
>> Could be done with RCU as well, but to complicated for a fix.
>>
>> Signed-off-by: Christian König 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 +---
>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index b285ab25146d..e11764164cbf 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1036,10 +1036,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, 
>> struct amdgpu_job *job, bool need_
>>   id->oa_base != job->oa_base ||
>>   id->oa_size != job->oa_size);
>>   bool vm_flush_needed = job->vm_needs_flush;
>> -    bool pasid_mapping_needed = id->pasid != job->pasid ||
>> -    !id->pasid_mapping ||
>> -    !dma_fence_is_signaled(id->pasid_mapping);
>>   struct dma_fence *fence = NULL;
>> +    bool pasid_mapping_needed;
>>   unsigned patch_offset = 0;
>>   int r;
>>   @@ -1049,6 +1047,12 @@ int amdgpu_vm_flush(struct amdgpu_ring 
>> *ring, struct amdgpu_job *job, bool need_
>>   pasid_mapping_needed = true;
>>   }
>>   +    mutex_lock(_mgr->lock);
>> +    if (id->pasid != job->pasid || !id->pasid_mapping ||
>> +    !dma_fence_is_signaled(id->pasid_mapping))
>> +    pasid_mapping_needed = true;
>> +    mutex_unlock(_mgr->lock);
>> +
>>   gds_switch_needed &= !!ring->funcs->emit_gds_switch;
>>   vm_flush_needed &= !!ring->funcs->emit_vm_flush &&
>>   job->vm_pd_addr != AMDGPU_BO_INVALID_OFFSET;
>> @@ -1088,9 +1092,11 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, 
>> struct amdgpu_job *job, bool need_
>>   }
>>     if (pasid_mapping_needed) {
>> +    mutex_lock(_mgr->lock);
>>   id->pasid = job->pasid;
>>   dma_fence_put(id->pasid_mapping);
>>   id->pasid_mapping = dma_fence_get(fence);
>> +    mutex_unlock(_mgr->lock);
>>   }
>>   dma_fence_put(fence);
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict

2019-09-03 Thread Chunming Zhou

Reviewed-by: Chunming Zhou  for series.

-David

在 2019/9/3 17:09, Christian König 写道:
> Trying to evict things from the current working set doesn't work that
> well anymore because of per VM BOs.
>
> Rely on reserving VRAM for page tables to avoid contention.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 71 +-
>   2 files changed, 1 insertion(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index a236213f8e8e..d1995156733e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -478,7 +478,6 @@ struct amdgpu_cs_parser {
>   uint64_tbytes_moved_vis_threshold;
>   uint64_tbytes_moved;
>   uint64_tbytes_moved_vis;
> - struct amdgpu_bo_list_entry *evictable;
>   
>   /* user fence */
>   struct amdgpu_bo_list_entry uf_entry;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index fd95b586b590..03182d968d3d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -447,75 +447,12 @@ static int amdgpu_cs_bo_validate(struct 
> amdgpu_cs_parser *p,
>   return r;
>   }
>   
> -/* Last resort, try to evict something from the current working set */
> -static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
> - struct amdgpu_bo *validated)
> -{
> - uint32_t domain = validated->allowed_domains;
> - struct ttm_operation_ctx ctx = { true, false };
> - int r;
> -
> - if (!p->evictable)
> - return false;
> -
> - for (;>evictable->tv.head != >validated;
> -  p->evictable = list_prev_entry(p->evictable, tv.head)) {
> -
> - struct amdgpu_bo_list_entry *candidate = p->evictable;
> - struct amdgpu_bo *bo = ttm_to_amdgpu_bo(candidate->tv.bo);
> - struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> - bool update_bytes_moved_vis;
> - uint32_t other;
> -
> - /* If we reached our current BO we can forget it */
> - if (bo == validated)
> - break;
> -
> - /* We can't move pinned BOs here */
> - if (bo->pin_count)
> - continue;
> -
> - other = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> -
> - /* Check if this BO is in one of the domains we need space for 
> */
> - if (!(other & domain))
> - continue;
> -
> - /* Check if we can move this BO somewhere else */
> - other = bo->allowed_domains & ~domain;
> - if (!other)
> - continue;
> -
> - /* Good we can try to move this BO somewhere else */
> - update_bytes_moved_vis =
> - !amdgpu_gmc_vram_full_visible(>gmc) &&
> - amdgpu_bo_in_cpu_visible_vram(bo);
> - amdgpu_bo_placement_from_domain(bo, other);
> - r = ttm_bo_validate(>tbo, >placement, );
> - p->bytes_moved += ctx.bytes_moved;
> - if (update_bytes_moved_vis)
> - p->bytes_moved_vis += ctx.bytes_moved;
> -
> - if (unlikely(r))
> - break;
> -
> - p->evictable = list_prev_entry(p->evictable, tv.head);
> - list_move(>tv.head, >validated);
> -
> - return true;
> - }
> -
> - return false;
> -}
> -
>   static int amdgpu_cs_validate(void *param, struct amdgpu_bo *bo)
>   {
>   struct amdgpu_cs_parser *p = param;
>   int r;
>   
> - do {
> - r = amdgpu_cs_bo_validate(p, bo);
> - } while (r == -ENOMEM && amdgpu_cs_try_evict(p, bo));
> + r = amdgpu_cs_bo_validate(p, bo);
>   if (r)
>   return r;
>   
> @@ -554,9 +491,6 @@ static int amdgpu_cs_list_validate(struct 
> amdgpu_cs_parser *p,
>   binding_userptr = true;
>   }
>   
> - if (p->evictable == lobj)
> - p->evictable = NULL;
> -
>   r = amdgpu_cs_validate(p, bo);
>   if (r)
>   return r;
> @@ -659,9 +593,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser 
> *p,
>

Re: [PATCH] drm/amdgpu: fix dma_fence_wait without reference

2019-08-16 Thread Chunming Zhou

Reviewed-by: Chunming Zhou 

在 2019/8/16 21:21, Christian König 写道:
> We need to grab a reference to the fence we wait for.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 27 ++---
>   1 file changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index f539a2a92774..7398b4850649 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -534,21 +534,24 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx,
>  struct drm_sched_entity *entity)
>   {
>   struct amdgpu_ctx_entity *centity = to_amdgpu_ctx_entity(entity);
> - unsigned idx = centity->sequence & (amdgpu_sched_jobs - 1);
> - struct dma_fence *other = centity->fences[idx];
> + struct dma_fence *other;
> + unsigned idx;
> + long r;
>   
> - if (other) {
> - signed long r;
> - r = dma_fence_wait(other, true);
> - if (r < 0) {
> - if (r != -ERESTARTSYS)
> - DRM_ERROR("Error (%ld) waiting for fence!\n", 
> r);
> + spin_lock(>ring_lock);
> + idx = centity->sequence & (amdgpu_sched_jobs - 1);
> + other = dma_fence_get(centity->fences[idx]);
> + spin_unlock(>ring_lock);
>   
> - return r;
> - }
> - }
> + if (!other)
> + return 0;
>   
> - return 0;
> + r = dma_fence_wait(other, true);
> + if (r < 0 && r != -ERESTARTSYS)
> + DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +
> + dma_fence_put(other);
> + return r;
>   }
>   
>   void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix a potential information leaking bug

2019-07-27 Thread Chunming Zhou


在 2019/7/27 17:30, Wang Xiayang 写道:
> Coccinelle reports a path that the array "data" is never initialized.
> The path skips the checks in the conditional branches when either
> of callback functions, read_wave_vgprs and read_wave_sgprs, is not
> registered. Later, the uninitialized "data" array is read
> in the while-loop below and passed to put_user().
>
> Fix the path by allocating the array with kcalloc().
>
> The patch is simplier than adding a fall-back branch that explicitly
> calls memset(data, 0, ...). Also it does not need the multiplication
> 1024*sizeof(*data) as the size parameter for memset() though there is
> no risk of integer overflow.
>
> Signed-off-by: Wang Xiayang 

Reviewed-by: Chunming Zhou 

-David

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 6d54decef7f8..5652cc72ed3a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -707,7 +707,7 @@ static ssize_t amdgpu_debugfs_gpr_read(struct file *f, 
> char __user *buf,
>   thread = (*pos & GENMASK_ULL(59, 52)) >> 52;
>   bank = (*pos & GENMASK_ULL(61, 60)) >> 60;
>   
> - data = kmalloc_array(1024, sizeof(*data), GFP_KERNEL);
> + data = kcalloc(1024, sizeof(*data), GFP_KERNEL);
>   if (!data)
>   return -ENOMEM;
>   
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Intermittent errors when using amdgpu_job_submit_direct

2019-07-10 Thread Chunming Zhou


在 2019/7/10 3:26, Kuehling, Felix 写道:
> On 2019-07-09 8:58 a.m., Zhou, David(ChunMing) wrote:
>> I've raised it up when Christian make page fault, at that patch,
>> amdgpu_job_submit_direct uses exclusive page fault ring for that.
>>
>> But if you use amdgpu_job_submit_direct for gerneral rings ocuppied by
>> scheduler, I guess varias bugs will happen.
> The problem is, even the paging ring is used by the scheduler. There are
> several places where buffer operations are submitted to the paging ring
> through the scheduler. That makes any use of the paging ring through
> direct submission problematic.
>
> Even ignoring the scheduler, if it's possible that multiple threads
> submit to the paging ring, we'll need locking to ensure that the
> contents of the ring remain consistent. IIRC, the rings used to have
> locking before we had a GPU scheduler. For comparison, see
> radeon_ring.c, which still has locking. With the GPU scheduler, the
> rings became single-producer queues that no longer needed locking. But
> with direct submission that is no longer true. I think a good place to
> do that locking now would be in amdgpu_ib_schedule.

Yes, That is exact reason why we remove ring lock at that moment.

You can add back it when using submit_direct co-existing with scheduler.

-David

>
> Regards,
>     Felix
>
>
>> -David
>>
>> 在 2019/7/9 12:53, Kuehling, Felix 写道:
>>> I'm seeing some weird intermittent bugs (vm faults, hangs, etc) when
>>> trying to use amdgpu_job_submit_direct. I'm wondering if there is a
>>> possibility of a race condition, when a submit_direct and a GPU
>>> scheduler thread try to submit to the same ring at the same time. I
>>> didn't see any locking to allow multiple threads safely submitting to
>>> the same ring.
>>>
>>> Am I missing something?
>>>
>>> Thanks,
>>>   Felix
>>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: Fix the memory delay free issue

2019-07-10 Thread Chunming Zhou

It doesn't make sense that freeing BO still uses per-vm resv.

I remember when BO is in release list, its resv will be from per-vm resv 
copy. Could you check it?

-David

在 2019/7/10 17:29, Emily Deng 写道:
> For vulkan cts allocation test cases, they will create a series of bos, and 
> then free
> them. As it has lots of alloction test cases with the same vm, as per vm
> bo feature enable, all of those bos' resv are the same. But the bo free is 
> quite slow,
> as they use the same resv object, for every time, free a bo,
> it will check the resv whether signal, if it signal, then will free it. But
> as the test cases will continue to create bo, and the resv fence is 
> increasing. So the
> free is more slower than creating. It will cause memory exhausting.
>
> Method:
> When the resv signal, release all the bos which are use the same
> resv object.
>
> Signed-off-by: Emily Deng 
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c | 29 -
>   1 file changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index f9a3d4c..57ec59b 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -543,6 +543,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object 
> *bo,
>   {
>   struct ttm_bo_global *glob = bo->bdev->glob;
>   struct reservation_object *resv;
> + struct ttm_buffer_object *resv_bo, *resv_bo_next;
>   int ret;
>   
>   if (unlikely(list_empty(>ddestroy)))
> @@ -566,10 +567,14 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object 
> *bo,
>  interruptible,
>  30 * HZ);
>   
> - if (lret < 0)
> + if (lret < 0) {
> + kref_put(>list_kref, ttm_bo_release_list);
>   return lret;
> - else if (lret == 0)
> + }
> + else if (lret == 0) {
> + kref_put(>list_kref, ttm_bo_release_list);
>   return -EBUSY;
> + }
>   
>   spin_lock(>lru_lock);
>   if (unlock_resv && !kcl_reservation_object_trylock(bo->resv)) {
> @@ -582,6 +587,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object 
> *bo,
>* here.
>*/
>   spin_unlock(>lru_lock);
> + kref_put(>list_kref, ttm_bo_release_list);
>   return 0;
>   }
>   ret = 0;
> @@ -591,15 +597,29 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object 
> *bo,
>   if (unlock_resv)
>   kcl_reservation_object_unlock(bo->resv);
>   spin_unlock(>lru_lock);
> + kref_put(>list_kref, ttm_bo_release_list);
>   return ret;
>   }
>   
>   ttm_bo_del_from_lru(bo);
>   list_del_init(>ddestroy);
>   kref_put(>list_kref, ttm_bo_ref_bug);
> -
>   spin_unlock(>lru_lock);
>   ttm_bo_cleanup_memtype_use(bo);
> + kref_put(>list_kref, ttm_bo_release_list);
> +
> + spin_lock(>lru_lock);
> + list_for_each_entry_safe(resv_bo, resv_bo_next, >bdev->ddestroy, 
> ddestroy) {
> + if (resv_bo->resv == bo->resv) {
> + ttm_bo_del_from_lru(resv_bo);
> + list_del_init(_bo->ddestroy);
> + spin_unlock(>lru_lock);
> + ttm_bo_cleanup_memtype_use(resv_bo);
> + kref_put(_bo->list_kref, ttm_bo_release_list);
> + spin_lock(>lru_lock);
> + }
> + }
> + spin_unlock(>lru_lock);
>   
>   if (unlock_resv)
>   kcl_reservation_object_unlock(bo->resv);
> @@ -639,9 +659,8 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device 
> *bdev, bool remove_all)
>   ttm_bo_cleanup_refs(bo, false, !remove_all, true);
>   } else {
>   spin_unlock(>lru_lock);
> + kref_put(>list_kref, ttm_bo_release_list);
>   }
> -
> - kref_put(>list_kref, ttm_bo_release_list);
>   spin_lock(>lru_lock);
>   }
>   list_splice_tail(, >ddestroy);
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Intermittent errors when using amdgpu_job_submit_direct

2019-07-09 Thread Chunming Zhou

I've raised it up when Christian make page fault, at that patch, 
amdgpu_job_submit_direct uses exclusive page fault ring for that.

But if you use amdgpu_job_submit_direct for gerneral rings ocuppied by 
scheduler, I guess varias bugs will happen.

-David

在 2019/7/9 12:53, Kuehling, Felix 写道:
> I'm seeing some weird intermittent bugs (vm faults, hangs, etc) when
> trying to use amdgpu_job_submit_direct. I'm wondering if there is a
> possibility of a race condition, when a submit_direct and a GPU
> scheduler thread try to submit to the same ring at the same time. I
> didn't see any locking to allow multiple threads safely submitting to
> the same ring.
>
> Am I missing something?
>
> Thanks,
>     Felix
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/5] drm/amdgpu: allow direct submission in the VM backends

2019-06-28 Thread Chunming Zhou


在 2019/6/28 20:18, Christian König 写道:
> This allows us to update page tables directly while in a page fault.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  5 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  |  4 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 29 +
>   3 files changed, 27 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 489a162ca620..5941accea061 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -197,6 +197,11 @@ struct amdgpu_vm_update_params {
>*/
>   struct amdgpu_vm *vm;
>   
> + /**
> +  * @direct: if changes should be made directly
> +  */
> + bool direct;
> +
>   /**
>* @pages_addr:
>*
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> index 5222d165abfc..f94e4896079c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> @@ -49,6 +49,10 @@ static int amdgpu_vm_cpu_prepare(struct 
> amdgpu_vm_update_params *p, void *owner,
>   {
>   int r;
>   
> + /* Don't wait for anything during page fault */
> + if (p->direct)
> + return 0;
> +
>   /* Wait for PT BOs to be idle. PTs share the same resv. object
>* as the root PD BO
>*/
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index ddd181f5ed37..891d597063cb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -68,17 +68,17 @@ static int amdgpu_vm_sdma_prepare(struct 
> amdgpu_vm_update_params *p,
>   if (r)
>   return r;
>   
> - r = amdgpu_sync_fence(p->adev, >job->sync, exclusive, false);
> - if (r)
> - return r;
> + p->num_dw_left = ndw;
> +
> + if (p->direct)
> + return 0;
>   
> - r = amdgpu_sync_resv(p->adev, >job->sync, root->tbo.resv,
> -  owner, false);
> + r = amdgpu_sync_fence(p->adev, >job->sync, exclusive, false);
>   if (r)
>   return r;
>   
> - p->num_dw_left = ndw;
> - return 0;
> + return amdgpu_sync_resv(p->adev, >job->sync, root->tbo.resv,
> + owner, false);
>   }
>   
>   /**
> @@ -99,13 +99,21 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>   struct dma_fence *f;
>   int r;
>   
> - ring = container_of(p->vm->entity.rq->sched, struct amdgpu_ring, sched);
> + if (p->direct)
> + ring = p->adev->vm_manager.page_fault;
> + else
> + ring = container_of(p->vm->entity.rq->sched,
> + struct amdgpu_ring, sched);
>   
>   WARN_ON(ib->length_dw == 0);
>   amdgpu_ring_pad_ib(ring, ib);
>   WARN_ON(ib->length_dw > p->num_dw_left);
> - r = amdgpu_job_submit(p->job, >vm->entity,
> -   AMDGPU_FENCE_OWNER_VM, );
> +
> + if (p->direct)
> + r = amdgpu_job_submit_direct(p->job, ring, );

When we use direct submission after intialization, we need to take care 
of ring race condision, don't we? Am I missing anything?


-David

> + else
> + r = amdgpu_job_submit(p->job, >vm->entity,
> +   AMDGPU_FENCE_OWNER_VM, );
>   if (r)
>   goto error;
>   
> @@ -120,7 +128,6 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>   return r;
>   }
>   
> -
>   /**
>* amdgpu_vm_sdma_copy_ptes - copy the PTEs from mapping
>*
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/2] drm/amdgpu: fix transform feedback GDS hang on gfx10

2019-06-20 Thread Chunming Zhou

please take care of .emit_ib_size member, otherwise it looks ok to me.

-David

在 2019/6/20 8:02, Marek Olšák 写道:
> From: Marek Olšák 
>
> Signed-off-by: Marek Olšák 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h |  3 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 12 ++--
>   2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
> index dad2186f4ed5..df8a23554831 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
> @@ -24,21 +24,22 @@
>   #ifndef __AMDGPU_GDS_H__
>   #define __AMDGPU_GDS_H__
>   
>   struct amdgpu_ring;
>   struct amdgpu_bo;
>   
>   struct amdgpu_gds {
>   uint32_t gds_size;
>   uint32_t gws_size;
>   uint32_t oa_size;
> - uint32_tgds_compute_max_wave_id;
> + uint32_t gds_compute_max_wave_id;
> + uint32_t vgt_gs_max_wave_id;
>   };
>   
>   struct amdgpu_gds_reg_offset {
>   uint32_tmem_base;
>   uint32_tmem_size;
>   uint32_tgws;
>   uint32_toa;
>   };
>   
>   #endif /* __AMDGPU_GDS_H__ */
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 0090cba2d24d..75a34779a57c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -4213,20 +4213,29 @@ static void gfx_v10_0_ring_emit_hdp_flush(struct 
> amdgpu_ring *ring)
>   }
>   
>   static void gfx_v10_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
>  struct amdgpu_job *job,
>  struct amdgpu_ib *ib,
>  uint32_t flags)
>   {
>   unsigned vmid = AMDGPU_JOB_GET_VMID(job);
>   u32 header, control = 0;
>   
> + /* Prevent a hw deadlock due to a wave ID mismatch between ME and GDS.
> +  * This resets the wave ID counters. (needed by transform feedback)
> +  * TODO: This might only be needed on a VMID switch when we change
> +  *   the GDS OA mapping, not sure.
> +  */
> + amdgpu_ring_write(ring, PACKET3(PACKET3_SET_CONFIG_REG, 1));
> + amdgpu_ring_write(ring, mmVGT_GS_MAX_WAVE_ID);
> + amdgpu_ring_write(ring, ring->adev->gds.vgt_gs_max_wave_id);
> +
>   if (ib->flags & AMDGPU_IB_FLAG_CE)
>   header = PACKET3(PACKET3_INDIRECT_BUFFER_CNST, 2);
>   else
>   header = PACKET3(PACKET3_INDIRECT_BUFFER, 2);
>   
>   control |= ib->length_dw | (vmid << 24);
>   
>   if (amdgpu_mcbp && (ib->flags & AMDGPU_IB_FLAG_PREEMPT)) {
>   control |= INDIRECT_BUFFER_PRE_ENB(1);
>   
> @@ -5094,24 +5103,23 @@ static void gfx_v10_0_set_rlc_funcs(struct 
> amdgpu_device *adev)
>   default:
>   break;
>   }
>   }
>   
>   static void gfx_v10_0_set_gds_init(struct amdgpu_device *adev)
>   {
>   /* init asic gds info */
>   switch (adev->asic_type) {
>   case CHIP_NAVI10:
> - adev->gds.gds_size = 0x1;
> - break;
>   default:
>   adev->gds.gds_size = 0x1;
> + adev->gds.vgt_gs_max_wave_id = 0x3ff;
>   break;
>   }
>   
>   adev->gds.gws_size = 64;
>   adev->gds.oa_size = 16;
>   }
>   
>   static void gfx_v10_0_set_user_wgp_inactive_bitmap_per_sh(struct 
> amdgpu_device *adev,
> u32 bitmap)
>   {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu

2019-05-27 Thread Chunming Zhou

Change-Id: I2b1af1478fbddbb5084b90b3ff85c2eb964bd217
Signed-off-by: Chunming Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 78706dfa753a..1f38d6fc1fe3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1307,7 +1307,8 @@ static struct drm_driver kms_driver = {
.driver_features =
DRIVER_USE_AGP | DRIVER_ATOMIC |
DRIVER_GEM |
-   DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ,
+   DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ |
+   DRIVER_SYNCOBJ_TIMELINE,
.load = amdgpu_driver_load_kms,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10

2019-05-23 Thread Chunming Zhou


在 2019/5/23 19:03, Christian König 写道:
> [CAUTION: External Email]
>
> Am 23.05.19 um 12:24 schrieb zhoucm1:
>>
>>
>> On 2019年05月22日 20:59, Christian König wrote:
>>> [CAUTION: External Email]
>>>
>>> BOs on the LRU might be blocked during command submission
>>> and cause OOM situations.
>>>
>>> Avoid this by blocking for the first busy BO not locked by
>>> the same ticket as the BO we are searching space for.
>>>
>>> v10: completely start over with the patch since we didn't
>>>   handled a whole bunch of corner cases.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo.c | 77 
>>> ++--
>>>   1 file changed, 66 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index 4c6389d849ed..861facac33d4 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -771,32 +771,72 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable);
>>>    * b. Otherwise, trylock it.
>>>    */
>>>   static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object
>>> *bo,
>>> -   struct ttm_operation_ctx *ctx, bool *locked)
>>> +   struct ttm_operation_ctx *ctx, bool *locked,
>>> bool *busy)
>>>   {
>>>  bool ret = false;
>>>
>>> -   *locked = false;
>>>  if (bo->resv == ctx->resv) {
>>>  reservation_object_assert_held(bo->resv);
>>>  if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
>>>  || !list_empty(>ddestroy))
>>>  ret = true;
>>> +   *locked = false;
>>> +   if (busy)
>>> +   *busy = false;
>>>  } else {
>>> -   *locked = reservation_object_trylock(bo->resv);
>>> -   ret = *locked;
>>> +   ret = reservation_object_trylock(bo->resv);
>>> +   *locked = ret;
>>> +   if (busy)
>>> +   *busy = !ret;
>>>  }
>>>
>>>  return ret;
>>>   }
>>>
>>> +/**
>>> + * ttm_mem_evict_wait_busy - wait for a busy BO to become available
>>> + *
>>> + * @busy_bo: BO which couldn't be locked with trylock
>>> + * @ctx: operation context
>>> + * @ticket: acquire ticket
>>> + *
>>> + * Try to lock a busy buffer object to avoid failing eviction.
>>> + */
>>> +static int ttm_mem_evict_wait_busy(struct ttm_buffer_object *busy_bo,
>>> +  struct ttm_operation_ctx *ctx,
>>> +  struct ww_acquire_ctx *ticket)
>>> +{
>>> +   int r;
>>> +
>>> +   if (!busy_bo || !ticket)
>>> +   return -EBUSY;
>>> +
>>> +   if (ctx->interruptible)
>>> +   r = 
>>> reservation_object_lock_interruptible(busy_bo->resv,
>>> + ticket);
>>> +   else
>>> +   r = reservation_object_lock(busy_bo->resv, ticket);
>>> +
>>> +   /*
>>> +    * TODO: It would be better to keep the BO locked until
>>> allocation is at
>>> +    * least tried one more time, but that would mean a much
>>> larger rework
>>> +    * of TTM.
>>> +    */
>>> +   if (!r)
>>> +   reservation_object_unlock(busy_bo->resv);
>>> +
>>> +   return r == -EDEADLK ? -EAGAIN : r;
>>> +}
>>> +
>>>   static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
>>>     uint32_t mem_type,
>>>     const struct ttm_place *place,
>>> -  struct ttm_operation_ctx *ctx)
>>> +  struct ttm_operation_ctx *ctx,
>>> +  struct ww_acquire_ctx *ticket)
>>>   {
>>> +   struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
>>>  struct ttm_bo_global *glob = bdev->glob;
>>>  struct ttm_mem_type_manager *man = >man[mem_type];
>>> -   struct ttm_buffer_object *bo = NULL;
>>>  bool locked = false;
>>>  unsigned i;
>>>  int ret;
>>> @@ -804,8 +844,15 @@ static int ttm_mem_evict_first(struct
>>> ttm_bo_device *bdev,
>>>  spin_lock(>lru_lock);
>>>  for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
>>>  list_for_each_entry(bo, >lru[i], lru) {
>>> -   if (!ttm_bo_evict_swapout_allowable(bo, ctx,
>>> ))
>>> +   bool busy;
>>> +
>>> +   if (!ttm_bo_evict_swapout_allowable(bo, ctx,
>>> ,
>>> + )) {
>>> +   if (busy && !busy_bo &&
>>> +   bo->resv->lock.ctx != ticket)
>>> +   busy_bo = bo;
>>>  continue;
>>> +   }
>>>
>>>  if (place &&
>>> !bdev->driver->eviction_valuable(bo,
>>> place)) {
>>> @@ -824,8 +871,13 @@ static int ttm_mem_evict_first(struct
>>> ttm_bo_device *bdev,
>>>  }
>>>
>>>  if (!bo) {
>>> +   if

[PATCH libdrm 3/7] wrap syncobj timeline query/wait APIs for amdgpu v3

2019-05-13 Thread Chunming Zhou

v2: symbos are stored in lexical order.
v3: drop export/import and extra query indirection

Signed-off-by: Chunming Zhou 
Acked-by: Christian König 
---
 amdgpu/amdgpu-symbol-check |  2 ++
 amdgpu/amdgpu.h| 39 ++
 amdgpu/amdgpu_cs.c | 23 ++
 3 files changed, 64 insertions(+)

diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol-check
index 4d806922..d3c5bb89 100755
--- a/amdgpu/amdgpu-symbol-check
+++ b/amdgpu/amdgpu-symbol-check
@@ -53,8 +53,10 @@ amdgpu_cs_submit_raw
 amdgpu_cs_submit_raw2
 amdgpu_cs_syncobj_export_sync_file
 amdgpu_cs_syncobj_import_sync_file
+amdgpu_cs_syncobj_query
 amdgpu_cs_syncobj_reset
 amdgpu_cs_syncobj_signal
+amdgpu_cs_syncobj_timeline_wait
 amdgpu_cs_syncobj_wait
 amdgpu_cs_wait_fences
 amdgpu_cs_wait_semaphore
diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index c44a495a..5ebfe1e3 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -1536,6 +1536,45 @@ int amdgpu_cs_syncobj_wait(amdgpu_device_handle dev,
   int64_t timeout_nsec, unsigned flags,
   uint32_t *first_signaled);
 
+/**
+ *  Wait for one or all sync objects on their points to signal.
+ *
+ * \param   dev- \c [in] self-explanatory
+ * \param   handles - \c [in] array of sync object handles
+ * \param   points - \c [in] array of sync points to wait
+ * \param   num_handles - \c [in] self-explanatory
+ * \param   timeout_nsec - \c [in] self-explanatory
+ * \param   flags   - \c [in] a bitmask of DRM_SYNCOBJ_WAIT_FLAGS_*
+ * \param   first_signaled - \c [in] self-explanatory
+ *
+ * \return   0 on success\n
+ *  -ETIME - Timeout
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_timeline_wait(amdgpu_device_handle dev,
+   uint32_t *handles, uint64_t *points,
+   unsigned num_handles,
+   int64_t timeout_nsec, unsigned flags,
+   uint32_t *first_signaled);
+/**
+ *  Query sync objects payloads.
+ *
+ * \param   dev- \c [in] self-explanatory
+ * \param   handles - \c [in] array of sync object handles
+ * \param   points - \c [out] array of sync points returned, which presents
+ * syncobj payload.
+ * \param   num_handles - \c [in] self-explanatory
+ *
+ * \return   0 on success\n
+ *  -ETIME - Timeout
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_query(amdgpu_device_handle dev,
+   uint32_t *handles, uint64_t *points,
+   unsigned num_handles);
+
 /**
  *  Export kernel sync object to shareable fd.
  *
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 7c5b9d13..9fcaf2c4 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -686,6 +686,29 @@ drm_public int amdgpu_cs_syncobj_wait(amdgpu_device_handle 
dev,
  flags, first_signaled);
 }
 
+drm_public int amdgpu_cs_syncobj_timeline_wait(amdgpu_device_handle dev,
+  uint32_t *handles, uint64_t 
*points,
+  unsigned num_handles,
+  int64_t timeout_nsec, unsigned 
flags,
+  uint32_t *first_signaled)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjTimelineWait(dev->fd, handles, points, num_handles,
+ timeout_nsec, flags, first_signaled);
+}
+
+drm_public int amdgpu_cs_syncobj_query(amdgpu_device_handle dev,
+  uint32_t *handles, uint64_t *points,
+  unsigned num_handles)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjQuery(dev->fd, handles, points, num_handles);
+}
+
 drm_public int amdgpu_cs_export_syncobj(amdgpu_device_handle dev,
uint32_t handle,
int *shared_fd)
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 2/7] add timeline wait/query ioctl v2

2019-05-13 Thread Chunming Zhou

v2: drop export/import

Signed-off-by: Chunming Zhou 
---
 xf86drm.c | 44 
 xf86drm.h |  6 ++
 2 files changed, 50 insertions(+)

diff --git a/xf86drm.c b/xf86drm.c
index 2c19376b..17e3d880 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -4256,3 +4256,47 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t 
*handles,
 ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_SIGNAL, );
 return ret;
 }
+
+drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t 
*points,
+ unsigned num_handles,
+ int64_t timeout_nsec, unsigned flags,
+ uint32_t *first_signaled)
+{
+struct drm_syncobj_timeline_wait args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;
+args.timeout_nsec = timeout_nsec;
+args.count_handles = num_handles;
+args.flags = flags;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, );
+if (ret < 0)
+return -errno;
+
+if (first_signaled)
+*first_signaled = args.first_signaled;
+return ret;
+}
+
+
+drm_public int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
+  uint32_t handle_count)
+{
+struct drm_syncobj_timeline_array args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;
+args.count_handles = handle_count;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_QUERY, );
+if (ret)
+return ret;
+return 0;
+}
+
+
diff --git a/xf86drm.h b/xf86drm.h
index 887ecc76..60c7a84f 100644
--- a/xf86drm.h
+++ b/xf86drm.h
@@ -876,6 +876,12 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, 
unsigned num_handles,
  uint32_t *first_signaled);
 extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t 
handle_count);
 extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t 
handle_count);
+extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points,
+ unsigned num_handles,
+ int64_t timeout_nsec, unsigned flags,
+ uint32_t *first_signaled);
+extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
+  uint32_t handle_count);
 
 #if defined(__cplusplus)
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 7/7] add syncobj timeline tests v3

2019-05-13 Thread Chunming Zhou

v2: drop DRM_SYNCOBJ_CREATE_TYPE_TIMELINE, fix timeout calculation,
fix some warnings
v3: add export/import and cpu signal testing cases

Signed-off-by: Chunming Zhou 
Acked-by: Christian König 
---
 tests/amdgpu/Makefile.am |   3 +-
 tests/amdgpu/amdgpu_test.c   |  11 ++
 tests/amdgpu/amdgpu_test.h   |  21 +++
 tests/amdgpu/meson.build |   2 +-
 tests/amdgpu/syncobj_tests.c | 290 +++
 5 files changed, 325 insertions(+), 2 deletions(-)
 create mode 100644 tests/amdgpu/syncobj_tests.c

diff --git a/tests/amdgpu/Makefile.am b/tests/amdgpu/Makefile.am
index 48278848..920882d0 100644
--- a/tests/amdgpu/Makefile.am
+++ b/tests/amdgpu/Makefile.am
@@ -34,4 +34,5 @@ amdgpu_test_SOURCES = \
uve_ib.h \
deadlock_tests.c \
vm_tests.c  \
-   ras_tests.c
+   ras_tests.c \
+   syncobj_tests.c
diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c
index 35c8bf6c..73403fb4 100644
--- a/tests/amdgpu/amdgpu_test.c
+++ b/tests/amdgpu/amdgpu_test.c
@@ -57,6 +57,7 @@
 #define DEADLOCK_TESTS_STR "Deadlock Tests"
 #define VM_TESTS_STR "VM Tests"
 #define RAS_TESTS_STR "RAS Tests"
+#define SYNCOBJ_TIMELINE_TESTS_STR "SYNCOBJ TIMELINE Tests"
 
 /**
  *  Open handles for amdgpu devices
@@ -123,6 +124,12 @@ static CU_SuiteInfo suites[] = {
.pCleanupFunc = suite_ras_tests_clean,
.pTests = ras_tests,
},
+   {
+   .pName = SYNCOBJ_TIMELINE_TESTS_STR,
+   .pInitFunc = suite_syncobj_timeline_tests_init,
+   .pCleanupFunc = suite_syncobj_timeline_tests_clean,
+   .pTests = syncobj_timeline_tests,
+   },
 
CU_SUITE_INFO_NULL,
 };
@@ -176,6 +183,10 @@ static Suites_Active_Status suites_active_stat[] = {
.pName = RAS_TESTS_STR,
.pActive = suite_ras_tests_enable,
},
+   {
+   .pName = SYNCOBJ_TIMELINE_TESTS_STR,
+   .pActive = suite_syncobj_timeline_tests_enable,
+   },
 };
 
 
diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h
index bcd0bc7e..36675ea3 100644
--- a/tests/amdgpu/amdgpu_test.h
+++ b/tests/amdgpu/amdgpu_test.h
@@ -216,6 +216,27 @@ CU_BOOL suite_ras_tests_enable(void);
 extern CU_TestInfo ras_tests[];
 
 
+/**
+ * Initialize syncobj timeline test suite
+ */
+int suite_syncobj_timeline_tests_init();
+
+/**
+ * Deinitialize syncobj timeline test suite
+ */
+int suite_syncobj_timeline_tests_clean();
+
+/**
+ * Decide if the suite is enabled by default or not.
+ */
+CU_BOOL suite_syncobj_timeline_tests_enable(void);
+
+/**
+ * Tests in syncobj timeline test suite
+ */
+extern CU_TestInfo syncobj_timeline_tests[];
+
+
 /**
  * Helper functions
  */
diff --git a/tests/amdgpu/meson.build b/tests/amdgpu/meson.build
index 95ed9305..1726cb43 100644
--- a/tests/amdgpu/meson.build
+++ b/tests/amdgpu/meson.build
@@ -24,7 +24,7 @@ if dep_cunit.found()
 files(
   'amdgpu_test.c', 'basic_tests.c', 'bo_tests.c', 'cs_tests.c',
   'vce_tests.c', 'uvd_enc_tests.c', 'vcn_tests.c', 'deadlock_tests.c',
-  'vm_tests.c', 'ras_tests.c',
+  'vm_tests.c', 'ras_tests.c', 'syncobj_tests.c',
 ),
 dependencies : [dep_cunit, dep_threads],
 include_directories : [inc_root, inc_drm, 
include_directories('../../amdgpu')],
diff --git a/tests/amdgpu/syncobj_tests.c b/tests/amdgpu/syncobj_tests.c
new file mode 100644
index ..a0c627d7
--- /dev/null
+++ b/tests/amdgpu/syncobj_tests.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+*/
+
+#include "CUnit/Basic.h"
+
+#include "amdgpu_test.h"
+#include "amdgpu_drm.h"
+#include "amdgpu_internal.h"
+#include 
+
+static  amdgpu_device_hand

[PATCH libdrm 6/7] wrap transfer interfaces

2019-05-13 Thread Chunming Zhou

Signed-off-by: Chunming Zhou 
Acked-by: Christian König 
---
 amdgpu/amdgpu.h| 22 ++
 amdgpu/amdgpu_cs.c | 16 
 2 files changed, 38 insertions(+)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index d2480dbe..9d9b0832 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -1685,6 +1685,28 @@ int 
amdgpu_cs_syncobj_import_sync_file2(amdgpu_device_handle dev,
uint32_t syncobj,
uint64_t point,
int sync_file_fd);
+
+/**
+ *  transfer between syncbojs.
+ *
+ * \param   dev- \c [in] device handle
+ * \param   dst_handle - \c [in] sync object handle
+ * \param   dst_point  - \c [in] timeline point, 0 presents dst is binary
+ * \param   src_handle - \c [in] sync object handle
+ * \param   src_point  - \c [in] timeline point, 0 presents src is binary
+ * \param   flags  - \c [in] flags
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_transfer(amdgpu_device_handle dev,
+  uint32_t dst_handle,
+  uint64_t dst_point,
+  uint32_t src_handle,
+  uint64_t src_point,
+  uint32_t flags);
+
 /**
  * Export an amdgpu fence as a handle (syncobj or fd).
  *
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index daca4421..977fa3cf 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -817,6 +817,22 @@ out:
return ret;
 }
 
+drm_public int amdgpu_cs_syncobj_transfer(amdgpu_device_handle dev,
+ uint32_t dst_handle,
+ uint64_t dst_point,
+ uint32_t src_handle,
+ uint64_t src_point,
+ uint32_t flags)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjTransfer(dev->fd,
+ dst_handle, dst_point,
+ src_handle, src_point,
+ flags);
+}
+
 drm_public int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
amdgpu_context_handle context,
amdgpu_bo_list_handle bo_list_handle,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 4/7] add timeline signal/transfer ioctls v2

2019-05-13 Thread Chunming Zhou

v2: use one transfer ioctl

Signed-off-by: Chunming Zhou 
---
 xf86drm.c | 33 +
 xf86drm.h |  6 ++
 2 files changed, 39 insertions(+)

diff --git a/xf86drm.c b/xf86drm.c
index 17e3d880..acd16fab 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -4257,6 +4257,21 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t 
*handles,
 return ret;
 }
 
+drm_public int drmSyncobjTimelineSignal(int fd, const uint32_t *handles,
+   uint64_t *points, uint32_t handle_count)
+{
+struct drm_syncobj_timeline_array args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;
+args.count_handles = handle_count;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, );
+return ret;
+}
+
 drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t 
*points,
  unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
@@ -4299,4 +4314,22 @@ drm_public int drmSyncobjQuery(int fd, uint32_t 
*handles, uint64_t *points,
 return 0;
 }
 
+drm_public int drmSyncobjTransfer(int fd,
+ uint32_t dst_handle, uint64_t dst_point,
+ uint32_t src_handle, uint64_t src_point,
+ uint32_t flags)
+{
+struct drm_syncobj_transfer args;
+int ret;
+
+memclear(args);
+args.src_handle = src_handle;
+args.dst_handle = dst_handle;
+args.src_point = src_point;
+args.dst_point = dst_point;
+args.flags = flags;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TRANSFER, );
 
+return ret;
+}
diff --git a/xf86drm.h b/xf86drm.h
index 60c7a84f..3fb1d1ca 100644
--- a/xf86drm.h
+++ b/xf86drm.h
@@ -876,12 +876,18 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, 
unsigned num_handles,
  uint32_t *first_signaled);
 extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t 
handle_count);
 extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t 
handle_count);
+extern int drmSyncobjTimelineSignal(int fd, const uint32_t *handles,
+   uint64_t *points, uint32_t handle_count);
 extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points,
  unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
  uint32_t *first_signaled);
 extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
   uint32_t handle_count);
+extern int drmSyncobjTransfer(int fd,
+ uint32_t dst_handle, uint64_t dst_point,
+ uint32_t src_handle, uint64_t src_point,
+ uint32_t flags);
 
 #if defined(__cplusplus)
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 5/7] expose timeline signal/export/import interfaces v2

2019-05-13 Thread Chunming Zhou

v2: adapt to new one transfer ioctl

Signed-off-by: Chunming Zhou 
Acked-by: Christian König 
---
 amdgpu/amdgpu-symbol-check |  3 ++
 amdgpu/amdgpu.h| 51 
 amdgpu/amdgpu_cs.c | 68 ++
 3 files changed, 122 insertions(+)

diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol-check
index d3c5bb89..274b4c6d 100755
--- a/amdgpu/amdgpu-symbol-check
+++ b/amdgpu/amdgpu-symbol-check
@@ -52,10 +52,13 @@ amdgpu_cs_submit
 amdgpu_cs_submit_raw
 amdgpu_cs_submit_raw2
 amdgpu_cs_syncobj_export_sync_file
+amdgpu_cs_syncobj_export_sync_file2
 amdgpu_cs_syncobj_import_sync_file
+amdgpu_cs_syncobj_import_sync_file2
 amdgpu_cs_syncobj_query
 amdgpu_cs_syncobj_reset
 amdgpu_cs_syncobj_signal
+amdgpu_cs_syncobj_timeline_signal
 amdgpu_cs_syncobj_timeline_wait
 amdgpu_cs_syncobj_wait
 amdgpu_cs_wait_fences
diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index 5ebfe1e3..d2480dbe 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -1516,6 +1516,23 @@ int amdgpu_cs_syncobj_reset(amdgpu_device_handle dev,
 int amdgpu_cs_syncobj_signal(amdgpu_device_handle dev,
 const uint32_t *syncobjs, uint32_t syncobj_count);
 
+/**
+ * Signal kernel timeline sync objects.
+ *
+ * \param dev   - \c [in] device handle
+ * \param syncobjs  - \c [in] array of sync object handles
+ * \param points   - \c [in] array of timeline points
+ * \param syncobj_count - \c [in] number of handles in syncobjs
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+*/
+int amdgpu_cs_syncobj_timeline_signal(amdgpu_device_handle dev,
+ const uint32_t *syncobjs,
+ uint64_t *points,
+ uint32_t syncobj_count);
+
 /**
  *  Wait for one or all sync objects to signal.
  *
@@ -1633,7 +1650,41 @@ int 
amdgpu_cs_syncobj_export_sync_file(amdgpu_device_handle dev,
 int amdgpu_cs_syncobj_import_sync_file(amdgpu_device_handle dev,
   uint32_t syncobj,
   int sync_file_fd);
+/**
+ *  Export kernel timeline sync object to a sync_file.
+ *
+ * \param   dev- \c [in] device handle
+ * \param   syncobj- \c [in] sync object handle
+ * \param   point  - \c [in] timeline point
+ * \param   flags  - \c [in] flags
+ * \param   sync_file_fd - \c [out] sync_file file descriptor.
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_export_sync_file2(amdgpu_device_handle dev,
+   uint32_t syncobj,
+   uint64_t point,
+   uint32_t flags,
+   int *sync_file_fd);
 
+/**
+ *  Import kernel timeline sync object from a sync_file.
+ *
+ * \param   dev- \c [in] device handle
+ * \param   syncobj- \c [in] sync object handle
+ * \param   point  - \c [in] timeline point
+ * \param   sync_file_fd - \c [in] sync_file file descriptor.
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_import_sync_file2(amdgpu_device_handle dev,
+   uint32_t syncobj,
+   uint64_t point,
+   int sync_file_fd);
 /**
  * Export an amdgpu fence as a handle (syncobj or fd).
  *
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 9fcaf2c4..daca4421 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -674,6 +674,18 @@ drm_public int 
amdgpu_cs_syncobj_signal(amdgpu_device_handle dev,
return drmSyncobjSignal(dev->fd, syncobjs, syncobj_count);
 }
 
+drm_public int amdgpu_cs_syncobj_timeline_signal(amdgpu_device_handle dev,
+const uint32_t *syncobjs,
+uint64_t *points,
+uint32_t syncobj_count)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjTimelineSignal(dev->fd, syncobjs,
+   points, syncobj_count);
+}
+
 drm_public int amdgpu_cs_syncobj_wait(amdgpu_device_handle dev,
  uint32_t *handles, unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
@@ -749,6 +761,62 @@ drm_public int 
amdgpu_cs_syncobj_import_sync_file(amdgpu_device_handle dev,
return drmSyncobjImportSyncFile(dev->fd, syncobj, sync_file_fd);
 }
 
+drm_public int amdgpu_cs_syncobj_export_sync_file2(amdgpu_device_handle dev,
+  uint32_t syncobj,
+

[PATCH libdrm 1/7] addr cs chunk for syncobj timeline

2019-05-13 Thread Chunming Zhou

Signed-off-by: Chunming Zhou 
---
 include/drm/amdgpu_drm.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index d0701ffc..3d0318e6 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -528,6 +528,8 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05
 #define AMDGPU_CHUNK_ID_BO_HANDLES  0x06
 #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x08
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
 
 struct drm_amdgpu_cs_chunk {
__u32   chunk_id;
@@ -608,6 +610,13 @@ struct drm_amdgpu_cs_chunk_sem {
__u32 handle;
 };
 
+struct drm_amdgpu_cs_chunk_syncobj {
+   __u32 handle;
+   __u32 flags;
+   __u64 point;
+};
+
+
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ 0
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD  1
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD2
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/ttm: fix busy memory to fail other user v6

2019-05-07 Thread Chunming Zhou

heavy gpu job could occupy memory long time, which lead other user fail to get 
memory.

basically pick up Christian idea:

1. Reserve the BO in DC using a ww_mutex ticket (trivial).
2. If we then run into this EBUSY condition in TTM check if the BO we need 
memory for (or rather the ww_mutex of its reservation object) has a ticket 
assigned.
3. If we have a ticket we grab a reference to the first BO on the LRU, drop the 
LRU lock and try to grab the reservation lock with the ticket.
4. If getting the reservation lock with the ticket succeeded we check if the BO 
is still the first one on the LRU in question (the BO could have moved).
5. If the BO is still the first one on the LRU in question we try to evict it 
as we would evict any other BO.
6. If any of the "If's" above fail we just back off and return -EBUSY.

v2: fix some minor check
v3: address Christian v2 comments.
v4: fix some missing
v5: handle first_bo unlock and bo_get/put
v6: abstract unified iterate function, and handle all possible usecase not only 
pinned bo.

Change-Id: I21423fb922f885465f13833c41df1e134364a8e7
Signed-off-by: Chunming Zhou 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 113 ++-
 1 file changed, 97 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 8502b3ed2d88..bbf1d14d00a7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable);
  * b. Otherwise, trylock it.
  */
 static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
-   struct ttm_operation_ctx *ctx, bool *locked)
+   struct ttm_operation_ctx *ctx, bool *locked, bool *busy)
 {
bool ret = false;
 
*locked = false;
+   if (busy)
+   *busy = false;
if (bo->resv == ctx->resv) {
reservation_object_assert_held(bo->resv);
if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
@@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct 
ttm_buffer_object *bo,
} else {
*locked = reservation_object_trylock(bo->resv);
ret = *locked;
+   if (!ret && busy)
+   *busy = true;
}
 
return ret;
 }
 
-static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
-  uint32_t mem_type,
-  const struct ttm_place *place,
-  struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object*
+ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev,
+struct ttm_mem_type_manager *man,
+const struct ttm_place *place,
+struct ttm_operation_ctx *ctx,
+struct ttm_buffer_object **first_bo,
+bool *locked)
 {
-   struct ttm_bo_global *glob = bdev->glob;
-   struct ttm_mem_type_manager *man = >man[mem_type];
struct ttm_buffer_object *bo = NULL;
-   bool locked = false;
-   unsigned i;
-   int ret;
+   int i;
 
-   spin_lock(>lru_lock);
+   if (first_bo)
+   *first_bo = NULL;
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
list_for_each_entry(bo, >lru[i], lru) {
-   if (!ttm_bo_evict_swapout_allowable(bo, ctx, ))
+   bool busy = false;
+   if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked,
+   )) {
+   if (first_bo && !(*first_bo) && busy) {
+   ttm_bo_get(bo);
+   *first_bo = bo;
+   }
continue;
+   }
 
if (place && !bdev->driver->eviction_valuable(bo,
  place)) {
-   if (locked)
+   if (*locked)
reservation_object_unlock(bo->resv);
continue;
}
+
break;
}
 
@@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
bo = NULL;
}
 
+   return bo;
+}
+
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
+  uint32_t mem_type,
+  const struct ttm_place *place,
+  struct ttm_operation_ctx *ctx)
+{
+   struct ttm_bo_global *glob = bdev->glob;
+   struct ttm_mem_type_manager *man = >man[mem_type];
+   struct ttm_buffer_object *bo = NULL, *first_bo = NUL

[PATCH 2/2] drm/amd/display: use ttm_eu_reserve_buffers instead of amdgpu_bo_reserve

2019-05-07 Thread Chunming Zhou

add ticket for display bo, so that it can preempt busy bo.

Change-Id: I9f031cdcc8267de00e819ae303baa0a52df8ebb9
Signed-off-by: Chunming Zhou 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 22 ++-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index ac22f7351a42..8633d52e3fbe 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4176,6 +4176,9 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
struct amdgpu_device *adev;
struct amdgpu_bo *rbo;
struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old;
+   struct list_head list, duplicates;
+   struct ttm_validate_buffer tv;
+   struct ww_acquire_ctx ticket;
uint64_t tiling_flags;
uint32_t domain;
int r;
@@ -4192,9 +4195,18 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
obj = new_state->fb->obj[0];
rbo = gem_to_amdgpu_bo(obj);
adev = amdgpu_ttm_adev(rbo->tbo.bdev);
-   r = amdgpu_bo_reserve(rbo, false);
-   if (unlikely(r != 0))
+   INIT_LIST_HEAD();
+   INIT_LIST_HEAD();
+
+   tv.bo = >tbo;
+   tv.num_shared = 1;
+   list_add(, );
+
+   r = ttm_eu_reserve_buffers(, , false, );
+   if (r) {
+   dev_err(adev->dev, "fail to reserve bo (%d)\n", r);
return r;
+   }
 
if (plane->type != DRM_PLANE_TYPE_CURSOR)
domain = amdgpu_display_supported_domains(adev);
@@ -4205,21 +4217,21 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("Failed to pin framebuffer with error %d\n", 
r);
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
return r;
}
 
r = amdgpu_ttm_alloc_gart(>tbo);
if (unlikely(r != 0)) {
amdgpu_bo_unpin(rbo);
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
DRM_ERROR("%p bind failed\n", rbo);
return r;
}
 
amdgpu_bo_get_tiling_flags(rbo, _flags);
 
-   amdgpu_bo_unreserve(rbo);
+   ttm_eu_backoff_reservation(, );
 
afb->address = amdgpu_bo_gpu_offset(rbo);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-22 Thread Chunming Zhou

+Monk.

GPU reset is used widely in SRIOV, so need virtulizatino guy take a look.

But out of curious, why guilty job can signal more if the job is already 
set to guilty? set it wrongly?


-David

在 2019/4/18 23:00, Andrey Grodzovsky 写道:
> Also reject TDRs if another one already running.
>
> v2:
> Stop all schedulers across device and entire XGMI hive before
> force signaling HW fences.
> Avoid passing job_signaled to helper fnctions to keep all the decision
> making about skipping HW reset in one place.
>
> v3:
> Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced
> against it's decrement in drm_sched_stop in non HW reset case.
> v4: rebase
> v5: Revert v3 as we do it now in sceduler code.
>
> Signed-off-by: Andrey Grodzovsky 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 143 
> +++--
>   1 file changed, 95 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a0e165c..85f8792 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3334,8 +3334,6 @@ static int amdgpu_device_pre_asic_reset(struct 
> amdgpu_device *adev,
>   if (!ring || !ring->sched.thread)
>   continue;
>   
> - drm_sched_stop(>sched, >base);
> -
>   /* after all hw jobs are reset, hw fence is meaningless, so 
> force_completion */
>   amdgpu_fence_driver_force_completion(ring);
>   }
> @@ -3343,6 +3341,7 @@ static int amdgpu_device_pre_asic_reset(struct 
> amdgpu_device *adev,
>   if(job)
>   drm_sched_increase_karma(>base);
>   
> + /* Don't suspend on bare metal if we are not going to HW reset the ASIC 
> */
>   if (!amdgpu_sriov_vf(adev)) {
>   
>   if (!need_full_reset)
> @@ -3480,37 +3479,21 @@ static int amdgpu_do_asic_reset(struct 
> amdgpu_hive_info *hive,
>   return r;
>   }
>   
> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev)
> +static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, bool trylock)
>   {
> - int i;
> -
> - for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
> - struct amdgpu_ring *ring = adev->rings[i];
> -
> - if (!ring || !ring->sched.thread)
> - continue;
> -
> - if (!adev->asic_reset_res)
> - drm_sched_resubmit_jobs(>sched);
> + if (trylock) {
> + if (!mutex_trylock(>lock_reset))
> + return false;
> + } else
> + mutex_lock(>lock_reset);
>   
> - drm_sched_start(>sched, !adev->asic_reset_res);
> - }
> -
> - if (!amdgpu_device_has_dc_support(adev)) {
> - drm_helper_resume_force_mode(adev->ddev);
> - }
> -
> - adev->asic_reset_res = 0;
> -}
> -
> -static void amdgpu_device_lock_adev(struct amdgpu_device *adev)
> -{
> - mutex_lock(>lock_reset);
>   atomic_inc(>gpu_reset_counter);
>   adev->in_gpu_reset = 1;
>   /* Block kfd: SRIOV would do it separately */
>   if (!amdgpu_sriov_vf(adev))
>   amdgpu_amdkfd_pre_reset(adev);
> +
> + return true;
>   }
>   
>   static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
> @@ -3538,40 +3521,42 @@ static void amdgpu_device_unlock_adev(struct 
> amdgpu_device *adev)
>   int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> struct amdgpu_job *job)
>   {
> - int r;
> + struct list_head device_list, *device_list_handle =  NULL;
> + bool need_full_reset, job_signaled;
>   struct amdgpu_hive_info *hive = NULL;
> - bool need_full_reset = false;
>   struct amdgpu_device *tmp_adev = NULL;
> - struct list_head device_list, *device_list_handle =  NULL;
> + int i, r = 0;
>   
> + need_full_reset = job_signaled = false;
>   INIT_LIST_HEAD(_list);
>   
>   dev_info(adev->dev, "GPU reset begin!\n");
>   
> + hive = amdgpu_get_xgmi_hive(adev, false);
> +
>   /*
> -  * In case of XGMI hive disallow concurrent resets to be triggered
> -  * by different nodes. No point also since the one node already 
> executing
> -  * reset will also reset all the other nodes in the hive.
> +  * Here we trylock to avoid chain of resets executing from
> +  * either trigger by jobs on different adevs in XGMI hive or jobs on
> +  * different schedulers for same device while this TO handler is 
> running.
> +  * We always reset all schedulers for device and all devices for XGMI
> +  * hive so that should take care of them too.
>*/
> - hive = amdgpu_get_xgmi_hive(adev, 0);
> - if (hive && adev->gmc.xgmi.num_physical_nodes > 1 &&
> - !mutex_trylock(>reset_lock))
> +
> + if (hive && !mutex_trylock(>reset_lock)) {
> + DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another 
> already in

Re: [PATCH v5 4/6] drm/sched: Keep s_fence->parent pointer

2019-04-22 Thread Chunming Zhou

+Monk to response this patch.


在 2019/4/18 23:00, Andrey Grodzovsky 写道:
> For later driver's reference to see if the fence is signaled.
>
> v2: Move parent fence put to resubmit jobs.
>
> Signed-off-by: Andrey Grodzovsky 
> Reviewed-by: Christian König 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 11 +--
>   1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 7816de7..03e6bd8 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -375,8 +375,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> struct drm_sched_job *bad)
>   if (s_job->s_fence->parent &&
>   dma_fence_remove_callback(s_job->s_fence->parent,
> _job->cb)) {
> - dma_fence_put(s_job->s_fence->parent);
> - s_job->s_fence->parent = NULL;

I vaguely remember Monk set parent to be NULL to avoiod potiential free 
problem after callback removal.


-David


>   atomic_dec(>hw_rq_count);
>   } else {
>   /*
> @@ -403,6 +401,14 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> struct drm_sched_job *bad)
>   sched->ops->free_job(s_job);
>   }
>   }
> +
> + /*
> +  * Stop pending timer in flight as we rearm it in  drm_sched_start. This
> +  * avoids the pending timeout work in progress to fire right away after
> +  * this TDR finished and before the newly restarted jobs had a
> +  * chance to complete.
> +  */
> + cancel_delayed_work(>work_tdr);
>   }
>   
>   EXPORT_SYMBOL(drm_sched_stop);
> @@ -477,6 +483,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler 
> *sched)
>   if (found_guilty && s_job->s_fence->scheduled.context == 
> guilty_context)
>   dma_fence_set_error(_fence->finished, -ECANCELED);
>   
> + dma_fence_put(s_job->s_fence->parent);
>   s_job->s_fence->parent = sched->ops->run_job(s_job);
>   }
>   }
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v5 3/6] drm/scheduler: rework job destruction

2019-04-22 Thread Chunming Zhou

Hi Andrey,

static void drm_sched_process_job(struct dma_fence *f, struct 
dma_fence_cb *cb)
{
...
     spin_lock_irqsave(>job_list_lock, flags);
     /* remove job from ring_mirror_list */
     list_del_init(_job->node);
     spin_unlock_irqrestore(>job_list_lock, flags);
[David] How about just remove above to worker from irq process? Any 
problem? Maybe I missed previous your discussion, but I think removing 
lock for list is a risk for future maintenance although you make sure 
thread safe currently.

-David

...

     schedule_work(_job->finish_work);
}

在 2019/4/18 23:00, Andrey Grodzovsky 写道:
> From: Christian König 
>
> We now destroy finished jobs from the worker thread to make sure that
> we never destroy a job currently in timeout processing.
> By this we avoid holding lock around ring mirror list in drm_sched_stop
> which should solve a deadlock reported by a user.
>
> v2: Remove unused variable.
> v4: Move guilty job free into sched code.
> v5:
> Move sched->hw_rq_count to drm_sched_start to account for counter
> decrement in drm_sched_stop even when we don't call resubmit jobs
> if guily job did signal.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692
>
> Signed-off-by: Christian König 
> Signed-off-by: Andrey Grodzovsky 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   9 +-
>   drivers/gpu/drm/etnaviv/etnaviv_dump.c |   4 -
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c|   2 +-
>   drivers/gpu/drm/lima/lima_sched.c  |   2 +-
>   drivers/gpu/drm/panfrost/panfrost_job.c|   2 +-
>   drivers/gpu/drm/scheduler/sched_main.c | 159 
> +
>   drivers/gpu/drm/v3d/v3d_sched.c|   2 +-
>   include/drm/gpu_scheduler.h|   6 +-
>   8 files changed, 102 insertions(+), 84 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7cee269..a0e165c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3334,7 +3334,7 @@ static int amdgpu_device_pre_asic_reset(struct 
> amdgpu_device *adev,
>   if (!ring || !ring->sched.thread)
>   continue;
>   
> - drm_sched_stop(>sched);
> + drm_sched_stop(>sched, >base);
>   
>   /* after all hw jobs are reset, hw fence is meaningless, so 
> force_completion */
>   amdgpu_fence_driver_force_completion(ring);
> @@ -3343,8 +3343,6 @@ static int amdgpu_device_pre_asic_reset(struct 
> amdgpu_device *adev,
>   if(job)
>   drm_sched_increase_karma(>base);
>   
> -
> -
>   if (!amdgpu_sriov_vf(adev)) {
>   
>   if (!need_full_reset)
> @@ -3482,8 +3480,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
> *hive,
>   return r;
>   }
>   
> -static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev,
> -   struct amdgpu_job *job)
> +static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev)
>   {
>   int i;
>   
> @@ -3623,7 +3620,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
>   
>   /* Post ASIC reset for all devs .*/
>   list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
> - amdgpu_device_post_asic_reset(tmp_adev, tmp_adev == adev ? job 
> : NULL);
> + amdgpu_device_post_asic_reset(tmp_adev);
>   
>   if (r) {
>   /* bad news, how to tell it to userspace ? */
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> index 33854c9..5778d9c 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> @@ -135,13 +135,11 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>   mmu_size + gpu->buffer.size;
>   
>   /* Add in the active command buffers */
> - spin_lock_irqsave(>sched.job_list_lock, flags);
>   list_for_each_entry(s_job, >sched.ring_mirror_list, node) {
>   submit = to_etnaviv_submit(s_job);
>   file_size += submit->cmdbuf.size;
>   n_obj++;
>   }
> - spin_unlock_irqrestore(>sched.job_list_lock, flags);
>   
>   /* Add in the active buffer objects */
>   list_for_each_entry(vram, >mmu->mappings, mmu_node) {
> @@ -183,14 +181,12 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> gpu->buffer.size,
> etnaviv_cmdbuf_get_va(>buffer));
>   
> - spin_lock_irqsave(>sched.job_list_lock, flags);
>   list_for_each_entry(s_job, >sched.ring_mirror_list, node) {
>   submit = to_etnaviv_submit(s_job);
>   etnaviv_core_dump_mem(, ETDUMP_BUF_CMD,
> submit->cmdbuf.vaddr, submit->cmdbuf.size,
> etnaviv_cmdbuf_get_va(>cmdbuf));
>   }
> -

Re: dynamic DMA-buf sharing between devices

2019-04-17 Thread Chunming Zhou

I like you do somethings step by step, you can ping me when they are ready.

-David

在 2019/4/17 21:59, Christian König 写道:
> On top of those I have 6 more patches in the pipeline to enable VRAM 
> P2P with DMA-buf.
>
> So that is not the end of the patch set :)
>
> Christian.
>
> Am 17.04.19 um 15:52 schrieb Chunming Zhou:
>> Thanks Christian, great job. I will verify it this week when I finish my
>> current work on hand.
>>
>> -David
>>
>> 在 2019/4/17 2:38, Christian König wrote:
>>> Hi everybody,
>>>
>>> core idea in this patch set is that DMA-buf importers can now 
>>> provide an optional invalidate callback. Using this callback and the 
>>> reservation object exporters can now avoid pinning DMA-buf memory 
>>> for a long time while sharing it between devices.
>>>
>>> I've already send out an older version roughly a year ago, but 
>>> didn't had time to further look into cleaning this up.
>>>
>>> The last time a major problem was that we would had to fix up all 
>>> drivers implementing DMA-buf at once.
>>>
>>> Now I avoid this by allowing mappings to be cached in the DMA-buf 
>>> attachment and so driver can optionally move over to the new 
>>> interface one by one.
>>>
>>> This is also a prerequisite to my patchset enabling sharing of 
>>> device memory with DMA-buf.
>>>
>>> Please review and/or comment,
>>> Christian.
>>>
>>>
>>> ___
>>> dri-devel mailing list
>>> dri-de...@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: dynamic DMA-buf sharing between devices

2019-04-17 Thread Chunming Zhou

Thanks Christian, great job. I will verify it this week when I finish my 
current work on hand.

-David

在 2019/4/17 2:38, Christian König wrote:
> Hi everybody,
>
> core idea in this patch set is that DMA-buf importers can now provide an 
> optional invalidate callback. Using this callback and the reservation object 
> exporters can now avoid pinning DMA-buf memory for a long time while sharing 
> it between devices.
>
> I've already send out an older version roughly a year ago, but didn't had 
> time to further look into cleaning this up.
>
> The last time a major problem was that we would had to fix up all drivers 
> implementing DMA-buf at once.
>
> Now I avoid this by allowing mappings to be cached in the DMA-buf attachment 
> and so driver can optionally move over to the new interface one by one.
>
> This is also a prerequisite to my patchset enabling sharing of device memory 
> with DMA-buf.
>
> Please review and/or comment,
> Christian.
>
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 5/8] add timeline signal/transfer ioctls v2

2019-04-09 Thread Chunming Zhou

v2: use one transfer ioctl

Signed-off-by: Chunming Zhou 
---
 xf86drm.c | 33 +
 xf86drm.h |  6 ++
 2 files changed, 39 insertions(+)

diff --git a/xf86drm.c b/xf86drm.c
index 66e0c985..d57c4218 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -4280,6 +4280,21 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t 
*handles,
 return ret;
 }
 
+drm_public int drmSyncobjTimelineSignal(int fd, const uint32_t *handles,
+   uint64_t *points, uint32_t handle_count)
+{
+struct drm_syncobj_timeline_array args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;
+args.count_handles = handle_count;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, );
+return ret;
+}
+
 drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t 
*points,
  unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
@@ -4322,4 +4337,22 @@ drm_public int drmSyncobjQuery(int fd, uint32_t 
*handles, uint64_t *points,
 return 0;
 }
 
+drm_public int drmSyncobjTransfer(int fd,
+ uint32_t dst_handle, uint64_t dst_point,
+ uint32_t src_handle, uint64_t src_point,
+ uint32_t flags)
+{
+struct drm_syncobj_transfer args;
+int ret;
+
+memclear(args);
+args.src_handle = src_handle;
+args.dst_handle = dst_handle;
+args.src_point = src_point;
+args.dst_point = dst_point;
+args.flags = flags;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TRANSFER, );
 
+return ret;
+}
diff --git a/xf86drm.h b/xf86drm.h
index 60c7a84f..3fb1d1ca 100644
--- a/xf86drm.h
+++ b/xf86drm.h
@@ -876,12 +876,18 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, 
unsigned num_handles,
  uint32_t *first_signaled);
 extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t 
handle_count);
 extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t 
handle_count);
+extern int drmSyncobjTimelineSignal(int fd, const uint32_t *handles,
+   uint64_t *points, uint32_t handle_count);
 extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points,
  unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
  uint32_t *first_signaled);
 extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
   uint32_t handle_count);
+extern int drmSyncobjTransfer(int fd,
+ uint32_t dst_handle, uint64_t dst_point,
+ uint32_t src_handle, uint64_t src_point,
+ uint32_t flags);
 
 #if defined(__cplusplus)
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 7/8] wrap transfer interfaces

2019-04-09 Thread Chunming Zhou

Signed-off-by: Chunming Zhou 
---
 amdgpu/amdgpu.h| 22 ++
 amdgpu/amdgpu_cs.c | 16 
 2 files changed, 38 insertions(+)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index b5bd3ed9..2350835b 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -1670,6 +1670,28 @@ int 
amdgpu_cs_syncobj_import_sync_file2(amdgpu_device_handle dev,
uint32_t syncobj,
uint64_t point,
int sync_file_fd);
+
+/**
+ *  transfer between syncbojs.
+ *
+ * \param   dev- \c [in] device handle
+ * \param   dst_handle - \c [in] sync object handle
+ * \param   dst_point  - \c [in] timeline point, 0 presents dst is binary
+ * \param   src_handle - \c [in] sync object handle
+ * \param   src_point  - \c [in] timeline point, 0 presents src is binary
+ * \param   flags  - \c [in] flags
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_transfer(amdgpu_device_handle dev,
+  uint32_t dst_handle,
+  uint64_t dst_point,
+  uint32_t src_handle,
+  uint64_t src_point,
+  uint32_t flags);
+
 /**
  * Export an amdgpu fence as a handle (syncobj or fd).
  *
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 1c02d16f..a1c1af55 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -792,6 +792,22 @@ out:
return ret;
 }
 
+drm_public int amdgpu_cs_syncobj_transfer(amdgpu_device_handle dev,
+ uint32_t dst_handle,
+ uint64_t dst_point,
+ uint32_t src_handle,
+ uint64_t src_point,
+ uint32_t flags)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjTransfer(dev->fd,
+ dst_handle, dst_point,
+ src_handle, src_point,
+ flags);
+}
+
 drm_public int amdgpu_cs_submit_raw(amdgpu_device_handle dev,
amdgpu_context_handle context,
amdgpu_bo_list_handle bo_list_handle,
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 3/8] add timeline wait/query ioctl v2

2019-04-09 Thread Chunming Zhou

v2: drop export/import

Signed-off-by: Chunming Zhou 
---
 xf86drm.c | 44 
 xf86drm.h |  6 ++
 2 files changed, 50 insertions(+)

diff --git a/xf86drm.c b/xf86drm.c
index 18ad7c58..66e0c985 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -4279,3 +4279,47 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t 
*handles,
 ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_SIGNAL, );
 return ret;
 }
+
+drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t 
*points,
+ unsigned num_handles,
+ int64_t timeout_nsec, unsigned flags,
+ uint32_t *first_signaled)
+{
+struct drm_syncobj_timeline_wait args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;
+args.timeout_nsec = timeout_nsec;
+args.count_handles = num_handles;
+args.flags = flags;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, );
+if (ret < 0)
+return -errno;
+
+if (first_signaled)
+*first_signaled = args.first_signaled;
+return ret;
+}
+
+
+drm_public int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
+  uint32_t handle_count)
+{
+struct drm_syncobj_timeline_array args;
+int ret;
+
+memclear(args);
+args.handles = (uintptr_t)handles;
+args.points = (uint64_t)(uintptr_t)points;
+args.count_handles = handle_count;
+
+ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_QUERY, );
+if (ret)
+return ret;
+return 0;
+}
+
+
diff --git a/xf86drm.h b/xf86drm.h
index 887ecc76..60c7a84f 100644
--- a/xf86drm.h
+++ b/xf86drm.h
@@ -876,6 +876,12 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, 
unsigned num_handles,
  uint32_t *first_signaled);
 extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t 
handle_count);
 extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t 
handle_count);
+extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points,
+ unsigned num_handles,
+ int64_t timeout_nsec, unsigned flags,
+ uint32_t *first_signaled);
+extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points,
+  uint32_t handle_count);
 
 #if defined(__cplusplus)
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 2/8] addr cs chunk for syncobj timeline

2019-04-09 Thread Chunming Zhou

Signed-off-by: Chunming Zhou 
---
 include/drm/amdgpu_drm.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index e3a97da4..ab53f2e0 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -528,6 +528,8 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05
 #define AMDGPU_CHUNK_ID_BO_HANDLES  0x06
 #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT0x08
+#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL  0x09
 
 struct drm_amdgpu_cs_chunk {
__u32   chunk_id;
@@ -608,6 +610,13 @@ struct drm_amdgpu_cs_chunk_sem {
__u32 handle;
 };
 
+struct drm_amdgpu_cs_chunk_syncobj {
+   __u32 handle;
+   __u32 flags;
+   __u64 point;
+};
+
+
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ 0
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD  1
 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD2
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 6/8] expose timeline signal/export/import interfaces v2

2019-04-09 Thread Chunming Zhou

v2: adapt to new one transfer ioctl

Signed-off-by: Chunming Zhou 
---
 amdgpu/amdgpu-symbol-check |  3 ++
 amdgpu/amdgpu.h| 51 
 amdgpu/amdgpu_cs.c | 68 ++
 3 files changed, 122 insertions(+)

diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol-check
index 67ba3039..0cc54e5e 100755
--- a/amdgpu/amdgpu-symbol-check
+++ b/amdgpu/amdgpu-symbol-check
@@ -51,10 +51,13 @@ amdgpu_cs_submit
 amdgpu_cs_submit_raw
 amdgpu_cs_submit_raw2
 amdgpu_cs_syncobj_export_sync_file
+amdgpu_cs_syncobj_export_sync_file2
 amdgpu_cs_syncobj_import_sync_file
+amdgpu_cs_syncobj_import_sync_file2
 amdgpu_cs_syncobj_query
 amdgpu_cs_syncobj_reset
 amdgpu_cs_syncobj_signal
+amdgpu_cs_syncobj_timeline_signal
 amdgpu_cs_syncobj_timeline_wait
 amdgpu_cs_syncobj_wait
 amdgpu_cs_wait_fences
diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index dcf662e9..b5bd3ed9 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -1501,6 +1501,23 @@ int amdgpu_cs_syncobj_reset(amdgpu_device_handle dev,
 int amdgpu_cs_syncobj_signal(amdgpu_device_handle dev,
 const uint32_t *syncobjs, uint32_t syncobj_count);
 
+/**
+ * Signal kernel timeline sync objects.
+ *
+ * \param dev   - \c [in] device handle
+ * \param syncobjs  - \c [in] array of sync object handles
+ * \param points   - \c [in] array of timeline points
+ * \param syncobj_count - \c [in] number of handles in syncobjs
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+*/
+int amdgpu_cs_syncobj_timeline_signal(amdgpu_device_handle dev,
+ const uint32_t *syncobjs,
+ uint64_t *points,
+ uint32_t syncobj_count);
+
 /**
  *  Wait for one or all sync objects to signal.
  *
@@ -1618,7 +1635,41 @@ int 
amdgpu_cs_syncobj_export_sync_file(amdgpu_device_handle dev,
 int amdgpu_cs_syncobj_import_sync_file(amdgpu_device_handle dev,
   uint32_t syncobj,
   int sync_file_fd);
+/**
+ *  Export kernel timeline sync object to a sync_file.
+ *
+ * \param   dev- \c [in] device handle
+ * \param   syncobj- \c [in] sync object handle
+ * \param   point  - \c [in] timeline point
+ * \param   flags  - \c [in] flags
+ * \param   sync_file_fd - \c [out] sync_file file descriptor.
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_export_sync_file2(amdgpu_device_handle dev,
+   uint32_t syncobj,
+   uint64_t point,
+   uint32_t flags,
+   int *sync_file_fd);
 
+/**
+ *  Import kernel timeline sync object from a sync_file.
+ *
+ * \param   dev- \c [in] device handle
+ * \param   syncobj- \c [in] sync object handle
+ * \param   point  - \c [in] timeline point
+ * \param   sync_file_fd - \c [in] sync_file file descriptor.
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_import_sync_file2(amdgpu_device_handle dev,
+   uint32_t syncobj,
+   uint64_t point,
+   int sync_file_fd);
 /**
  * Export an amdgpu fence as a handle (syncobj or fd).
  *
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index b8b0d566..1c02d16f 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -649,6 +649,18 @@ drm_public int 
amdgpu_cs_syncobj_signal(amdgpu_device_handle dev,
return drmSyncobjSignal(dev->fd, syncobjs, syncobj_count);
 }
 
+drm_public int amdgpu_cs_syncobj_timeline_signal(amdgpu_device_handle dev,
+const uint32_t *syncobjs,
+uint64_t *points,
+uint32_t syncobj_count)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjTimelineSignal(dev->fd, syncobjs,
+   points, syncobj_count);
+}
+
 drm_public int amdgpu_cs_syncobj_wait(amdgpu_device_handle dev,
  uint32_t *handles, unsigned num_handles,
  int64_t timeout_nsec, unsigned flags,
@@ -724,6 +736,62 @@ drm_public int 
amdgpu_cs_syncobj_import_sync_file(amdgpu_device_handle dev,
return drmSyncobjImportSyncFile(dev->fd, syncobj, sync_file_fd);
 }
 
+drm_public int amdgpu_cs_syncobj_export_sync_file2(amdgpu_device_handle dev,
+  uint32_t syncobj,
+  uint64_t point,
+

[PATCH libdrm 4/8] wrap syncobj timeline query/wait APIs for amdgpu v3

2019-04-09 Thread Chunming Zhou

v2: symbos are stored in lexical order.
v3: drop export/import and extra query indirection

Signed-off-by: Chunming Zhou 
Signed-off-by: Christian König 
---
 amdgpu/amdgpu-symbol-check |  2 ++
 amdgpu/amdgpu.h| 39 ++
 amdgpu/amdgpu_cs.c | 23 ++
 3 files changed, 64 insertions(+)

diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol-check
index 96a44b40..67ba3039 100755
--- a/amdgpu/amdgpu-symbol-check
+++ b/amdgpu/amdgpu-symbol-check
@@ -52,8 +52,10 @@ amdgpu_cs_submit_raw
 amdgpu_cs_submit_raw2
 amdgpu_cs_syncobj_export_sync_file
 amdgpu_cs_syncobj_import_sync_file
+amdgpu_cs_syncobj_query
 amdgpu_cs_syncobj_reset
 amdgpu_cs_syncobj_signal
+amdgpu_cs_syncobj_timeline_wait
 amdgpu_cs_syncobj_wait
 amdgpu_cs_wait_fences
 amdgpu_cs_wait_semaphore
diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index d6de3b8d..dcf662e9 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -1521,6 +1521,45 @@ int amdgpu_cs_syncobj_wait(amdgpu_device_handle dev,
   int64_t timeout_nsec, unsigned flags,
   uint32_t *first_signaled);
 
+/**
+ *  Wait for one or all sync objects on their points to signal.
+ *
+ * \param   dev- \c [in] self-explanatory
+ * \param   handles - \c [in] array of sync object handles
+ * \param   points - \c [in] array of sync points to wait
+ * \param   num_handles - \c [in] self-explanatory
+ * \param   timeout_nsec - \c [in] self-explanatory
+ * \param   flags   - \c [in] a bitmask of DRM_SYNCOBJ_WAIT_FLAGS_*
+ * \param   first_signaled - \c [in] self-explanatory
+ *
+ * \return   0 on success\n
+ *  -ETIME - Timeout
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_timeline_wait(amdgpu_device_handle dev,
+   uint32_t *handles, uint64_t *points,
+   unsigned num_handles,
+   int64_t timeout_nsec, unsigned flags,
+   uint32_t *first_signaled);
+/**
+ *  Query sync objects payloads.
+ *
+ * \param   dev- \c [in] self-explanatory
+ * \param   handles - \c [in] array of sync object handles
+ * \param   points - \c [out] array of sync points returned, which presents
+ * syncobj payload.
+ * \param   num_handles - \c [in] self-explanatory
+ *
+ * \return   0 on success\n
+ *  -ETIME - Timeout
+ *  <0 - Negative POSIX Error code
+ *
+ */
+int amdgpu_cs_syncobj_query(amdgpu_device_handle dev,
+   uint32_t *handles, uint64_t *points,
+   unsigned num_handles);
+
 /**
  *  Export kernel sync object to shareable fd.
  *
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 5bedf748..b8b0d566 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -661,6 +661,29 @@ drm_public int amdgpu_cs_syncobj_wait(amdgpu_device_handle 
dev,
  flags, first_signaled);
 }
 
+drm_public int amdgpu_cs_syncobj_timeline_wait(amdgpu_device_handle dev,
+  uint32_t *handles, uint64_t 
*points,
+  unsigned num_handles,
+  int64_t timeout_nsec, unsigned 
flags,
+  uint32_t *first_signaled)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjTimelineWait(dev->fd, handles, points, num_handles,
+ timeout_nsec, flags, first_signaled);
+}
+
+drm_public int amdgpu_cs_syncobj_query(amdgpu_device_handle dev,
+  uint32_t *handles, uint64_t *points,
+  unsigned num_handles)
+{
+   if (NULL == dev)
+   return -EINVAL;
+
+   return drmSyncobjQuery(dev->fd, handles, points, num_handles);
+}
+
 drm_public int amdgpu_cs_export_syncobj(amdgpu_device_handle dev,
uint32_t handle,
int *shared_fd)
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH libdrm 8/8] add syncobj timeline tests v3

2019-04-09 Thread Chunming Zhou

v2: drop DRM_SYNCOBJ_CREATE_TYPE_TIMELINE, fix timeout calculation,
fix some warnings
v3: add export/import and cpu signal testing cases

Signed-off-by: Chunming Zhou 
Signed-off-by: Christian König 
---
 tests/amdgpu/Makefile.am |   3 +-
 tests/amdgpu/amdgpu_test.c   |  11 ++
 tests/amdgpu/amdgpu_test.h   |  21 +++
 tests/amdgpu/meson.build |   2 +-
 tests/amdgpu/syncobj_tests.c | 290 +++
 5 files changed, 325 insertions(+), 2 deletions(-)
 create mode 100644 tests/amdgpu/syncobj_tests.c

diff --git a/tests/amdgpu/Makefile.am b/tests/amdgpu/Makefile.am
index 48278848..920882d0 100644
--- a/tests/amdgpu/Makefile.am
+++ b/tests/amdgpu/Makefile.am
@@ -34,4 +34,5 @@ amdgpu_test_SOURCES = \
uve_ib.h \
deadlock_tests.c \
vm_tests.c  \
-   ras_tests.c
+   ras_tests.c \
+   syncobj_tests.c
diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c
index 8fc7a0b9..214c7fce 100644
--- a/tests/amdgpu/amdgpu_test.c
+++ b/tests/amdgpu/amdgpu_test.c
@@ -57,6 +57,7 @@
 #define DEADLOCK_TESTS_STR "Deadlock Tests"
 #define VM_TESTS_STR "VM Tests"
 #define RAS_TESTS_STR "RAS Tests"
+#define SYNCOBJ_TIMELINE_TESTS_STR "SYNCOBJ TIMELINE Tests"
 
 /**
  *  Open handles for amdgpu devices
@@ -123,6 +124,12 @@ static CU_SuiteInfo suites[] = {
.pCleanupFunc = suite_ras_tests_clean,
.pTests = ras_tests,
},
+   {
+   .pName = SYNCOBJ_TIMELINE_TESTS_STR,
+   .pInitFunc = suite_syncobj_timeline_tests_init,
+   .pCleanupFunc = suite_syncobj_timeline_tests_clean,
+   .pTests = syncobj_timeline_tests,
+   },
 
CU_SUITE_INFO_NULL,
 };
@@ -176,6 +183,10 @@ static Suites_Active_Status suites_active_stat[] = {
.pName = RAS_TESTS_STR,
.pActive = suite_ras_tests_enable,
},
+   {
+   .pName = SYNCOBJ_TIMELINE_TESTS_STR,
+   .pActive = suite_syncobj_timeline_tests_enable,
+   },
 };
 
 
diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h
index bcd0bc7e..36675ea3 100644
--- a/tests/amdgpu/amdgpu_test.h
+++ b/tests/amdgpu/amdgpu_test.h
@@ -216,6 +216,27 @@ CU_BOOL suite_ras_tests_enable(void);
 extern CU_TestInfo ras_tests[];
 
 
+/**
+ * Initialize syncobj timeline test suite
+ */
+int suite_syncobj_timeline_tests_init();
+
+/**
+ * Deinitialize syncobj timeline test suite
+ */
+int suite_syncobj_timeline_tests_clean();
+
+/**
+ * Decide if the suite is enabled by default or not.
+ */
+CU_BOOL suite_syncobj_timeline_tests_enable(void);
+
+/**
+ * Tests in syncobj timeline test suite
+ */
+extern CU_TestInfo syncobj_timeline_tests[];
+
+
 /**
  * Helper functions
  */
diff --git a/tests/amdgpu/meson.build b/tests/amdgpu/meson.build
index 95ed9305..1726cb43 100644
--- a/tests/amdgpu/meson.build
+++ b/tests/amdgpu/meson.build
@@ -24,7 +24,7 @@ if dep_cunit.found()
 files(
   'amdgpu_test.c', 'basic_tests.c', 'bo_tests.c', 'cs_tests.c',
   'vce_tests.c', 'uvd_enc_tests.c', 'vcn_tests.c', 'deadlock_tests.c',
-  'vm_tests.c', 'ras_tests.c',
+  'vm_tests.c', 'ras_tests.c', 'syncobj_tests.c',
 ),
 dependencies : [dep_cunit, dep_threads],
 include_directories : [inc_root, inc_drm, 
include_directories('../../amdgpu')],
diff --git a/tests/amdgpu/syncobj_tests.c b/tests/amdgpu/syncobj_tests.c
new file mode 100644
index ..a0c627d7
--- /dev/null
+++ b/tests/amdgpu/syncobj_tests.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+*/
+
+#include "CUnit/Basic.h"
+
+#include "amdgpu_test.h"
+#include "amdgpu_drm.h"
+#include "amdgpu_internal.h"
+#include 
+
+static  amdgpu_device_hand

[PATCH libdrm 1/8] new syncobj extension v3

2019-04-09 Thread Chunming Zhou

v2: drop not implemented IOCTLs and flags
v3: add transfer/signal ioctls

Signed-off-by: Chunming Zhou 
Signed-off-by: Christian König 
---
 include/drm/drm.h | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/include/drm/drm.h b/include/drm/drm.h
index 85c685a2..26f51bca 100644
--- a/include/drm/drm.h
+++ b/include/drm/drm.h
@@ -729,8 +729,18 @@ struct drm_syncobj_handle {
__u32 pad;
 };
 
+struct drm_syncobj_transfer {
+__u32 src_handle;
+__u32 dst_handle;
+__u64 src_point;
+__u64 dst_point;
+__u32 flags;
+__u32 pad;
+};
+
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE (1 << 2)
 struct drm_syncobj_wait {
__u64 handles;
/* absolute timeout */
@@ -741,12 +751,31 @@ struct drm_syncobj_wait {
__u32 pad;
 };
 
+struct drm_syncobj_timeline_wait {
+__u64 handles;
+/* wait on specific timeline point for every handles*/
+__u64 points;
+/* absolute timeout */
+__s64 timeout_nsec;
+__u32 count_handles;
+__u32 flags;
+__u32 first_signaled; /* only valid when not waiting all */
+__u32 pad;
+};
+
 struct drm_syncobj_array {
__u64 handles;
__u32 count_handles;
__u32 pad;
 };
 
+struct drm_syncobj_timeline_array {
+__u64 handles;
+__u64 points;
+__u32 count_handles;
+__u32 pad;
+};
+
 /* Query current scanout sequence number */
 struct drm_crtc_get_sequence {
__u32 crtc_id;  /* requested crtc_id */
@@ -903,6 +932,12 @@ extern "C" {
 #define DRM_IOCTL_MODE_GET_LEASE   DRM_IOWR(0xC8, struct 
drm_mode_get_lease)
 #define DRM_IOCTL_MODE_REVOKE_LEASEDRM_IOWR(0xC9, struct 
drm_mode_revoke_lease)
 
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT DRM_IOWR(0xCA, struct 
drm_syncobj_timeline_wait)
+#define DRM_IOCTL_SYNCOBJ_QUERY DRM_IOWR(0xCB, struct 
drm_syncobj_timeline_array)
+#define DRM_IOCTL_SYNCOBJ_TRANSFER DRM_IOWR(0xCC, struct 
drm_syncobj_transfer)
+#define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL   DRM_IOWR(0xCD, struct 
drm_syncobj_timeline_array)
+
+
 /**
  * Device specific ioctls should only be in their respective headers
  * The device specific ioctl range is from 0x40 to 0x9f.
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

1 2 3 4 5 6 7 >

1 - 100 of 674 matches

Mail list logo