Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-19 Thread Christian König

Am 19.09.2018 um 05:20 schrieb zhoucm1:




On 2018年09月18日 16:32, Christian König wrote:

+    for (i = 0; i < args->count_handles; i++) {
+    if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+    DRM_ERROR("timeline syncobj cannot reset!\n");


Why not? I mean that should still work or do I miss anything?
timeline semaphore spec doesn't require reset interface, it says the 
timeline value only can be changed by signal operations.


Yeah, but we don't care about the timeline spec in the kernel.

Question is rather if that still makes sense to support that and as 
far as I can see it should be trivial to reinitialize the object. 

Hi Daniel Rakos,

Could you give a comment on this question? Is it necessary to support 
timeline reset interface?  I only see the timeline value can be 
changed by signal operations in Spec.


I think the Vulkan spec is rather irrelevant here. We added the reset 
operation because of drivers which wanted to re-use an existing syncobj.


That is still a valid use case even when Vulkan doesn't have an 
interface for it.


Christian.




Thanks,
David Zhou


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-18 Thread zhoucm1



On 2018年09月18日 16:32, Christian König wrote:

+    for (i = 0; i < args->count_handles; i++) {
+    if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+    DRM_ERROR("timeline syncobj cannot reset!\n");


Why not? I mean that should still work or do I miss anything?
timeline semaphore spec doesn't require reset interface, it says the 
timeline value only can be changed by signal operations.


Yeah, but we don't care about the timeline spec in the kernel.

Question is rather if that still makes sense to support that and as 
far as I can see it should be trivial to reinitialize the object. 

Hi Daniel Rakos,

Could you give a comment on this question? Is it necessary to support 
timeline reset interface?  I only see the timeline value can be changed 
by signal operations in Spec.



Thanks,
David Zhou
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-18 Thread Christian König

Am 17.09.2018 um 11:33 schrieb zhoucm1:

[SNIP]

  +static struct dma_fence *
+drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 
flags)

+{
+    struct dma_fence *fence;
+    int ret = 0;
+
+    if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+    ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
+    point <= syncobj->syncobj_timeline.signal_point,
+    msecs_to_jiffies(1)); /* wait 10s */


Either don't use a timeout or provide the timeout explicitely, but 
don't hard code it here.


Additional to that we probably need an incorruptible wait.
Initially, I used interruptible wait, I found it sometimes often 
return -ERESTARTSYS without waiting any second when I wrote unit test.

Any ideas for that?


Yeah, that is perfectly normal. The application just received a signal 
and we need to abort the IOCTL ASAP.



still need to interruptible wait?


Yes, that should certainly be an interruptible wait.

Otherwise an application could block on this forever when somebody 
forgets to supply a signal point.



  -    for (i = 0; i < args->count_handles; i++)
-    drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
-
+    for (i = 0; i < args->count_handles; i++) {
+    if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+    DRM_ERROR("timeline syncobj cannot reset!\n");


Why not? I mean that should still work or do I miss anything?
timeline semaphore spec doesn't require reset interface, it says the 
timeline value only can be changed by signal operations.


Yeah, but we don't care about the timeline spec in the kernel.

Question is rather if that still makes sense to support that and as far 
as I can see it should be trivial to reinitialize the object.



+    ret = -EINVAL;
+    goto out;
+    }
+    drm_syncobj_timeline_fini(syncobjs[i],
+  [i]->syncobj_timeline);
+    drm_syncobj_timeline_init(syncobjs[i],
+  [i]->syncobj_timeline);
+    }
+out:
  drm_syncobj_array_free(syncobjs, args->count_handles);
  -    return 0;
+    return ret;
  }
    int
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c

index 0a8d2d64f380..579e91a5858b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
  if (!(flags & I915_EXEC_FENCE_WAIT))
  continue;
  -    fence = drm_syncobj_fence_get(syncobj);
+    drm_syncobj_search_fence(syncobj, 0,
+ DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,


Not sure if we can make DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT the 
default here.


Maybe better if drivers explicitly need to ask for it?
if you don't like the flag used in specific driver, I can wrap it to a 
new function.


Yeah, that is probably a good idea.




+ );
  if (!fence)
  return -EINVAL;
  diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 425432b85a87..9535521e6623 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -30,6 +30,25 @@
    struct drm_syncobj_cb;
  +enum drm_syncobj_type {
+    DRM_SYNCOBJ_TYPE_NORMAL,


Does anybody have a better suggestion for _TYPE_NORMAL? That sounds 
like timeline would be special.

How about _TYPE_INDIVIDUAL?


Oh, really good idea.

btw: I encounter a problem after removing advanced wait pt, I debug it 
two days, but still don't find reason, from log, it's timeount when 
wait_event_timeout, that means it don't receive signal operation.


Good question. Maybe a race condition? Have you tried to trace both 
threads to the trace buffer, e.g. use trace_printk?


"./deqp-vk -n dEQP-VK*semaphore*" will fail on 
'dEQP-VK.synchronization.basic.semaphore.chain' case, but can pass if 
only run that one case like "./deqp-vk -n 
dEQP-VK.synchronization.basic.semaphore.chain".


Both old and my previous implementation can pass and have no this 
problem.


Can you reproduce that as well with the unit tests in libdrm?

Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-17 Thread zhoucm1



On 2018年09月17日 16:37, Christian König wrote:

Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
This patch is for VK_KHR_timeline_semaphore extension, semaphore is 
called syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer 
payload

identifying a point in a timeline. Such timeline syncobjs support the
following operations:
    * CPU query - A host operation that allows querying the payload 
of the

  timeline syncobj.
    * CPU wait - A host operation that allows a blocking wait for a
  timeline syncobj to reach a specified value.
    * Device wait - A device operation that allows waiting for a
  timeline syncobj to reach a specified value.
    * Device signal - A device operation that allows advancing the
  timeline syncobj to a specified value.

Since it's a timeline, that means the front time point(PT) always is 
signaled before the late PT.

a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, 
when PT[N] fence is signaled,

the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when 
timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than 
timeline value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on 
any point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal 
PT, we need a sumission fence to

perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate 
patch. (Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate 
patch.
5. drop the submission_fence implementation and instead use 
wait_event() for that. (Christian)

6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and 
Christian)
 a. normal syncobj signal op will create a signal PT to tail of 
signal pt list.
 b. normal syncobj wait op will create a wait pt with last signal 
point, and this wait PT is only signaled by related signal point PT.

2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb

normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c  | 294 ++---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
  include/drm/drm_syncobj.h  |  62 +++--
  include/uapi/drm/drm.h |   1 +
  4 files changed, 292 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index e9ce623d049e..e78d076f2703 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
  #include "drm_internal.h"
  #include 
  +/* merge normal syncobj to timeline syncobj, the point interval is 
1 */

+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
  struct drm_syncobj_stub_fence {
  struct dma_fence base;
  spinlock_t lock;
@@ -82,6 +85,11 @@ static const struct dma_fence_ops 
drm_syncobj_stub_fence_ops = {

  .release = drm_syncobj_stub_fence_release,
  };
  +struct drm_syncobj_signal_pt {
+    struct dma_fence_array *base;
+    u64    value;
+    struct list_head list;
+};
    /**
   * drm_syncobj_find - lookup and reference a sync object.
@@ -124,7 +132,7 @@ static int 
drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,

  {
  int ret;
  -    *fence = drm_syncobj_fence_get(syncobj);
+    ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
  if (*fence)


Don't we need to check ret here instead?

both ok, if you like check ret, will update it in v6.




  return 1;
  @@ -133,10 +141,10 @@ static int 
drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,

   * have the lock, try one more time just to be sure we don't add a
   * callback when a fence has already been set.
   */
-    if (syncobj->fence) {
-    *fence = 
dma_fence_get(rcu_dereference_protected(syncobj->fence,

- lockdep_is_held(>lock)));
-    ret = 1;
+    if (fence) {
+    drm_syncobj_search_fence(syncobj, 0, 0, fence);
+    if (*fence)
+    ret = 1;


That doesn't looks correct to me, drm_syncobj_search_fence() would try 
to grab the lock once more.


That 

Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-17 Thread Christian König

Am 14.09.2018 um 12:37 schrieb Chunming Zhou:

This patch is for VK_KHR_timeline_semaphore extension, semaphore is called 
syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer payload
identifying a point in a timeline. Such timeline syncobjs support the
following operations:
* CPU query - A host operation that allows querying the payload of the
  timeline syncobj.
* CPU wait - A host operation that allows a blocking wait for a
  timeline syncobj to reach a specified value.
* Device wait - A device operation that allows waiting for a
  timeline syncobj to reach a specified value.
* Device signal - A device operation that allows advancing the
  timeline syncobj to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on any point 
of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
 a. normal syncobj signal op will create a signal PT to tail of signal pt 
list.
 b. normal syncobj wait op will create a wait pt with last signal point, 
and this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb

normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c  | 294 ++---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
  include/drm/drm_syncobj.h  |  62 +++--
  include/uapi/drm/drm.h |   1 +
  4 files changed, 292 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..e78d076f2703 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
  #include "drm_internal.h"
  #include 
  
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */

+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
  struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
@@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops 
= {
.release = drm_syncobj_stub_fence_release,
  };
  
+struct drm_syncobj_signal_pt {

+   struct dma_fence_array *base;
+   u64value;
+   struct list_head list;
+};
  
  /**

   * drm_syncobj_find - lookup and reference a sync object.
@@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct 
drm_syncobj *syncobj,
  {
int ret;
  
-	*fence = drm_syncobj_fence_get(syncobj);

+   ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
if (*fence)


Don't we need to check ret here instead?


return 1;
  
@@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,

 * have the lock, try one more time just to be sure we don't add a
 * callback when a fence has already been set.
 */
-   if (syncobj->fence) {
-   *fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
-
lockdep_is_held(>lock)));
-   ret = 1;
+   if (fence) {
+   drm_syncobj_search_fence(syncobj, 0, 0, fence);
+   if (*fence)
+   ret = 1;


That doesn't looks correct to me, drm_syncobj_search_fence() would try 
to grab the 

RE: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-16 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Daniel Vetter  On Behalf Of Daniel Vetter
> Sent: Saturday, September 15, 2018 12:11 AM
> To: Koenig, Christian 
> Cc: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Dave Airlie
> ; Rakos, Daniel ; Daniel
> Vetter 
> Subject: Re: [PATCH] [RFC]drm: add syncobj timeline support v5
> 
> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
> > Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
> > > This patch is for VK_KHR_timeline_semaphore extension, semaphore is
> called syncobj in kernel side:
> > > This extension introduces a new type of syncobj that has an integer
> > > payload identifying a point in a timeline. Such timeline syncobjs
> > > support the following operations:
> > > * CPU query - A host operation that allows querying the payload of the
> > >   timeline syncobj.
> > > * CPU wait - A host operation that allows a blocking wait for a
> > >   timeline syncobj to reach a specified value.
> > > * Device wait - A device operation that allows waiting for a
> > >   timeline syncobj to reach a specified value.
> > > * Device signal - A device operation that allows advancing the
> > >   timeline syncobj to a specified value.
> > >
> > > Since it's a timeline, that means the front time point(PT) always is
> signaled before the late PT.
> > > a. signal PT design:
> > > Signal PT fence N depends on PT[N-1] fence and signal opertion
> > > fence, when PT[N] fence is signaled, the timeline will increase to value 
> > > of
> PT[N].
> > > b. wait PT design:
> > > Wait PT fence is signaled by reaching timeline point value, when
> > > timeline is increasing, will compare wait PTs value with new
> > > timeline value, if PT value is lower than timeline value, then wait
> > > PT will be signaled, otherwise keep in list. syncobj wait operation
> > > can wait on any point of timeline, so need a RB tree to order them. And
> wait PT could ahead of signal PT, we need a sumission fence to perform that.
> > >
> > > v2:
> > > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2.
> move
> > > unexposed denitions to .c file. (Daniel Vetter) 3. split up the
> > > change to drm_syncobj_find_fence() in a separate patch. (Christian)
> > > 4. split up the change to drm_syncobj_replace_fence() in a separate
> patch.
> > > 5. drop the submission_fence implementation and instead use
> > > wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL
> > > type syncobj case. (Daniel Vetter)
> > >
> > > v3:
> > > 1. replace normal syncobj with timeline implemenation. (Vetter and
> Christian)
> > >  a. normal syncobj signal op will create a signal PT to tail of 
> > > signal pt list.
> > >  b. normal syncobj wait op will create a wait pt with last signal 
> > > point, and
> this wait PT is only signaled by related signal point PT.
> > > 2. many bug fix and clean up
> > > 3. stub fence moving is moved to other patch.
> > >
> > > v4:
> > > 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2.
> > > fix syncobj lifecycle. (Christian) 3. only enable_signaling when
> > > there is wait_pt. (Christian) 4. fix timeline path issues.
> > > 5. write a timeline test in libdrm
> > >
> > > v5: (Christian)
> > > 1. semaphore is called syncobj in kernel side.
> > > 2. don't need 'timeline' characters in some function name.
> > > 3. keep syncobj cb
> > >
> > > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline
> > > syncobj is tested by ./amdgpu_test -s 9
> > >
> > > Signed-off-by: Chunming Zhou 
> > > Cc: Christian Konig 
> > > Cc: Dave Airlie 
> > > Cc: Daniel Rakos 
> > > Cc: Daniel Vetter 
> >
> > At least on first glance that looks like it should work, going to do a
> > detailed review on Monday.
> 
> Just for my understanding, it's all condensed down to 1 patch now?

Yes, Christian suggest that.

 >I kinda
> didn't follow the detailed discussion last few days at all :-/
> 
> Also, is there a testcase, igt highly preferred (because then we'll run it in 
> our
> intel-gfx CI, and a bunch of people outside of intel have already discovered
> that and are using it).


I already wrote the test in libdrm unit test, since I'm not familiar with IGT 
stuff.

Thanks,
David Zhou
> 
> Thanks, Daniel
> 
> >
>

Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-14 Thread Christian König

Am 14.09.2018 um 20:24 schrieb Daniel Vetter:

On Fri, Sep 14, 2018 at 6:43 PM, Christian König
 wrote:

Am 14.09.2018 um 18:10 schrieb Daniel Vetter:

On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:

Am 14.09.2018 um 12:37 schrieb Chunming Zhou:

This patch is for VK_KHR_timeline_semaphore extension, semaphore is
called syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer
payload
identifying a point in a timeline. Such timeline syncobjs support the
following operations:
  * CPU query - A host operation that allows querying the payload of
the
timeline syncobj.
  * CPU wait - A host operation that allows a blocking wait for a
timeline syncobj to reach a specified value.
  * Device wait - A device operation that allows waiting for a
timeline syncobj to reach a specified value.
  * Device signal - A device operation that allows advancing the
timeline syncobj to a specified value.

Since it's a timeline, that means the front time point(PT) always is
signaled before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence,
when PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when
timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than
timeline value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on any
point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT,
we need a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch.
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate
patch.
5. drop the submission_fence implementation and instead use wait_event()
for that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and
Christian)
   a. normal syncobj signal op will create a signal PT to tail of
signal pt list.
   b. normal syncobj wait op will create a wait pt with last signal
point, and this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb

normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 

At least on first glance that looks like it should work, going to do a
detailed review on Monday.

Just for my understanding, it's all condensed down to 1 patch now? I kinda
didn't follow the detailed discussion last few days at all :-/


I've already committed all the cleanup/fix prerequisites to drm-misc-next.

The driver specific implementation needs to come on top and maybe a new CPU
wait IOCTL.

But essentially this patch is just the core of the kernel implementation.

Ah cool, missed that.


Also, is there a testcase, igt highly preferred (because then we'll run it
in our intel-gfx CI, and a bunch of people outside of intel have already
discovered that and are using it).


libdrm patches and I think amdgpu based test cases where already published
as well.

Not sure about igt testcases.

I guess we can write them when the intel implementation shows up. Just
kinda still hoping that we'd have a more unfified test suite. And not
really well-kept secret: We do have an amdgpu in our CI, in the form
of kbl-g :-) But unfortunately it's not running the full test set for
patches (only for drm-tip). But we could perhaps run more of the
amdgpu tests somehow, if there's serious interest.


Well I wouldn't mind if we sooner or later get rid of the amdgpu unit 
tests in libdrm.


They are more or less just a really bloody mess.

Christian.



Cheers, Daniel



Christian.



Thanks, Daniel


Christian.


---
drivers/gpu/drm/drm_syncobj.c  | 294
++---
drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
include/drm/drm_syncobj.h  |  62 +++--
include/uapi/drm/drm.h |   1 +
4 files changed, 292 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c
b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..e78d076f2703 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ 

Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-14 Thread Daniel Vetter
On Fri, Sep 14, 2018 at 6:43 PM, Christian König
 wrote:
> Am 14.09.2018 um 18:10 schrieb Daniel Vetter:
>>
>> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
>>>
>>> Am 14.09.2018 um 12:37 schrieb Chunming Zhou:

 This patch is for VK_KHR_timeline_semaphore extension, semaphore is
 called syncobj in kernel side:
 This extension introduces a new type of syncobj that has an integer
 payload
 identifying a point in a timeline. Such timeline syncobjs support the
 following operations:
  * CPU query - A host operation that allows querying the payload of
 the
timeline syncobj.
  * CPU wait - A host operation that allows a blocking wait for a
timeline syncobj to reach a specified value.
  * Device wait - A device operation that allows waiting for a
timeline syncobj to reach a specified value.
  * Device signal - A device operation that allows advancing the
timeline syncobj to a specified value.

 Since it's a timeline, that means the front time point(PT) always is
 signaled before the late PT.
 a. signal PT design:
 Signal PT fence N depends on PT[N-1] fence and signal opertion fence,
 when PT[N] fence is signaled,
 the timeline will increase to value of PT[N].
 b. wait PT design:
 Wait PT fence is signaled by reaching timeline point value, when
 timeline is increasing, will compare
 wait PTs value with new timeline value, if PT value is lower than
 timeline value, then wait PT will be
 signaled, otherwise keep in list. syncobj wait operation can wait on any
 point of timeline,
 so need a RB tree to order them. And wait PT could ahead of signal PT,
 we need a sumission fence to
 perform that.

 v2:
 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
 2. move unexposed denitions to .c file. (Daniel Vetter)
 3. split up the change to drm_syncobj_find_fence() in a separate patch.
 (Christian)
 4. split up the change to drm_syncobj_replace_fence() in a separate
 patch.
 5. drop the submission_fence implementation and instead use wait_event()
 for that. (Christian)
 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

 v3:
 1. replace normal syncobj with timeline implemenation. (Vetter and
 Christian)
   a. normal syncobj signal op will create a signal PT to tail of
 signal pt list.
   b. normal syncobj wait op will create a wait pt with last signal
 point, and this wait PT is only signaled by related signal point PT.
 2. many bug fix and clean up
 3. stub fence moving is moved to other patch.

 v4:
 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
 2. fix syncobj lifecycle. (Christian)
 3. only enable_signaling when there is wait_pt. (Christian)
 4. fix timeline path issues.
 5. write a timeline test in libdrm

 v5: (Christian)
 1. semaphore is called syncobj in kernel side.
 2. don't need 'timeline' characters in some function name.
 3. keep syncobj cb

 normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
 timeline syncobj is tested by ./amdgpu_test -s 9

 Signed-off-by: Chunming Zhou 
 Cc: Christian Konig 
 Cc: Dave Airlie 
 Cc: Daniel Rakos 
 Cc: Daniel Vetter 
>>>
>>> At least on first glance that looks like it should work, going to do a
>>> detailed review on Monday.
>>
>> Just for my understanding, it's all condensed down to 1 patch now? I kinda
>> didn't follow the detailed discussion last few days at all :-/
>
>
> I've already committed all the cleanup/fix prerequisites to drm-misc-next.
>
> The driver specific implementation needs to come on top and maybe a new CPU
> wait IOCTL.
>
> But essentially this patch is just the core of the kernel implementation.

Ah cool, missed that.

>> Also, is there a testcase, igt highly preferred (because then we'll run it
>> in our intel-gfx CI, and a bunch of people outside of intel have already
>> discovered that and are using it).
>
>
> libdrm patches and I think amdgpu based test cases where already published
> as well.
>
> Not sure about igt testcases.

I guess we can write them when the intel implementation shows up. Just
kinda still hoping that we'd have a more unfified test suite. And not
really well-kept secret: We do have an amdgpu in our CI, in the form
of kbl-g :-) But unfortunately it's not running the full test set for
patches (only for drm-tip). But we could perhaps run more of the
amdgpu tests somehow, if there's serious interest.

Cheers, Daniel


> Christian.
>
>
>>
>> Thanks, Daniel
>>
>>> Christian.
>>>
 ---
drivers/gpu/drm/drm_syncobj.c  | 294
 ++---
drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
include/drm/drm_syncobj.h  |  62 +++--

Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-14 Thread Christian König

Am 14.09.2018 um 18:10 schrieb Daniel Vetter:

On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:

Am 14.09.2018 um 12:37 schrieb Chunming Zhou:

This patch is for VK_KHR_timeline_semaphore extension, semaphore is called 
syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer payload
identifying a point in a timeline. Such timeline syncobjs support the
following operations:
 * CPU query - A host operation that allows querying the payload of the
   timeline syncobj.
 * CPU wait - A host operation that allows a blocking wait for a
   timeline syncobj to reach a specified value.
 * Device wait - A device operation that allows waiting for a
   timeline syncobj to reach a specified value.
 * Device signal - A device operation that allows advancing the
   timeline syncobj to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on any point 
of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
  a. normal syncobj signal op will create a signal PT to tail of signal pt 
list.
  b. normal syncobj wait op will create a wait pt with last signal point, 
and this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb

normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 

At least on first glance that looks like it should work, going to do a
detailed review on Monday.

Just for my understanding, it's all condensed down to 1 patch now? I kinda
didn't follow the detailed discussion last few days at all :-/


I've already committed all the cleanup/fix prerequisites to drm-misc-next.

The driver specific implementation needs to come on top and maybe a new 
CPU wait IOCTL.


But essentially this patch is just the core of the kernel implementation.


Also, is there a testcase, igt highly preferred (because then we'll run it
in our intel-gfx CI, and a bunch of people outside of intel have already
discovered that and are using it).


libdrm patches and I think amdgpu based test cases where already 
published as well.


Not sure about igt testcases.

Christian.



Thanks, Daniel


Christian.


---
   drivers/gpu/drm/drm_syncobj.c  | 294 ++---
   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
   include/drm/drm_syncobj.h  |  62 +++--
   include/uapi/drm/drm.h |   1 +
   4 files changed, 292 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..e78d076f2703 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@either
   #include "drm_internal.h"
   #include 
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */
+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
   struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
@@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops 
= {
.release = drm_syncobj_stub_fence_release,
   };
+struct drm_syncobj_signal_pt {
+   struct dma_fence_array *base;
+   u64value;
+   struct list_head list;
+};
   /**
* drm_syncobj_find - lookup and reference a sync object.
@@ -124,7 +132,7 @@ static int 

Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-14 Thread Daniel Vetter
On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
> Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
> > This patch is for VK_KHR_timeline_semaphore extension, semaphore is called 
> > syncobj in kernel side:
> > This extension introduces a new type of syncobj that has an integer payload
> > identifying a point in a timeline. Such timeline syncobjs support the
> > following operations:
> > * CPU query - A host operation that allows querying the payload of the
> >   timeline syncobj.
> > * CPU wait - A host operation that allows a blocking wait for a
> >   timeline syncobj to reach a specified value.
> > * Device wait - A device operation that allows waiting for a
> >   timeline syncobj to reach a specified value.
> > * Device signal - A device operation that allows advancing the
> >   timeline syncobj to a specified value.
> > 
> > Since it's a timeline, that means the front time point(PT) always is 
> > signaled before the late PT.
> > a. signal PT design:
> > Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
> > PT[N] fence is signaled,
> > the timeline will increase to value of PT[N].
> > b. wait PT design:
> > Wait PT fence is signaled by reaching timeline point value, when timeline 
> > is increasing, will compare
> > wait PTs value with new timeline value, if PT value is lower than timeline 
> > value, then wait PT will be
> > signaled, otherwise keep in list. syncobj wait operation can wait on any 
> > point of timeline,
> > so need a RB tree to order them. And wait PT could ahead of signal PT, we 
> > need a sumission fence to
> > perform that.
> > 
> > v2:
> > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
> > 2. move unexposed denitions to .c file. (Daniel Vetter)
> > 3. split up the change to drm_syncobj_find_fence() in a separate patch. 
> > (Christian)
> > 4. split up the change to drm_syncobj_replace_fence() in a separate patch.
> > 5. drop the submission_fence implementation and instead use wait_event() 
> > for that. (Christian)
> > 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
> > 
> > v3:
> > 1. replace normal syncobj with timeline implemenation. (Vetter and 
> > Christian)
> >  a. normal syncobj signal op will create a signal PT to tail of signal 
> > pt list.
> >  b. normal syncobj wait op will create a wait pt with last signal 
> > point, and this wait PT is only signaled by related signal point PT.
> > 2. many bug fix and clean up
> > 3. stub fence moving is moved to other patch.
> > 
> > v4:
> > 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
> > 2. fix syncobj lifecycle. (Christian)
> > 3. only enable_signaling when there is wait_pt. (Christian)
> > 4. fix timeline path issues.
> > 5. write a timeline test in libdrm
> > 
> > v5: (Christian)
> > 1. semaphore is called syncobj in kernel side.
> > 2. don't need 'timeline' characters in some function name.
> > 3. keep syncobj cb
> > 
> > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
> > timeline syncobj is tested by ./amdgpu_test -s 9
> > 
> > Signed-off-by: Chunming Zhou 
> > Cc: Christian Konig 
> > Cc: Dave Airlie 
> > Cc: Daniel Rakos 
> > Cc: Daniel Vetter 
> 
> At least on first glance that looks like it should work, going to do a
> detailed review on Monday.

Just for my understanding, it's all condensed down to 1 patch now? I kinda
didn't follow the detailed discussion last few days at all :-/

Also, is there a testcase, igt highly preferred (because then we'll run it
in our intel-gfx CI, and a bunch of people outside of intel have already
discovered that and are using it).

Thanks, Daniel

> 
> Christian.
> 
> > ---
> >   drivers/gpu/drm/drm_syncobj.c  | 294 ++---
> >   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
> >   include/drm/drm_syncobj.h  |  62 +++--
> >   include/uapi/drm/drm.h |   1 +
> >   4 files changed, 292 insertions(+), 69 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> > index e9ce623d049e..e78d076f2703 100644
> > --- a/drivers/gpu/drm/drm_syncobj.c
> > +++ b/drivers/gpu/drm/drm_syncobj.c
> > @@ -56,6 +56,9 @@either
> >   #include "drm_internal.h"
> >   #include 
> > +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
> > +#define DRM_SYNCOBJ_NORMAL_POINT 1
> > +
> >   struct drm_syncobj_stub_fence {
> > struct dma_fence base;
> > spinlock_t lock;
> > @@ -82,6 +85,11 @@ static const struct dma_fence_ops 
> > drm_syncobj_stub_fence_ops = {
> > .release = drm_syncobj_stub_fence_release,
> >   };
> > +struct drm_syncobj_signal_pt {
> > +   struct dma_fence_array *base;
> > +   u64value;
> > +   struct list_head list;
> > +};
> >   /**
> >* drm_syncobj_find - lookup and reference a sync object.
> > @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct 
> > drm_syncobj *syncobj,
> >   {
> >  

Re: [PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-14 Thread Christian König

Am 14.09.2018 um 12:37 schrieb Chunming Zhou:

This patch is for VK_KHR_timeline_semaphore extension, semaphore is called 
syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer payload
identifying a point in a timeline. Such timeline syncobjs support the
following operations:
* CPU query - A host operation that allows querying the payload of the
  timeline syncobj.
* CPU wait - A host operation that allows a blocking wait for a
  timeline syncobj to reach a specified value.
* Device wait - A device operation that allows waiting for a
  timeline syncobj to reach a specified value.
* Device signal - A device operation that allows advancing the
  timeline syncobj to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on any point 
of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
 a. normal syncobj signal op will create a signal PT to tail of signal pt 
list.
 b. normal syncobj wait op will create a wait pt with last signal point, 
and this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb

normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 


At least on first glance that looks like it should work, going to do a 
detailed review on Monday.


Christian.


---
  drivers/gpu/drm/drm_syncobj.c  | 294 ++---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
  include/drm/drm_syncobj.h  |  62 +++--
  include/uapi/drm/drm.h |   1 +
  4 files changed, 292 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..e78d076f2703 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@either
  #include "drm_internal.h"
  #include 
  
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */

+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
  struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
@@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops 
= {
.release = drm_syncobj_stub_fence_release,
  };
  
+struct drm_syncobj_signal_pt {

+   struct dma_fence_array *base;
+   u64value;
+   struct list_head list;
+};
  
  /**

   * drm_syncobj_find - lookup and reference a sync object.
@@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct 
drm_syncobj *syncobj,
  {
int ret;
  
-	*fence = drm_syncobj_fence_get(syncobj);

+   ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
if (*fence)
return 1;
  
@@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,

 * have the lock, try one more time just to be sure we don't add a
 * callback when a fence has already been set.
 */
-   if (syncobj->fence) {
-   *fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
-
lockdep_is_held(>lock)));
-   ret = 1;
+   if (fence) {
+   drm_syncobj_search_fence(syncobj, 0, 0, fence);
+   if (*fence)
+   ret = 1;
} 

[PATCH] [RFC]drm: add syncobj timeline support v5

2018-09-14 Thread Chunming Zhou
This patch is for VK_KHR_timeline_semaphore extension, semaphore is called 
syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer payload
identifying a point in a timeline. Such timeline syncobjs support the
following operations:
   * CPU query - A host operation that allows querying the payload of the
 timeline syncobj.
   * CPU wait - A host operation that allows a blocking wait for a
 timeline syncobj to reach a specified value.
   * Device wait - A device operation that allows waiting for a
 timeline syncobj to reach a specified value.
   * Device signal - A device operation that allows advancing the
 timeline syncobj to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on any point 
of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
a. normal syncobj signal op will create a signal PT to tail of signal pt 
list.
b. normal syncobj wait op will create a wait pt with last signal point, and 
this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb

normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_syncobj.c  | 294 ++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
 include/drm/drm_syncobj.h  |  62 +++--
 include/uapi/drm/drm.h |   1 +
 4 files changed, 292 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..e78d076f2703 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
 #include "drm_internal.h"
 #include 
 
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */
+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
 struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
@@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops 
= {
.release = drm_syncobj_stub_fence_release,
 };
 
+struct drm_syncobj_signal_pt {
+   struct dma_fence_array *base;
+   u64value;
+   struct list_head list;
+};
 
 /**
  * drm_syncobj_find - lookup and reference a sync object.
@@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct 
drm_syncobj *syncobj,
 {
int ret;
 
-   *fence = drm_syncobj_fence_get(syncobj);
+   ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
if (*fence)
return 1;
 
@@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct 
drm_syncobj *syncobj,
 * have the lock, try one more time just to be sure we don't add a
 * callback when a fence has already been set.
 */
-   if (syncobj->fence) {
-   *fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
-
lockdep_is_held(>lock)));
-   ret = 1;
+   if (fence) {
+   drm_syncobj_search_fence(syncobj, 0, 0, fence);
+   if (*fence)
+   ret = 1;
} else {
*fence = NULL;
drm_syncobj_add_callback_locked(syncobj, cb, func);
@@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj 
*syncobj,