RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-14 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Koenig, Christian
> Sent: Friday, September 14, 2018 3:27 PM
> To: Zhou, David(ChunMing) ; Zhou,
> David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org; Daniel Vetter
> 
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 14.09.2018 um 05:59 schrieb zhoucm1:
> >
> >
> > On 2018年09月14日 11:14, zhoucm1 wrote:
> >>
> >>
> >> On 2018年09月13日 18:22, Christian König wrote:
> >>> Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing):
> >>>>
> >>>>> -Original Message-
> >>>>> From: Koenig, Christian
> >>>>> Sent: Thursday, September 13, 2018 5:20 PM
> >>>>> To: Zhou, David(ChunMing) ; dri-
> >>>>> de...@lists.freedesktop.org
> >>>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>>> ; amd-gfx@lists.freedesktop.org
> >>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>>>>
> >>>>> Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):
> >>>>>>> -Original Message-----
> >>>>>>> From: Christian König 
> >>>>>>> Sent: Thursday, September 13, 2018 4:50 PM
> >>>>>>> To: Zhou, David(ChunMing) ; Koenig,
> >>>>>>> Christian ;
> >>>>>>> dri-de...@lists.freedesktop.org
> >>>>>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>>>>> ; amd-gfx@lists.freedesktop.org
> >>>>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support
> >>>>>>> v4
> >>>>>>>
> >>>>>>> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):
> >>>>>>>>> -Original Message-
> >>>>>>>>> From: Koenig, Christian
> >>>>>>>>> Sent: Thursday, September 13, 2018 2:56 PM
> >>>>>>>>> To: Zhou, David(ChunMing) ; Zhou,
> >>>>>>>>> David(ChunMing) ; dri-
> >>>>>>>>> de...@lists.freedesktop.org
> >>>>>>>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>>>>>>> ; amd-gfx@lists.freedesktop.org
> >>>>>>>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline
> >>>>>>>>> support v4
> >>>>>>>>>
> >>>>>>>>> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> >>>>>>>>>> On 2018年09月12日 19:05, Christian König wrote:
> >>>>>>>>>>>>>>> [SNIP]
> >>>>>>>>>>>>>>> +static void
> >>>>>>>>>>>>>>> +drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>>>>>>>>>>> drm_syncobj *syncobj,
> >>>>>>>>>>>>>>> +   struct drm_syncobj_wait_pt
> >>>>>>>>>>>>>>> +*wait_pt) {
> >>>>>>>>>>>>>> That whole approach still looks horrible complicated to me.
> >>>>>>>>>>>> It's already very close to what you said before.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> Especially the separation of signal and wait pt is
> >>>>>>>>>>>>>> completely unnecessary as far as I can see.
> >>>>>>>>>>>>>> When a wait pt is requested we just need to search for
> >>>>>>>>>>>>>> the signal point which it will trigger.
> >>>>>>>>>>>> Yeah, I tried this, but when I implement cpu wait ioctl on
> >>>>>>>>>>>> specific point, we need a advanced wait pt fence,
> >>>>>>>>>>>> otherwise, we could still need old syncobj cb.
> >>>>>>>>>>> Why? I mean you just need to call drm_syncobj_find_fence()
> >>>>>>>>>>> and
> >>>>>>> when
> >>>>>>>>>>> that one returns NULL you use wait_event_*() to wait for a
> >>>>>>>>>>> signal point >= your wait point to appear and tr

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-14 Thread Christian König

Am 14.09.2018 um 05:59 schrieb zhoucm1:



On 2018年09月14日 11:14, zhoucm1 wrote:



On 2018年09月13日 18:22, Christian König wrote:

Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing):



-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 5:20 PM
To: Zhou, David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):

-Original Message-
From: Christian König 
Sent: Thursday, September 13, 2018 4:50 PM
To: Zhou, David(ChunMing) ; Koenig, Christian
; dri-de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):

-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 2:56 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline 
support v4


Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:

[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct
drm_syncobj *syncobj,
+   struct drm_syncobj_wait_pt
+*wait_pt) {

That whole approach still looks horrible complicated to me.

It's already very close to what you said before.

Especially the separation of signal and wait pt is 
completely

unnecessary as far as I can see.
When a wait pt is requested we just need to search for the
signal point which it will trigger.

Yeah, I tried this, but when I implement cpu wait ioctl on
specific point, we need a advanced wait pt fence, 
otherwise, we

could still need old syncobj cb.

Why? I mean you just need to call drm_syncobj_find_fence() and

when
that one returns NULL you use wait_event_*() to wait for a 
signal

point >= your wait point to appear and try again.
e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC 
have

no fence yet, as you said, during drm_syncobj_find_fence(A) is
working on wait_event, syncobjB and syncobjC could already be
signaled, then we don't know which one is first signaled, 
which is

need when wait ioctl returns.
I don't really see a problem with that. When you wait for the 
first

one you need to wait for A,B,C at the same time anyway.

So what you do is to register a fence callback on the fences you
already have and for the syncobj which doesn't yet have a 
fence you

make sure that they wake up your thread when they get one.

So essentially exactly what 
drm_syncobj_fence_get_or_add_callback()

already does today.

So do you mean we need still use old syncobj CB for that?

Yes, as far as I can see it should work.


    Advanced wait pt is bad?

Well it isn't bad, I just don't see any advantage in it.

The advantage is to replace old syncobj cb.


The existing mechanism
should already be able to handle that.
I thought more a bit, we don't that mechanism at all, if use 
advanced wait
pt, we can easily use fence array to achieve it for wait ioctl, we 
should use
kernel existing feature as much as possible, not invent another, 
shouldn't we?

I remember  you said  it before.

Yeah, but the syncobj cb is an existing feature.
This is obviously a workaround when doing for wait ioctl, Do you 
see it used in other place?



And I absolutely don't see a
need to modify that and replace it with something far more complex.
The wait ioctl is simplified much more by fence array, not complex, 
and we just need  to allocate a wait pt.  If keeping old syncobj cb 
workaround, all wait pt logic still is there, just save allocation 
and wait pt handling, in fact, which part isn't complex at all. But 
compare with ugly syncobj cb, which is simpler.


I strongly disagree on that. You just need to extend the syncobj cb 
with the sequence number and you are done.


We could clean that up in the long term by adding some wait_multi 
event macro, but for now just adding the sequence number should do 
the trick.


Quote from Daniel Vetter comment when v1, "

Specifically for this stuff here having unified future fence semantics
will allow drivers to do clever stuff with them.

"
I think the advanced wait pt is a similar concept as 'future fence' 
what Daniel Vetter said before, which obviously a right direction.



Anyway, I will change the patch as you like if no other comment, so 
that the patch can pass soon.
When I try to remove wait pt future fence, I encounter another 
problem, drm_syncobj_find_fence cannot get a fence if signal pt 
already is collected as garbage, then CS will report error, any idea 
for that?


Well when the signal pt is already garbage collected you know that it is 
already signaled. So you can just return a dummy fence.


I actually thought that this was 

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-14 Thread Christian König

Am 14.09.2018 um 09:46 schrieb Zhou, David(ChunMing):



-Original Message-
From: Koenig, Christian
Sent: Friday, September 14, 2018 3:27 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org; Daniel Vetter

Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 14.09.2018 um 05:59 schrieb zhoucm1:


On 2018年09月14日 11:14, zhoucm1 wrote:


On 2018年09月13日 18:22, Christian König wrote:

Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing):

-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 5:20 PM
To: Zhou, David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):

-Original Message-
From: Christian König 
Sent: Thursday, September 13, 2018 4:50 PM
To: Zhou, David(ChunMing) ; Koenig,
Christian ;
dri-de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support
v4

Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):

-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 2:56 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline
support v4

Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:

[SNIP]
+static void
+drm_syncobj_find_signal_pt_for_wait_pt(struct
drm_syncobj *syncobj,
+   struct drm_syncobj_wait_pt
+*wait_pt) {

That whole approach still looks horrible complicated to me.

It's already very close to what you said before.


Especially the separation of signal and wait pt is
completely unnecessary as far as I can see.
When a wait pt is requested we just need to search for
the signal point which it will trigger.

Yeah, I tried this, but when I implement cpu wait ioctl on
specific point, we need a advanced wait pt fence,
otherwise, we could still need old syncobj cb.

Why? I mean you just need to call drm_syncobj_find_fence()
and

when

that one returns NULL you use wait_event_*() to wait for a
signal point >= your wait point to appear and try again.

e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC
have no fence yet, as you said, during
drm_syncobj_find_fence(A) is working on wait_event,

syncobjB

and syncobjC could already be signaled, then we don't know
which one is first signaled, which is need when wait ioctl
returns.

I don't really see a problem with that. When you wait for the
first one you need to wait for A,B,C at the same time anyway.

So what you do is to register a fence callback on the fences
you already have and for the syncobj which doesn't yet have a
fence you make sure that they wake up your thread when they
get one.

So essentially exactly what
drm_syncobj_fence_get_or_add_callback()
already does today.

So do you mean we need still use old syncobj CB for that?

Yes, as far as I can see it should work.


     Advanced wait pt is bad?

Well it isn't bad, I just don't see any advantage in it.

The advantage is to replace old syncobj cb.


The existing mechanism
should already be able to handle that.

I thought more a bit, we don't that mechanism at all, if use
advanced wait

pt, we can easily use fence array to achieve it for wait ioctl, we
should use kernel existing feature as much as possible, not invent
another, shouldn't we?
I remember  you said  it before.

Yeah, but the syncobj cb is an existing feature.

This is obviously a workaround when doing for wait ioctl, Do you
see it used in other place?


And I absolutely don't see a
need to modify that and replace it with something far more complex.

The wait ioctl is simplified much more by fence array, not complex,
and we just need  to allocate a wait pt.  If keeping old syncobj cb
workaround, all wait pt logic still is there, just save allocation
and wait pt handling, in fact, which part isn't complex at all. But
compare with ugly syncobj cb, which is simpler.

I strongly disagree on that. You just need to extend the syncobj cb
with the sequence number and you are done.

We could clean that up in the long term by adding some wait_multi
event macro, but for now just adding the sequence number should do
the trick.

Quote from Daniel Vetter comment when v1, "

Specifically for this stuff here having unified future fence
semantics will allow drivers to do clever stuff with them.

"
I think the advanced wait pt is a similar concept as 'future fence'
what Daniel Vetter said before, which obviously a right direction.


Anyway, I will change the patch as you like if no other comment, so
that the patch can pass soon.

When I try to remove w

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-14 Thread zhoucm1



On 2018年09月13日 18:22, Christian König wrote:

Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing):



-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 5:20 PM
To: Zhou, David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):

-Original Message-
From: Christian König 
Sent: Thursday, September 13, 2018 4:50 PM
To: Zhou, David(ChunMing) ; Koenig, Christian
; dri-de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):

-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 2:56 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:

[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct
drm_syncobj *syncobj,
+   struct drm_syncobj_wait_pt
+*wait_pt) {

That whole approach still looks horrible complicated to me.

It's already very close to what you said before.


Especially the separation of signal and wait pt is completely
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the
signal point which it will trigger.

Yeah, I tried this, but when I implement cpu wait ioctl on
specific point, we need a advanced wait pt fence, otherwise, we
could still need old syncobj cb.

Why? I mean you just need to call drm_syncobj_find_fence() and

when

that one returns NULL you use wait_event_*() to wait for a signal
point >= your wait point to appear and try again.

e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have
no fence yet, as you said, during drm_syncobj_find_fence(A) is
working on wait_event, syncobjB and syncobjC could already be
signaled, then we don't know which one is first signaled, which is
need when wait ioctl returns.

I don't really see a problem with that. When you wait for the first
one you need to wait for A,B,C at the same time anyway.

So what you do is to register a fence callback on the fences you
already have and for the syncobj which doesn't yet have a fence you
make sure that they wake up your thread when they get one.

So essentially exactly what drm_syncobj_fence_get_or_add_callback()
already does today.

So do you mean we need still use old syncobj CB for that?

Yes, as far as I can see it should work.


    Advanced wait pt is bad?

Well it isn't bad, I just don't see any advantage in it.

The advantage is to replace old syncobj cb.


The existing mechanism
should already be able to handle that.
I thought more a bit, we don't that mechanism at all, if use 
advanced wait
pt, we can easily use fence array to achieve it for wait ioctl, we 
should use
kernel existing feature as much as possible, not invent another, 
shouldn't we?

I remember  you said  it before.

Yeah, but the syncobj cb is an existing feature.
This is obviously a workaround when doing for wait ioctl, Do you see 
it used in other place?



And I absolutely don't see a
need to modify that and replace it with something far more complex.
The wait ioctl is simplified much more by fence array, not complex, 
and we just need  to allocate a wait pt.  If keeping old syncobj cb 
workaround, all wait pt logic still is there, just save allocation 
and wait pt handling, in fact, which part isn't complex at all. But 
compare with ugly syncobj cb, which is simpler.


I strongly disagree on that. You just need to extend the syncobj cb 
with the sequence number and you are done.


We could clean that up in the long term by adding some wait_multi 
event macro, but for now just adding the sequence number should do the 
trick.


Quote from Daniel Vetter comment when v1, "

Specifically for this stuff here having unified future fence semantics
will allow drivers to do clever stuff with them.

"
I think the advanced wait pt is a similar concept as 'future fence' what 
Daniel Vetter said before, which obviously a right direction.



Anyway, I will change the patch as you like if no other comment, so that 
the patch can pass soon.


Thanks,
David Zhou


Regards,
Christian.



Thanks,
David Zhou

Regards,
Christian.


Thanks,
David Zhou

Christian.


Thanks,
David Zhou

Regards,
Christian.


Back to my implementation, it already fixes all your concerns
before, and can be able to easily used in wait_ioctl. When you
feel that is complicated, I guess that is because we merged all
logic to that and much clean up in one patch. In fact, it already
is very simple, timeline_init/fini, create signal/

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-14 Thread zhoucm1



On 2018年09月14日 11:14, zhoucm1 wrote:



On 2018年09月13日 18:22, Christian König wrote:

Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing):



-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 5:20 PM
To: Zhou, David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):

-Original Message-
From: Christian König 
Sent: Thursday, September 13, 2018 4:50 PM
To: Zhou, David(ChunMing) ; Koenig, Christian
; dri-de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):

-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 2:56 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:

[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct
drm_syncobj *syncobj,
+   struct drm_syncobj_wait_pt
+*wait_pt) {

That whole approach still looks horrible complicated to me.

It's already very close to what you said before.


Especially the separation of signal and wait pt is completely
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the
signal point which it will trigger.

Yeah, I tried this, but when I implement cpu wait ioctl on
specific point, we need a advanced wait pt fence, otherwise, we
could still need old syncobj cb.

Why? I mean you just need to call drm_syncobj_find_fence() and

when
that one returns NULL you use wait_event_*() to wait for a 
signal

point >= your wait point to appear and try again.
e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC 
have

no fence yet, as you said, during drm_syncobj_find_fence(A) is
working on wait_event, syncobjB and syncobjC could already be
signaled, then we don't know which one is first signaled, 
which is

need when wait ioctl returns.
I don't really see a problem with that. When you wait for the 
first

one you need to wait for A,B,C at the same time anyway.

So what you do is to register a fence callback on the fences you
already have and for the syncobj which doesn't yet have a fence 
you

make sure that they wake up your thread when they get one.

So essentially exactly what 
drm_syncobj_fence_get_or_add_callback()

already does today.

So do you mean we need still use old syncobj CB for that?

Yes, as far as I can see it should work.


    Advanced wait pt is bad?

Well it isn't bad, I just don't see any advantage in it.

The advantage is to replace old syncobj cb.


The existing mechanism
should already be able to handle that.
I thought more a bit, we don't that mechanism at all, if use 
advanced wait
pt, we can easily use fence array to achieve it for wait ioctl, we 
should use
kernel existing feature as much as possible, not invent another, 
shouldn't we?

I remember  you said  it before.

Yeah, but the syncobj cb is an existing feature.
This is obviously a workaround when doing for wait ioctl, Do you see 
it used in other place?



And I absolutely don't see a
need to modify that and replace it with something far more complex.
The wait ioctl is simplified much more by fence array, not complex, 
and we just need  to allocate a wait pt.  If keeping old syncobj cb 
workaround, all wait pt logic still is there, just save allocation 
and wait pt handling, in fact, which part isn't complex at all. But 
compare with ugly syncobj cb, which is simpler.


I strongly disagree on that. You just need to extend the syncobj cb 
with the sequence number and you are done.


We could clean that up in the long term by adding some wait_multi 
event macro, but for now just adding the sequence number should do 
the trick.


Quote from Daniel Vetter comment when v1, "

Specifically for this stuff here having unified future fence semantics
will allow drivers to do clever stuff with them.

"
I think the advanced wait pt is a similar concept as 'future fence' 
what Daniel Vetter said before, which obviously a right direction.



Anyway, I will change the patch as you like if no other comment, so 
that the patch can pass soon.
When I try to remove wait pt future fence, I encounter another problem, 
drm_syncobj_find_fence cannot get a fence if signal pt already is 
collected as garbage, then CS will report error, any idea for that?
I still think the future fence is right thing, Could you give futher 
thought on it again? Otherwise, we could need various workarounds.


Thanks,
David Zhou


Thanks,
David Zhou


Regards,
Christian.



Thank

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Christian König

Am 13.09.2018 um 11:35 schrieb Zhou, David(ChunMing):



-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 5:20 PM
To: Zhou, David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):

-Original Message-
From: Christian König 
Sent: Thursday, September 13, 2018 4:50 PM
To: Zhou, David(ChunMing) ; Koenig, Christian
; dri-de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):

-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 2:56 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:

[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct
drm_syncobj *syncobj,
+   struct drm_syncobj_wait_pt
+*wait_pt) {

That whole approach still looks horrible complicated to me.

It's already very close to what you said before.


Especially the separation of signal and wait pt is completely
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the
signal point which it will trigger.

Yeah, I tried this, but when I implement cpu wait ioctl on
specific point, we need a advanced wait pt fence, otherwise, we
could still need old syncobj cb.

Why? I mean you just need to call drm_syncobj_find_fence() and

when

that one returns NULL you use wait_event_*() to wait for a signal
point >= your wait point to appear and try again.

e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have
no fence yet, as you said, during drm_syncobj_find_fence(A) is
working on wait_event, syncobjB and syncobjC could already be
signaled, then we don't know which one is first signaled, which is
need when wait ioctl returns.

I don't really see a problem with that. When you wait for the first
one you need to wait for A,B,C at the same time anyway.

So what you do is to register a fence callback on the fences you
already have and for the syncobj which doesn't yet have a fence you
make sure that they wake up your thread when they get one.

So essentially exactly what drm_syncobj_fence_get_or_add_callback()
already does today.

So do you mean we need still use old syncobj CB for that?

Yes, as far as I can see it should work.


Advanced wait pt is bad?

Well it isn't bad, I just don't see any advantage in it.

The advantage is to replace old syncobj cb.


The existing mechanism
should already be able to handle that.

I thought more a bit, we don't that mechanism at all, if use advanced wait

pt, we can easily use fence array to achieve it for wait ioctl, we should use
kernel existing feature as much as possible, not invent another, shouldn't we?
I remember  you said  it before.

Yeah, but the syncobj cb is an existing feature.

This is obviously a workaround when doing for wait ioctl, Do you see it used in 
other place?


And I absolutely don't see a
need to modify that and replace it with something far more complex.

The wait ioctl is simplified much more by fence array, not complex, and we just 
need  to allocate a wait pt.  If keeping old syncobj cb workaround, all wait pt 
logic still is there, just save allocation and wait pt handling, in fact, which 
part isn't complex at all. But compare with ugly syncobj cb, which is simpler.


I strongly disagree on that. You just need to extend the syncobj cb with 
the sequence number and you are done.


We could clean that up in the long term by adding some wait_multi event 
macro, but for now just adding the sequence number should do the trick.


Regards,
Christian.



Thanks,
David Zhou

Regards,
Christian.


Thanks,
David Zhou

Christian.


Thanks,
David Zhou

Regards,
Christian.


Back to my implementation, it already fixes all your concerns
before, and can be able to easily used in wait_ioctl. When you
feel that is complicated, I guess that is because we merged all
logic to that and much clean up in one patch. In fact, it already
is very simple, timeline_init/fini, create signal/wait_pt, find
signal_pt for wait_pt, garbage collection, just them.

Thanks,
David Zhou

Regards,
Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Koenig, Christian
> Sent: Thursday, September 13, 2018 5:20 PM
> To: Zhou, David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):
> >
> >> -Original Message-
> >> From: Christian König 
> >> Sent: Thursday, September 13, 2018 4:50 PM
> >> To: Zhou, David(ChunMing) ; Koenig, Christian
> >> ; dri-de...@lists.freedesktop.org
> >> Cc: Dave Airlie ; Rakos, Daniel
> >> ; amd-gfx@lists.freedesktop.org
> >> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>
> >> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):
> >>>> -Original Message-
> >>>> From: Koenig, Christian
> >>>> Sent: Thursday, September 13, 2018 2:56 PM
> >>>> To: Zhou, David(ChunMing) ; Zhou,
> >>>> David(ChunMing) ; dri-
> >>>> de...@lists.freedesktop.org
> >>>> Cc: Dave Airlie ; Rakos, Daniel
> >>>> ; amd-gfx@lists.freedesktop.org
> >>>> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>>>
> >>>> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> >>>>> On 2018年09月12日 19:05, Christian König wrote:
> >>>>>>>>>> [SNIP]
> >>>>>>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>>>>>> drm_syncobj *syncobj,
> >>>>>>>>>> +   struct drm_syncobj_wait_pt
> >>>>>>>>>> +*wait_pt) {
> >>>>>>>>> That whole approach still looks horrible complicated to me.
> >>>>>>> It's already very close to what you said before.
> >>>>>>>
> >>>>>>>>> Especially the separation of signal and wait pt is completely
> >>>>>>>>> unnecessary as far as I can see.
> >>>>>>>>> When a wait pt is requested we just need to search for the
> >>>>>>>>> signal point which it will trigger.
> >>>>>>> Yeah, I tried this, but when I implement cpu wait ioctl on
> >>>>>>> specific point, we need a advanced wait pt fence, otherwise, we
> >>>>>>> could still need old syncobj cb.
> >>>>>> Why? I mean you just need to call drm_syncobj_find_fence() and
> >> when
> >>>>>> that one returns NULL you use wait_event_*() to wait for a signal
> >>>>>> point >= your wait point to appear and try again.
> >>>>> e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have
> >>>>> no fence yet, as you said, during drm_syncobj_find_fence(A) is
> >>>>> working on wait_event, syncobjB and syncobjC could already be
> >>>>> signaled, then we don't know which one is first signaled, which is
> >>>>> need when wait ioctl returns.
> >>>> I don't really see a problem with that. When you wait for the first
> >>>> one you need to wait for A,B,C at the same time anyway.
> >>>>
> >>>> So what you do is to register a fence callback on the fences you
> >>>> already have and for the syncobj which doesn't yet have a fence you
> >>>> make sure that they wake up your thread when they get one.
> >>>>
> >>>> So essentially exactly what drm_syncobj_fence_get_or_add_callback()
> >>>> already does today.
> >>> So do you mean we need still use old syncobj CB for that?
> >> Yes, as far as I can see it should work.
> >>
> >>>Advanced wait pt is bad?
> >> Well it isn't bad, I just don't see any advantage in it.
> >
> > The advantage is to replace old syncobj cb.
> >
> >> The existing mechanism
> >> should already be able to handle that.
> > I thought more a bit, we don't that mechanism at all, if use advanced wait
> pt, we can easily use fence array to achieve it for wait ioctl, we should use
> kernel existing feature as much as possible, not invent another, shouldn't we?
> I remember  you said  it before.
> 
> Yeah, but the syncobj cb is an existing feature.

This is obviously a workaround when doing for wait ioctl, Do you see it used in 
other place?

> And I absolutely don't

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Christian König

Am 13.09.2018 um 11:11 schrieb Zhou, David(ChunMing):



-Original Message-
From: Christian König 
Sent: Thursday, September 13, 2018 4:50 PM
To: Zhou, David(ChunMing) ; Koenig, Christian
; dri-de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):

-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 2:56 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:

[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct
drm_syncobj *syncobj,
+   struct drm_syncobj_wait_pt
+*wait_pt) {

That whole approach still looks horrible complicated to me.

It's already very close to what you said before.


Especially the separation of signal and wait pt is completely
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the
signal point which it will trigger.

Yeah, I tried this, but when I implement cpu wait ioctl on
specific point, we need a advanced wait pt fence, otherwise, we
could still need old syncobj cb.

Why? I mean you just need to call drm_syncobj_find_fence() and

when

that one returns NULL you use wait_event_*() to wait for a signal
point >= your wait point to appear and try again.

e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have
no fence yet, as you said, during drm_syncobj_find_fence(A) is
working on wait_event, syncobjB and syncobjC could already be
signaled, then we don't know which one is first signaled, which is
need when wait ioctl returns.

I don't really see a problem with that. When you wait for the first
one you need to wait for A,B,C at the same time anyway.

So what you do is to register a fence callback on the fences you
already have and for the syncobj which doesn't yet have a fence you
make sure that they wake up your thread when they get one.

So essentially exactly what drm_syncobj_fence_get_or_add_callback()
already does today.

So do you mean we need still use old syncobj CB for that?

Yes, as far as I can see it should work.


   Advanced wait pt is bad?

Well it isn't bad, I just don't see any advantage in it.


The advantage is to replace old syncobj cb.


The existing mechanism
should already be able to handle that.

I thought more a bit, we don't that mechanism at all, if use advanced wait pt, 
we can easily use fence array to achieve it for wait ioctl, we should use 
kernel existing feature as much as possible, not invent another, shouldn't we?  
I remember  you said  it before.


Yeah, but the syncobj cb is an existing feature. And I absolutely don't 
see a need to modify that and replace it with something far more complex.


Regards,
Christian.



Thanks,
David Zhou

Christian.


Thanks,
David Zhou

Regards,
Christian.


Back to my implementation, it already fixes all your concerns
before, and can be able to easily used in wait_ioctl. When you feel
that is complicated, I guess that is because we merged all logic to
that and much clean up in one patch. In fact, it already is very
simple, timeline_init/fini, create signal/wait_pt, find signal_pt
for wait_pt, garbage collection, just them.

Thanks,
David Zhou

Regards,
Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Christian König 
> Sent: Thursday, September 13, 2018 4:50 PM
> To: Zhou, David(ChunMing) ; Koenig, Christian
> ; dri-de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):
> >
> >> -Original Message-
> >> From: Koenig, Christian
> >> Sent: Thursday, September 13, 2018 2:56 PM
> >> To: Zhou, David(ChunMing) ; Zhou,
> >> David(ChunMing) ; dri-
> >> de...@lists.freedesktop.org
> >> Cc: Dave Airlie ; Rakos, Daniel
> >> ; amd-gfx@lists.freedesktop.org
> >> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> >>
> >> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> >>> On 2018年09月12日 19:05, Christian König wrote:
> >>>>>>>> [SNIP]
> >>>>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>>>> drm_syncobj *syncobj,
> >>>>>>>> +   struct drm_syncobj_wait_pt
> >>>>>>>> +*wait_pt) {
> >>>>>>> That whole approach still looks horrible complicated to me.
> >>>>> It's already very close to what you said before.
> >>>>>
> >>>>>>> Especially the separation of signal and wait pt is completely
> >>>>>>> unnecessary as far as I can see.
> >>>>>>> When a wait pt is requested we just need to search for the
> >>>>>>> signal point which it will trigger.
> >>>>> Yeah, I tried this, but when I implement cpu wait ioctl on
> >>>>> specific point, we need a advanced wait pt fence, otherwise, we
> >>>>> could still need old syncobj cb.
> >>>> Why? I mean you just need to call drm_syncobj_find_fence() and
> when
> >>>> that one returns NULL you use wait_event_*() to wait for a signal
> >>>> point >= your wait point to appear and try again.
> >>> e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have
> >>> no fence yet, as you said, during drm_syncobj_find_fence(A) is
> >>> working on wait_event, syncobjB and syncobjC could already be
> >>> signaled, then we don't know which one is first signaled, which is
> >>> need when wait ioctl returns.
> >> I don't really see a problem with that. When you wait for the first
> >> one you need to wait for A,B,C at the same time anyway.
> >>
> >> So what you do is to register a fence callback on the fences you
> >> already have and for the syncobj which doesn't yet have a fence you
> >> make sure that they wake up your thread when they get one.
> >>
> >> So essentially exactly what drm_syncobj_fence_get_or_add_callback()
> >> already does today.
> > So do you mean we need still use old syncobj CB for that?
> 
> Yes, as far as I can see it should work.
> 
> >   Advanced wait pt is bad?
> 
> Well it isn't bad, I just don't see any advantage in it.


The advantage is to replace old syncobj cb.

> The existing mechanism
> should already be able to handle that.

I thought more a bit, we don't that mechanism at all, if use advanced wait pt, 
we can easily use fence array to achieve it for wait ioctl, we should use 
kernel existing feature as much as possible, not invent another, shouldn't we?  
I remember  you said  it before.

Thanks,
David Zhou
> 
> Christian.
> 
> >
> > Thanks,
> > David Zhou
> >> Regards,
> >> Christian.
> >>
> >>> Back to my implementation, it already fixes all your concerns
> >>> before, and can be able to easily used in wait_ioctl. When you feel
> >>> that is complicated, I guess that is because we merged all logic to
> >>> that and much clean up in one patch. In fact, it already is very
> >>> simple, timeline_init/fini, create signal/wait_pt, find signal_pt
> >>> for wait_pt, garbage collection, just them.
> >>>
> >>> Thanks,
> >>> David Zhou
> >>>> Regards,
> >>>> Christian.
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Christian König

Am 13.09.2018 um 09:43 schrieb Zhou, David(ChunMing):



-Original Message-
From: Koenig, Christian
Sent: Thursday, September 13, 2018 2:56 PM
To: Zhou, David(ChunMing) ; Zhou,
David(ChunMing) ; dri-
de...@lists.freedesktop.org
Cc: Dave Airlie ; Rakos, Daniel
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:

[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct
drm_syncobj *syncobj,
+   struct drm_syncobj_wait_pt *wait_pt)
+{

That whole approach still looks horrible complicated to me.

It's already very close to what you said before.


Especially the separation of signal and wait pt is completely
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the signal
point which it will trigger.

Yeah, I tried this, but when I implement cpu wait ioctl on specific
point, we need a advanced wait pt fence, otherwise, we could still
need old syncobj cb.

Why? I mean you just need to call drm_syncobj_find_fence() and when
that one returns NULL you use wait_event_*() to wait for a signal
point >= your wait point to appear and try again.

e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have no
fence yet, as you said, during drm_syncobj_find_fence(A) is working on
wait_event, syncobjB and syncobjC could already be signaled, then we
don't know which one is first signaled, which is need when wait ioctl
returns.

I don't really see a problem with that. When you wait for the first one you
need to wait for A,B,C at the same time anyway.

So what you do is to register a fence callback on the fences you already have
and for the syncobj which doesn't yet have a fence you make sure that they
wake up your thread when they get one.

So essentially exactly what drm_syncobj_fence_get_or_add_callback()
already does today.

So do you mean we need still use old syncobj CB for that?


Yes, as far as I can see it should work.


  Advanced wait pt is bad?


Well it isn't bad, I just don't see any advantage in it. The existing 
mechanism should already be able to handle that.


Christian.



Thanks,
David Zhou

Regards,
Christian.


Back to my implementation, it already fixes all your concerns before,
and can be able to easily used in wait_ioctl. When you feel that is
complicated, I guess that is because we merged all logic to that and
much clean up in one patch. In fact, it already is very simple,
timeline_init/fini, create signal/wait_pt, find signal_pt for wait_pt,
garbage collection, just them.

Thanks,
David Zhou

Regards,
Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Zhou, David(ChunMing)


> -Original Message-
> From: Koenig, Christian
> Sent: Thursday, September 13, 2018 2:56 PM
> To: Zhou, David(ChunMing) ; Zhou,
> David(ChunMing) ; dri-
> de...@lists.freedesktop.org
> Cc: Dave Airlie ; Rakos, Daniel
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4
> 
> Am 13.09.2018 um 04:15 schrieb zhoucm1:
> > On 2018年09月12日 19:05, Christian König wrote:
> >>>>>
> >>>>>> [SNIP]
> >>>>>> +static void drm_syncobj_find_signal_pt_for_wait_pt(struct
> >>>>>> drm_syncobj *syncobj,
> >>>>>> +   struct drm_syncobj_wait_pt *wait_pt)
> >>>>>> +{
> >>>>>
> >>>>> That whole approach still looks horrible complicated to me.
> >>> It's already very close to what you said before.
> >>>
> >>>>>
> >>>>> Especially the separation of signal and wait pt is completely
> >>>>> unnecessary as far as I can see.
> >>>>> When a wait pt is requested we just need to search for the signal
> >>>>> point which it will trigger.
> >>> Yeah, I tried this, but when I implement cpu wait ioctl on specific
> >>> point, we need a advanced wait pt fence, otherwise, we could still
> >>> need old syncobj cb.
> >>
> >> Why? I mean you just need to call drm_syncobj_find_fence() and when
> >> that one returns NULL you use wait_event_*() to wait for a signal
> >> point >= your wait point to appear and try again.
> > e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have no
> > fence yet, as you said, during drm_syncobj_find_fence(A) is working on
> > wait_event, syncobjB and syncobjC could already be signaled, then we
> > don't know which one is first signaled, which is need when wait ioctl
> > returns.
> 
> I don't really see a problem with that. When you wait for the first one you
> need to wait for A,B,C at the same time anyway.
> 
> So what you do is to register a fence callback on the fences you already have
> and for the syncobj which doesn't yet have a fence you make sure that they
> wake up your thread when they get one.
> 
> So essentially exactly what drm_syncobj_fence_get_or_add_callback()
> already does today.

So do you mean we need still use old syncobj CB for that? Advanced wait pt is 
bad?

Thanks,
David Zhou
> 
> Regards,
> Christian.
> 
> >
> > Back to my implementation, it already fixes all your concerns before,
> > and can be able to easily used in wait_ioctl. When you feel that is
> > complicated, I guess that is because we merged all logic to that and
> > much clean up in one patch. In fact, it already is very simple,
> > timeline_init/fini, create signal/wait_pt, find signal_pt for wait_pt,
> > garbage collection, just them.
> >
> > Thanks,
> > David Zhou
> >>
> >> Regards,
> >> Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-13 Thread Christian König

Am 13.09.2018 um 04:15 schrieb zhoucm1:

On 2018年09月12日 19:05, Christian König wrote:



[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct 
drm_syncobj *syncobj,

+   struct drm_syncobj_wait_pt *wait_pt)
+{


That whole approach still looks horrible complicated to me.

It's already very close to what you said before.



Especially the separation of signal and wait pt is completely 
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the signal 
point which it will trigger.
Yeah, I tried this, but when I implement cpu wait ioctl on specific 
point, we need a advanced wait pt fence, otherwise, we could still 
need old syncobj cb.


Why? I mean you just need to call drm_syncobj_find_fence() and when 
that one returns NULL you use wait_event_*() to wait for a signal 
point >= your wait point to appear and try again.
e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have no 
fence yet, as you said, during drm_syncobj_find_fence(A) is working on 
wait_event, syncobjB and syncobjC could already be signaled, then we 
don't know which one is first signaled, which is need when wait ioctl 
returns.


I don't really see a problem with that. When you wait for the first one 
you need to wait for A,B,C at the same time anyway.


So what you do is to register a fence callback on the fences you already 
have and for the syncobj which doesn't yet have a fence you make sure 
that they wake up your thread when they get one.


So essentially exactly what drm_syncobj_fence_get_or_add_callback() 
already does today.


Regards,
Christian.



Back to my implementation, it already fixes all your concerns before, 
and can be able to easily used in wait_ioctl. When you feel that is 
complicated, I guess that is because we merged all logic to that and 
much clean up in one patch. In fact, it already is very simple, 
timeline_init/fini, create signal/wait_pt, find signal_pt for wait_pt, 
garbage collection, just them.


Thanks,
David Zhou


Regards,
Christian.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-12 Thread zhoucm1



On 2018年09月12日 19:05, Christian König wrote:

Am 12.09.2018 um 12:20 schrieb zhoucm1:

[SNIP]

Drop the term semaphore here, better use syncobj.
This is from VK_KHR_timeline_semaphore extension describe, not my 
invention, I just quote it. In kernel side, we call syncobj, in UMD, 
they still call semaphore.


Yeah, but we don't care about close source UMD names in the kernel and'
the open source UMD calls it syncobj as well.




[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct 
drm_syncobj *syncobj,

+   struct drm_syncobj_wait_pt *wait_pt)
+{


That whole approach still looks horrible complicated to me.

It's already very close to what you said before.



Especially the separation of signal and wait pt is completely 
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the signal 
point which it will trigger.
Yeah, I tried this, but when I implement cpu wait ioctl on specific 
point, we need a advanced wait pt fence, otherwise, we could still 
need old syncobj cb.


Why? I mean you just need to call drm_syncobj_find_fence() and when 
that one returns NULL you use wait_event_*() to wait for a signal 
point >= your wait point to appear and try again.
e.g. when there are 3 syncobjs(A,B,C) to wait, all syncobjABC have no 
fence yet, as you said, during drm_syncobj_find_fence(A) is working on 
wait_event, syncobjB and syncobjC could already be signaled, then we 
don't know which one is first signaled, which is need when wait ioctl 
returns.


Back to my implementation, it already fixes all your concerns before, 
and can be able to easily used in wait_ioctl. When you feel that is 
complicated, I guess that is because we merged all logic to that and 
much clean up in one patch. In fact, it already is very simple, 
timeline_init/fini, create signal/wait_pt, find signal_pt for wait_pt, 
garbage collection, just them.


Thanks,
David Zhou


Regards,
Christian.




Thanks,
David Zhou


Regards,
Christian.

+    struct drm_syncobj_timeline *timeline = 
>syncobj_timeline;

+    struct drm_syncobj_signal_pt *signal_pt;
+    int ret;
+
+    if (wait_pt->signal_pt_fence) {
+    return;
+    } else if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
+   (wait_pt->value <= timeline->timeline)) {
+    dma_fence_signal(_pt->base.base);
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    dma_fence_put(_pt->base.base);
+    return;
+    }
+
+    list_for_each_entry(signal_pt, >signal_pt_list, 
list) {

+    if (wait_pt->value < signal_pt->value)
+    continue;
+    if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
+    (wait_pt->value != signal_pt->value))
+    continue;
+    wait_pt->signal_pt_fence = 
dma_fence_get(_pt->base->base);

+    ret = dma_fence_add_callback(wait_pt->signal_pt_fence,
+ _pt->wait_cb,
+ wait_pt_func);
+    if (ret == -ENOENT) {
+    dma_fence_signal(_pt->base.base);
+    dma_fence_put(wait_pt->signal_pt_fence);
+    wait_pt->signal_pt_fence = NULL;
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    dma_fence_put(_pt->base.base);
+    } else if (ret < 0) {
+    dma_fence_put(wait_pt->signal_pt_fence);
+    DRM_ERROR("add callback error!");
+    } else {
+    /* after adding callback, remove from rb tree */
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    }
+    return;
+    }
+    /* signaled pt was released */
+    if (!wait_pt->signal_pt_fence && (wait_pt->value <=
+  timeline->signal_point)) {
+    dma_fence_signal(_pt->base.base);
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    dma_fence_put(_pt->base.base);
+    }
  }
  -void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
-  struct drm_syncobj_cb *cb,
-  drm_syncobj_func_t func)
+static int drm_syncobj_timeline_create_signal_pt(struct 
drm_syncobj *syncobj,

+ struct dma_fence *fence,
+ u64 point)
  {
+    struct drm_syncobj_signal_pt *signal_pt =
+    kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
+    struct drm_syncobj_signal_pt *tail_pt;
+    struct dma_fence **fences;
+    struct rb_node *node;
+    struct drm_syncobj_wait_pt *tail_wait_pt = NULL;
+    int num_fences = 0;
+    int ret = 0, i;
+
+    if (!signal_pt)
+    return -ENOMEM;
+    if (syncobj->syncobj_timeline.signal_point >= point) {
+    DRM_WARN("A later signal is ready!");
+    goto out;
+    }
+    if (!fence)
+    goto out;
+
+    fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
+    if (!fences)
+    goto out;
+    fences[num_fences++] = 

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-12 Thread Christian König

Am 12.09.2018 um 12:20 schrieb zhoucm1:

[SNIP]

Drop the term semaphore here, better use syncobj.
This is from VK_KHR_timeline_semaphore extension describe, not my 
invention, I just quote it. In kernel side, we call syncobj, in UMD, 
they still call semaphore.


Yeah, but we don't care about close source UMD names in the kernel and 
the open source UMD calls it syncobj as well.





[SNIP]
+static void drm_syncobj_find_signal_pt_for_wait_pt(struct 
drm_syncobj *syncobj,

+   struct drm_syncobj_wait_pt *wait_pt)
+{


That whole approach still looks horrible complicated to me.

It's already very close to what you said before.



Especially the separation of signal and wait pt is completely 
unnecessary as far as I can see.
When a wait pt is requested we just need to search for the signal 
point which it will trigger.
Yeah, I tried this, but when I implement cpu wait ioctl on specific 
point, we need a advanced wait pt fence, otherwise, we could still 
need old syncobj cb.


Why? I mean you just need to call drm_syncobj_find_fence() and when that 
one returns NULL you use wait_event_*() to wait for a signal point >= 
your wait point to appear and try again.


Regards,
Christian.




Thanks,
David Zhou


Regards,
Christian.

+    struct drm_syncobj_timeline *timeline = 
>syncobj_timeline;

+    struct drm_syncobj_signal_pt *signal_pt;
+    int ret;
+
+    if (wait_pt->signal_pt_fence) {
+    return;
+    } else if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
+   (wait_pt->value <= timeline->timeline)) {
+    dma_fence_signal(_pt->base.base);
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    dma_fence_put(_pt->base.base);
+    return;
+    }
+
+    list_for_each_entry(signal_pt, >signal_pt_list, list) {
+    if (wait_pt->value < signal_pt->value)
+    continue;
+    if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
+    (wait_pt->value != signal_pt->value))
+    continue;
+    wait_pt->signal_pt_fence = 
dma_fence_get(_pt->base->base);

+    ret = dma_fence_add_callback(wait_pt->signal_pt_fence,
+ _pt->wait_cb,
+ wait_pt_func);
+    if (ret == -ENOENT) {
+    dma_fence_signal(_pt->base.base);
+    dma_fence_put(wait_pt->signal_pt_fence);
+    wait_pt->signal_pt_fence = NULL;
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    dma_fence_put(_pt->base.base);
+    } else if (ret < 0) {
+    dma_fence_put(wait_pt->signal_pt_fence);
+    DRM_ERROR("add callback error!");
+    } else {
+    /* after adding callback, remove from rb tree */
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    }
+    return;
+    }
+    /* signaled pt was released */
+    if (!wait_pt->signal_pt_fence && (wait_pt->value <=
+  timeline->signal_point)) {
+    dma_fence_signal(_pt->base.base);
+    rb_erase(_pt->node,
+ >wait_pt_tree);
+    RB_CLEAR_NODE(_pt->node);
+    dma_fence_put(_pt->base.base);
+    }
  }
  -void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
-  struct drm_syncobj_cb *cb,
-  drm_syncobj_func_t func)
+static int drm_syncobj_timeline_create_signal_pt(struct 
drm_syncobj *syncobj,

+ struct dma_fence *fence,
+ u64 point)
  {
+    struct drm_syncobj_signal_pt *signal_pt =
+    kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
+    struct drm_syncobj_signal_pt *tail_pt;
+    struct dma_fence **fences;
+    struct rb_node *node;
+    struct drm_syncobj_wait_pt *tail_wait_pt = NULL;
+    int num_fences = 0;
+    int ret = 0, i;
+
+    if (!signal_pt)
+    return -ENOMEM;
+    if (syncobj->syncobj_timeline.signal_point >= point) {
+    DRM_WARN("A later signal is ready!");
+    goto out;
+    }
+    if (!fence)
+    goto out;
+
+    fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
+    if (!fences)
+    goto out;
+    fences[num_fences++] = dma_fence_get(fence);
+    /* timeline syncobj must take this dependency */
+    if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+    spin_lock(>lock);
+    if (!list_empty(>syncobj_timeline.signal_pt_list)) {
+    tail_pt = 
list_last_entry(>syncobj_timeline.signal_pt_list,

+  struct drm_syncobj_signal_pt, list);
+    fences[num_fences++] = 
dma_fence_get(_pt->base->base);

+    }
+    spin_unlock(>lock);
+    }
+    signal_pt->base = dma_fence_array_create(num_fences, fences,
+ syncobj->syncobj_timeline.timeline_context,
+ point, false);
+    if (!signal_pt->base)
+    goto fail;
+
+    signal_pt->value = point;
  spin_lock(>lock);
-    

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-12 Thread zhoucm1



On 2018年09月12日 15:22, Christian König wrote:

Ping? Have you seen my comments here?

Sorry, I didn't see this reply.  inline...



Looks like you haven't addressed any of them in your last mail.

Christian.

Am 06.09.2018 um 09:25 schrieb Christian König:

Am 06.09.2018 um 08:25 schrieb Chunming Zhou:

VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an 
integer payload

identifying a point in a timeline. Such timeline semaphores support the


Drop the term semaphore here, better use syncobj.
This is from VK_KHR_timeline_semaphore extension describe, not my 
invention, I just quote it. In kernel side, we call syncobj, in UMD, 
they still call semaphore.





following operations:
    * CPU query - A host operation that allows querying the payload 
of the

  timeline semaphore.
    * CPU wait - A host operation that allows a blocking wait for a
  timeline semaphore to reach a specified value.
    * Device wait - A device operation that allows waiting for a
  timeline semaphore to reach a specified value.
    * Device signal - A device operation that allows advancing the
  timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is 
signaled before the late PT.

a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion 
fence, when PT[N] fence is signaled,

the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when 
timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than 
timeline value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait 
on any point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal 
PT, we need a sumission fence to

perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate 
patch. (Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate 
patch.
5. drop the submission_fence implementation and instead use 
wait_event() for that. (Christian)

6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and 
Christian)
 a. normal syncobj signal op will create a signal PT to tail of 
signal pt list.
 b. normal syncobj wait op will create a wait pt with last 
signal point, and this wait PT is only signaled by related signal 
point PT.

2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)

Tested by ./deqp-vk -n dEQP-VK*semaphore* for normal syncobj

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c  | 516 
+

  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
  include/drm/drm_syncobj.h  |  78 ++--
  include/uapi/drm/drm.h |   1 +
  4 files changed, 448 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index e9ce623d049e..94b31de23858 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
  #include "drm_internal.h"
  #include 
  +/* merge normal syncobj to timeline syncobj, the point interval 
is 1 */

+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
  struct drm_syncobj_stub_fence {
  struct dma_fence base;
  spinlock_t lock;
@@ -82,6 +85,18 @@ static const struct dma_fence_ops 
drm_syncobj_stub_fence_ops = {

  .release = drm_syncobj_stub_fence_release,
  };
  +struct drm_syncobj_signal_pt {
+    struct dma_fence_array *base;
+    u64    value;
+    struct list_head list;
+};
+struct drm_syncobj_wait_pt {
+    struct drm_syncobj_stub_fence base;
+    struct dma_fence *signal_pt_fence;
+    struct dma_fence_cb wait_cb;
+    struct rb_node   node;
+    u64    value;
+};
    /**
   * drm_syncobj_find - lookup and reference a sync object.
@@ -109,61 +124,238 @@ struct drm_syncobj *drm_syncobj_find(struct 
drm_file *file_private,

  }
  EXPORT_SYMBOL(drm_syncobj_find);
  -static void drm_syncobj_add_callback_locked(struct drm_syncobj 
*syncobj,

-    struct drm_syncobj_cb *cb,
-    drm_syncobj_func_t func)
+static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
+  struct drm_syncobj_timeline *syncobj_timeline)


Since we merged timeline and singleton syncobj you can drop the extra 
_timeline_ part in the function name I think.

Will try in v5.




  {
-    cb->func = func;
-    list_add_tail(>node, >cb_list);
+    

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-12 Thread Christian König

Ping? Have you seen my comments here?

Looks like you haven't addressed any of them in your last mail.

Christian.

Am 06.09.2018 um 09:25 schrieb Christian König:

Am 06.09.2018 um 08:25 schrieb Chunming Zhou:

VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer 
payload

identifying a point in a timeline. Such timeline semaphores support the


Drop the term semaphore here, better use syncobj.


following operations:
    * CPU query - A host operation that allows querying the payload 
of the

  timeline semaphore.
    * CPU wait - A host operation that allows a blocking wait for a
  timeline semaphore to reach a specified value.
    * Device wait - A device operation that allows waiting for a
  timeline semaphore to reach a specified value.
    * Device signal - A device operation that allows advancing the
  timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is 
signaled before the late PT.

a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, 
when PT[N] fence is signaled,

the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when 
timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than 
timeline value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait 
on any point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal 
PT, we need a sumission fence to

perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate 
patch. (Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate 
patch.
5. drop the submission_fence implementation and instead use 
wait_event() for that. (Christian)

6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and 
Christian)
 a. normal syncobj signal op will create a signal PT to tail of 
signal pt list.
 b. normal syncobj wait op will create a wait pt with last signal 
point, and this wait PT is only signaled by related signal point PT.

2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)

Tested by ./deqp-vk -n dEQP-VK*semaphore* for normal syncobj

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c  | 516 +
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
  include/drm/drm_syncobj.h  |  78 ++--
  include/uapi/drm/drm.h |   1 +
  4 files changed, 448 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index e9ce623d049e..94b31de23858 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
  #include "drm_internal.h"
  #include 
  +/* merge normal syncobj to timeline syncobj, the point interval is 
1 */

+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
  struct drm_syncobj_stub_fence {
  struct dma_fence base;
  spinlock_t lock;
@@ -82,6 +85,18 @@ static const struct dma_fence_ops 
drm_syncobj_stub_fence_ops = {

  .release = drm_syncobj_stub_fence_release,
  };
  +struct drm_syncobj_signal_pt {
+    struct dma_fence_array *base;
+    u64    value;
+    struct list_head list;
+};
+struct drm_syncobj_wait_pt {
+    struct drm_syncobj_stub_fence base;
+    struct dma_fence *signal_pt_fence;
+    struct dma_fence_cb wait_cb;
+    struct rb_node   node;
+    u64    value;
+};
    /**
   * drm_syncobj_find - lookup and reference a sync object.
@@ -109,61 +124,238 @@ struct drm_syncobj *drm_syncobj_find(struct 
drm_file *file_private,

  }
  EXPORT_SYMBOL(drm_syncobj_find);
  -static void drm_syncobj_add_callback_locked(struct drm_syncobj 
*syncobj,

-    struct drm_syncobj_cb *cb,
-    drm_syncobj_func_t func)
+static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
+  struct drm_syncobj_timeline *syncobj_timeline)


Since we merged timeline and singleton syncobj you can drop the extra 
_timeline_ part in the function name I think.



  {
-    cb->func = func;
-    list_add_tail(>node, >cb_list);
+    spin_lock(>lock);
+    syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
+    syncobj_timeline->timeline = 0;
+    syncobj_timeline->signal_point = 0;
+    init_waitqueue_head(_timeline->wq);
+
+    syncobj_timeline->wait_pt_tree = RB_ROOT;
+    

[PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-06 Thread Chunming Zhou
VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer payload
identifying a point in a timeline. Such timeline semaphores support the
following operations:
   * CPU query - A host operation that allows querying the payload of the
 timeline semaphore.
   * CPU wait - A host operation that allows a blocking wait for a
 timeline semaphore to reach a specified value.
   * Device wait - A device operation that allows waiting for a
 timeline semaphore to reach a specified value.
   * Device signal - A device operation that allows advancing the
 timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait on any 
point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
a. normal syncobj signal op will create a signal PT to tail of signal pt 
list.
b. normal syncobj wait op will create a wait pt with last signal point, and 
this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)

Tested by ./deqp-vk -n dEQP-VK*semaphore* for normal syncobj

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_syncobj.c  | 516 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
 include/drm/drm_syncobj.h  |  78 ++--
 include/uapi/drm/drm.h |   1 +
 4 files changed, 448 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..94b31de23858 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
 #include "drm_internal.h"
 #include 
 
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */
+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
 struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
@@ -82,6 +85,18 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops 
= {
.release = drm_syncobj_stub_fence_release,
 };
 
+struct drm_syncobj_signal_pt {
+   struct dma_fence_array *base;
+   u64value;
+   struct list_head list;
+};
+struct drm_syncobj_wait_pt {
+   struct drm_syncobj_stub_fence base;
+   struct dma_fence *signal_pt_fence;
+   struct dma_fence_cb wait_cb;
+   struct rb_node   node;
+   u64value;
+};
 
 /**
  * drm_syncobj_find - lookup and reference a sync object.
@@ -109,61 +124,238 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file 
*file_private,
 }
 EXPORT_SYMBOL(drm_syncobj_find);
 
-static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
-   struct drm_syncobj_cb *cb,
-   drm_syncobj_func_t func)
+static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
+ struct drm_syncobj_timeline 
*syncobj_timeline)
 {
-   cb->func = func;
-   list_add_tail(>node, >cb_list);
+   spin_lock(>lock);
+   syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
+   syncobj_timeline->timeline = 0;
+   syncobj_timeline->signal_point = 0;
+   init_waitqueue_head(_timeline->wq);
+
+   syncobj_timeline->wait_pt_tree = RB_ROOT;
+   INIT_LIST_HEAD(_timeline->signal_pt_list);
+   spin_unlock(>lock);
 }
 
-static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
-struct dma_fence **fence,
-struct drm_syncobj_cb *cb,
-

Re: [PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-06 Thread Christian König

Am 06.09.2018 um 08:25 schrieb Chunming Zhou:

VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer payload
identifying a point in a timeline. Such timeline semaphores support the


Drop the term semaphore here, better use syncobj.


following operations:
* CPU query - A host operation that allows querying the payload of the
  timeline semaphore.
* CPU wait - A host operation that allows a blocking wait for a
  timeline semaphore to reach a specified value.
* Device wait - A device operation that allows waiting for a
  timeline semaphore to reach a specified value.
* Device signal - A device operation that allows advancing the
  timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait on any 
point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
 a. normal syncobj signal op will create a signal PT to tail of signal pt 
list.
 b. normal syncobj wait op will create a wait pt with last signal point, 
and this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)

Tested by ./deqp-vk -n dEQP-VK*semaphore* for normal syncobj

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c  | 516 +
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
  include/drm/drm_syncobj.h  |  78 ++--
  include/uapi/drm/drm.h |   1 +
  4 files changed, 448 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..94b31de23858 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
  #include "drm_internal.h"
  #include 
  
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */

+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
  struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
@@ -82,6 +85,18 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops 
= {
.release = drm_syncobj_stub_fence_release,
  };
  
+struct drm_syncobj_signal_pt {

+   struct dma_fence_array *base;
+   u64value;
+   struct list_head list;
+};
+struct drm_syncobj_wait_pt {
+   struct drm_syncobj_stub_fence base;
+   struct dma_fence *signal_pt_fence;
+   struct dma_fence_cb wait_cb;
+   struct rb_node   node;
+   u64value;
+};
  
  /**

   * drm_syncobj_find - lookup and reference a sync object.
@@ -109,61 +124,238 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file 
*file_private,
  }
  EXPORT_SYMBOL(drm_syncobj_find);
  
-static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,

-   struct drm_syncobj_cb *cb,
-   drm_syncobj_func_t func)
+static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
+ struct drm_syncobj_timeline 
*syncobj_timeline)


Since we merged timeline and singleton syncobj you can drop the extra 
_timeline_ part in the function name I think.



  {
-   cb->func = func;
-   list_add_tail(>node, >cb_list);
+   spin_lock(>lock);
+   syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
+   syncobj_timeline->timeline = 0;
+   syncobj_timeline->signal_point = 0;
+   init_waitqueue_head(_timeline->wq);
+
+   syncobj_timeline->wait_pt_tree = RB_ROOT;
+   INIT_LIST_HEAD(_timeline->signal_pt_list);
+   spin_unlock(>lock);
  }
  
-static int 

[PATCH 1/3] [RFC]drm: add syncobj timeline support v4

2018-09-06 Thread Chunming Zhou
VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer payload
identifying a point in a timeline. Such timeline semaphores support the
following operations:
   * CPU query - A host operation that allows querying the payload of the
 timeline semaphore.
   * CPU wait - A host operation that allows a blocking wait for a
 timeline semaphore to reach a specified value.
   * Device wait - A device operation that allows waiting for a
 timeline semaphore to reach a specified value.
   * Device signal - A device operation that allows advancing the
 timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait on any 
point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
a. normal syncobj signal op will create a signal PT to tail of signal pt 
list.
b. normal syncobj wait op will create a wait pt with last signal point, and 
this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)

Tested by ./deqp-vk -n dEQP-VK*semaphore* for normal syncobj

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_syncobj.c  | 516 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
 include/drm/drm_syncobj.h  |  78 ++--
 include/uapi/drm/drm.h |   1 +
 4 files changed, 448 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..94b31de23858 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
 #include "drm_internal.h"
 #include 
 
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */
+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
 struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
@@ -82,6 +85,18 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops 
= {
.release = drm_syncobj_stub_fence_release,
 };
 
+struct drm_syncobj_signal_pt {
+   struct dma_fence_array *base;
+   u64value;
+   struct list_head list;
+};
+struct drm_syncobj_wait_pt {
+   struct drm_syncobj_stub_fence base;
+   struct dma_fence *signal_pt_fence;
+   struct dma_fence_cb wait_cb;
+   struct rb_node   node;
+   u64value;
+};
 
 /**
  * drm_syncobj_find - lookup and reference a sync object.
@@ -109,61 +124,238 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file 
*file_private,
 }
 EXPORT_SYMBOL(drm_syncobj_find);
 
-static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
-   struct drm_syncobj_cb *cb,
-   drm_syncobj_func_t func)
+static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
+ struct drm_syncobj_timeline 
*syncobj_timeline)
 {
-   cb->func = func;
-   list_add_tail(>node, >cb_list);
+   spin_lock(>lock);
+   syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
+   syncobj_timeline->timeline = 0;
+   syncobj_timeline->signal_point = 0;
+   init_waitqueue_head(_timeline->wq);
+
+   syncobj_timeline->wait_pt_tree = RB_ROOT;
+   INIT_LIST_HEAD(_timeline->signal_pt_list);
+   spin_unlock(>lock);
 }
 
-static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
-struct dma_fence **fence,
-struct drm_syncobj_cb *cb,
-