Re: [PATCH 5/5] [RFC]drm: add syncobj timeline support v3

Christian König Mon, 03 Sep 2018 04:19:31 -0700

Am 03.09.2018 um 12:07 schrieb Chunming Zhou:

在 2018/9/3 16:50, Christian König 写道:
Am 03.09.2018 um 06:13 schrieb Chunming Zhou:
在 2018/8/30 19:32, Christian König 写道:
[SNIP]
+
+struct drm_syncobj_wait_pt {
+    struct drm_syncobj_stub_fence base;
+    u64    value;
+    struct rb_node   node;
+};
+struct drm_syncobj_signal_pt {
+    struct drm_syncobj_stub_fence base;
+    struct dma_fence *signal_fence;
+    struct dma_fence *pre_pt_base;
+    struct dma_fence_cb signal_cb;
+    struct dma_fence_cb pre_pt_cb;
+    struct drm_syncobj *syncobj;
+    u64    value;
+    struct list_head list;
+};
What are those two structures good for
They are used to record wait ops points and signal ops points.
For timeline, they are connected by timeline value, works like:
a. signal pt increase timeline value to signal_pt value whensignal_pt is signaled. signal_pt is depending on previous ptfence and itself signal fence from signal ops.
    b. wait pt tree checks if timeline value over itself value.

For normal, works like:
a. signal pt increase 1 for syncobj_timeline->signal_pointevery time when signal ops is performed. b. when wait ops is comming, wait pt will fetch above lastsignal pt value as its wait point. wait pt will be only signaledby equal point value signal_pt.
and why is the stub fence their base?
Good question, I tried to kzalloc for them as well when I debugthem, I encountered a problem:I lookup/find wait_pt or signal_pt successfully, but when Itried to use them, they are freed sometimes, and results in NULLpoint.and generally, when lookup them, we often need their stub fenceas well, and in the meantime, their lifecycle are same.
Above reason, I make stub fence as their base.
That sounds like you only did this because you messed up thelifecycle.
Additional to that I don't think you correctly considered thelifecycle of the waits and the sync object itself. E.g. blockingin drm_syncobj_timeline_fini() until all waits are done is not agood idea.
What you should do instead is to create a fence_array object withall the fence we need to wait for when a wait point is requested.
Yeah, this is our initial discussion result, but when I tried todo that, I found that cannot meet the advance signal requirement: a. We need to consider the wait and signal pt value are notone-to-one match, it's difficult to find dependent point, atleast, there is some overhead.
As far as I can see that is independent of using a fence arrayhere. See you can either use a ring buffer or an rb-tree, but whenyou want to wait for a specific point we need to condense the notyet signaled fences into an array.
again, need to find the range of where the specific point in, weshould close to timeline semantics, I also refered the sw_sync.ctimeline, usally wait_pt is signalled by timeline point. And I agreewe can implement it with several methods, but I don't think thereare basical differences.
The problem is that with your current approach you need the sync_objalive for the synchronization to work. That is most likely not a goodidea.
Indeed, I will fix that. How abount only creating fence array forevery wait pt when syncobj release? when syncobj release, wait pt musthave waited the signal opertation, then we can easily condense fencesfor every wait pt. And meantime, we can take timeline based wait ptadvantage.

That could work, but I don't see how you want to replace the alreadyissued fence with a fence_array when the sync object is destroyed.

Additional to that I would rather prefer a consistent handling, e.g.without extra rarely used code paths.

Additional to that you enable signaling without a need from thewaiting side. That is rather bad for implementations which need thatoptimization.
Do you mean increasing timeline based on signal fence is not better?only update timeline value when requested by a wait pt?


Yes, exactly.

This way, we will not update timeline value immidiately and cannotfree signal pt immidiately, and we also need to consider it to CPUquery and wait.

That is actually the better coding style. We usually try to avoid doingthings in interrupt handlers as much as possible.


How about this idea:
1. Each signaling point is a fence implementation with an rb node.
2. Each node keeps a reference to the last previously inserted node.
3. Each node is referenced by the sync object itself.

4. Before each signal/wait operation we do a garbage collection andremove the first node from the tree as long as it is signaled.

5. When enable_signaling is requested for a node we cascade that to theleft using rb_prev. This ensures that signaling is enabled for the current fence aswell as all previous fences.

6. A wait just looks into the tree for the signal point lower or equalof the requested sequence number.

7. When the sync object is released we userbtree_postorder_for_each_entry_safe() and drop the extra reference toeach node, but never call rb_erase! This way the rb_tree stays in memory, but without a root (e.g. thesync object). It only destructs itself when the looked up references tothe nodes are dropped.


Well that is quite a bunch of logic, but I think that should work fine.

Regards,
Christian.

Thanks,
David Zhou
I suggest to either condense all the fences you need to wait for inan array during the wait operation, or reference count the sync_objand only enable the signaling on the fences when requested by a wait.
b. because we allowed "wait-before-signal", if"wait-before-signal" happens, there isn't signal fence which canbe used to create fence array.
Well, again we DON'T support wait-before-signal here. I willcertainly NAK any implementation which tries to do this until wehaven't figured out all the resource management constraints and Istill don't see how we can do this.
yes, we don't support real wait-before-signal in patch now, just afake wait-before-signal, which still wait on CS submission untilsignal operation coming through wait_event, which is the conclusionwe disscussed before.
Well in this case we should call it wait-for-signal and notwait-before-signal :)
So timeline value is good to resolve that.
Otherwise somebody can easily construct a situation wheretimeline sync obj A waits on B and B waits on A.
Same situation also can happens with fence-array, we only can seethis is a programming bug with incorrect use.
No, fence-array is initialized only once with a static list offences. This way it is impossible to add the fence-array to itselffor example.
E.g. you can't build circle dependencies with that.
we already wait for signal operation event, so never build circledependencies with that. The theory is same.
Yeah, ok that is certainly true.

Regards,
Christian.
Better use rbtree_postorder_for_each_entry_safe() for this.
From the comments, seems we cannot use it here, we need erasenode here.
Comments:
* Note, however, that it cannot handle other modifications thatre-order the * rbtree it is iterating over. This includes calling rb_erase()on @pos, as * rb_erase() may rebalance the tree, causing us to miss somenodes.
 */
Yeah, but your current implementation has the same problem :)
You need to use something like "while (node = rb_first(...))"instead if you want to destruct the whole tree.
OK, got it, I will change it in v4.

Thanks,
David Zhou
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 5/5] [RFC]drm: add syncobj timeline support v3

Reply via email to