Fence, timeline and android sync points

Daniel Vetter Fri, 15 Aug 2014 10:07:39 +0200

On Thu, Aug 14, 2014 at 07:03:44PM -0400, Jerome Glisse wrote:
> On Thu, Aug 14, 2014 at 11:23:01PM +0200, Daniel Vetter wrote:
> > On Thu, Aug 14, 2014 at 9:15 PM, Jerome Glisse <j.glisse at gmail.com> 
> > wrote:
> > > Cost 1 uint32 per buffer and simple if without locking to check status of
> > > a buffer.
> > 
> > Yeah well except it doesn't and that's why we switch to full blown
> > fence objects internally instead of smacking seqno values all over the
> > place. At least in i915 because I have seen what this "simple"
> > solution looks like if you throw all the complexities into the bin and
> > mix it well.
> 
> I am kind of curious on what can go wrong here ? Unless you have thousands
> of different hw block inside your gpu all of them with different cmd queue
> then i do not see any issue.


See my other reply, but because we'll schedule software contexts here and
we need to do it with implicit fencing because we can't just break the
existing world. The below is pretty much the design we currently have.
-Daniel

> 
> Note that this global seqno i am talking is only useful for bind/unbind of
> buffer in ideal world of explicit sync, not for synchronization btw command
> buffer. So pseudo code would be :
> 
> // if buffer_can_unbind(buf, dev) return true then buffer is no longer in
> // use and can be unbind. if false you can wait on the device wait queue.
> bool buffer_can_unbind(buf, dev)
> {
>   // Is the last gseqno associated with buffer is smaller than current
>   // smallest signaled seqno ?
>   if (buf->gseqno <= dev->gseqno_cmin)
>     return true;
>   hw_update_gseqno_min(dev);
>   if (buf->gseqno <= dev->gseqno_cmin)
>     return true;
>   for_each_hw_block(hwblock, dev) {
>     // If that hwblock has not signaled a gseqno bigger than the
>     // buffer one's and also has work scheduled that might be using
>     // the buffer (ie the last scheduled gseqno is greater than
>     // buffer gseqno). If that's true than buffer might still be
>     // in use so assume the worst.
>     if (hwblock->gseqno < buf->gseqno &&
>         hwblock->gseqno_last_scheduled >= buf->gseqno)
>       return false;
>     // This means either this block is already past the buf->gseqno
>     // or that this block have nothing in its pipeline that will ever
>     // use buf.
>     // As an extra optimization one might add a hwblock mask to buf
>     // and unmask wait for that hwblock so further wait will not wait
>     // on this block as we know for sure it's not involve.
>   }
>   // Well none of the hwblock is still using that buffer.
>   return true;
> }
> 
> hw_update_gseqno_min(dev)
> {
>    unsigned nmin = -1;
> 
>    for_each_hw_block(hwblock, dev) {
>      nmin = min(nmin, hwblock->gseqno);
>    }
>    // atomic exchange and compage set new min only if it's bigger then
>    // current min
>    atomic_cmp_xchg(dev->gseqno_cmin, nmin)
> }
> 
> 
> So this is how it looks in my mind, each hwblock write to there dedicated
> >gseqno and for each hwblock you store the last gseqno that was scheduled.
> There is no need for any lock and this is outside any other sync mecanism.
> 
> For hw with preemption, i assume you have then have a hwcontext, well you
> use same code except that now you have a gseqno per context which means
> you need to store a seqno per context per buffer. With some trickery this
> might be avoided especialy if hw can do atomic cmp_xchg.
> 
> Cheers,
> J?r?me
> 
> 
> > -Daniel
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

Fence, timeline and android sync points

Reply via email to