On Thu, Sep 22, 2016 at 12:07:24AM +0900, Michel Dänzer wrote:
> On 21/09/16 09:56 PM, Daniel Vetter wrote:
> > On Wed, Sep 21, 2016 at 1:19 PM, Christian König
> > <deathsim...@vodafone.de> wrote:
> >> Am 21.09.2016 um 13:04 schrieb Daniel Vetter:
> >>> On Wed, Sep 21, 2016 at 12:30 PM, Christian König
> >>> <deathsim...@vodafone.de> wrote:
> >>>> Am 21.09.2016 um 11:56 schrieb Michel Dänzer:
> >>>>> Looks like there are different interpretations of the semantics of
> >>>>> exclusive vs. shared fences. Where are these semantics documented?
> >>>> Yeah, I think as well that this is the primary question here.
> >>>> IIRC the fences were explicitly called exclusive/shared instead of
> >>>> writing/reading on purpose.
> >>>> I absolutely don't mind switching to them to writing/reading semantics,
> >>>> but
> >>>> amdgpu really needs multiple writers at the same time.
> >>>> So in this case the writing side of a reservation object needs to be a
> >>>> collection of fences as well.
> >>> You can't have multiple writers with implicit syncing. That confusion
> >>> is exactly why we called them shared/exclusive. Multiple writers
> >>> generally means that you do some form of fencing in userspace
> >>> (unsync'ed gl buffer access is the common one). What you do for
> >>> private buffers doesn't matter, but when you render into a
> >>> shared/winsys buffer you really need to set the exclusive fence (and
> >>> there can only ever be one). So probably needs some userspace
> >>> adjustments to make sure you don't accidentally set an exclusive write
> >>> hazard when you don't really want that implicit sync.
> >> Nope, that isn't true.
> >> We use multiple writers without implicit syncing between processes in the
> >> amdgpu stack perfectly fine.
> >> See amdgpu_sync.c for the implementation. What we do there is taking a look
> >> at all the fences associated with a reservation object and only sync to
> >> those who are from another process.
> >> Then we use implicit syncing for command submissions in the form of
> >> "dependencies". E.g. for each CS we report back an identifier of that
> >> submission to user space and on the next submission you can give this
> >> identifier as dependency which needs to be satisfied before the command
> >> submission can start running.
> > This is called explicit fencing. Implemented with a driver-private
> > primitive (and not sync_file fds like on android), but still
> > conceptually explicit fencing. Implicit fencing really only can handle
> > one writer, at least as currently implemented by struct
> > reservation_object.
> >> This was done to allow multiple engines (3D, DMA, Compute) to compose a
> >> buffer while still allow compatibility with protocols like DRI2/DRI3.
> > Instead of the current solution you need to stop attaching exclusive
> > fences to non-shared buffers (which are coordinated using the
> > driver-private explicit fencing you're describing),
> Err, the current issue is actually that amdgpu never sets an exclusive
> fence, only ever shared ones. :)
Well since you sometimes sync and sometimes not sync it is kinda a special
case of semi-exclusive fence (even if attached to the shared slots).
> > and only attach exclusive fences to shared buffers (DRI2/3, PRIME,
> > whatever).
> Still, it occurred to me in the meantime that amdgpu setting the
> exclusive fence for buffers shared via PRIME (no matter if it's a write
> or read operation) might be a solution. Christian, what do you think?
Yup, that's what I mean. And it shouldn't cause a problem since for shared
buffers (at least for protocols where implicit fencing is required), since
for those you really can't have multiple concurrent writers. And with the
special checks in amdgpu_sync.c that's what's happening in reality, only
difference is that the filtering/selection of what is considered and
exclusive fences happens when you sync, and not when you attach them. And
that breaks reservation_object assumptions.
Software Engineer, Intel Corporation
amd-gfx mailing list