On Mon, Sep 10, 2018 at 5:33 PM Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl>
wrote:

> On Mon, Sep 10, 2018 at 8:05 PM Jason Ekstrand <ja...@jlekstrand.net>
> wrote:
> >
> > The instruction scheduler is re-ordering loads which is causing fence
> > values to be loaded after the value they're fencing.  In particular,
> > consider the following pseudocode:
> >
> >     void try_use_a_thing(int idx)
> >     {
> >         bool ready = ssbo.arr[idx].ready;
> >         vec4 data = ssbo.arr[idx].data;
> >         if (ready)
> >             use(data);
> >     }
> >
> >     void write_a_thing(int idx, vec4 data)
> >     {
> >         ssbo.arr[idx].data = data;
> >         ssbo.arr[idx].ready = true;
> >     }
> >
> > Our current instruction scheduling scheme doesn't see any problem with
> > re-ordering the load of "ready" with respect to the load of "data".
> > However, if try_use_a_thing is called in one thread and write_a_thing is
> > called in another thread, such re-ordering could cause invalid data to
> > be used.  Normally, some re-ordering of loads is fine; however, with the
> > Vulkan memory model, there are some additional guarantees that are
> > provided particularly in the case of atomic loads which we currently
> > don't differentiate in any way in the back-end.
> >
> > Obviously, we need to come up with something better for this than just
> > shutting off the scheduler but I wanted to send this out earlier rather
> > than later and provide the opportunity for a discussion.
>
> so how about adding a bitmask with flags for these to load/store
> intrinsics? I remember we still need a coherent bit to implement
> coherent loads and stores for AMD. We could easily have a mask with
> coherent+atomic+whatever and then pass that to the backend?
>

I think that's where we're going to end up.  The current model you just
have load/store ops and then barriers to block around them.  However, I
suspect that the long-term direction is going to be something more like
SPIR-V's access flags on each load/store that provides some sort of partial
barrier and re-ordering restriction information.

> ---
> >  src/intel/compiler/brw_fs.cpp | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp
> b/src/intel/compiler/brw_fs.cpp
> > index 3f7f2b4c984..9df238a6f6a 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -6427,7 +6427,7 @@ fs_visitor::allocate_registers(unsigned
> min_dispatch_width, bool allow_spilling)
> >      * performance but increasing likelihood of allocating.
> >      */
> >     for (unsigned i = 0; i < ARRAY_SIZE(pre_modes); i++) {
> > -      schedule_instructions(pre_modes[i]);
> > +      //schedule_instructions(pre_modes[i]);
> >
> >        if (0) {
> >           assign_regs_trivial();
> > @@ -6478,7 +6478,7 @@ fs_visitor::allocate_registers(unsigned
> min_dispatch_width, bool allow_spilling)
> >
> >     opt_bank_conflicts();
> >
> > -   schedule_instructions(SCHEDULE_POST);
> > +   //schedule_instructions(SCHEDULE_POST);
> >
> >     if (last_scratch > 0) {
> >        MAYBE_UNUSED unsigned max_scratch_size = 2 * 1024 * 1024;
> > --
> > 2.17.1
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to