On Mon, Sep 10, 2018 at 5:38 PM Ian Romanick <i...@freedesktop.org> wrote:
> On 09/10/2018 11:04 AM, Jason Ekstrand wrote: > > The instruction scheduler is re-ordering loads which is causing fence > > values to be loaded after the value they're fencing. In particular, > > consider the following pseudocode: > > > > void try_use_a_thing(int idx) > > { > > bool ready = ssbo.arr[idx].ready; > > vec4 data = ssbo.arr[idx].data; > > if (ready) > > use(data); > > } > > > > void write_a_thing(int idx, vec4 data) > > { > > ssbo.arr[idx].data = data; > > ssbo.arr[idx].ready = true; > > } > > > > Our current instruction scheduling scheme doesn't see any problem with > > re-ordering the load of "ready" with respect to the load of "data". > > However, if try_use_a_thing is called in one thread and write_a_thing is > > called in another thread, such re-ordering could cause invalid data to > > be used. Normally, some re-ordering of loads is fine; however, with the > > Vulkan memory model, there are some additional guarantees that are > > provided particularly in the case of atomic loads which we currently > > don't differentiate in any way in the back-end. > > I have encountered similar problems with reordering in OpenGL land > reading a value that is also accessed using atomics. SPIR-V has an > atomic load instruction. I haven't looked at how we handle that in > spv_to_nir, but could we use something like that in these cases? > OpenGL only has load/store and no concept of an atomic load or store. I think the theory is that loads/stores of a 32-bit integer should already be atomic, right? Unfortunately, that leaves the ordering rules ambiguous. SPIR-V has AtomicLoad and AtomicStore opcodes which could imply the appropreate ordering constraings more-or-less. With the memory model, however, things get a lot more granular and you can start specifying very fine-grained ordering constraints on loads, stores, and atomics. The primary difference, in this brave new world, between atomic and not is that atomics get some of those constraints by default whereas you have to ask for it on a regular load/store. Currently SPIR-V's OpAtomicLoad is just translated to a regular load in NIR. --Jason > > Obviously, we need to come up with something better for this than just > > shutting off the scheduler but I wanted to send this out earlier rather > > than later and provide the opportunity for a discussion. > > --- > > src/intel/compiler/brw_fs.cpp | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs.cpp > b/src/intel/compiler/brw_fs.cpp > > index 3f7f2b4c984..9df238a6f6a 100644 > > --- a/src/intel/compiler/brw_fs.cpp > > +++ b/src/intel/compiler/brw_fs.cpp > > @@ -6427,7 +6427,7 @@ fs_visitor::allocate_registers(unsigned > min_dispatch_width, bool allow_spilling) > > * performance but increasing likelihood of allocating. > > */ > > for (unsigned i = 0; i < ARRAY_SIZE(pre_modes); i++) { > > - schedule_instructions(pre_modes[i]); > > + //schedule_instructions(pre_modes[i]); > > > > if (0) { > > assign_regs_trivial(); > > @@ -6478,7 +6478,7 @@ fs_visitor::allocate_registers(unsigned > min_dispatch_width, bool allow_spilling) > > > > opt_bank_conflicts(); > > > > - schedule_instructions(SCHEDULE_POST); > > + //schedule_instructions(SCHEDULE_POST); > > > > if (last_scratch > 0) { > > MAYBE_UNUSED unsigned max_scratch_size = 2 * 1024 * 1024; > > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev