On Fri, Apr 1, 2016 at 9:56 PM, Jason Ekstrand <[email protected]> wrote:
> > > On Tue, Mar 29, 2016 at 8:49 PM, Matt Turner <[email protected]> wrote: > >> On Fri, Mar 25, 2016 at 4:12 PM, Jason Ekstrand <[email protected]> >> wrote: >> > --- >> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32 >> ++++++++++++++++++++++++++++++ >> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 32 >> ++++++++++++++++++++++++++++++ >> > 2 files changed, 64 insertions(+) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > index 14480fb..131f50e 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > @@ -844,8 +844,40 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, >> nir_alu_instr *instr) >> > unreachable("Should have been lowered by borrow_to_arith()."); >> > >> > case nir_op_umod: >> > + case nir_op_irem: >> > + /* According to the sign table for INT DIV in the Ivy Bridge >> PRM, it >> > + * appears that our hardware just does the right thing for signed >> > + * remainder. >> > + */ >> > + bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]); >> > + break; >> > + >> > + case nir_op_imod: { >> > + /* Get a regular C-style remainder. If a % b == 0, set the >> predicate. */ >> > bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]); >> > + >> > + /* Math instructions don't support conditional mod */ >> > + inst = bld.MOV(bld.null_reg_d(), result); >> > + inst->conditional_mod = BRW_CONDITIONAL_NZ; >> > + >> > + /* Now, we need to determine if signs of the sources are >> different. >> > + * When we XOR the sources, the top bit is 0 if they are the >> same and 1 >> > + * if they are different. We can then use a conditional >> modifier to >> > + * turn that into a predicate. This leads us to an XOR.l >> instruction. >> > + */ >> > + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_D); >> > + inst = bld.XOR(tmp, op[0], op[1]); >> > + inst->predicate = BRW_PREDICATE_NORMAL; >> > + inst->conditional_mod = BRW_CONDITIONAL_L; >> >> This goes against the PRM: >> >> "This operation does not produce sign or overflow conditions. Only the >> .e/.z or .ne/.nz conditional modifiers should be used." >> > > So, interesting news: I wrote a Vulkan CTS test for both imod and irem > (they weren't tested by the CTS before) to see what the hardware does. > And, contrary to what the PRM might lead you to believe, it seems to work > just fine. I've only tested on SKL so far but I have tested in both FS and > vec4. I'll try it on BDW and HSW before declaring victory, but it looks > like XOR.l might be a well-defined thing after all. > More data: It appears to also work on Ivy Bridge. > At the very least, we should add a comment with the PRM citation and the > empirical results. > --Jason >
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
