On Sun, Apr 6, 2014 at 11:49 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > Integer shifts are basically always well supported and efficient; that > isn't always true of integer division, and sometimes even integer > multiplication isn't without issues. > > On some Intel hardware, INTDIV can't be used in SIMD16 mode. It also > doesn't support immediate operands (on any generation), while ASR can. > > On Haswell, this cuts the number of instructions in dolphin/efb2ram by > 7.94%. It also removes a single MOV in dolphin/realxfb (due to ASR > supporting immediates), and gains SIMD16 support (due to no INTDIV).
Just a note that the upstream shaders have already been changed to not do integer division. That's the reason I didn't bother fixing up my patch a couple of weeks ago. > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > --- > src/glsl/opt_algebraic.cpp | 70 > ++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 68 insertions(+), 2 deletions(-) > > diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp > index 9d55392..eccc2eb 100644 > --- a/src/glsl/opt_algebraic.cpp > +++ b/src/glsl/opt_algebraic.cpp > @@ -34,6 +34,7 @@ > #include "ir_optimization.h" > #include "ir_builder.h" > #include "glsl_types.h" > +#include "main/macros.h" > > using namespace ir_builder; > > @@ -68,6 +69,8 @@ public: > int op2); > ir_rvalue *swizzle_if_required(ir_expression *expr, > ir_rvalue *operand); > + ir_rvalue *convert_int_math_to_shifts(ir_expression *ir, > + ir_constant *op_const_array[4]); > > void *mem_ctx; > > @@ -185,6 +188,59 @@ ir_algebraic_visitor::reassociate_constant(ir_expression > *ir1, int const_index, > return false; > } > > +/** > + * Transform integer multiplication/division by a constant power-of-two > + * factor into shift instructions. > + */ > +ir_rvalue * > +ir_algebraic_visitor::convert_int_math_to_shifts(ir_expression *ir, > + ir_constant > *op_const_array[4]) > +{ > + /* This optimization only makes sense for GPUs with native integers. */ > + if (!native_integers) > + return NULL; > + > + assert(ir->operation == ir_binop_mul || ir->operation == ir_binop_div); > + > + /* Shifts only work for integer types. */ > + if (!ir->type->is_integer()) > + return NULL; > + > + ir_constant *const_op; > + ir_rvalue *other_op; > + if (op_const_array[0]) { > + const_op = op_const_array[0]; > + other_op = ir->operands[1]; > + } else if (op_const_array[1]) { > + const_op = op_const_array[1]; > + other_op = ir->operands[0]; > + } else { > + /* If neither is a constant, we can't check for powers of two. */ > + return NULL; > + } > + > + ir_constant_data shift_data; > + for (int i = 0; i < const_op->type->vector_elements; i++) { > + if (const_op->type->base_type == GLSL_TYPE_INT && > + const_op->value.i[i] <= 0) { > + /* Negative values aren't powers of two. */ I was going to comment on the is_power_of_two() function returning true for 0 and how this was going to break things, but I see this check for <= 0 should handle it. I don't know what is_power_of_two() should return for zero. The comment "Negative values" would be more correct if we returned false from is_power_of_two() though. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev