Re: [Mesa-dev] [PATCH v3] glsl: Optimize min/max expression trees

2014-09-30 Thread Iago Toral Quiroga
On lun, 2014-09-29 at 13:19 -0400, Connor Abbott wrote:
 On Mon, Sep 29, 2014 at 7:49 AM, Iago Toral Quiroga ito...@igalia.com wrote:
  Original patch by Petri Latvala petri.latv...@intel.com:
 
  Add an optimization pass that drops min/max expression operands that
  can be proven to not contribute to the final result. The algorithm is
  similar to alpha-beta pruning on a minmax search, from the field of
  AI.
 
  This optimization pass can optimize min/max expressions where operands
  are min/max expressions. Such code can appear in shaders by itself, or
  as the result of clamp() or AMD_shader_trinary_minmax functions.
 
  This optimization pass improves the generated code for piglit's
  AMD_shader_trinary_minmax tests as follows:
 
  total instructions in shared programs: 75 - 67 (-10.67%)
  instructions in affected programs: 60 - 52 (-13.33%)
  GAINED:0
  LOST:  0
 
  All tests (max3, min3, mid3) improved.
 
  A full shader-db run:
 
  total instructions in shared programs: 4293603 - 4293575 (-0.00%)
  instructions in affected programs: 1188 - 1160 (-2.36%)
  GAINED:0
  LOST:  0
 
  Improvements happen in Guacamelee and Serious Sam 3. One shader from
  Dungeon Defenders is hurt by shader-db metrics (26 - 28), because of
  dropping of a (constant float (0.0)) operand, which was
  compiled to a saturate modifier.
 
  Version 2 by Iago Toral Quiroga ito...@igalia.com:
 
  Changes from review feedback:
  - Squashed various cosmetic changes sent by Matt Turner.
  - Make less_all_components return an enum rather than setting a class 
  member.
(Suggested by Mat Turner). Also, renamed it to compare_components.
  - Make less_all_components, smaller_constant and larger_constant static.
(Suggested by Mat Turner)
  - Change mixmax_range to call its limits low and high instead of
range[0] and range[1]. (Suggested by Connor Abbot).
  - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
Connor Abbot).
  - Make the logic more clearer by rearrenging the code and commenting.
(Suggested by Connor Abbot).
  - Added comment to explain why we need to recurse twice. (Suggested by
Connor Abbot).
  - If we cannot prune an expression, do not return early. Instead, attempt
to prune its children. (Suggested by Connor Abbot).
 
  Other changes:
  - Instead of having a global valid visitor member, let the various 
  functions
that can determine this status return a boolean and check for its value
to decide what to do in each case. This is more flexible and allows to
recurse into children of parents that could not be prunned due to invalid
ranges (so related to the last bullet in the review feedback).
  - Make sure we always check if a range is valid before working with it. 
  Since
any use of get_range, combine_range or range_intersection can invalidate
a range we should check for this situation every time we use any of these
functions.
 
  Version 3 by Iago Toral Quiroga ito...@igalia.com:
 
  Changes from review feedback:
  - Now we can make get_range, combine_range and range_intersection static too
(suggested by Connor Abbot).
  - Do not return NULL when looking for the larger or greater constant into
mixed vector constants. Instead, produce a new constant by doing a
component-wise minmax. With this we can also remove of the validations 
  when
we call into these functions (suggested by Connor Abbot).
  - Add a comment explaining the meaning of the baserange argument in
prune_expression (suggested by Connor Abbot).
 
  Other changes:
  - Eliminate minmax expressions operating on constant vectors with mixed 
  values
by resolving them.
 
  No piglit regressions observed with Version 3.
 
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861
  ---
 
  Version 3 passes all minmax unit tests sent in the original series by Petri,
  except for the ones aimed at mixed vectors because this version produces
  better code for these than what is expected by these tests.
 
 Thanks for cleaning this up! This patch is
 
 Reviewed-by: Connor Abbott cwabbo...@gmail.com
 
 Dylan had some style comments on the Python patches IIRC, but other
 than that I think they looked good (assuming you fix the broken tests
 you mentioned) - it has been a while though.

Yes, there was some review feedback for those too, I'll look into them
and send updated versions here for review.

Thanks for reviewing this!

Iago

 
   src/glsl/Makefile.sources   |   1 +
   src/glsl/glsl_parser_extras.cpp |   1 +
   src/glsl/ir_optimization.h  |   1 +
   src/glsl/opt_minmax.cpp | 464 
  
   4 files changed, 467 insertions(+)
   create mode 100644 src/glsl/opt_minmax.cpp
 
  diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
  index cb8d5a6..1c08697 100644
  

Re: [Mesa-dev] [PATCH v3] glsl: Optimize min/max expression trees

2014-09-30 Thread Iago Toral Quiroga
I just noticed that we should add:

index a4fe2bd..ca53eb8 100644
--- a/src/glsl/opt_minmax.cpp
+++ b/src/glsl/opt_minmax.cpp
@@ -415,6 +415,17 @@ ir_minmax_visitor::prune_expression(ir_expression *expr, 
minmax_range baserange)
   }
}
 
+   /* If we got here we could not discard any of the operands of the minmax
+* expression, but we can still try to resolve the expression if both
+* operands are constant. We do this after the loop above, to make sure
+* that if our operands are minmax expressions we have tried to prune them
+* first (hopefully reducing them to constants).
+*/
+   ir_constant *a = expr-operands[0]-as_constant();
+   ir_constant *b = expr-operands[1]-as_constant();
+   if (a  b)
+  return combine_constant(ismin, a, b);
+
return expr;
 }

at the bottom of prune_expression. This makes sure that when we prune
the operands of a minmax expression to constants, we also resolve the
parent expression to a constant, otherwise we will leave the parent with
two constant arguments. I noticed this while reworking the unit tests
for mixed vectors.

Connor: if you give the okay to this change I will squash it in before
pushing.

Iago

On lun, 2014-09-29 at 13:19 -0400, Connor Abbott wrote:
 On Mon, Sep 29, 2014 at 7:49 AM, Iago Toral Quiroga ito...@igalia.com wrote:
  Original patch by Petri Latvala petri.latv...@intel.com:
 
  Add an optimization pass that drops min/max expression operands that
  can be proven to not contribute to the final result. The algorithm is
  similar to alpha-beta pruning on a minmax search, from the field of
  AI.
 
  This optimization pass can optimize min/max expressions where operands
  are min/max expressions. Such code can appear in shaders by itself, or
  as the result of clamp() or AMD_shader_trinary_minmax functions.
 
  This optimization pass improves the generated code for piglit's
  AMD_shader_trinary_minmax tests as follows:
 
  total instructions in shared programs: 75 - 67 (-10.67%)
  instructions in affected programs: 60 - 52 (-13.33%)
  GAINED:0
  LOST:  0
 
  All tests (max3, min3, mid3) improved.
 
  A full shader-db run:
 
  total instructions in shared programs: 4293603 - 4293575 (-0.00%)
  instructions in affected programs: 1188 - 1160 (-2.36%)
  GAINED:0
  LOST:  0
 
  Improvements happen in Guacamelee and Serious Sam 3. One shader from
  Dungeon Defenders is hurt by shader-db metrics (26 - 28), because of
  dropping of a (constant float (0.0)) operand, which was
  compiled to a saturate modifier.
 
  Version 2 by Iago Toral Quiroga ito...@igalia.com:
 
  Changes from review feedback:
  - Squashed various cosmetic changes sent by Matt Turner.
  - Make less_all_components return an enum rather than setting a class 
  member.
(Suggested by Mat Turner). Also, renamed it to compare_components.
  - Make less_all_components, smaller_constant and larger_constant static.
(Suggested by Mat Turner)
  - Change mixmax_range to call its limits low and high instead of
range[0] and range[1]. (Suggested by Connor Abbot).
  - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
Connor Abbot).
  - Make the logic more clearer by rearrenging the code and commenting.
(Suggested by Connor Abbot).
  - Added comment to explain why we need to recurse twice. (Suggested by
Connor Abbot).
  - If we cannot prune an expression, do not return early. Instead, attempt
to prune its children. (Suggested by Connor Abbot).
 
  Other changes:
  - Instead of having a global valid visitor member, let the various 
  functions
that can determine this status return a boolean and check for its value
to decide what to do in each case. This is more flexible and allows to
recurse into children of parents that could not be prunned due to invalid
ranges (so related to the last bullet in the review feedback).
  - Make sure we always check if a range is valid before working with it. 
  Since
any use of get_range, combine_range or range_intersection can invalidate
a range we should check for this situation every time we use any of these
functions.
 
  Version 3 by Iago Toral Quiroga ito...@igalia.com:
 
  Changes from review feedback:
  - Now we can make get_range, combine_range and range_intersection static too
(suggested by Connor Abbot).
  - Do not return NULL when looking for the larger or greater constant into
mixed vector constants. Instead, produce a new constant by doing a
component-wise minmax. With this we can also remove of the validations 
  when
we call into these functions (suggested by Connor Abbot).
  - Add a comment explaining the meaning of the baserange argument in
prune_expression (suggested by Connor Abbot).
 
  Other changes:
  - Eliminate minmax expressions operating on constant vectors with mixed 
  values
by 

[Mesa-dev] [PATCH v3] glsl: Optimize min/max expression trees

2014-09-29 Thread Iago Toral Quiroga
Original patch by Petri Latvala petri.latv...@intel.com:

Add an optimization pass that drops min/max expression operands that
can be proven to not contribute to the final result. The algorithm is
similar to alpha-beta pruning on a minmax search, from the field of
AI.

This optimization pass can optimize min/max expressions where operands
are min/max expressions. Such code can appear in shaders by itself, or
as the result of clamp() or AMD_shader_trinary_minmax functions.

This optimization pass improves the generated code for piglit's
AMD_shader_trinary_minmax tests as follows:

total instructions in shared programs: 75 - 67 (-10.67%)
instructions in affected programs: 60 - 52 (-13.33%)
GAINED:0
LOST:  0

All tests (max3, min3, mid3) improved.

A full shader-db run:

total instructions in shared programs: 4293603 - 4293575 (-0.00%)
instructions in affected programs: 1188 - 1160 (-2.36%)
GAINED:0
LOST:  0

Improvements happen in Guacamelee and Serious Sam 3. One shader from
Dungeon Defenders is hurt by shader-db metrics (26 - 28), because of
dropping of a (constant float (0.0)) operand, which was
compiled to a saturate modifier.

Version 2 by Iago Toral Quiroga ito...@igalia.com:

Changes from review feedback:
- Squashed various cosmetic changes sent by Matt Turner.
- Make less_all_components return an enum rather than setting a class member.
  (Suggested by Mat Turner). Also, renamed it to compare_components.
- Make less_all_components, smaller_constant and larger_constant static.
  (Suggested by Mat Turner)
- Change mixmax_range to call its limits low and high instead of
  range[0] and range[1]. (Suggested by Connor Abbot).
- Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
  Connor Abbot).
- Make the logic more clearer by rearrenging the code and commenting.
  (Suggested by Connor Abbot).
- Added comment to explain why we need to recurse twice. (Suggested by
  Connor Abbot).
- If we cannot prune an expression, do not return early. Instead, attempt
  to prune its children. (Suggested by Connor Abbot).

Other changes:
- Instead of having a global valid visitor member, let the various functions
  that can determine this status return a boolean and check for its value
  to decide what to do in each case. This is more flexible and allows to
  recurse into children of parents that could not be prunned due to invalid
  ranges (so related to the last bullet in the review feedback).
- Make sure we always check if a range is valid before working with it. Since
  any use of get_range, combine_range or range_intersection can invalidate
  a range we should check for this situation every time we use any of these
  functions.

Version 3 by Iago Toral Quiroga ito...@igalia.com:

Changes from review feedback:
- Now we can make get_range, combine_range and range_intersection static too
  (suggested by Connor Abbot).
- Do not return NULL when looking for the larger or greater constant into
  mixed vector constants. Instead, produce a new constant by doing a
  component-wise minmax. With this we can also remove of the validations when
  we call into these functions (suggested by Connor Abbot).
- Add a comment explaining the meaning of the baserange argument in
  prune_expression (suggested by Connor Abbot).

Other changes:
- Eliminate minmax expressions operating on constant vectors with mixed values
  by resolving them.

No piglit regressions observed with Version 3.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861
---

Version 3 passes all minmax unit tests sent in the original series by Petri,
except for the ones aimed at mixed vectors because this version produces
better code for these than what is expected by these tests.

 src/glsl/Makefile.sources   |   1 +
 src/glsl/glsl_parser_extras.cpp |   1 +
 src/glsl/ir_optimization.h  |   1 +
 src/glsl/opt_minmax.cpp | 464 
 4 files changed, 467 insertions(+)
 create mode 100644 src/glsl/opt_minmax.cpp

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index cb8d5a6..1c08697 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -95,6 +95,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/opt_flip_matrices.cpp \
$(GLSL_SRCDIR)/opt_function_inlining.cpp \
$(GLSL_SRCDIR)/opt_if_simplification.cpp \
+   $(GLSL_SRCDIR)/opt_minmax.cpp \
$(GLSL_SRCDIR)/opt_noop_swizzle.cpp \
$(GLSL_SRCDIR)/opt_rebalance_tree.cpp \
$(GLSL_SRCDIR)/opt_redundant_jumps.cpp \
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 490c3c8..ae19ce4 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1586,6 +1586,7 @@ do_common_optimization(exec_list *ir, bool linked,
else
   progress = do_constant_variable_unlinked(ir) || progress;

Re: [Mesa-dev] [PATCH v3] glsl: Optimize min/max expression trees

2014-09-29 Thread Connor Abbott
On Mon, Sep 29, 2014 at 7:49 AM, Iago Toral Quiroga ito...@igalia.com wrote:
 Original patch by Petri Latvala petri.latv...@intel.com:

 Add an optimization pass that drops min/max expression operands that
 can be proven to not contribute to the final result. The algorithm is
 similar to alpha-beta pruning on a minmax search, from the field of
 AI.

 This optimization pass can optimize min/max expressions where operands
 are min/max expressions. Such code can appear in shaders by itself, or
 as the result of clamp() or AMD_shader_trinary_minmax functions.

 This optimization pass improves the generated code for piglit's
 AMD_shader_trinary_minmax tests as follows:

 total instructions in shared programs: 75 - 67 (-10.67%)
 instructions in affected programs: 60 - 52 (-13.33%)
 GAINED:0
 LOST:  0

 All tests (max3, min3, mid3) improved.

 A full shader-db run:

 total instructions in shared programs: 4293603 - 4293575 (-0.00%)
 instructions in affected programs: 1188 - 1160 (-2.36%)
 GAINED:0
 LOST:  0

 Improvements happen in Guacamelee and Serious Sam 3. One shader from
 Dungeon Defenders is hurt by shader-db metrics (26 - 28), because of
 dropping of a (constant float (0.0)) operand, which was
 compiled to a saturate modifier.

 Version 2 by Iago Toral Quiroga ito...@igalia.com:

 Changes from review feedback:
 - Squashed various cosmetic changes sent by Matt Turner.
 - Make less_all_components return an enum rather than setting a class member.
   (Suggested by Mat Turner). Also, renamed it to compare_components.
 - Make less_all_components, smaller_constant and larger_constant static.
   (Suggested by Mat Turner)
 - Change mixmax_range to call its limits low and high instead of
   range[0] and range[1]. (Suggested by Connor Abbot).
 - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
   Connor Abbot).
 - Make the logic more clearer by rearrenging the code and commenting.
   (Suggested by Connor Abbot).
 - Added comment to explain why we need to recurse twice. (Suggested by
   Connor Abbot).
 - If we cannot prune an expression, do not return early. Instead, attempt
   to prune its children. (Suggested by Connor Abbot).

 Other changes:
 - Instead of having a global valid visitor member, let the various functions
   that can determine this status return a boolean and check for its value
   to decide what to do in each case. This is more flexible and allows to
   recurse into children of parents that could not be prunned due to invalid
   ranges (so related to the last bullet in the review feedback).
 - Make sure we always check if a range is valid before working with it. Since
   any use of get_range, combine_range or range_intersection can invalidate
   a range we should check for this situation every time we use any of these
   functions.

 Version 3 by Iago Toral Quiroga ito...@igalia.com:

 Changes from review feedback:
 - Now we can make get_range, combine_range and range_intersection static too
   (suggested by Connor Abbot).
 - Do not return NULL when looking for the larger or greater constant into
   mixed vector constants. Instead, produce a new constant by doing a
   component-wise minmax. With this we can also remove of the validations when
   we call into these functions (suggested by Connor Abbot).
 - Add a comment explaining the meaning of the baserange argument in
   prune_expression (suggested by Connor Abbot).

 Other changes:
 - Eliminate minmax expressions operating on constant vectors with mixed values
   by resolving them.

 No piglit regressions observed with Version 3.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861
 ---

 Version 3 passes all minmax unit tests sent in the original series by Petri,
 except for the ones aimed at mixed vectors because this version produces
 better code for these than what is expected by these tests.

Thanks for cleaning this up! This patch is

Reviewed-by: Connor Abbott cwabbo...@gmail.com

Dylan had some style comments on the Python patches IIRC, but other
than that I think they looked good (assuming you fix the broken tests
you mentioned) - it has been a while though.


  src/glsl/Makefile.sources   |   1 +
  src/glsl/glsl_parser_extras.cpp |   1 +
  src/glsl/ir_optimization.h  |   1 +
  src/glsl/opt_minmax.cpp | 464 
 
  4 files changed, 467 insertions(+)
  create mode 100644 src/glsl/opt_minmax.cpp

 diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
 index cb8d5a6..1c08697 100644
 --- a/src/glsl/Makefile.sources
 +++ b/src/glsl/Makefile.sources
 @@ -95,6 +95,7 @@ LIBGLSL_FILES = \
 $(GLSL_SRCDIR)/opt_flip_matrices.cpp \
 $(GLSL_SRCDIR)/opt_function_inlining.cpp \
 $(GLSL_SRCDIR)/opt_if_simplification.cpp \
 +   $(GLSL_SRCDIR)/opt_minmax.cpp \