Not much to say about the code (the theory sounds sane) but I was
wondering about the comment.
Why did glsl implement this really as x * (1 - a) + y * a?
The usual way for lerp would be (y - x) * a + x, i.e. two ops for most
gpus (sub+mad, or sub+mul+add). But I'm wondering if that sacrifices
precision or gets Infs wrong or something (this is the way the gallivm
code implements TGSI_OPCODE_LRP). I guess strict IEEE conformance would
really forbid that optimization though...
Roland
Am 20.02.2013 02:03, schrieb Matt Turner:
From: Kenneth Graunke kenn...@whitecape.org
Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).
Pattern matching or peepholing this is more desirable, but can be
tricky. By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.
Currently, all consumers lower ir_triop_lrp. Subsequent patches will
actually generate different code.
v2 [mattst88]:
- Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
subsequent patch and ir_triop_lrp translated directly.
Reviewed-by: Matt Turner matts...@gmail.com
---
src/glsl/builtins/ir/mix.ir| 14 +-
src/glsl/ir.cpp|4 +++
src/glsl/ir.h |7 +
src/glsl/ir_constant_expression.cpp| 13 ++
src/glsl/ir_optimization.h |1 +
src/glsl/ir_validate.cpp |6
src/glsl/lower_instructions.cpp| 35
src/mesa/drivers/dri/i965/brw_shader.cpp |3 +-
src/mesa/program/ir_to_mesa.cpp|6 -
src/mesa/state_tracker/st_glsl_to_tgsi.cpp |1 +
10 files changed, 81 insertions(+), 9 deletions(-)
diff --git a/src/glsl/builtins/ir/mix.ir b/src/glsl/builtins/ir/mix.ir
index 70ae13c..e666532 100644
--- a/src/glsl/builtins/ir/mix.ir
+++ b/src/glsl/builtins/ir/mix.ir
@@ -4,49 +4,49 @@
(declare (in) float arg0)
(declare (in) float arg1)
(declare (in) float arg2))
- ((return (expression float + (expression float * (var_ref arg0)
(expression float - (constant float (1.00)) (var_ref arg2))) (expression
float * (var_ref arg1) (var_ref arg2))
+ ((return (expression float lrp (var_ref arg0) (var_ref arg1) (var_ref
arg2)
(signature vec2
(parameters
(declare (in) vec2 arg0)
(declare (in) vec2 arg1)
(declare (in) vec2 arg2))
- ((return (expression vec2 + (expression vec2 * (var_ref arg0)
(expression vec2 - (constant float (1.00)) (var_ref arg2))) (expression
vec2 * (var_ref arg1) (var_ref arg2))
+ ((return (expression vec2 lrp (var_ref arg0) (var_ref arg1) (var_ref
arg2)
(signature vec3
(parameters
(declare (in) vec3 arg0)
(declare (in) vec3 arg1)
(declare (in) vec3 arg2))
- ((return (expression vec3 + (expression vec3 * (var_ref arg0)
(expression vec3 - (constant float (1.00)) (var_ref arg2))) (expression
vec3 * (var_ref arg1) (var_ref arg2))
+ ((return (expression vec3 lrp (var_ref arg0) (var_ref arg1) (var_ref
arg2)
(signature vec4
(parameters
(declare (in) vec4 arg0)
(declare (in) vec4 arg1)
(declare (in) vec4 arg2))
- ((return (expression vec4 + (expression vec4 * (var_ref arg0)
(expression vec4 - (constant float (1.00)) (var_ref arg2))) (expression
vec4 * (var_ref arg1) (var_ref arg2))
+ ((return (expression vec4 lrp (var_ref arg0) (var_ref arg1) (var_ref
arg2)
(signature vec2
(parameters
(declare (in) vec2 arg0)
(declare (in) vec2 arg1)
(declare (in) float arg2))
- ((return (expression vec2 + (expression vec2 * (var_ref arg0)
(expression float - (constant float (1.00)) (var_ref arg2))) (expression
vec2 * (var_ref arg1) (var_ref arg2))
+ ((return (expression vec2 lrp (var_ref arg0) (var_ref arg1) (var_ref
arg2)
(signature vec3
(parameters
(declare (in) vec3 arg0)
(declare (in) vec3 arg1)
(declare (in) float arg2))
- ((return (expression vec3 + (expression vec3 * (var_ref arg0)
(expression float - (constant float (1.00)) (var_ref arg2))) (expression
vec3 * (var_ref arg1) (var_ref arg2))
+ ((return (expression vec3 lrp (var_ref arg0) (var_ref arg1) (var_ref
arg2)
(signature vec4
(parameters
(declare (in) vec4 arg0)
(declare (in) vec4 arg1)
(declare (in) float arg2))
- ((return (expression vec4 + (expression vec4 * (var_ref arg0)
(expression float - (constant float (1.00)) (var_ref arg2))) (expression
vec4 * (var_ref arg1) (var_ref arg2))
+ ((return (expression vec4 lrp (var_ref