Re: [Mesa-dev] [PATCH 3/9] glsl: Convert mix() to use a new ir_triop_lrp opcode.

2013-02-20 Thread Aras Pranckevicius
 Why did glsl implement this really as x * (1 - a) + y * a?
 The usual way for lerp would be (y - x) * a + x, i.e. two ops for most
 gpus (sub+mad, or sub+mul+add). But I'm wondering if that sacrifices
 precision


Yes.
http://fgiesen.wordpress.com/2012/08/15/linear-interpolation-past-present-and-future/



-- 
Aras Pranckevičius
work: http://unity3d.com
home: http://aras-p.info
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] glsl: Convert mix() to use a new ir_triop_lrp opcode.

2013-02-20 Thread Roland Scheidegger
Am 20.02.2013 11:39, schrieb Aras Pranckevicius:
 
 Why did glsl implement this really as x * (1 - a) + y * a?
 The usual way for lerp would be (y - x) * a + x, i.e. two ops for most
 gpus (sub+mad, or sub+mul+add). But I'm wondering if that sacrifices
 precision
 
 
 Yes.
 http://fgiesen.wordpress.com/2012/08/15/linear-interpolation-past-present-and-future/

Ah ok. Also (from the comments) the other form could also be better in
some circumstances, but I guess getting accurate results for weight ==
1.0 is really more important.
(Also it looks like at least some infinity cases woldn't get the same
result neither, if y and a are ordinary numbers and x is +infinity, the
first equation would yield +/- infinity depending on a, but the second
would result (again depending on a) either in +infinity or NaN.)
If those properties are important might have to use that for
TGSI_OPCODE_LRP at some point...

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] glsl: Convert mix() to use a new ir_triop_lrp opcode.

2013-02-19 Thread Roland Scheidegger
Not much to say about the code (the theory sounds sane) but I was
wondering about the comment.
Why did glsl implement this really as x * (1 - a) + y * a?
The usual way for lerp would be (y - x) * a + x, i.e. two ops for most
gpus (sub+mad, or sub+mul+add). But I'm wondering if that sacrifices
precision or gets Infs wrong or something (this is the way the gallivm
code implements TGSI_OPCODE_LRP). I guess strict IEEE conformance would
really forbid that optimization though...

Roland


Am 20.02.2013 02:03, schrieb Matt Turner:
 From: Kenneth Graunke kenn...@whitecape.org
 
 Many GPUs have an instruction to do linear interpolation which is more
 efficient than simply performing the algebra necessary (two multiplies,
 an add, and a subtract).
 
 Pattern matching or peepholing this is more desirable, but can be
 tricky.  By using an opcode, we can at least make shaders which use the
 mix() built-in get the more efficient behavior.
 
 Currently, all consumers lower ir_triop_lrp.  Subsequent patches will
 actually generate different code.
 
 v2 [mattst88]:
- Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
  subsequent patch and ir_triop_lrp translated directly.
 
 Reviewed-by: Matt Turner matts...@gmail.com
 ---
  src/glsl/builtins/ir/mix.ir|   14 +-
  src/glsl/ir.cpp|4 +++
  src/glsl/ir.h  |7 +
  src/glsl/ir_constant_expression.cpp|   13 ++
  src/glsl/ir_optimization.h |1 +
  src/glsl/ir_validate.cpp   |6 
  src/glsl/lower_instructions.cpp|   35 
 
  src/mesa/drivers/dri/i965/brw_shader.cpp   |3 +-
  src/mesa/program/ir_to_mesa.cpp|6 -
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |1 +
  10 files changed, 81 insertions(+), 9 deletions(-)
 
 diff --git a/src/glsl/builtins/ir/mix.ir b/src/glsl/builtins/ir/mix.ir
 index 70ae13c..e666532 100644
 --- a/src/glsl/builtins/ir/mix.ir
 +++ b/src/glsl/builtins/ir/mix.ir
 @@ -4,49 +4,49 @@
 (declare (in) float arg0)
 (declare (in) float arg1)
 (declare (in) float arg2))
 - ((return (expression float + (expression float * (var_ref arg0) 
 (expression float - (constant float (1.00)) (var_ref arg2))) (expression 
 float * (var_ref arg1) (var_ref arg2))
 + ((return (expression float lrp (var_ref arg0) (var_ref arg1) (var_ref 
 arg2)
  
 (signature vec2
   (parameters
 (declare (in) vec2 arg0)
 (declare (in) vec2 arg1)
 (declare (in) vec2 arg2))
 - ((return (expression vec2 + (expression vec2 * (var_ref arg0) 
 (expression vec2 - (constant float (1.00)) (var_ref arg2))) (expression 
 vec2 * (var_ref arg1) (var_ref arg2))
 + ((return (expression vec2 lrp (var_ref arg0) (var_ref arg1) (var_ref 
 arg2)
  
 (signature vec3
   (parameters
 (declare (in) vec3 arg0)
 (declare (in) vec3 arg1)
 (declare (in) vec3 arg2))
 - ((return (expression vec3 + (expression vec3 * (var_ref arg0) 
 (expression vec3 - (constant float (1.00)) (var_ref arg2))) (expression 
 vec3 * (var_ref arg1) (var_ref arg2))
 + ((return (expression vec3 lrp (var_ref arg0) (var_ref arg1) (var_ref 
 arg2)
  
 (signature vec4
   (parameters
 (declare (in) vec4 arg0)
 (declare (in) vec4 arg1)
 (declare (in) vec4 arg2))
 - ((return (expression vec4 + (expression vec4 * (var_ref arg0) 
 (expression vec4 - (constant float (1.00)) (var_ref arg2))) (expression 
 vec4 * (var_ref arg1) (var_ref arg2))
 + ((return (expression vec4 lrp (var_ref arg0) (var_ref arg1) (var_ref 
 arg2)
  
 (signature vec2
   (parameters
 (declare (in) vec2 arg0)
 (declare (in) vec2 arg1)
 (declare (in) float arg2))
 - ((return (expression vec2 + (expression vec2 * (var_ref arg0) 
 (expression float - (constant float (1.00)) (var_ref arg2))) (expression 
 vec2 * (var_ref arg1) (var_ref arg2))
 + ((return (expression vec2 lrp (var_ref arg0) (var_ref arg1) (var_ref 
 arg2)
  
 (signature vec3
   (parameters
 (declare (in) vec3 arg0)
 (declare (in) vec3 arg1)
 (declare (in) float arg2))
 - ((return (expression vec3 + (expression vec3 * (var_ref arg0) 
 (expression float - (constant float (1.00)) (var_ref arg2))) (expression 
 vec3 * (var_ref arg1) (var_ref arg2))
 + ((return (expression vec3 lrp (var_ref arg0) (var_ref arg1) (var_ref 
 arg2)
  
 (signature vec4
   (parameters
 (declare (in) vec4 arg0)
 (declare (in) vec4 arg1)
 (declare (in) float arg2))
 - ((return (expression vec4 + (expression vec4 * (var_ref arg0) 
 (expression float - (constant float (1.00)) (var_ref arg2))) (expression 
 vec4 * (var_ref arg1) (var_ref arg2))
 + ((return (expression vec4 lrp (var_ref