Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-20 Thread Roland Scheidegger
Am 20.12.2016 um 22:12 schrieb Giuseppe Bilotta:
> On Tue, Dec 20, 2016 at 2:17 AM, Matt Turner  wrote:
>> On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
>>  wrote:
>>> Just one question though —not knowing much of the shader language, can
>>> I expect expm1 to be available?
>>
>> No, expm1 doesn't exist in GLSL.
> 
> This is extremely bothersome. Both the (exp(2x)-1)/(exp(2x)+1) and the
> 1-2/(exp(2x)+1) formulas give pretty good results when written
> in terms of expm1.
> 
> On Tue, Dec 20, 2016 at 3:48 AM, Roland Scheidegger  
> wrote:
>> Not sure it really matters though one way or another. If you wanted good
>> accuracy around 0, you'd have to use a different formula plus a select
>> (seems like libm implementations actually use 3 cases depending on input
>> value magnitude - not so hot with vectors, but thankfully glsl doesn't
>> require 1 ULP accuracy).
> 
> Brute-forcing over all floating points on CPU by switching between the
> two formulas above at appropriate thresholds gives a maximum relative
> error of the order of machine epsilon when using expm1, and the switch
> between the two formulas can be implemented with a select on two
> terms. However, this does require expm1.
> 
> Nelson Beebe has a very detailed description of how to achieve very
> accurate results for tanh here
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.math.utah.edu_-7Ebeebe_software_ieee_tanh.pdf=DgIFaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=-8RA3Y0TZk5KhOV7i-V1QiCKZ2b1Xd7ubIOObRsSajM=LQvjfGSg0bmKWKjl7W2DlL0vE-Xw2XJoJCx6He20Bcs=
>   and the
> results are a bit depressing, in that multiple thresholds are
> necessary. I'm not sure if these are the same used by libm, but in any
> case neither lends itself well to vectorization (in contrast to the
> switch between the two formulas above).
> 
> An alternative approach could be to actually provide a software
> implementation of expm1 and use it to compute tanh. I wouldn't be
> surprised if this would turn out to not be slower than using exp
> itself, in fact.
> 

I'd venture a guess, you cannot beat the exp of the gpus (exp2 actually,
but it doesn't matter). Those are built to be fast (and not necessarily
100% exact). Ok maybe for some intel chips which use the famous mathbox
maybe you could be competitive...
Now for something like llvmpipe, you could be right. I have no idea if
exp or expm1 is more difficult to evaluate. But noone is going to bother
for that case. For an opcode we don't even have any evidence it's
actually even used somewhere (outside conformance tests). Well it
probably is somewhere, but it's probably rare enough it's not exactly an
interesting target for optimization.
So, I guess unless more accuracy around 0 is really needed, there's
really not much point investing time in this.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-20 Thread Giuseppe Bilotta
On Tue, Dec 20, 2016 at 2:17 AM, Matt Turner  wrote:
> On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
>  wrote:
>> Just one question though —not knowing much of the shader language, can
>> I expect expm1 to be available?
>
> No, expm1 doesn't exist in GLSL.

This is extremely bothersome. Both the (exp(2x)-1)/(exp(2x)+1) and the
1-2/(exp(2x)+1) formulas give pretty good results when written
in terms of expm1.

On Tue, Dec 20, 2016 at 3:48 AM, Roland Scheidegger  wrote:
> Not sure it really matters though one way or another. If you wanted good
> accuracy around 0, you'd have to use a different formula plus a select
> (seems like libm implementations actually use 3 cases depending on input
> value magnitude - not so hot with vectors, but thankfully glsl doesn't
> require 1 ULP accuracy).

Brute-forcing over all floating points on CPU by switching between the
two formulas above at appropriate thresholds gives a maximum relative
error of the order of machine epsilon when using expm1, and the switch
between the two formulas can be implemented with a select on two
terms. However, this does require expm1.

Nelson Beebe has a very detailed description of how to achieve very
accurate results for tanh here
https://www.math.utah.edu/~beebe/software/ieee/tanh.pdf and the
results are a bit depressing, in that multiple thresholds are
necessary. I'm not sure if these are the same used by libm, but in any
case neither lends itself well to vectorization (in contrast to the
switch between the two formulas above).

An alternative approach could be to actually provide a software
implementation of expm1 and use it to compute tanh. I wouldn't be
surprised if this would turn out to not be slower than using exp
itself, in fact.

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Roland Scheidegger
Am 20.12.2016 um 00:12 schrieb Giuseppe Bilotta:
> Hello,
> 
> I realize that I'm a little late to comment about this, but I think
> the formula used for
> tanh should be changed again. Specifically, as suggested by Roland
> 
> On Fri, Dec 9, 2016 at 5:41 AM, Roland Scheidegger  wrote:
>> btw I'm wondering if some vendors wouldn't implement that with slightly
>> simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
>> used for cg apparently according to docs, saving one of the
>> exponentials). Might be worse for accuracy though (and won't solve this
>> problem, though it would now only need a one-sided clamp).

It was changed to this formula.


> 
> Another option is the 1 - 2/(1+expf(2x)), or even better 1 -
> 2/(2+expm1f(2x)).. I've run some tests and this seems to have the same
> accuracy as the
> one mentioned by Roland, with the bonus benefit of not needing any
> clamping. The accuracy seems to actually be better
> than the direct evaluation (difference over sum of exps), except
> around 0 (say, when abs(x) < 1).

The 1 - 2/(1+expf(2x)) is worse for numbers close to zero (probably
provably so, I think you might have one bit more to play with there with
the other formula due to the division by essentially 2). e.g. if you
have 8e-8, libm tanhf() gives me 8e-8 as a result (it looks like it's
actually hard-coded to return the input as result for sufficiently small
values), the (e^2x - 1) / (e^2x + 1) formula gives 5.960464e-08 whereas
1 - 2/(1+expf(2x)) will give you back 0.0f (but, with even smaller
values like 2e-8, both methods will return 0.0f which is pretty wrong in
any case, the relative error can get to enormous levels there).
I'm not sure which method is better for larger values, I think they
might be about the same. Nvidia docs stating they use the slightly more
complex formula for cg though may be a hint that this indeed has some
properties which are nice-to-have. Though arguably it's not that more
complex, since the only part it saves is the one-sided clamp - the most
expensive parts are the exp and the div, neither of which you can get
rid of.
Not sure it really matters though one way or another. If you wanted good
accuracy around 0, you'd have to use a different formula plus a select
(seems like libm implementations actually use 3 cases depending on input
value magnitude - not so hot with vectors, but thankfully glsl doesn't
require 1 ULP accuracy).

Roland



> 
> I've found the relative error away from 0 to be typically in the same
> order of magnitude as the error in tanhf() itself (compared to tanh())
> , and generally less than machine epsilon.. I'm currently looking at
> options to improve the accuracy without clamping and without excessive
> additional computations, might propose a patch in the next couple of
> days.
> 
> Just one question though —not knowing much of the shader language, can
> I expect expm1 to be available?
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Ilia Mirkin
On Mon, Dec 19, 2016 at 8:17 PM, Matt Turner  wrote:
> On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
>  wrote:
>> Just one question though —not knowing much of the shader language, can
>> I expect expm1 to be available?
>
> No, expm1 doesn't exist in GLSL.

And - more importantly - not in the hardware either. At least not on
NVIDIA, and on a brief scan, not Intel either (SKL "math"
instruction).

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Matt Turner
On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
 wrote:
> Just one question though —not knowing much of the shader language, can
> I expect expm1 to be available?

No, expm1 doesn't exist in GLSL.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-19 Thread Giuseppe Bilotta
Hello,

I realize that I'm a little late to comment about this, but I think
the formula used for
tanh should be changed again. Specifically, as suggested by Roland

On Fri, Dec 9, 2016 at 5:41 AM, Roland Scheidegger  wrote:
> btw I'm wondering if some vendors wouldn't implement that with slightly
> simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
> used for cg apparently according to docs, saving one of the
> exponentials). Might be worse for accuracy though (and won't solve this
> problem, though it would now only need a one-sided clamp).

Another option is the 1 - 2/(1+expf(2x)), or even better 1 -
2/(2+expm1f(2x)).. I've run some tests and this seems to have the same
accuracy as the
one mentioned by Roland, with the bonus benefit of not needing any
clamping. The accuracy seems to actually be better
than the direct evaluation (difference over sum of exps), except
around 0 (say, when abs(x) < 1).

I've found the relative error away from 0 to be typically in the same
order of magnitude as the error in tanhf() itself (compared to tanh())
, and generally less than machine epsilon.. I'm currently looking at
options to improve the accuracy without clamping and without excessive
additional computations, might propose a patch in the next couple of
days.

Just one question though —not knowing much of the shader language, can
I expect expm1 to be available?

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-09 Thread Jason Ekstrand
On Thu, Dec 8, 2016 at 5:50 PM, Kenneth Graunke 
wrote:

> On Thursday, December 8, 2016 5:41:02 PM PST Haixia Shi wrote:
> > Clamp input scalar value to range [-10, +10] to avoid precision problems
> > when the absolute value of input is too large.
> >
> > Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.*
> test
> > failures.
> >
> > v2: added more explanation in the comment.
> > v3: fixed a typo in the comment.
> >
> > Signed-off-by: Haixia Shi 
> > Cc: Jason Ekstrand ,
> > Cc: Stéphane Marchesin ,
> > Cc: Kenneth Graunke 
> >
> > Change-Id: I324c948b3323ff8107127c42934f14459e124b95
> > ---
> >  src/compiler/glsl/builtin_functions.cpp | 13 +++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/compiler/glsl/builtin_functions.cpp
> b/src/compiler/glsl/builtin_functions.cpp
> > index 3e4bcbb..0bacffb 100644
> > --- a/src/compiler/glsl/builtin_functions.cpp
> > +++ b/src/compiler/glsl/builtin_functions.cpp
> > @@ -3563,9 +3563,18 @@ builtin_builder::_tanh(const glsl_type *type)
> > ir_variable *x = in_var(type, "x");
> > MAKE_SIG(type, v130, 1, x);
> >
> > +   /*
>
> For future reference, /* doesn't go on its own line in Mesa.
> (We can fix that when pushing, no big deal.)
>
> Thanks for fixing this.  The explanation makes sense.
>
> Reviewed-by: Kenneth Graunke 
>

I just pushed this with my and ken's reviews and the comment change Ken
suggested.


> > +* Clamp x to [-10, +10] to avoid precision problems.
> > +* When x > 10, e^(-x) is so small relative to e^x that it gets
> flushed to
> > +* zero in the computation e^x + e^(-x). The same happens in the
> other
> > +* direction when x < -10.
> > +*/
> > +   ir_variable *t = body.make_temp(type, "tmp");
> > +   body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f;
> > +
> > /* (e^x - e^(-x)) / (e^x + e^(-x)) */
> > -   body.emit(ret(div(sub(exp(x), exp(neg(x))),
> > - add(exp(x), exp(neg(x));
> > +   body.emit(ret(div(sub(exp(t), exp(neg(t))),
> > + add(exp(t), exp(neg(t));
> >
> > return sig;
> >  }
> >
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-08 Thread Kenneth Graunke
On Thursday, December 8, 2016 9:17:04 PM PST Jason Ekstrand wrote:
> On Thu, Dec 8, 2016 at 8:41 PM, Roland Scheidegger 
> wrote:
> 
> > I'm wondering, isn't that actually a problem of the test, that is it
> > can't actually expect reasonable results with such input values?
> > Since within the shader languages those functions which are composed of
> > multiple other functions are usually allowed to basically accumulate all
> > the errors of said functions. Though I agree that results outside [-1,1]
> > would be odd...
> >
> 
> No, not really.  tanh() is well defined on the entire real line and always
> stays inside the interval (-1, 1).  The problem is just that floating-point
> arithmatic explodes once x gets large enough.  However, long before that
> point, it's flattened out to +- 1 (Not mathematically, but as far as
> floating-point precision is concerned).
> 
> 
> > btw I'm wondering if some vendors wouldn't implement that with slightly
> > simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
> > used for cg apparently according to docs, saving one of the
> > exponentials). Might be worse for accuracy though (and won't solve this
> > problem, though it would now only need a one-sided clamp).
> >
> 
> I would be interested to know if that formula would pass the dEQP precision
> tests...  It would be simpler.

It does, if you also include Haixia's clamping code.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-08 Thread Jason Ekstrand
On Thu, Dec 8, 2016 at 8:41 PM, Roland Scheidegger 
wrote:

> I'm wondering, isn't that actually a problem of the test, that is it
> can't actually expect reasonable results with such input values?
> Since within the shader languages those functions which are composed of
> multiple other functions are usually allowed to basically accumulate all
> the errors of said functions. Though I agree that results outside [-1,1]
> would be odd...
>

No, not really.  tanh() is well defined on the entire real line and always
stays inside the interval (-1, 1).  The problem is just that floating-point
arithmatic explodes once x gets large enough.  However, long before that
point, it's flattened out to +- 1 (Not mathematically, but as far as
floating-point precision is concerned).


> btw I'm wondering if some vendors wouldn't implement that with slightly
> simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
> used for cg apparently according to docs, saving one of the
> exponentials). Might be worse for accuracy though (and won't solve this
> problem, though it would now only need a one-sided clamp).
>

I would be interested to know if that formula would pass the dEQP precision
tests...  It would be simpler.


> Roland
>
> Am 09.12.2016 um 02:41 schrieb Haixia Shi:
> > Clamp input scalar value to range [-10, +10] to avoid precision problems
> > when the absolute value of input is too large.
> >
> > Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.*
> test
> > failures.
> >
> > v2: added more explanation in the comment.
> > v3: fixed a typo in the comment.
> >
> > Signed-off-by: Haixia Shi 
> > Cc: Jason Ekstrand ,
> > Cc: Stéphane Marchesin ,
> > Cc: Kenneth Graunke 
> >
> > Change-Id: I324c948b3323ff8107127c42934f14459e124b95
> > ---
> >  src/compiler/glsl/builtin_functions.cpp | 13 +++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/compiler/glsl/builtin_functions.cpp
> b/src/compiler/glsl/builtin_functions.cpp
> > index 3e4bcbb..0bacffb 100644
> > --- a/src/compiler/glsl/builtin_functions.cpp
> > +++ b/src/compiler/glsl/builtin_functions.cpp
> > @@ -3563,9 +3563,18 @@ builtin_builder::_tanh(const glsl_type *type)
> > ir_variable *x = in_var(type, "x");
> > MAKE_SIG(type, v130, 1, x);
> >
> > +   /*
> > +* Clamp x to [-10, +10] to avoid precision problems.
> > +* When x > 10, e^(-x) is so small relative to e^x that it gets
> flushed to
> > +* zero in the computation e^x + e^(-x). The same happens in the
> other
> > +* direction when x < -10.
> > +*/
> > +   ir_variable *t = body.make_temp(type, "tmp");
> > +   body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f;
> > +
> > /* (e^x - e^(-x)) / (e^x + e^(-x)) */
> > -   body.emit(ret(div(sub(exp(x), exp(neg(x))),
> > - add(exp(x), exp(neg(x));
> > +   body.emit(ret(div(sub(exp(t), exp(neg(t))),
> > + add(exp(t), exp(neg(t));
> >
> > return sig;
> >  }
> >
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-08 Thread Roland Scheidegger
I'm wondering, isn't that actually a problem of the test, that is it
can't actually expect reasonable results with such input values?
Since within the shader languages those functions which are composed of
multiple other functions are usually allowed to basically accumulate all
the errors of said functions. Though I agree that results outside [-1,1]
would be odd...

btw I'm wondering if some vendors wouldn't implement that with slightly
simplified formula, e.g. (e^2x - 1) / (e^2x + 1) (this is what nvidia
used for cg apparently according to docs, saving one of the
exponentials). Might be worse for accuracy though (and won't solve this
problem, though it would now only need a one-sided clamp).

Roland

Am 09.12.2016 um 02:41 schrieb Haixia Shi:
> Clamp input scalar value to range [-10, +10] to avoid precision problems
> when the absolute value of input is too large.
> 
> Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test
> failures.
> 
> v2: added more explanation in the comment.
> v3: fixed a typo in the comment.
> 
> Signed-off-by: Haixia Shi 
> Cc: Jason Ekstrand ,
> Cc: Stéphane Marchesin ,
> Cc: Kenneth Graunke 
> 
> Change-Id: I324c948b3323ff8107127c42934f14459e124b95
> ---
>  src/compiler/glsl/builtin_functions.cpp | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/builtin_functions.cpp 
> b/src/compiler/glsl/builtin_functions.cpp
> index 3e4bcbb..0bacffb 100644
> --- a/src/compiler/glsl/builtin_functions.cpp
> +++ b/src/compiler/glsl/builtin_functions.cpp
> @@ -3563,9 +3563,18 @@ builtin_builder::_tanh(const glsl_type *type)
> ir_variable *x = in_var(type, "x");
> MAKE_SIG(type, v130, 1, x);
>  
> +   /*
> +* Clamp x to [-10, +10] to avoid precision problems.
> +* When x > 10, e^(-x) is so small relative to e^x that it gets flushed to
> +* zero in the computation e^x + e^(-x). The same happens in the other
> +* direction when x < -10.
> +*/
> +   ir_variable *t = body.make_temp(type, "tmp");
> +   body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f;
> +
> /* (e^x - e^(-x)) / (e^x + e^(-x)) */
> -   body.emit(ret(div(sub(exp(x), exp(neg(x))),
> - add(exp(x), exp(neg(x));
> +   body.emit(ret(div(sub(exp(t), exp(neg(t))),
> + add(exp(t), exp(neg(t));
>  
> return sig;
>  }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-08 Thread Kenneth Graunke
On Thursday, December 8, 2016 5:41:02 PM PST Haixia Shi wrote:
> Clamp input scalar value to range [-10, +10] to avoid precision problems
> when the absolute value of input is too large.
> 
> Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test
> failures.
> 
> v2: added more explanation in the comment.
> v3: fixed a typo in the comment.
> 
> Signed-off-by: Haixia Shi 
> Cc: Jason Ekstrand ,
> Cc: Stéphane Marchesin ,
> Cc: Kenneth Graunke 
> 
> Change-Id: I324c948b3323ff8107127c42934f14459e124b95
> ---
>  src/compiler/glsl/builtin_functions.cpp | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/builtin_functions.cpp 
> b/src/compiler/glsl/builtin_functions.cpp
> index 3e4bcbb..0bacffb 100644
> --- a/src/compiler/glsl/builtin_functions.cpp
> +++ b/src/compiler/glsl/builtin_functions.cpp
> @@ -3563,9 +3563,18 @@ builtin_builder::_tanh(const glsl_type *type)
> ir_variable *x = in_var(type, "x");
> MAKE_SIG(type, v130, 1, x);
>  
> +   /*

For future reference, /* doesn't go on its own line in Mesa.
(We can fix that when pushing, no big deal.)

Thanks for fixing this.  The explanation makes sense.

Reviewed-by: Kenneth Graunke 

> +* Clamp x to [-10, +10] to avoid precision problems.
> +* When x > 10, e^(-x) is so small relative to e^x that it gets flushed to
> +* zero in the computation e^x + e^(-x). The same happens in the other
> +* direction when x < -10.
> +*/
> +   ir_variable *t = body.make_temp(type, "tmp");
> +   body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f;
> +
> /* (e^x - e^(-x)) / (e^x + e^(-x)) */
> -   body.emit(ret(div(sub(exp(x), exp(neg(x))),
> - add(exp(x), exp(neg(x));
> +   body.emit(ret(div(sub(exp(t), exp(neg(t))),
> + add(exp(t), exp(neg(t));
>  
> return sig;
>  }
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-08 Thread Haixia Shi
Clamp input scalar value to range [-10, +10] to avoid precision problems
when the absolute value of input is too large.

Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test
failures.

v2: added more explanation in the comment.
v3: fixed a typo in the comment.

Signed-off-by: Haixia Shi 
Cc: Jason Ekstrand ,
Cc: Stéphane Marchesin ,
Cc: Kenneth Graunke 

Change-Id: I324c948b3323ff8107127c42934f14459e124b95
---
 src/compiler/glsl/builtin_functions.cpp | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 3e4bcbb..0bacffb 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3563,9 +3563,18 @@ builtin_builder::_tanh(const glsl_type *type)
ir_variable *x = in_var(type, "x");
MAKE_SIG(type, v130, 1, x);
 
+   /*
+* Clamp x to [-10, +10] to avoid precision problems.
+* When x > 10, e^(-x) is so small relative to e^x that it gets flushed to
+* zero in the computation e^x + e^(-x). The same happens in the other
+* direction when x < -10.
+*/
+   ir_variable *t = body.make_temp(type, "tmp");
+   body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f;
+
/* (e^x - e^(-x)) / (e^x + e^(-x)) */
-   body.emit(ret(div(sub(exp(x), exp(neg(x))),
- add(exp(x), exp(neg(x));
+   body.emit(ret(div(sub(exp(t), exp(neg(t))),
+ add(exp(t), exp(neg(t));
 
return sig;
 }
-- 
2.8.0.rc3.226.g39d4020

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev