Jason Clamping to +- 10 sounds good to me. I have sent a patch to the mesa-dev list.
Please port spirv_to_nir in a separate patch if you could, I'm afraid I'm not really familiar with that part and don't know how to test it. On Thu, Dec 8, 2016 at 4:54 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > On Thu, Dec 8, 2016 at 3:31 PM, Haixia Shi <h...@chromium.org> wrote: > >> We're encountering failures in the latest version of dEQP (specifically, >> dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.highp_*) >> on all Intel devices. >> >> The problem is that when the abs value of input is too large (say 80), >> the function should return +/-1, but the actual shader output is 0. >> >> The following patch in glsl compiler fixes the problem, but I'm not >> really sure if a fix in the compiler frontend is the right place. On the >> other hand, once the intermediate code is emitted, it's not really feasible >> for drivers to handle precision problems in backend. Any ideas? >> > > I think this is the correct way to handle this. It isn't really a > back-end precision problem. The problem is that, once x gets large enough, > exp(x) tends towards infinity and you may end up with inf/inf. > > >> Here's the patch >> >> diff --git a/src/compiler/glsl/builtin_functions.cpp >> b/src/compiler/glsl/builtin_functions.cpp >> index 3e4bcbb..6cfeb1b 100644 >> --- a/src/compiler/glsl/builtin_functions.cpp >> +++ b/src/compiler/glsl/builtin_functions.cpp >> @@ -3563,9 +3563,13 @@ builtin_builder::_tanh(const glsl_type *type) >> ir_variable *x = in_var(type, "x"); >> MAKE_SIG(type, v130, 1, x); >> >> + /* clamp x to [-20, +20] to avoid numeric problems */ >> + ir_variable *t = body.make_temp(type, "tmp"); >> + body.emit(assign(t, min2(max2(x, imm(-20.0f)), imm(20.0f)))); >> > > I think I would feel more comfortable if we clamped to +- 10 instead. I'm > a bit more comfortable with the stability of the calculation at lower > values. Also, when x > 10, e^(-x) is so small relative to e^x that it > won't actually count in the floating-point calculation. In particular > e^(-10)/e^(10) = e^(-20) = 2^-27.182 and you only have 23 mantissa bits. > > Also, would you mind fixing spirv_to_nir while you're at it? It should be > in vtn_glsl450.c. I can port it myself if you'd prefer. > > >> + >> /* (e^x - e^(-x)) / (e^x + e^(-x)) */ >> - body.emit(ret(div(sub(exp(x), exp(neg(x))), >> - add(exp(x), exp(neg(x)))))); >> + body.emit(ret(div(sub(exp(t), exp(neg(t))), >> + add(exp(t), exp(neg(t)))))); >> >> return sig; >> } >> >> >> >> >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev