Re: [Mesa-dev] [PATCH v2 11/53] compiler/spirv: use 32-bit polynomial approximation for 16-bit asin()

2018-12-19 Thread Jason Ekstrand
I've sent a nit or two but the first 11 patches are

Reviewed-by: Jason Ekstrand 

I'd be very happy for you to push them ahead of the rest of the series so
they don't end up in a v3 unless someone else requests significant changes.

On Wed, Dec 19, 2018 at 5:51 AM Iago Toral Quiroga 
wrote:

> The 16-bit polynomial execution doesn't meet Khronos precision
> requirements.
> Also, the half-float denorm range starts at 2^(-14) and with asin taking
> input
> values in the range [0, 1], polynomial approximations can lead to flushing
> relatively easy.
>
> An alternative is to use the atan2 formula to compute asin, which is the
> reference taken by Khronos to determine precision requirements, but that
> ends up generating too many additional instructions when compared to the
> polynomial approximation. Specifically, for the Intel case, doing this
> adds +41 instructions to the program for each asin/acos call, which looks
> like an undesirable trade off.
>
> So for now we take the easy way out and fallback to using the 32-bit
> polynomial approximation, which is better (faster) than the 16-bit atan2
> implementation and gives us better precision that matches Khronos
> requirements.
>
> v2:
>  - Fallback to 32-bit using recursion (Jason).
> ---
>  src/compiler/spirv/vtn_glsl450.c | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/src/compiler/spirv/vtn_glsl450.c
> b/src/compiler/spirv/vtn_glsl450.c
> index 2540331b6cc..0a641077513 100644
> --- a/src/compiler/spirv/vtn_glsl450.c
> +++ b/src/compiler/spirv/vtn_glsl450.c
> @@ -202,6 +202,20 @@ build_log(nir_builder *b, nir_ssa_def *x)
>  static nir_ssa_def *
>  build_asin(nir_builder *b, nir_ssa_def *x, float p0, float p1)
>  {
> +   if (x->bit_size == 16) {
> +  /* The polynomial approximation isn't precise enough to meet
> half-float
> +   * precision requirements. Alternatively, we could implement this
> using
> +   * the formula:
> +   *
> +   * asin(x) = atan2(x, sqrt(1 - x*x))
> +   *
> +   * But that is very expensive, so instead we just do the polynomial
> +   * approximation in 32-bit math and then we convert the result back
> to
> +   * 16-bit.
> +   */
> +  return nir_f2f16(b, build_asin(b, nir_f2f32(b, x), p0, p1));
> +   }
> +
> nir_ssa_def *one = nir_imm_floatN_t(b, 1.0f, x->bit_size);
> nir_ssa_def *abs_x = nir_fabs(b, x);
>
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 11/53] compiler/spirv: use 32-bit polynomial approximation for 16-bit asin()

2018-12-19 Thread Iago Toral Quiroga
The 16-bit polynomial execution doesn't meet Khronos precision requirements.
Also, the half-float denorm range starts at 2^(-14) and with asin taking input
values in the range [0, 1], polynomial approximations can lead to flushing
relatively easy.

An alternative is to use the atan2 formula to compute asin, which is the
reference taken by Khronos to determine precision requirements, but that
ends up generating too many additional instructions when compared to the
polynomial approximation. Specifically, for the Intel case, doing this
adds +41 instructions to the program for each asin/acos call, which looks
like an undesirable trade off.

So for now we take the easy way out and fallback to using the 32-bit
polynomial approximation, which is better (faster) than the 16-bit atan2
implementation and gives us better precision that matches Khronos
requirements.

v2:
 - Fallback to 32-bit using recursion (Jason).
---
 src/compiler/spirv/vtn_glsl450.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 2540331b6cc..0a641077513 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -202,6 +202,20 @@ build_log(nir_builder *b, nir_ssa_def *x)
 static nir_ssa_def *
 build_asin(nir_builder *b, nir_ssa_def *x, float p0, float p1)
 {
+   if (x->bit_size == 16) {
+  /* The polynomial approximation isn't precise enough to meet half-float
+   * precision requirements. Alternatively, we could implement this using
+   * the formula:
+   *
+   * asin(x) = atan2(x, sqrt(1 - x*x))
+   *
+   * But that is very expensive, so instead we just do the polynomial
+   * approximation in 32-bit math and then we convert the result back to
+   * 16-bit.
+   */
+  return nir_f2f16(b, build_asin(b, nir_f2f32(b, x), p0, p1));
+   }
+
nir_ssa_def *one = nir_imm_floatN_t(b, 1.0f, x->bit_size);
nir_ssa_def *abs_x = nir_fabs(b, x);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev