Re: [Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.
On 22/03/16 23:31, Ian Romanick wrote: On 03/17/2016 09:18 AM, Martin Peres wrote: On 16/03/16 19:33, Kenneth Graunke wrote: The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.27 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.7, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.6 to about 0.3. I chose to apply this only when not saturating, as saturate already clamps to 1.0. This may or may not be a good idea. Fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 26 -- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26 -- 2 files changed, 40 insertions(+), 12 deletions(-) This has been in the Vulkan tree for a while - we needed it to pass the Vulkan CTS, as it contains these same dEQP tests. I haven't run shader-db yet, but I don't expect we'll like the results. The patch is pretty sketchy, too. I'm sort of tempted to hide it behind an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day... FYI, a quick run on hsw_gt2 shows -10.45% on Gputest:voplosion. Can you explain this result? -10.45% of what? Instructions? FPS? And this is comparing what to what? Before this patch to after? So sorry Ian, I completely missed your email! Yes, it was just applying this patch and I was talking about FPS. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.
On Tuesday, March 22, 2016 2:31:42 PM PDT Ian Romanick wrote: > On 03/17/2016 09:18 AM, Martin Peres wrote: > > On 16/03/16 19:33, Kenneth Graunke wrote: > >> The SIN and COS instructions on Intel hardware can produce values > >> slightly outside of the [-1.0, 1.0] range for a small set of values. > >> Obviously, this can break everyone's expectations about trig functions. > >> > >> According to an internal presentation, the COS instruction can produce > >> a value up to 1.27 for inputs in the range (0.08296, 0.09888). One > >> suggested workaround is to multiply by 0.7, scaling down the > >> amplitude slightly. Apparently this also minimizes the error function, > >> reducing the maximum error from 0.6 to about 0.3. > >> > >> I chose to apply this only when not saturating, as saturate already > >> clamps to 1.0. This may or may not be a good idea. > >> > >> Fixes 16 dEQP precision tests > >> > >> dEQP-GLES31.functional.shaders.builtin_functions.precision. > >> {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. > >> > >> at the cost of making every sin and cos call more expensive. > >> > >> Signed-off-by: Kenneth Graunke > >> --- > >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 26 > >> -- > >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26 > >> -- > >> 2 files changed, 40 insertions(+), 12 deletions(-) > >> > >> This has been in the Vulkan tree for a while - we needed it to pass the > >> Vulkan CTS, as it contains these same dEQP tests. > >> > >> I haven't run shader-db yet, but I don't expect we'll like the results. > >> > >> The patch is pretty sketchy, too. I'm sort of tempted to hide it behind > >> an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day... > > > > FYI, a quick run on hsw_gt2 shows -10.45% on Gputest:voplosion. > > Can you explain this result? -10.45% of what? Instructions? FPS? And > this is comparing what to what? Before this patch to after? I think FPS goes down by 10.45% when applying this patch. Pretty dire. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.
On 03/17/2016 09:18 AM, Martin Peres wrote: > On 16/03/16 19:33, Kenneth Graunke wrote: >> The SIN and COS instructions on Intel hardware can produce values >> slightly outside of the [-1.0, 1.0] range for a small set of values. >> Obviously, this can break everyone's expectations about trig functions. >> >> According to an internal presentation, the COS instruction can produce >> a value up to 1.27 for inputs in the range (0.08296, 0.09888). One >> suggested workaround is to multiply by 0.7, scaling down the >> amplitude slightly. Apparently this also minimizes the error function, >> reducing the maximum error from 0.6 to about 0.3. >> >> I chose to apply this only when not saturating, as saturate already >> clamps to 1.0. This may or may not be a good idea. >> >> Fixes 16 dEQP precision tests >> >> dEQP-GLES31.functional.shaders.builtin_functions.precision. >> {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. >> >> at the cost of making every sin and cos call more expensive. >> >> Signed-off-by: Kenneth Graunke >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 26 >> -- >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26 >> -- >> 2 files changed, 40 insertions(+), 12 deletions(-) >> >> This has been in the Vulkan tree for a while - we needed it to pass the >> Vulkan CTS, as it contains these same dEQP tests. >> >> I haven't run shader-db yet, but I don't expect we'll like the results. >> >> The patch is pretty sketchy, too. I'm sort of tempted to hide it behind >> an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day... > > FYI, a quick run on hsw_gt2 shows -10.45% on Gputest:voplosion. Can you explain this result? -10.45% of what? Instructions? FPS? And this is comparing what to what? Before this patch to after? > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.
The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.27 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.7, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.6 to about 0.3. I chose to apply this only when not saturating, as saturate already clamps to 1.0. This may or may not be a good idea. Fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 26 -- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26 -- 2 files changed, 40 insertions(+), 12 deletions(-) This has been in the Vulkan tree for a while - we needed it to pass the Vulkan CTS, as it contains these same dEQP tests. I haven't run shader-db yet, but I don't expect we'll like the results. The patch is pretty sketchy, too. I'm sort of tempted to hide it behind an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day... diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index cde8f0b..2f56f01 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -774,15 +774,29 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) inst->saturate = instr->dest.saturate; break; - case nir_op_fsin: - inst = bld.emit(SHADER_OPCODE_SIN, result, op[0]); - inst->saturate = instr->dest.saturate; + case nir_op_fsin: { + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F); + inst = bld.emit(SHADER_OPCODE_SIN, tmp, op[0]); + if (instr->dest.saturate) { + inst->dst = result; + inst->saturate = true; + } else { + bld.MUL(result, tmp, brw_imm_f(0.7)); + } break; + } - case nir_op_fcos: - inst = bld.emit(SHADER_OPCODE_COS, result, op[0]); - inst->saturate = instr->dest.saturate; + case nir_op_fcos: { + fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F); + inst = bld.emit(SHADER_OPCODE_COS, tmp, op[0]); + if (instr->dest.saturate) { + inst->dst = result; + inst->saturate = true; + } else { + bld.MUL(result, tmp, brw_imm_f(0.7)); + } break; + } case nir_op_fddx: if (fs_key->high_quality_derivatives) { diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 52977f1..c7a2408 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -1100,15 +1100,29 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) inst->saturate = instr->dest.saturate; break; - case nir_op_fsin: - inst = emit_math(SHADER_OPCODE_SIN, dst, op[0]); - inst->saturate = instr->dest.saturate; + case nir_op_fsin: { + src_reg tmp = src_reg(this, glsl_type::vec4_type); + inst = emit_math(SHADER_OPCODE_SIN, dst_reg(tmp), op[0]); + if (instr->dest.saturate) { + inst->dst = dst; + inst->saturate = true; + } else { + emit(MUL(dst, tmp, brw_imm_f(0.7))); + } break; + } - case nir_op_fcos: - inst = emit_math(SHADER_OPCODE_COS, dst, op[0]); - inst->saturate = instr->dest.saturate; + case nir_op_fcos: { + src_reg tmp = src_reg(this, glsl_type::vec4_type); + inst = emit_math(SHADER_OPCODE_COS, dst_reg(tmp), op[0]); + if (instr->dest.saturate) { + inst->dst = dst; + inst->saturate = true; + } else { + emit(MUL(dst, tmp, brw_imm_f(0.7))); + } break; + } case nir_op_idiv: case nir_op_udiv: -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.
On Thursday, March 17, 2016 6:18:39 PM PDT Martin Peres wrote: > On 16/03/16 19:33, Kenneth Graunke wrote: > > The SIN and COS instructions on Intel hardware can produce values > > slightly outside of the [-1.0, 1.0] range for a small set of values. > > Obviously, this can break everyone's expectations about trig functions. > > > > According to an internal presentation, the COS instruction can produce > > a value up to 1.27 for inputs in the range (0.08296, 0.09888). One > > suggested workaround is to multiply by 0.7, scaling down the > > amplitude slightly. Apparently this also minimizes the error function, > > reducing the maximum error from 0.6 to about 0.3. > > > > I chose to apply this only when not saturating, as saturate already > > clamps to 1.0. This may or may not be a good idea. > > > > Fixes 16 dEQP precision tests > > > > dEQP-GLES31.functional.shaders.builtin_functions.precision. > > {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. > > > > at the cost of making every sin and cos call more expensive. > > > > Signed-off-by: Kenneth Graunke > > --- > > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 26 +++ +-- > > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26 +++ +-- > > 2 files changed, 40 insertions(+), 12 deletions(-) > > > > This has been in the Vulkan tree for a while - we needed it to pass the > > Vulkan CTS, as it contains these same dEQP tests. > > > > I haven't run shader-db yet, but I don't expect we'll like the results. > > > > The patch is pretty sketchy, too. I'm sort of tempted to hide it behind > > an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day... > > FYI, a quick run on hsw_gt2 shows -10.45% on Gputest:voplosion. Thanks! It sounds like we probably should hide this behind some kind of INTEL_STRICT_CONFORMANCE option or build time flag. sin and cos are really common, and making them cost effectively twice as much is really painful. I'll come up with a v2 at some point, let's NAK this for now. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.
On 16/03/16 19:33, Kenneth Graunke wrote: The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.27 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.7, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.6 to about 0.3. I chose to apply this only when not saturating, as saturate already clamps to 1.0. This may or may not be a good idea. Fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 26 -- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26 -- 2 files changed, 40 insertions(+), 12 deletions(-) This has been in the Vulkan tree for a while - we needed it to pass the Vulkan CTS, as it contains these same dEQP tests. I haven't run shader-db yet, but I don't expect we'll like the results. The patch is pretty sketchy, too. I'm sort of tempted to hide it behind an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day... FYI, a quick run on hsw_gt2 shows -10.45% on Gputest:voplosion. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev