On Thu, 2016-02-25 at 18:20 -0500, Ilia Mirkin wrote: > On Thu, Feb 25, 2016 at 6:16 PM, Francisco Jerez <[email protected]> > wrote: > > Ian Romanick <[email protected]> writes: > > > >> On 02/25/2016 12:13 PM, Francisco Jerez wrote: > >>> Ian Romanick <[email protected]> writes: > >>> > >>>> On 02/25/2016 08:46 AM, Roland Scheidegger wrote: > >>>>> Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: > >>>>>> From the OpenGL 4.2 spec: > >>>>>> > >>>>>> "When a constructor is used to convert any integer or floating-point > >>>>>> type to a > >>>>>> bool, 0 and 0.0 are converted to false, and non-zero values are > >>>>>> converted to > >>>>>> true." > >>>>>> > >>>>>> Thus, even the smallest non-zero floating value should be translated > >>>>>> to true. > >>>>>> This behavior has been verified also with proprietary NVIDIA drivers. > >>>>>> > >>>>>> Currently, we implement this conversion as a cmp.nz operation with > >>>>>> floats, > >>>>>> subject to floating-point precision limitations, and as a result, > >>>>>> relatively > >>>>>> small non-zero floating point numbers return false instead of true. > >>>>>> > >>>>>> This patch fixes the problem by getting rid of the sign bit (to cover > >>>>>> the case > >>>>>> of -0.0) and testing the result against 0u using an integer comparison > >>>>>> instead. > >>>>>> --- > >>>>>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 ++++++++++++--- > >>>>>> 1 file changed, 12 insertions(+), 3 deletions(-) > >>>>>> > >>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >>>>>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >>>>>> index db20c71..7d62d7e 100644 > >>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >>>>>> @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, > >>>>>> nir_alu_instr *instr) > >>>>>> bld.MOV(result, negate(op[0])); > >>>>>> break; > >>>>>> > >>>>>> - case nir_op_f2b: > >>>>>> - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); > >>>>>> - break; > >>>>>> + case nir_op_f2b: { > >>>>>> + /* Because comparing to 0.0f is subject to precision > >>>>>> limitations, do the > >>>>>> + * comparison using integers (we need to get rid of the sign > >>>>>> bit for that) > >>>>>> + */ > >>>>>> + if (devinfo->gen >= 8) > >>>>>> + op[0] = resolve_source_modifiers(op[0]); > >>>>>> + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); > >>>>>> + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFFFFFu)); > >>>>>> + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); > >>>>>> + break; > >>>>>> + } > >>>>>> + > >>>>>> case nir_op_i2b: > >>>>>> bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); > >>>>>> break; > >>>>>> > >>>>> > >>>>> Does that fix anything? I don't really see a problem with the existing > >>>>> logic. Yes any "non-zero value" should be converted to true. But surely > >>>>> that definition cannot include denorms, which you are always allowed to > >>>>> flush to zero. > >>>>> (Albeit I can't tell what the result would be with NaNs with the float > >>>>> compare, nor what the result actually should be in this case since glsl > >>>>> doesn't require NaNs neither.) > >>>> > >>>> Based on this and Jason's comments, I think we need a bunch of new tests. > >>>> > >>>> - smallest positive normal number > >>>> - abs of smallest positive normal number > >>>> - neg of " " " " > >>>> - largest positive subnormal number > >>>> - abs of largest positive subnormal number > >>>> - neg of " " " " > >>>> - all of the above with negative numbers > >>>> - NaN > >>>> - abs of NaN > >>>> - neg of " > >>>> > >>>> Perhaps others? +/-Inf just for kicks? > >>> > >>> What's the point? The result of most of the above (except possibly > >>> bool(NaN)) is undefined by the spec: > >>> > >>> "Any denormalized value input into a shader or potentially generated by > >>> any operation in a shader can be flushed to 0. [...] NaNs are not > >>> required to be generated. [...] Operations and built-in functions that > >>> operate on a NaN are not required to return a NaN as the result." > >> > >> Except that apparently one major OpenGL vendor does something well > >> defined that's different than what we do. > > > > I'm skeptical that nVidia would treat single-precision denorms > > inconsistently between datatype constructors and other floating-point > > arithmetic, but assuming that's the case it would be an argument for > > proposing the spec change to Khronos rather than introducing a dubiously > > compliant change into the back-end. I think I would argue against > > making such a change in the spec in any case, because even though it's > > implementation-defined whether denorms are flushed or not, the following > > is guaranteed by the spec AFAIUI: > > > > | if (bool(f)) > > | random_arithmetic_on(f /* Guaranteed not to be zero here even if > > | denorms are flushed */); > > > > With this change in, bool(f) would evaluate to true even if f is a > > denorm which is flushed to zero for all subsequent arithmetic in the > > block. For that reason this seems more likely to break stuff than to > > fix stuff, it's not like applications can expect denorms to be > > representable at all regardless of what nVidia does, but they can expect > > the above GLSL source to work. > > I'd be curious to see a shader test that triggers the relevant > behaviour -- AFAIK nvidia always flushes denorms. The flush is a > per-instruction setting though [since Fermi], so it's technically > feasible not to flush sometimes and to flush other times.
This one: https://github.com/Igalia/piglit/blob/arb_gpu_shader_fp64/tests/spec/arb_gpu_shader_fp64/execution/fs-conversion-explicit-double-bool-bad.shader_test That test is for doubles. The author of the test told me that the equivalent version for float also passed, but I have just verified that he made a mistake. In summary: that test passes for double and fails for float... The quote from the spec shared by Curro says: "Any denormalized value input into a shader or potentially generated by any operation in a shader can be flushed to 0" It says _can_ instead of _must_, so I share Curro's point of view that this is in fact undefined behavior by the spec and f2b of a denorm could return true or false and both are valid implementations. I wonder if nvidia's case of flushing floats but not doubles is also acceptable though, sounds like a bug to me. In any case, based on this I think we can drop the patches. Iago _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
