Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On Fri, Feb 23, 2018 at 8:18 PM, Bas Nieuwenhuizenwrote: > On Fri, Feb 23, 2018 at 6:39 PM, Connor Abbott wrote: >> On Fri, Feb 23, 2018 at 8:30 AM, Bas Nieuwenhuizen >> wrote: >>> >>> >>> On Thu, Feb 15, 2018 at 8:54 AM, Connor Abbott wrote: On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceri wrote: > > > On 15/02/18 04:39, Marek Olšák wrote: >> >> Reviewed-by: Marek Olšák >> >> Marek >> >> On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri >> wrote: >>> >>> Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for >>> radeonsi and should fix SPIRV errors when LLVM optimises away >>> the workarounds in vtn_handle_alu() for handling ordered >>> comparisons. >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 >>> --- >>> src/amd/common/ac_nir_to_llvm.c | 8 >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/amd/common/ac_nir_to_llvm.c >>> b/src/amd/common/ac_nir_to_llvm.c >>> index a0c5680205..e81f86bb08 100644 >>> --- a/src/amd/common/ac_nir_to_llvm.c >>> +++ b/src/amd/common/ac_nir_to_llvm.c >>> @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context >>> *ctx, >>> const nir_alu_instr *instr) >>> result = emit_int_cmp(>ac, LLVMIntUGE, src[0], >>> src[1]); >>> break; >>> case nir_op_feq: >>> - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], >>> src[1]); >>> + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], >>> src[1]); >>> break; >>> case nir_op_fne: >>> - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], >>> src[1]); >>> + result = emit_float_cmp(>ac, LLVMRealONE, src[0], >>> src[1]); > > > It seems we need to leave this one as is to avoid regressions. This is > also > what radeonsi does. So, the thing you have to understand is that in LLVM unordered comparisons are precisely the inverse of the ordered comparisons. That is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a !=(unordered) b), and so on. C defines that all comparsions are ordered except !=, so that (a == b) == !(a != b) always holds true. Most hardware follows this convention -- offhand, x86 SSE is the only ISA I know of with separate ordered and unordered comparisons, and LLVM appears to have copied the distinction from them, but no one else has both. I'm not even sure if it's in the IEEE spec. GLSL follows the C convention, so glsl_to_nir just uses nir_op_fne to mean unordered not-equal. spirv_to_nir generates some extra instructions, which then get stripped away later... sigh. I think the right way to untangle this mess is to define that the NIR opcodes should always match the C convention. The separate ordered and unordered opcodes are unnecesary, since one is just the logical negation of the other, and LLVM was a little overzealous -- I'm sure they would get rid of the distinction if they had the chance -- and then they were blindly copied to SPIR-V. spirv_to_nir should just negate the result if necessary rather than emitting the extra code to handle NaN, and ac should use ordered except for not-equals. >>> >>> >>> I think we should also use ordered for not-equal. Otherwise we have no way >>> to contruct an unordered equal or ordered not-equal using the not-trick. I >>> think that would be more important than trying to keep it in sync with C? >> >> I was thinking about that too... but all the backends (except for >> radv), frontends, opt_algebraic patterns, etc. currently assume fne >> means unordered not-equals. We'd have to rewrite a lot of stuff to >> flip the meaning. But if you're willing to do all the mechanical >> rewriting, sure :). > > Well, nir_opt_algebraic completely disregards whether it is ordered > and will happily flip it the wrong way > (https://cgit.freedesktop.org/mesa/mesa/tree/src/compiler/nir/nir_opt_algebraic.py#n137) > and it seems like spirv does try to be extra careful by doing the NaN > checks manually for all the comparisons (but in the process still > relies on feq being ordered). So I'd argue that we need some > non-backend work either way. > > I agree that switching over all backends is annoying though, > especially because not all of them have a great instruction selector > that we can trust to optimize out any of the not's if we don't lower > them in nir_opt_algebraic to instructions with explicit orderedness. > So radv switching over to what intel uses seems OK for now. > > btw
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On Fri, Feb 23, 2018 at 6:39 PM, Connor Abbottwrote: > On Fri, Feb 23, 2018 at 8:30 AM, Bas Nieuwenhuizen wrote: >> >> >> On Thu, Feb 15, 2018 at 8:54 AM, Connor Abbott wrote: >>> >>> On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceri >>> wrote: >>> > >>> > >>> > On 15/02/18 04:39, Marek Olšák wrote: >>> >> >>> >> Reviewed-by: Marek Olšák >>> >> >>> >> Marek >>> >> >>> >> On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri >>> >> wrote: >>> >>> >>> >>> Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for >>> >>> radeonsi and should fix SPIRV errors when LLVM optimises away >>> >>> the workarounds in vtn_handle_alu() for handling ordered >>> >>> comparisons. >>> >>> >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 >>> >>> --- >>> >>> src/amd/common/ac_nir_to_llvm.c | 8 >>> >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> >>> >>> diff --git a/src/amd/common/ac_nir_to_llvm.c >>> >>> b/src/amd/common/ac_nir_to_llvm.c >>> >>> index a0c5680205..e81f86bb08 100644 >>> >>> --- a/src/amd/common/ac_nir_to_llvm.c >>> >>> +++ b/src/amd/common/ac_nir_to_llvm.c >>> >>> @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context >>> >>> *ctx, >>> >>> const nir_alu_instr *instr) >>> >>> result = emit_int_cmp(>ac, LLVMIntUGE, src[0], >>> >>> src[1]); >>> >>> break; >>> >>> case nir_op_feq: >>> >>> - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], >>> >>> src[1]); >>> >>> + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], >>> >>> src[1]); >>> >>> break; >>> >>> case nir_op_fne: >>> >>> - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], >>> >>> src[1]); >>> >>> + result = emit_float_cmp(>ac, LLVMRealONE, src[0], >>> >>> src[1]); >>> > >>> > >>> > It seems we need to leave this one as is to avoid regressions. This is >>> > also >>> > what radeonsi does. >>> >>> So, the thing you have to understand is that in LLVM unordered >>> comparisons are precisely the inverse of the ordered comparisons. That >>> is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a >>> !=(unordered) b), and so on. C defines that all comparsions are >>> ordered except !=, so that (a == b) == !(a != b) always holds true. >>> Most hardware follows this convention -- offhand, x86 SSE is the only >>> ISA I know of with separate ordered and unordered comparisons, and >>> LLVM appears to have copied the distinction from them, but no one else >>> has both. I'm not even sure if it's in the IEEE spec. GLSL follows the >>> C convention, so glsl_to_nir just uses nir_op_fne to mean unordered >>> not-equal. spirv_to_nir generates some extra instructions, which then >>> get stripped away later... sigh. >>> >>> I think the right way to untangle this mess is to define that the NIR >>> opcodes should always match the C convention. The separate ordered and >>> unordered opcodes are unnecesary, since one is just the logical >>> negation of the other, and LLVM was a little overzealous -- I'm sure >>> they would get rid of the distinction if they had the chance -- and >>> then they were blindly copied to SPIR-V. spirv_to_nir should just >>> negate the result if necessary rather than emitting the extra code to >>> handle NaN, and ac should use ordered except for not-equals. >> >> >> I think we should also use ordered for not-equal. Otherwise we have no way >> to contruct an unordered equal or ordered not-equal using the not-trick. I >> think that would be more important than trying to keep it in sync with C? > > I was thinking about that too... but all the backends (except for > radv), frontends, opt_algebraic patterns, etc. currently assume fne > means unordered not-equals. We'd have to rewrite a lot of stuff to > flip the meaning. But if you're willing to do all the mechanical > rewriting, sure :). Well, nir_opt_algebraic completely disregards whether it is ordered and will happily flip it the wrong way (https://cgit.freedesktop.org/mesa/mesa/tree/src/compiler/nir/nir_opt_algebraic.py#n137) and it seems like spirv does try to be extra careful by doing the NaN checks manually for all the comparisons (but in the process still relies on feq being ordered). So I'd argue that we need some non-backend work either way. I agree that switching over all backends is annoying though, especially because not all of them have a great instruction selector that we can trust to optimize out any of the not's if we don't lower them in nir_opt_algebraic to instructions with explicit orderedness. So radv switching over to what intel uses seems OK for now. btw should we document this somewhere? Either in the instruction name, or at least in the definition in nir_opcodes.py? > >>> >>> >>> > >>> > >>> >>> break; >>> >>> case nir_op_flt:
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On Fri, Feb 23, 2018 at 6:39 PM, Connor Abbottwrote: > On Fri, Feb 23, 2018 at 8:30 AM, Bas Nieuwenhuizen wrote: >> >> >> On Thu, Feb 15, 2018 at 8:54 AM, Connor Abbott wrote: >>> >>> On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceri >>> wrote: >>> > >>> > >>> > On 15/02/18 04:39, Marek Olšák wrote: >>> >> >>> >> Reviewed-by: Marek Olšák >>> >> >>> >> Marek >>> >> >>> >> On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri >>> >> wrote: >>> >>> >>> >>> Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for >>> >>> radeonsi and should fix SPIRV errors when LLVM optimises away >>> >>> the workarounds in vtn_handle_alu() for handling ordered >>> >>> comparisons. >>> >>> >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 >>> >>> --- >>> >>> src/amd/common/ac_nir_to_llvm.c | 8 >>> >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> >>> >>> diff --git a/src/amd/common/ac_nir_to_llvm.c >>> >>> b/src/amd/common/ac_nir_to_llvm.c >>> >>> index a0c5680205..e81f86bb08 100644 >>> >>> --- a/src/amd/common/ac_nir_to_llvm.c >>> >>> +++ b/src/amd/common/ac_nir_to_llvm.c >>> >>> @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context >>> >>> *ctx, >>> >>> const nir_alu_instr *instr) >>> >>> result = emit_int_cmp(>ac, LLVMIntUGE, src[0], >>> >>> src[1]); >>> >>> break; >>> >>> case nir_op_feq: >>> >>> - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], >>> >>> src[1]); >>> >>> + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], >>> >>> src[1]); >>> >>> break; >>> >>> case nir_op_fne: >>> >>> - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], >>> >>> src[1]); >>> >>> + result = emit_float_cmp(>ac, LLVMRealONE, src[0], >>> >>> src[1]); >>> > >>> > >>> > It seems we need to leave this one as is to avoid regressions. This is >>> > also >>> > what radeonsi does. >>> >>> So, the thing you have to understand is that in LLVM unordered >>> comparisons are precisely the inverse of the ordered comparisons. That >>> is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a >>> !=(unordered) b), and so on. C defines that all comparsions are >>> ordered except !=, so that (a == b) == !(a != b) always holds true. >>> Most hardware follows this convention -- offhand, x86 SSE is the only >>> ISA I know of with separate ordered and unordered comparisons, and >>> LLVM appears to have copied the distinction from them, but no one else >>> has both. I'm not even sure if it's in the IEEE spec. GLSL follows the >>> C convention, so glsl_to_nir just uses nir_op_fne to mean unordered >>> not-equal. spirv_to_nir generates some extra instructions, which then >>> get stripped away later... sigh. >>> >>> I think the right way to untangle this mess is to define that the NIR >>> opcodes should always match the C convention. The separate ordered and >>> unordered opcodes are unnecesary, since one is just the logical >>> negation of the other, and LLVM was a little overzealous -- I'm sure >>> they would get rid of the distinction if they had the chance -- and >>> then they were blindly copied to SPIR-V. spirv_to_nir should just >>> negate the result if necessary rather than emitting the extra code to >>> handle NaN, and ac should use ordered except for not-equals. >> >> >> I think we should also use ordered for not-equal. Otherwise we have no way >> to contruct an unordered equal or ordered not-equal using the not-trick. I >> think that would be more important than trying to keep it in sync with C? > > I was thinking about that too... but all the backends (except for > radv), frontends, opt_algebraic patterns, etc. currently assume fne > means unordered not-equals. We'd have to rewrite a lot of stuff to > flip the meaning. But if you're willing to do all the mechanical > rewriting, sure :). not-equal is unordered in DirectX, GLSL, and C. Let's keep it that way. Ordered not-equal can't be generated as far as I know. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On Fri, Feb 23, 2018 at 8:30 AM, Bas Nieuwenhuizenwrote: > > > On Thu, Feb 15, 2018 at 8:54 AM, Connor Abbott wrote: >> >> On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceri >> wrote: >> > >> > >> > On 15/02/18 04:39, Marek Olšák wrote: >> >> >> >> Reviewed-by: Marek Olšák >> >> >> >> Marek >> >> >> >> On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri >> >> wrote: >> >>> >> >>> Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for >> >>> radeonsi and should fix SPIRV errors when LLVM optimises away >> >>> the workarounds in vtn_handle_alu() for handling ordered >> >>> comparisons. >> >>> >> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 >> >>> --- >> >>> src/amd/common/ac_nir_to_llvm.c | 8 >> >>> 1 file changed, 4 insertions(+), 4 deletions(-) >> >>> >> >>> diff --git a/src/amd/common/ac_nir_to_llvm.c >> >>> b/src/amd/common/ac_nir_to_llvm.c >> >>> index a0c5680205..e81f86bb08 100644 >> >>> --- a/src/amd/common/ac_nir_to_llvm.c >> >>> +++ b/src/amd/common/ac_nir_to_llvm.c >> >>> @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context >> >>> *ctx, >> >>> const nir_alu_instr *instr) >> >>> result = emit_int_cmp(>ac, LLVMIntUGE, src[0], >> >>> src[1]); >> >>> break; >> >>> case nir_op_feq: >> >>> - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], >> >>> src[1]); >> >>> + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], >> >>> src[1]); >> >>> break; >> >>> case nir_op_fne: >> >>> - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], >> >>> src[1]); >> >>> + result = emit_float_cmp(>ac, LLVMRealONE, src[0], >> >>> src[1]); >> > >> > >> > It seems we need to leave this one as is to avoid regressions. This is >> > also >> > what radeonsi does. >> >> So, the thing you have to understand is that in LLVM unordered >> comparisons are precisely the inverse of the ordered comparisons. That >> is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a >> !=(unordered) b), and so on. C defines that all comparsions are >> ordered except !=, so that (a == b) == !(a != b) always holds true. >> Most hardware follows this convention -- offhand, x86 SSE is the only >> ISA I know of with separate ordered and unordered comparisons, and >> LLVM appears to have copied the distinction from them, but no one else >> has both. I'm not even sure if it's in the IEEE spec. GLSL follows the >> C convention, so glsl_to_nir just uses nir_op_fne to mean unordered >> not-equal. spirv_to_nir generates some extra instructions, which then >> get stripped away later... sigh. >> >> I think the right way to untangle this mess is to define that the NIR >> opcodes should always match the C convention. The separate ordered and >> unordered opcodes are unnecesary, since one is just the logical >> negation of the other, and LLVM was a little overzealous -- I'm sure >> they would get rid of the distinction if they had the chance -- and >> then they were blindly copied to SPIR-V. spirv_to_nir should just >> negate the result if necessary rather than emitting the extra code to >> handle NaN, and ac should use ordered except for not-equals. > > > I think we should also use ordered for not-equal. Otherwise we have no way > to contruct an unordered equal or ordered not-equal using the not-trick. I > think that would be more important than trying to keep it in sync with C? I was thinking about that too... but all the backends (except for radv), frontends, opt_algebraic patterns, etc. currently assume fne means unordered not-equals. We'd have to rewrite a lot of stuff to flip the meaning. But if you're willing to do all the mechanical rewriting, sure :). >> >> >> > >> > >> >>> break; >> >>> case nir_op_flt: >> >>> - result = emit_float_cmp(>ac, LLVMRealULT, src[0], >> >>> src[1]); >> >>> + result = emit_float_cmp(>ac, LLVMRealOLT, src[0], >> >>> src[1]); >> >>> break; >> >>> case nir_op_fge: >> >>> - result = emit_float_cmp(>ac, LLVMRealUGE, src[0], >> >>> src[1]); >> >>> + result = emit_float_cmp(>ac, LLVMRealOGE, src[0], >> >>> src[1]); >> >>> break; >> >>> case nir_op_ufeq: >> >>> result = emit_float_cmp(>ac, LLVMRealUEQ, >> >>> src[0], >> >>> src[1]); >> >>> -- >> >>> 2.14.3 >> >>> >> >>> ___ >> >>> mesa-dev mailing list >> >>> mesa-dev@lists.freedesktop.org >> >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > >> > ___ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> ___ >> mesa-dev mailing
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On Thu, Feb 15, 2018 at 8:54 AM, Connor Abbottwrote: > On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceri > wrote: > > > > > > On 15/02/18 04:39, Marek Olšák wrote: > >> > >> Reviewed-by: Marek Olšák > >> > >> Marek > >> > >> On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri > >> wrote: > >>> > >>> Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for > >>> radeonsi and should fix SPIRV errors when LLVM optimises away > >>> the workarounds in vtn_handle_alu() for handling ordered > >>> comparisons. > >>> > >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 > >>> --- > >>> src/amd/common/ac_nir_to_llvm.c | 8 > >>> 1 file changed, 4 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/src/amd/common/ac_nir_to_llvm.c > >>> b/src/amd/common/ac_nir_to_llvm.c > >>> index a0c5680205..e81f86bb08 100644 > >>> --- a/src/amd/common/ac_nir_to_llvm.c > >>> +++ b/src/amd/common/ac_nir_to_llvm.c > >>> @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context > *ctx, > >>> const nir_alu_instr *instr) > >>> result = emit_int_cmp(>ac, LLVMIntUGE, src[0], > >>> src[1]); > >>> break; > >>> case nir_op_feq: > >>> - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], > >>> src[1]); > >>> + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], > >>> src[1]); > >>> break; > >>> case nir_op_fne: > >>> - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], > >>> src[1]); > >>> + result = emit_float_cmp(>ac, LLVMRealONE, src[0], > >>> src[1]); > > > > > > It seems we need to leave this one as is to avoid regressions. This is > also > > what radeonsi does. > > So, the thing you have to understand is that in LLVM unordered > comparisons are precisely the inverse of the ordered comparisons. That > is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a > !=(unordered) b), and so on. C defines that all comparsions are > ordered except !=, so that (a == b) == !(a != b) always holds true. > Most hardware follows this convention -- offhand, x86 SSE is the only > ISA I know of with separate ordered and unordered comparisons, and > LLVM appears to have copied the distinction from them, but no one else > has both. I'm not even sure if it's in the IEEE spec. GLSL follows the > C convention, so glsl_to_nir just uses nir_op_fne to mean unordered > not-equal. spirv_to_nir generates some extra instructions, which then > get stripped away later... sigh. > > I think the right way to untangle this mess is to define that the NIR > opcodes should always match the C convention. The separate ordered and > unordered opcodes are unnecesary, since one is just the logical > negation of the other, and LLVM was a little overzealous -- I'm sure > they would get rid of the distinction if they had the chance -- and > then they were blindly copied to SPIR-V. spirv_to_nir should just > negate the result if necessary rather than emitting the extra code to > handle NaN, and ac should use ordered except for not-equals. > I think we should also use ordered for not-equal. Otherwise we have no way to contruct an unordered equal or ordered not-equal using the not-trick. I think that would be more important than trying to keep it in sync with C? > > > > > > >>> break; > >>> case nir_op_flt: > >>> - result = emit_float_cmp(>ac, LLVMRealULT, src[0], > >>> src[1]); > >>> + result = emit_float_cmp(>ac, LLVMRealOLT, src[0], > >>> src[1]); > >>> break; > >>> case nir_op_fge: > >>> - result = emit_float_cmp(>ac, LLVMRealUGE, src[0], > >>> src[1]); > >>> + result = emit_float_cmp(>ac, LLVMRealOGE, src[0], > >>> src[1]); > >>> break; > >>> case nir_op_ufeq: > >>> result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], > >>> src[1]); > >>> -- > >>> 2.14.3 > >>> > >>> ___ > >>> mesa-dev mailing list > >>> mesa-dev@lists.freedesktop.org > >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On 15/02/18 20:06, Bas Nieuwenhuizen wrote: On Thu, Feb 15, 2018 at 8:54 AM, Connor Abbottwrote: On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceri wrote: On 15/02/18 04:39, Marek Olšák wrote: Reviewed-by: Marek Olšák Marek On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri wrote: Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for radeonsi and should fix SPIRV errors when LLVM optimises away the workarounds in vtn_handle_alu() for handling ordered comparisons. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 --- src/amd/common/ac_nir_to_llvm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index a0c5680205..e81f86bb08 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr) result = emit_int_cmp(>ac, LLVMIntUGE, src[0], src[1]); break; case nir_op_feq: - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], src[1]); break; case nir_op_fne: - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealONE, src[0], src[1]); It seems we need to leave this one as is to avoid regressions. This is also what radeonsi does. So, the thing you have to understand is that in LLVM unordered comparisons are precisely the inverse of the ordered comparisons. That is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a !=(unordered) b), and so on. C defines that all comparsions are ordered except !=, so that (a == b) == !(a != b) always holds true. Most hardware follows this convention -- offhand, x86 SSE is the only ISA I know of with separate ordered and unordered comparisons, and LLVM appears to have copied the distinction from them, but no one else has both. I'm not even sure if it's in the IEEE spec. GLSL follows the C convention, so glsl_to_nir just uses nir_op_fne to mean unordered not-equal. spirv_to_nir generates some extra instructions, which then get stripped away later... sigh. I think the right way to untangle this mess is to define that the NIR opcodes should always match the C convention. The separate ordered and unordered opcodes are unnecesary, since one is just the logical negation of the other, and LLVM was a little overzealous -- I'm sure they would get rid of the distinction if they had the chance -- and then they were blindly copied to SPIR-V. spirv_to_nir should just negate the result if necessary rather than emitting the extra code to handle NaN, and ac should use ordered except for not-equals. GCN hardware actually has both ordered and unordered instructions, though I think it could be fair to only introduce them during instruction selection (or conversion to LLVM) and keep a canonical ordered comparison + not in nir. I think the most important part would be to firmly define which one the nir instructions are and then make nir_opt_algebraic not break that. Well you guys seem to understand the issue and solutions better then I do so I think I'll step back from this for the time being. So feel free to tackle the issue. I've also noticed what seems to be mishandling of nans elsewhere. For example on radeonsi the following test produces "vec1 32 ssa_5 = load_const (0x /* -nan */)" which seems to incorrectly handled and optimised out at some point along the way. ./glcts --deqp-case=KHR-GL43.compute_shader.atomic-case1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On Thu, Feb 15, 2018 at 8:54 AM, Connor Abbottwrote: > On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceri > wrote: >> >> >> On 15/02/18 04:39, Marek Olšák wrote: >>> >>> Reviewed-by: Marek Olšák >>> >>> Marek >>> >>> On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri >>> wrote: Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for radeonsi and should fix SPIRV errors when LLVM optimises away the workarounds in vtn_handle_alu() for handling ordered comparisons. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 --- src/amd/common/ac_nir_to_llvm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index a0c5680205..e81f86bb08 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr) result = emit_int_cmp(>ac, LLVMIntUGE, src[0], src[1]); break; case nir_op_feq: - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], src[1]); break; case nir_op_fne: - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealONE, src[0], src[1]); >> >> >> It seems we need to leave this one as is to avoid regressions. This is also >> what radeonsi does. > > So, the thing you have to understand is that in LLVM unordered > comparisons are precisely the inverse of the ordered comparisons. That > is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a > !=(unordered) b), and so on. C defines that all comparsions are > ordered except !=, so that (a == b) == !(a != b) always holds true. > Most hardware follows this convention -- offhand, x86 SSE is the only > ISA I know of with separate ordered and unordered comparisons, and > LLVM appears to have copied the distinction from them, but no one else > has both. I'm not even sure if it's in the IEEE spec. GLSL follows the > C convention, so glsl_to_nir just uses nir_op_fne to mean unordered > not-equal. spirv_to_nir generates some extra instructions, which then > get stripped away later... sigh. > > I think the right way to untangle this mess is to define that the NIR > opcodes should always match the C convention. The separate ordered and > unordered opcodes are unnecesary, since one is just the logical > negation of the other, and LLVM was a little overzealous -- I'm sure > they would get rid of the distinction if they had the chance -- and > then they were blindly copied to SPIR-V. spirv_to_nir should just > negate the result if necessary rather than emitting the extra code to > handle NaN, and ac should use ordered except for not-equals. GCN hardware actually has both ordered and unordered instructions, though I think it could be fair to only introduce them during instruction selection (or conversion to LLVM) and keep a canonical ordered comparison + not in nir. I think the most important part would be to firmly define which one the nir instructions are and then make nir_opt_algebraic not break that. > >> >> break; case nir_op_flt: - result = emit_float_cmp(>ac, LLVMRealULT, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOLT, src[0], src[1]); break; case nir_op_fge: - result = emit_float_cmp(>ac, LLVMRealUGE, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOGE, src[0], src[1]); break; case nir_op_ufeq: result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], src[1]); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On Wed, Feb 14, 2018 at 11:53 PM, Timothy Arceriwrote: > > > On 15/02/18 04:39, Marek Olšák wrote: >> >> Reviewed-by: Marek Olšák >> >> Marek >> >> On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri >> wrote: >>> >>> Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for >>> radeonsi and should fix SPIRV errors when LLVM optimises away >>> the workarounds in vtn_handle_alu() for handling ordered >>> comparisons. >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 >>> --- >>> src/amd/common/ac_nir_to_llvm.c | 8 >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/amd/common/ac_nir_to_llvm.c >>> b/src/amd/common/ac_nir_to_llvm.c >>> index a0c5680205..e81f86bb08 100644 >>> --- a/src/amd/common/ac_nir_to_llvm.c >>> +++ b/src/amd/common/ac_nir_to_llvm.c >>> @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context *ctx, >>> const nir_alu_instr *instr) >>> result = emit_int_cmp(>ac, LLVMIntUGE, src[0], >>> src[1]); >>> break; >>> case nir_op_feq: >>> - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], >>> src[1]); >>> + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], >>> src[1]); >>> break; >>> case nir_op_fne: >>> - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], >>> src[1]); >>> + result = emit_float_cmp(>ac, LLVMRealONE, src[0], >>> src[1]); > > > It seems we need to leave this one as is to avoid regressions. This is also > what radeonsi does. So, the thing you have to understand is that in LLVM unordered comparisons are precisely the inverse of the ordered comparisons. That is, (a <=(ordered) b) == !(a >(unordered b), (a ==(ordered) b) == !(a !=(unordered) b), and so on. C defines that all comparsions are ordered except !=, so that (a == b) == !(a != b) always holds true. Most hardware follows this convention -- offhand, x86 SSE is the only ISA I know of with separate ordered and unordered comparisons, and LLVM appears to have copied the distinction from them, but no one else has both. I'm not even sure if it's in the IEEE spec. GLSL follows the C convention, so glsl_to_nir just uses nir_op_fne to mean unordered not-equal. spirv_to_nir generates some extra instructions, which then get stripped away later... sigh. I think the right way to untangle this mess is to define that the NIR opcodes should always match the C convention. The separate ordered and unordered opcodes are unnecesary, since one is just the logical negation of the other, and LLVM was a little overzealous -- I'm sure they would get rid of the distinction if they had the chance -- and then they were blindly copied to SPIR-V. spirv_to_nir should just negate the result if necessary rather than emitting the extra code to handle NaN, and ac should use ordered except for not-equals. > > >>> break; >>> case nir_op_flt: >>> - result = emit_float_cmp(>ac, LLVMRealULT, src[0], >>> src[1]); >>> + result = emit_float_cmp(>ac, LLVMRealOLT, src[0], >>> src[1]); >>> break; >>> case nir_op_fge: >>> - result = emit_float_cmp(>ac, LLVMRealUGE, src[0], >>> src[1]); >>> + result = emit_float_cmp(>ac, LLVMRealOGE, src[0], >>> src[1]); >>> break; >>> case nir_op_ufeq: >>> result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], >>> src[1]); >>> -- >>> 2.14.3 >>> >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
On 15/02/18 04:39, Marek Olšák wrote: Reviewed-by: Marek OlšákMarek On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri wrote: Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for radeonsi and should fix SPIRV errors when LLVM optimises away the workarounds in vtn_handle_alu() for handling ordered comparisons. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 --- src/amd/common/ac_nir_to_llvm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index a0c5680205..e81f86bb08 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr) result = emit_int_cmp(>ac, LLVMIntUGE, src[0], src[1]); break; case nir_op_feq: - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], src[1]); break; case nir_op_fne: - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealONE, src[0], src[1]); It seems we need to leave this one as is to avoid regressions. This is also what radeonsi does. break; case nir_op_flt: - result = emit_float_cmp(>ac, LLVMRealULT, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOLT, src[0], src[1]); break; case nir_op_fge: - result = emit_float_cmp(>ac, LLVMRealUGE, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOGE, src[0], src[1]); break; case nir_op_ufeq: result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], src[1]); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
Reviewed-by: Marek OlšákMarek On Wed, Feb 14, 2018 at 7:29 AM, Timothy Arceri wrote: > Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for > radeonsi and should fix SPIRV errors when LLVM optimises away > the workarounds in vtn_handle_alu() for handling ordered > comparisons. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 > --- > src/amd/common/ac_nir_to_llvm.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index a0c5680205..e81f86bb08 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context *ctx, > const nir_alu_instr *instr) > result = emit_int_cmp(>ac, LLVMIntUGE, src[0], src[1]); > break; > case nir_op_feq: > - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], > src[1]); > + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], > src[1]); > break; > case nir_op_fne: > - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], > src[1]); > + result = emit_float_cmp(>ac, LLVMRealONE, src[0], > src[1]); > break; > case nir_op_flt: > - result = emit_float_cmp(>ac, LLVMRealULT, src[0], > src[1]); > + result = emit_float_cmp(>ac, LLVMRealOLT, src[0], > src[1]); > break; > case nir_op_fge: > - result = emit_float_cmp(>ac, LLVMRealUGE, src[0], > src[1]); > + result = emit_float_cmp(>ac, LLVMRealOGE, src[0], > src[1]); > break; > case nir_op_ufeq: > result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], > src[1]); > -- > 2.14.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] ac: use correct LLVM opcodes for ordered comparisons
Fixes glsl-1.30/execution/isinf-and-isnan* piglit tests for radeonsi and should fix SPIRV errors when LLVM optimises away the workarounds in vtn_handle_alu() for handling ordered comparisons. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 --- src/amd/common/ac_nir_to_llvm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index a0c5680205..e81f86bb08 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -1792,16 +1792,16 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr) result = emit_int_cmp(>ac, LLVMIntUGE, src[0], src[1]); break; case nir_op_feq: - result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOEQ, src[0], src[1]); break; case nir_op_fne: - result = emit_float_cmp(>ac, LLVMRealUNE, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealONE, src[0], src[1]); break; case nir_op_flt: - result = emit_float_cmp(>ac, LLVMRealULT, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOLT, src[0], src[1]); break; case nir_op_fge: - result = emit_float_cmp(>ac, LLVMRealUGE, src[0], src[1]); + result = emit_float_cmp(>ac, LLVMRealOGE, src[0], src[1]); break; case nir_op_ufeq: result = emit_float_cmp(>ac, LLVMRealUEQ, src[0], src[1]); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev