[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #20 from Richard Biener --- Fixed.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #19 from CVS Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:7797f5ec58078523a452e5cf239596e13d77d885 commit r11-535-g7797f5ec58078523a452e5cf239596e13d77d885 Author: Uros Bizjak Date: Thu May 21 01:53:09 2020 +0200 i386: Do not use commutative operands with (use) RTX [PR95238] 2020-05-21 Uroš Bizjak gcc/ChangeLog: PR target/95218 * config/i386/mmx.md (*mmx_v2sf): Do not mark operands 1 and 2 commutative. Manually swap operands. (*mmx_nabsv2sf2): Ditto. Partially revert: * config/i386/i386.md (*tf2_1): Mark operands 1 and 2 commutative. (*nabstf2_1): Ditto. * config/i386/sse.md (*2): Mark operands 1 and 2 commutative. Do not swap operands. (*nabs2): Ditto.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #18 from Uroš Bizjak --- Created attachment 48575 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48575=edit Patch in testing.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #17 from Uroš Bizjak --- The problem is with commutative operands, these somehow confuse postreload pass. I'll commit partial revert that basically puts back: (define_insn_and_split "*2" - [(set (match_operand:VF 0 "register_operand" "=x,v") + [(set (match_operand:VF 0 "register_operand" "=x,x,v,v") (absneg:VF - (match_operand:VF 1 "vector_operand" "%0,v"))) - (use (match_operand:VF 2 "vector_operand" "xBm,vm"))] + (match_operand:VF 1 "vector_operand" "0,xBm,v,m"))) + (use (match_operand:VF 2 "vector_operand" "xBm,0,vm,v"))] with manual swapping of operands.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 Jeffrey A. Law changed: What|Removed |Added CC||law at redhat dot com --- Comment #16 from Jeffrey A. Law --- A "naked" use as an INSN should only occur inside the pattern of an insn (there are exceptins that exist internally within reload and delay slot scheduling). It's only supposed to be used to mark that a particular object is live and used by the insn. I'll note the following: ... This means that @code{use} can @emph{only} be used to describe that the register is live. You should think twice before adding @code{use} statements, more often you will want to use @code{unspec} instead. The @code{use} RTX is most commonly useful to describe that a fixed register is implicitly used in an insn. It is also safe to use in patterns where the compiler knows for other reasons that the result of the whole pattern is variable, such as @samp{cpymem@var{m}} or @samp{call} patterns. I think one could reasonably think that we should support (use (mem)) with the same semantics -- it means that the value in memory is used which is helpful in dependency tracking.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #15 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #13) > So perhaps pre-reload splitter of that into the UNSPEC form? Vector insns should be able to use pre-reload splitter, but scalar instructions depend on post-reload splitter, because they are split in a different way, depending on a register set of allocated register (FP, XMM or even integer). So, it really needs to be a post-reload splitter.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 Uroš Bizjak changed: What|Removed |Added CC|uros at gcc dot gnu.org| --- Comment #14 from Uroš Bizjak --- Its interesting to note, that only *some* of insns with memory uses get removed.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #13 from Jakub Jelinek --- So perhaps pre-reload splitter of that into the UNSPEC form?
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #12 from Uroš Bizjak --- (In reply to Richard Biener from comment #11) > Note a 'use' is not something that needs to be preserved, so > > (define_insn_and_split "*2" > [(set (match_operand:VF 0 "register_operand" "=x,v") > (absneg:VF > (match_operand:VF 1 "vector_operand" "%0,v"))) >(use (match_operand:VF 2 "vector_operand" "xBm,vm"))] > "TARGET_SSE" > "#" > "&& reload_completed" > [(set (match_dup 0) > (:VF (match_dup 1) (match_dup 2)))] > "" > [(set_attr "isa" "noavx,avx")]) > > doesn't make much sense (before reload). To me, that is. Why do > we go that obfuscated way at all? I think a clean solution is to > use an UNSPEC here (well, "clean"...). The reason for this approach was, that combine still processes the pattern as abs/nop. Please see how *nabstf2_1 is defined.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #11 from Richard Biener --- Note a 'use' is not something that needs to be preserved, so (define_insn_and_split "*2" [(set (match_operand:VF 0 "register_operand" "=x,v") (absneg:VF (match_operand:VF 1 "vector_operand" "%0,v"))) (use (match_operand:VF 2 "vector_operand" "xBm,vm"))] "TARGET_SSE" "#" "&& reload_completed" [(set (match_dup 0) (:VF (match_dup 1) (match_dup 2)))] "" [(set_attr "isa" "noavx,avx")]) doesn't make much sense (before reload). To me, that is. Why do we go that obfuscated way at all? I think a clean solution is to use an UNSPEC here (well, "clean"...).
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #10 from rguenther at suse dot de --- On Wed, 20 May 2020, ubizjak at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 > > Uroš Bizjak changed: > >What|Removed |Added > > CC||law at gcc dot gnu.org, >||rsandifo at gcc dot gnu.org > > --- Comment #9 from Uroš Bizjak --- > (In reply to Uroš Bizjak from comment #7) > > Ooh, yes :( > > > > '(use X)' > > Represents the use of the value of X. It indicates that the value > > in X at this point in the program is needed, even though it may not > > be apparent why this is so. Therefore, the compiler will not > > attempt to delete previous instructions whose only effect is to > > store a value in X. X must be a 'reg' expression. > > > > Partial revert is in works. > > Actually, no. The above applies to single (use ...) RTX, not (use ...) as part > of a parallel. There are plenty of uses of memory_operands in i386.md: > > (define_insn "fix_truncdi_i387" > [(set (match_operand:DI 0 "nonimmediate_operand" "=m") > (fix:DI (match_operand 1 "register_operand" "f"))) >(use (match_operand:HI 2 "memory_operand" "m")) >(use (match_operand:HI 3 "memory_operand" "m")) >(clobber (match_scratch:XF 4 "="))] > > Let's ask experts. The question is what should the (use ...) do? Allow the splitter to use its contents? I guess that's reasonable interpretation though I thought (use ...) apply only to register liveness computation and not to memory. But what do I know about RTL ;)
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 Uroš Bizjak changed: What|Removed |Added CC||law at gcc dot gnu.org, ||rsandifo at gcc dot gnu.org --- Comment #9 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #7) > Ooh, yes :( > > '(use X)' > Represents the use of the value of X. It indicates that the value > in X at this point in the program is needed, even though it may not > be apparent why this is so. Therefore, the compiler will not > attempt to delete previous instructions whose only effect is to > store a value in X. X must be a 'reg' expression. > > Partial revert is in works. Actually, no. The above applies to single (use ...) RTX, not (use ...) as part of a parallel. There are plenty of uses of memory_operands in i386.md: (define_insn "fix_truncdi_i387" [(set (match_operand:DI 0 "nonimmediate_operand" "=m") (fix:DI (match_operand 1 "register_operand" "f"))) (use (match_operand:HI 2 "memory_operand" "m")) (use (match_operand:HI 3 "memory_operand" "m")) (clobber (match_scratch:XF 4 "="))] Let's ask experts.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #8 from Martin Liška --- There's partially reduced test-case: $ cat fma.i double res_test0101[] = { -3,1, 17,51,109, 197, 321, 487, 701, 969, 1297, 1691, 2157, 2701, 3329, 4047, 4861, 5777, 6801, 7939, 9197, 10581, 12097, 13751, 15549, 17497, 19601, 21867, 24301, 26909, 29697, 32671}; double res_test0110[] = {3, -1, -17,-51,-109, -197, -321, -487, -701, -969, -1297, -1691, -2157, -2701, -3329, -4047, -4861, -5777, -6801, -7939, -9197, -10581, -12097, -13751, -15549, -17497, -19601, -21867, -24301, -26909, -29697, -32671}; extern void abort() __attribute__(()) __attribute__(()); static __inline int __get_cpuid(unsigned int __leaf, unsigned int *__eax, unsigned int *__ebx, unsigned int *__ecx, unsigned int *__edx) { __asm__("cpuid\n\t" : "=a"(*__eax), "=b"(*__ebx), "=c"(*__ecx), "=d"(*__edx) : "0"(__leaf)); } static void fma_test(); int main() { unsigned int eax, ebx, ecx, edx; if (!__get_cpuid(1, , , , )) 0; if (ecx & (1 << 12)) fma_test(); return 0; } double m1[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32}; double m2[] = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33}; double m3[] = {3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34}; double m4[32]; int test_fails = 0; void compare_result(double *res) { int i; int good = 1; i = 0; for (i; i < 32; i++) if (m4[i] != res[i]) if (good) good = 0; if (!good) test_fails = 1; } static void fma_test() { double __trans_tmp_3; double __trans_tmp_2; double __trans_tmp_1; int i; for (i = 0; i < 32; i++) m4[i] = 0; i = 0; for (i; i < 32; i++) { double a = m1[i]; double b = m2[i]; double c = m3[i]; __trans_tmp_1 = ((a * b) - c) * a - b; m4[i] = __trans_tmp_1; } compare_result(res_test0101); i = 0; for (i; i < 32; i++) { { double a = m1[i]; double b = m2[i]; double c = m3[i]; __trans_tmp_3 = -((a * b) - c) * a + b; } m4[i] = __trans_tmp_3; } compare_result(res_test0110); i = 0; for (i; i < 32; i++) { double a = m1[i]; double b = m2[i]; double c = m3[i]; __trans_tmp_2 = -((a * b) - c) * a - b; m4[i] = __trans_tmp_2; } if (test_fails) abort(); } $ gcc -O3 -Wno-attributes -mfpmath=sse -mfma fma.i && ./a.out Aborted (core dumped)
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 Uroš Bizjak changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #7 from Uroš Bizjak --- Ooh, yes :( '(use X)' Represents the use of the value of X. It indicates that the value in X at this point in the program is needed, even though it may not be apparent why this is so. Therefore, the compiler will not attempt to delete previous instructions whose only effect is to store a value in X. X must be a 'reg' expression. Partial revert is in works.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #6 from Uroš Bizjak --- I think I found the issue. Before the patch, we had: (insn 375 373 2574 7 (parallel [ (set (reg:V4DF 21 xmm1 [orig:1681 vect__45.441 ] [1681]) (neg:V4DF (mem/c:V4DF (plus:DI (reg/f:DI 7 sp) (const_int 160 [0xa0])) [3 %sfp+-1184 S32 A256]))) (use (reg:V4DF 20 xmm0 [3332])) ]) "fma_1.h":20:10 1487 {*negv4df2} (nil)) after the patch, reload is free to create: (insn 375 3216 2578 7 (parallel [ (set (reg:V4DF 21 xmm1 [orig:1681 vect__45.441 ] [1681]) (neg:V4DF (reg:V4DF 20 xmm0 [3332]))) (use (mem/c:V4DF (plus:DI (reg/f:DI 7 sp) (const_int 160 [0xa0])) [3 %sfp+-1184 S32 A256])) ]) "fma_1.h":20:10 1487 {*negv4df2} (nil)) which postreload pass does not like, and simply deletes it: deleting insn with uid = 375. Just like that. No substitution whatsoever. So, is there some limitation with (use) RTX, so we can't have memory here?
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #5 from Martin Liška --- Sure, doing that.
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 --- Comment #4 from Uroš Bizjak --- (In reply to Martin Liška from comment #3) > Started with r11-455-g94f687bd9ae37ece. It is not obvious from the referred patch what is going wrong here. Unfortunately, I have no FMA capable machine, can someone please isolate one small test that fails?
[Bug target/95218] [11 Regression] FAIL: gcc.target/i386/fma_run_double_1.c execution test since r11-455-g94f687bd9ae37ece
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95218 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org, ||rguenth at gcc dot gnu.org, ||uros at gcc dot gnu.org Summary|[11 Regression] FAIL: |[11 Regression] FAIL: |gcc.target/i386/fma_run_dou |gcc.target/i386/fma_run_dou |ble_1.c execution test |ble_1.c execution test ||since ||r11-455-g94f687bd9ae37ece --- Comment #3 from Martin Liška --- Started with r11-455-g94f687bd9ae37ece.