[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 Rainer Orth changed: What|Removed |Added CC||ro at gcc dot gnu.org --- Comment #21 from Rainer Orth --- Created attachment 57437 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57437=edit 32- bit i386-pc-solaris2.11 pr108357.c.120t.threadfull1 The test also FAILs on 32 and 64-bit Solaris/SPARC since it was committed. Changing b from char to unsigned char lets it PASS. Again, this is weird insofar as char is signed on both Solaris/SPARC and Solaris/x86, but the test PASSes on x86 already.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #20 from rguenther at suse dot de --- On Fri, 14 Apr 2023, xry111 at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 > > --- Comment #18 from Xi Ruoyao --- > (In reply to Richard Biener from comment #17) > > Isn't this the same issue as seen in another bug, most targets defining > > TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not? > > That will cause those conversions to be missed. > > Looks like we should define it, as our psABI says: > > In most cases, the unsigned integer data types are zero-extended when stored > in > general-purpose register, and the signed integer data types are sign-extended. > However, in the LP64D ABI, unsigned 32-bit types, such as unsigned int, are > stored in general-purpose registers as proper sign extensions of their 32-bit > values. > > IIUC it matches the semantics of TARGET_PROMOTE_PROTOTYPES. TARGET_PROMOTE_PROTOTYPES is about foo (signed char) or foo (unsigned short), thus argument types less than int. With TARGET_PROMOTE_PROTOTYPES defined to true they will get promoted to integer so you'll see foo ((int)x) when 'x' is of type signed char or unsigned short for the above cases.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #19 from chenglulu --- (In reply to Xi Ruoyao from comment #18) > (In reply to Richard Biener from comment #17) > > Isn't this the same issue as seen in another bug, most targets defining > > TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not? > > That will cause those conversions to be missed. > > Looks like we should define it, as our psABI says: > > In most cases, the unsigned integer data types are zero-extended when stored > in general-purpose register, and the signed integer data types are > sign-extended. However, in the LP64D ABI, unsigned 32-bit types, such as > unsigned int, are stored in general-purpose registers as proper sign > extensions of their 32-bit values. > > IIUC it matches the semantics of TARGET_PROMOTE_PROTOTYPE I also think this should be considered
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #18 from Xi Ruoyao --- (In reply to Richard Biener from comment #17) > Isn't this the same issue as seen in another bug, most targets defining > TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not? > That will cause those conversions to be missed. Looks like we should define it, as our psABI says: In most cases, the unsigned integer data types are zero-extended when stored in general-purpose register, and the signed integer data types are sign-extended. However, in the LP64D ABI, unsigned 32-bit types, such as unsigned int, are stored in general-purpose registers as proper sign extensions of their 32-bit values. IIUC it matches the semantics of TARGET_PROMOTE_PROTOTYPES.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #17 from Richard Biener --- Isn't this the same issue as seen in another bug, most targets defining TARGET_PROMOTE_PROTOTYPES to hook_bool_const_tree_true but loongarch not? That will cause those conversions to be missed.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #16 from chenglulu --- (In reply to rguent...@suse.de from comment #15) > On Thu, 13 Apr 2023, xry111 at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 > > > > --- Comment #14 from Xi Ruoyao --- > > (In reply to rguent...@suse.de from comment #13) > > > On Thu, 13 Apr 2023, chenglulu at loongson dot cn wrote: > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 > > > > > > > > --- Comment #10 from chenglulu --- > > > > (In reply to Xi Ruoyao from comment #5) > > > > > The test fails on loongarch64-linux-gnu. foo is kept in > > > > > 114t.threadfull1, > > > > > but removed in 135t.forwprop3. > > > > > > > > > > Does this mean something is wrong for LoongArch, or we should simply > > > > > check > > > > > the tree dump in a later pass (for e.g. 254t.optimized)? > > > > > > > > If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the > > > > test > > > > case can pass the test. I guess it is because the definition of > > > > DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting > > > > in some > > > > blocks that cannot be removed, resulting in the failure of this test > > > > case. > > > > > > Can you check if making b unsigned fixes the test for you? If so > > > that's what we should do. > > > > It works? > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > > b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > > index 44c457b7a97..79cf371ef28 100644 > > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > > @@ -1,7 +1,7 @@ > > /* { dg-do compile } */ > > /* { dg-options "-O2 -fdump-tree-threadfull1" } */ > > > > -static char b; > > +static unsigned char b; > > static unsigned c; > > void foo(); > > short(a)(short d, short e) { return d * e; } > > > > But I'm still wondering why this is not an issue for x86_64. > > Yes, that's interesting to see. It does change how b is extended > in b ^ 9854 (but for the value zero it doesn't matter). I think the problem is here: In adjust_alignment, the intermediate result output of loongarch and x86 is as follows: LoongArch: ... b.2_1 = bD.2176; # RANGE [irange] short int [-128, 127] _2 = (short intD.12) b.2_1; # RANGE [irange] short int [-16384, -1][1, 16383] _3 = _2 ^ 9854; # RANGE [irange] unsigned short [1, 16383][49152, +INF] e.1_6 = (unsigned short) _3; _7 = e.1_6 * 5; _8 = (short intD.12) _7; # .MEM_15 = VDEF <.MEM_4(D)> bD.2176 = 0; if (_8 != 0) goto ; [67.00%] else goto ; [33.00%] ... c.4_9 = 0; _10 = c.4_9 == 0; # RANGE [irange] int [0, 1] NONZERO 0x1 _11 = (intD.1) _10; # RANGE [irange] int [-32768, -1][1, 32767] _12 = (intD.1) _8; ... X86: ... b.2_1 = bD.2738; # RANGE [irange] short int [-128, 127] _2 = (short intD.17) b.2_1; # RANGE [irange] short int [-16384, -1][1, 16383] _3 = _2 ^ 9854; # RANGE [irange] unsigned short [1, 16383][49152, +INF] e.1_7 = (unsigned short) _3; _8 = e.1_7 * 5; _9 = (short intD.17) _8; # RANGE [irange] int [-32768, 32767] _4 = (intD.6) _9; d_10 = (short intD.17) _4; # .MEM_17 = VDEF <.MEM_5(D)> bD.2738 = 0; if (d_10 != 0) goto ; [67.00%] else goto ; [33.00%] ... There is an additional intermediate variable _9 in x86 and loongarch does not, but _8 is used, but _8 is used twice, so if (_8 != 0) goto ; [67.00%] else goto ; [33.00%] is not deleted when ccp2 passes. That's why the test case failed. I think if loongarch can generate an intermediate variable like x86, the test will pass.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #15 from rguenther at suse dot de --- On Thu, 13 Apr 2023, xry111 at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 > > --- Comment #14 from Xi Ruoyao --- > (In reply to rguent...@suse.de from comment #13) > > On Thu, 13 Apr 2023, chenglulu at loongson dot cn wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 > > > > > > --- Comment #10 from chenglulu --- > > > (In reply to Xi Ruoyao from comment #5) > > > > The test fails on loongarch64-linux-gnu. foo is kept in > > > > 114t.threadfull1, > > > > but removed in 135t.forwprop3. > > > > > > > > Does this mean something is wrong for LoongArch, or we should simply > > > > check > > > > the tree dump in a later pass (for e.g. 254t.optimized)? > > > > > > If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the > > > test > > > case can pass the test. I guess it is because the definition of > > > DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting > > > in some > > > blocks that cannot be removed, resulting in the failure of this test case. > > > > Can you check if making b unsigned fixes the test for you? If so > > that's what we should do. > > It works? > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > index 44c457b7a97..79cf371ef28 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c > @@ -1,7 +1,7 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -fdump-tree-threadfull1" } */ > > -static char b; > +static unsigned char b; > static unsigned c; > void foo(); > short(a)(short d, short e) { return d * e; } > > But I'm still wondering why this is not an issue for x86_64. Yes, that's interesting to see. It does change how b is extended in b ^ 9854 (but for the value zero it doesn't matter).
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #14 from Xi Ruoyao --- (In reply to rguent...@suse.de from comment #13) > On Thu, 13 Apr 2023, chenglulu at loongson dot cn wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 > > > > --- Comment #10 from chenglulu --- > > (In reply to Xi Ruoyao from comment #5) > > > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > > > but removed in 135t.forwprop3. > > > > > > Does this mean something is wrong for LoongArch, or we should simply check > > > the tree dump in a later pass (for e.g. 254t.optimized)? > > > > If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the test > > case can pass the test. I guess it is because the definition of > > DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting in > > some > > blocks that cannot be removed, resulting in the failure of this test case. > > Can you check if making b unsigned fixes the test for you? If so > that's what we should do. It works: diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c index 44c457b7a97..79cf371ef28 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr108357.c @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-threadfull1" } */ -static char b; +static unsigned char b; static unsigned c; void foo(); short(a)(short d, short e) { return d * e; } But I'm still wondering why this is not an issue for x86_64.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #13 from rguenther at suse dot de --- On Thu, 13 Apr 2023, chenglulu at loongson dot cn wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 > > --- Comment #10 from chenglulu --- > (In reply to Xi Ruoyao from comment #5) > > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > > but removed in 135t.forwprop3. > > > > Does this mean something is wrong for LoongArch, or we should simply check > > the tree dump in a later pass (for e.g. 254t.optimized)? > > If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the test > case can pass the test. I guess it is because the definition of > DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting in > some > blocks that cannot be removed, resulting in the failure of this test case. Can you check if making b unsigned fixes the test for you? If so that's what we should do.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #12 from chenglulu --- (In reply to Xi Ruoyao from comment #11) > (In reply to chenglulu from comment #10) > > (In reply to Xi Ruoyao from comment #5) > > > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > > > but removed in 135t.forwprop3. > > > > > > Does this mean something is wrong for LoongArch, or we should simply check > > > the tree dump in a later pass (for e.g. 254t.optimized)? > > > > If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the test > > case can pass the test. I guess it is because the definition of > > DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting in > > some blocks that cannot be removed, resulting in the failure of this test > > case. > > Hmm, but we cannot change DEFAULT_SIGNED_CHAR or we'll break ABI and API > everywhere. And x86_64-linux-gnu also uses DEFAULT_SIGNED_CHAR=1. Uh, I didn't notice this, I'll keep looking.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #11 from Xi Ruoyao --- (In reply to chenglulu from comment #10) > (In reply to Xi Ruoyao from comment #5) > > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > > but removed in 135t.forwprop3. > > > > Does this mean something is wrong for LoongArch, or we should simply check > > the tree dump in a later pass (for e.g. 254t.optimized)? > > If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the test > case can pass the test. I guess it is because the definition of > DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting in > some blocks that cannot be removed, resulting in the failure of this test > case. Hmm, but we cannot change DEFAULT_SIGNED_CHAR or we'll break ABI and API everywhere. And x86_64-linux-gnu also uses DEFAULT_SIGNED_CHAR=1.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #10 from chenglulu --- (In reply to Xi Ruoyao from comment #5) > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > but removed in 135t.forwprop3. > > Does this mean something is wrong for LoongArch, or we should simply check > the tree dump in a later pass (for e.g. 254t.optimized)? If the definition of the macro DEFAULT_SIGNED_CHAR is changed to 0, the test case can pass the test. I guess it is because the definition of DEFAULT_SIGNED_CHAR affects the optimization of the ccp pass, resulting in some blocks that cannot be removed, resulting in the failure of this test case.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #9 from Richard Biener --- (In reply to Xi Ruoyao from comment #7) > (In reply to Richard Biener from comment #6) > > (In reply to Xi Ruoyao from comment #5) > > > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > > > but removed in 135t.forwprop3. > > > > > > Does this mean something is wrong for LoongArch, or we should simply check > > > the tree dump in a later pass (for e.g. 254t.optimized)? > > > > I guess it depends on LOGICAL_OP_NON_SHORT_CIRCUIT, can you try > > --param logical-op-non-short-circuit=1 and see if that helps? > > Nope, the result is same. Aha, the issue is missing promotions, already in .original: -short int g = a (5, (int) ((short int) b ^ 9854)); - f ((int) g); +short int g = a (5, (short int) b ^ 9854); + f (g); (+ is loongarch, - is x86_64) That results in different IL into threadfull1. On loongarch forwprop3 elides the branch, probably with the help of nonzero bits set by CCP.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #8 from Xi Ruoyao --- Created attachment 54783 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54783=edit threadfull1 dump on LoongArch
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #7 from Xi Ruoyao --- (In reply to Richard Biener from comment #6) > (In reply to Xi Ruoyao from comment #5) > > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > > but removed in 135t.forwprop3. > > > > Does this mean something is wrong for LoongArch, or we should simply check > > the tree dump in a later pass (for e.g. 254t.optimized)? > > I guess it depends on LOGICAL_OP_NON_SHORT_CIRCUIT, can you try > --param logical-op-non-short-circuit=1 and see if that helps? Nope, the result is same.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #6 from Richard Biener --- (In reply to Xi Ruoyao from comment #5) > The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, > but removed in 135t.forwprop3. > > Does this mean something is wrong for LoongArch, or we should simply check > the tree dump in a later pass (for e.g. 254t.optimized)? I guess it depends on LOGICAL_OP_NON_SHORT_CIRCUIT, can you try --param logical-op-non-short-circuit=1 and see if that helps?
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 Xi Ruoyao changed: What|Removed |Added CC||chenglulu at loongson dot cn, ||xry111 at gcc dot gnu.org --- Comment #5 from Xi Ruoyao --- The test fails on loongarch64-linux-gnu. foo is kept in 114t.threadfull1, but removed in 135t.forwprop3. Does this mean something is wrong for LoongArch, or we should simply check the tree dump in a later pass (for e.g. 254t.optimized)?
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- (In reply to Richard Biener from comment #2) > This was very recently fixed. I'll add the testcase. By r13-6834-g41ade3399bd1ec9927be indeed.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #3 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:ce4a00e29c71f2f51d52f407ecd265fa40688586 commit r13-6878-gce4a00e29c71f2f51d52f407ecd265fa40688586 Author: Richard Biener Date: Mon Mar 27 14:22:56 2023 +0200 tree-optimization/108357 - add testcase The following adds the testcase for the bug which was recently fixed. PR tree-optimization/108357 * gcc.dg/tree-ssa/pr108357.c: New testcase.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #2 from Richard Biener --- This was very recently fixed. I'll add the testcase.
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 --- Comment #1 from Richard Biener --- The CCP hunk causes a condition to be optimized away which then results in different jump threading and different VRP. I didn't analyze further, but the first difference is good: @@ -253,31 +256,25 @@ _9 = (short int) _8; _4 = (int) _9; b = 0; - if (_9 != 0) -goto ; [67.00%] - else -goto ; [33.00%] - - [local count: 719407025]: _14 = (int) _9; if (_14 > 1) -goto ; [50.00%] - else goto ; [50.00%] + else +goto ; [50.00%] - [local count: 359703512]: + [local count: 359703512]: - [local count: 719407025]: - # iftmp.3_15 = PHI <1(3), 0(4)> + [local count: 719407025]: + # iftmp.3_15 = PHI <1(2), 0(3)> if (_14 != iftmp.3_15) -goto ; [66.00%] +goto ; [66.00%] else -goto ; [34.00%] +goto ; [34.00%] - [local count: 598933192]: + [local count: 598933192]: foo (); - [local count: 1073741825]: + [local count: 1073741825]: return 0;
[Bug tree-optimization/108357] [13 Regression] Dead Code Elimination Regression at -O2 since r13-4607-g2dc5d6b1e7ec88
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108357 Martin Liška changed: What|Removed |Added Summary|Dead Code Elimination |[13 Regression] Dead Code |Regression at -O2 (trunk|Elimination Regression at |vs. 12.2.0) |-O2 since ||r13-4607-g2dc5d6b1e7ec88 Status|UNCONFIRMED |NEW CC||marxin at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Ever confirmed|0 |1 Target Milestone|--- |13.0 Last reconfirmed||2023-01-10