[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #13 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:0b93a0ae153ef70a82ff63e67926a01fdab9956b

commit r15-520-g0b93a0ae153ef70a82ff63e67926a01fdab9956b
Author: Jakub Jelinek 
Date:   Wed May 15 18:37:17 2024 +0200

combine: Fix up simplify_compare_const [PR115092]

The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers.  That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).

The following patch just disables the optimization in that case.

2024-05-15  Jakub Jelinek  

PR rtl-optimization/114902
PR rtl-optimization/115092
* combine.cc (simplify_compare_const): Don't optimize
GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
EQ op0 const0_rtx.

* gcc.dg/pr114902.c: New test.
* gcc.dg/pr115092.c: New test.

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #12 from Andrew Pinski  ---
*** Bug 115092 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #11 from Segher Boessenkool  ---
So, is there a simplified testcase that *actually* shows any *actual* problem?

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #10 from Segher Boessenkool  ---
(_extract, btw.)

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #9 from Segher Boessenkool  ---
(In reply to Andrew Pinski from comment #2)
> We go from CCGC with a sign_extend to a zero_extend with CCZ. that can't be
> right.

Why not?  We prefer zero_extend whenever it has the same result.

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|14.0|14.2

--- Comment #8 from Richard Biener  ---
GCC 14.1 is being released, retargeting bugs to GCC 14.2.

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #7 from Andrew Pinski  ---
(In reply to Segher Boessenkool from comment #6)
> (In reply to Andrew Pinski from comment #2)
> > Looks like the issue is during combine.
> > 
> > We go from CCGC with a sign_extend to a zero_extend with CCZ. that can't be
> > right.
> 
> Why is that not correct?  zero_extend is preferred over sign_extend, and both
> are equivalent when only checking for zero.

For Equality they are equivalent yes. But when doing `a >=s 0` a sign
extend/extract will cause different results from a zero extend/extract.

> Is there something wrong in target code here, perhaps?

For arm, x86 and mips?

For testcase in comment #4 on x86_64:
Before combine we start with:
```
(insn 16 15 17 2 (parallel [
(set (reg:SI 106 [ t_4 ])
(and:SI (reg:SI 105 [ tt1_3 ])
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":6:9 617 {*andsi_1}
 (expr_list:REG_DEAD (reg:SI 105 [ tt1_3 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil
(insn 17 16 20 2 (parallel [
(set (reg:SI 107 [ e_5 ])
(neg:SI (reg:SI 106 [ t_4 ])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":7:9 804 {*negsi_1}
 (expr_list:REG_DEAD (reg:SI 106 [ t_4 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil
(insn 20 17 21 2 (set (reg:CCGC 17 flags)
(compare:CCGC (reg:SI 107 [ e_5 ])
(const_int -1 [0x]))) "/app/example.cpp":8:16 11
{*cmpsi_1}
 (expr_list:REG_DEAD (reg:SI 107 [ e_5 ])
(nil)))
(insn 21 20 22 2 (set (reg:QI 109)
(ge:QI (reg:CCGC 17 flags)
(const_int 0 [0]))) "/app/example.cpp":8:16 1125 {*setcc_qi}
 (expr_list:REG_DEAD (reg:CCGC 17 flags)
(nil)))
(insn 22 21 23 2 (set (reg:SI 108 [ _1 ])
(zero_extend:SI (reg:QI 109))) "/app/example.cpp":8:16 169
{*zero_extendqisi2}
 (expr_list:REG_DEAD (reg:QI 109)
(nil)))
(insn 23 22 24 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 108 [ _1 ])
(const_int 0 [0]))) "/app/example.cpp":9:8 7 {*cmpsi_ccno_1}
 (expr_list:REG_DEAD (reg:SI 108 [ _1 ])
(nil)))
(jump_insn 24 23 30 2 (set (pc)
(if_then_else (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(label_ref 30)
(pc))) "/app/example.cpp":9:8 1130 {*jcc}
 (expr_list:REG_DEAD (reg:CCZ 17 flags)
(int_list:REG_BR_PROB 7 (nil)))
 -> 30)
```

We first combine 16->17 into:
```
(parallel [
(set (reg:SI 107 [ e_5 ])
(sign_extract:SI (reg:SI 105 [ tt1_3 ])
(const_int 1 [0x1])
(const_int 0 [0])))
(clobber (reg:CC 17 flags))
])
```
which is correct and good

And then when combining 17 -> 20 combine does:
Trying 17 -> 20:
   17: {r107:SI=sign_extract(r105:SI,0x1,0);clobber flags:CC;}
  REG_DEAD r105:SI
  REG_UNUSED flags:CC
   20: flags:CCGC=cmp(r107:SI,0x)
  REG_DEAD r107:SI
Successfully matched this instruction:
(set (reg:CCZ 17 flags)
(compare:CCZ (zero_extract:SI (reg:SI 105 [ tt1_3 ])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 0 [0])))
Successfully matched this instruction:
(set (reg:QI 109)
(ne:QI (reg:CCZ 17 flags)
(const_int 0 [0])))

Which is also replacing insn 21 incorrectly.
We go from `-(a&1) >= -1` (which is always true) to `(a&1) != 0`.
Maybe we go to `(a&1) <= 1` (still always true) and we mess up somehow to `(a &
1) != 0`

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-05-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #6 from Segher Boessenkool  ---
(In reply to Andrew Pinski from comment #2)
> Looks like the issue is during combine.
> 
> We go from CCGC with a sign_extend to a zero_extend with CCZ. that can't be
> right.

Why is that not correct?  zero_extend is preferred over sign_extend, and both
are equivalent when only checking for zero.

Is there something wrong in target code here, perhaps?

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-04-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #5 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #4)
> here is a reduced testcase:

> Note ` -O1 -fno-tree-fre -fno-tree-forwprop -fno-tree-ccp 
> -fno-tree-dominator-opts`


This testcase is broken in GCC 13 for mips64-linux-gnu with the added option
-march=octeon.
And it has been broken since at least 4.9.4.
andi$4,$4,0x1
xori$4,$4,0x1
teq $4,0
j   $31
move$2,$0

That is:
$4 = $4 & 0x1
$4 = $4 ^ 1
trapif $4 == 0

That is the earliest compiler version I could test where I Know that
sign_extract shows up in RTL.

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-04-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #4 from Andrew Pinski  ---
here is a reduced testcase:
```

[[gnu::noipa]]
int f(int b)
{
int tt1 = ~b;
int t = 1 & tt1;
int e = -t;
int tt = e >= -1;
if (tt) return 0;
__builtin_trap();
}

int main()
{
  for(int i = -1;i < 2; i++)
f(i);
}
```

Note ` -O1 -fno-tree-fre -fno-tree-forwprop -fno-tree-ccp 
-fno-tree-dominator-opts` is needed to reproduce it with this one. The generate
gimple is the same between GCC 13 and 14 here.

But the first difference is in combine:
```
Trying 7 -> 8:
7: {r106:SI=r105:SI&0x1;clobber flags:CC;}
  REG_DEAD r105:SI
  REG_UNUSED flags:CC
8: {r107:SI=-r106:SI;clobber flags:CC;}
  REG_DEAD r106:SI
  REG_UNUSED flags:CC
Successfully matched this instruction:
(parallel [
(set (reg:SI 107 [ e_5 ])
(sign_extract:SI (reg:SI 105 [ tt1_3 ])
(const_int 1 [0x1])
(const_int 0 [0])))
(clobber (reg:CC 17 flags))
])
allowing combination of insns 7 and 8
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 7.
modifying insn i3 8: {r107:SI=sign_extract(r105:SI,0x1,0);clobber
flags:CC;}
  REG_DEAD r105:SI

```

This is correct but it goes down hill after like as I mentioned in comment #2.

So it does look like a latent bug after all.


If someone does a bisect of this testcase, I am 99% sure you find
r14-4810-ge28869670c9879 is where the failure was introduced. For the original
testcase and the one in comment #1 might find a different commit due to gimple
level being different.

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-04-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

--- Comment #3 from Andrew Pinski  ---
Note this is almost definitely a latent bug exposed by some change. Might be
interesting to see what change exposed it but not so much really.

[Bug rtl-optimization/114902] [14/15 Regression] wrong code at -O3 with "-fno-tree-vrp -fno-expensive-optimizations -fno-tree-dominator-opts" on x86_64-linux-gnu

2024-04-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114902

Andrew Pinski  changed:

   What|Removed |Added

  Component|target  |rtl-optimization

--- Comment #2 from Andrew Pinski  ---
Looks like the issue is during combine.

After combine we have:
```
   12: r113:SI=[`b']
   13: r112:SI=~r113:SI
  REG_DEAD r113:SI
  REG_EQUAL ~[`b']
   14: NOTE_INSN_DELETED
   15: {r109:SI=sign_extract(r112:SI,0x1,0);clobber flags:CC;}
  REG_UNUSED flags:CC
   18: NOTE_INSN_DELETED
   19: NOTE_INSN_DELETED
   22: r117:SI=0x1
   21: flags:CCZ=cmp(zero_extract(r112:SI,0x1,0),0)
  REG_DEAD r112:SI
   23: r106:SI={(flags:CCZ==0)?r109:SI:r117:SI}
  REG_DEAD r117:SI
  REG_DEAD r109:SI
  REG_DEAD flags:CCZ
  REG_EQUAL {(flags:CCZ==0)?r109:SI:0x1}
```

insn 21 is wrong.



```

Trying 15 -> 18:
   15: {r109:SI=sign_extract(r112:SI,0x1,0);clobber flags:CC;}
  REG_DEAD r112:SI
  REG_UNUSED flags:CC
   18: flags:CCGC=cmp(r109:SI,0x)
Failed to match this instruction:
(parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (zero_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 0 [0])))
(set (reg/v:SI 109 [ eD.2798 ])
(sign_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0])))
])
Failed to match this instruction:
(parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (zero_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 0 [0])))
(set (reg/v:SI 109 [ eD.2798 ])
(sign_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0])))
])
Failed to match this instruction:
(parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (and:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1]))
(const_int 0 [0])))
(set (reg/v:SI 109 [ eD.2798 ])
(sign_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0])))
])
Failed to match this instruction:
(parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (and:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1]))
(const_int 0 [0])))
(set (reg/v:SI 109 [ eD.2798 ])
(sign_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0])))
])
Successfully matched this instruction:
(set (reg/v:SI 109 [ eD.2798 ])
(sign_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0])))
Successfully matched this instruction:
(set (reg:CCZ 17 flags)
(compare:CCZ (zero_extract:SI (reg:SI 112 [ _2 ])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 0 [0])))
Successfully matched this instruction:
(set (reg:QI 115 [ _10 ])
(ne:QI (reg:CCZ 17 flags)
(const_int 0 [0])))
allowing combination of insns 15 and 18
original costs 4 + 4 = 12
replacement costs 4 + 4 = 12
modifying other_insn19: r115:QI=flags:CCZ!=0
  REG_DEAD flags:CCGC
deferring rescan insn with uid = 19.
modifying insn i215: {r109:SI=sign_extract(r112:SI,0x1,0);clobber
flags:CC;}
  REG_UNUSED flags:CC
deferring rescan insn with uid = 15.
modifying insn i318: flags:CCZ=cmp(zero_extract(r112:SI,0x1,0),0)
  REG_DEAD r112:SI
deferring rescan insn with uid = 18.
```

We go from CCGC with a sign_extend to a zero_extend with CCZ. that can't be
right.