[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-27 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

Segher Boessenkool  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Segher Boessenkool  ---
Patch for the original problem went in as r258452:
(I accidentally deleted the changelog from the commit message, so BZ didn't
pick this up).

combine: Fix PR84780 (more LOG_LINKS trouble)

There still are situations where we have stale LOG_LINKS.  This causes
combine to try two-insn combinations I2->I3 where the register set by
I2 is used before I3 as well.  Not good.

This patch fixes it by checking for this situation in can_combine_p
(similar to what we already do for three and four insn combinations).




Patch for #c10 went in as r258523.

combine: Don't make log_links for pc_rtx (PR84780 #c10)

distribute_links tries to place a log_link for whatever the destination
of the modified instruction is.  It shouldn't do that when that dest
is pc_rtx, which isn't actually a register.


* combine.c (distribute_links): Don't make a link based on pc_rtx.


Closing as fixed.

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-13 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #11 from Segher Boessenkool  ---
That is a separate issue, not caused by the previous patch.

I have a patch for this, too.

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-13 Thread zsojka at seznam dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #10 from Zdenek Sojka  ---
(In reply to Segher Boessenkool from comment #8)
> Created attachment 43631 [details]
> proposed patch
> 
> I cannot reproduce that exact generated code; maybe it needs tuning for some
> particular CPU?
> 
> Could you try the attached patch?  Thanks!

This is causing segfault:

$ cat testcase.c 
int a, d;
long b;
__int128 c;
void foo(void)
{
  a &= 0 < b;
  do {
c = b >>= 63;
c -= __builtin_add_overflow_p(0, b, d);
  } while (a >= 255);
}

$ aarch64-unknown-linux-gnu-gcc -Og testcase.c
during RTL pass: combine
testcase.c: In function 'foo':
testcase.c:11:1: internal compiler error: Segmentation fault
 }
 ^
0xc927bf crash_signal
/repo/gcc-trunk/gcc/toplev.c:325
0xc11391 reg_used_between_p(rtx_def const*, rtx_insn const*, rtx_insn const*)
/repo/gcc-trunk/gcc/rtlanal.c:1128
0x13c1486 can_combine_p
/repo/gcc-trunk/gcc/combine.c:1993
...

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-12 Thread zsojka at seznam dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #9 from Zdenek Sojka  ---
(In reply to Segher Boessenkool from comment #8)
> Created attachment 43631 [details]
> proposed patch
> 
> I cannot reproduce that exact generated code; maybe it needs tuning for some
> particular CPU?
> 
> Could you try the attached patch?  Thanks!

I can confirm that r258444 FAILs, but r258449+patch PASSes the testcases (both
original and reduced).

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-12 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

Segher Boessenkool  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |segher at gcc dot 
gnu.org

--- Comment #8 from Segher Boessenkool  ---
Created attachment 43631
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43631=edit
proposed patch

I cannot reproduce that exact generated code; maybe it needs tuning for some
particular CPU?

Could you try the attached patch?  Thanks!

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-11 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #7 from Segher Boessenkool  ---
I have a patch.

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-10 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #6 from Segher Boessenkool  ---
And the actual problem happens earlier: the earlier 63, 70 -> 71 combination
links
the much later insn 100 to 70, for cc, but there are plenty other setters and
users of cc earlier.

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-10 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #5 from Segher Boessenkool  ---
Insn 55 is a parallel, and that is split into two insns i1 and i2, both
numbered as 55.  The i1 will never become part of the insn stream.  It is
this insn that is deleted.

Later on insn 55 is combined into insn 100:

   55:
cc:CC_C=zero_extend(r165:DI)+zero_extend(x2:DI)!=zero_extend(r165:DI+x2:DI)
  100:
{cc:CC_C=zero_extend(r178:DI)+zero_extend(r198:DI)!=zero_extend(r178:DI+r198:DI);r200:DI=r178:DI+r198:DI;}
  REG_DEAD r198:DI
  REG_DEAD r178:DI

becomes

100:
{cc:CC_C=zero_extend(r178:DI)+zero_extend(r198:DI)!=zero_extend(r178:DI+r198:DI);r200:DI=r178:DI+r198:DI;}
  REG_DEAD r178:DI
  REG_DEAD r198:DI

and that seems fine, too?  Or does something in between use cc?  Ah yes, insn
71
does.  Somehow insn 100 has a LOG_LINK to 55 though (for cc).  This happens at
the 55 -> 70 combination.

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-09 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #4 from ktkachov at gcc dot gnu.org ---
A carry-setting instruction gets deleted. Among the disassembly the non-failing
assembly has this:

cmp x13, 0
asr w0, w0, w4
csetw4, ne
sxtwx0, w0
adrpx12, d
sub w4, w7, w4
sxtwx9, w7
add w4, w4, w6
asr x6, x0, 63
cmn x4, x2  // sets the carry flag
asr x1, x9, 63
cincx3, x3, cs // Uses the carry flag

The bad disassembly has this:

cmp x4, 0
adrpx13, d
csetw4, ne
asr x7, x0, 63
sxtwx6, w8
cincx1, x3, cs // use of carry flag, but the setter was eliminated

Combine ends up eliminating the carry-setting instruction:
(insn 55 52 56 2 (set (reg:CC_C 66 cc)
(ne:CC_C (plus:TI (zero_extend:TI (reg:DI 165))
(zero_extend:TI (reg:DI 2 x2 [ h ])))
(zero_extend:TI (plus:DI (reg:DI 165)
(reg:DI 2 x2 [ h ]) "bad.c":23 104
{*adddi3_compareC_cconly}
 (nil))

From what I can see in the combine logs:

Trying 55 -> 70:
   55:
{cc:CC_C=zero_extend(r165:DI)+zero_extend(x2:DI)!=zero_extend(r165:DI+x2:DI);r167:DI=r165:DI+x2:DI;}
  REG_DEAD x2:DI
  REG_DEAD r165:DI
   70: r178:DI=r167:DI
  REG_DEAD r167:DI

After a few failed PARALLEL formation it succeeds twice with:
Successfully matched this instruction:
(set (reg:CC_C 66 cc)
(ne:CC_C (plus:TI (zero_extend:TI (reg:DI 165))
(zero_extend:TI (reg:DI 2 x2 [ h ])))
(zero_extend:TI (plus:DI (reg:DI 165)
(reg:DI 2 x2 [ h ])
Successfully matched this instruction:
(set (reg:DI 178)
(plus:DI (reg:DI 165)
(reg:DI 2 x2 [ h ])))
allowing combination of insns 55 and 70
original costs 0 + 0 = 0
replacement costs 24 + 4 = 28
deferring deletion of insn with uid = 55.
modifying insn i255:
cc:CC_C=zero_extend(r165:DI)+zero_extend(x2:DI)!=zero_extend(r165:DI+x2:DI)
deferring rescan insn with uid = 55.
modifying insn i370: r178:DI=r165:DI+x2:DI
  REG_DEAD r165:DI
  REG_DEAD x2:DI
deferring rescan insn with uid = 70.

so it seems like it deletes insn 55 but then also modifies it?

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-09 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Started with r257644, but haven't analyzed if things go wrong during combine or
just it made some latent bug reproduceable.

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-09 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

--- Comment #2 from ktkachov at gcc dot gnu.org ---
Fails for me with -O2 --param=tree-reassoc-width=4.

With -fno-if-conversion it doesn't fail but I don't see what the if-conversion
passes do wrong, if anything

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-09 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-03-09
 CC||ktkachov at gcc dot gnu.org
  Known to work||7.3.1
 Ever confirmed|0   |1

--- Comment #1 from ktkachov at gcc dot gnu.org ---
Confirmed

[Bug rtl-optimization/84780] [8 Regression] wrong code aarch64 with -O3 --param=tree-reassoc-width=32

2018-03-09 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84780

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |8.0