[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-11-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||jakub at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #20 from Jakub Jelinek  ---
Assuming fixed then, please reopen if not.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-16 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #19 from Bernd Edlinger  ---
Hope all is now working again.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-16 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #18 from Bernd Edlinger  ---
Author: edlinger
Date: Fri Aug 16 16:37:04 2019
New Revision: 274578

URL: https://gcc.gnu.org/viewcvs?rev=274578=gcc=rev
Log:
2019-08-16  Bernd Edlinger  

Backport from mainline
2019-08-16  Bernd Edlinger  

PR tree-optimization/91109
* lra-int.h (lra_need_for_scratch_reg_p): Declare.
* lra.c (lra): Use lra_need_for_scratch_reg_p.
* lra-spills.c (lra_need_for_scratch_reg_p): New function.

Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/lra-int.h
branches/gcc-9-branch/gcc/lra-spills.c
branches/gcc-9-branch/gcc/lra.c

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-16 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #17 from Bernd Edlinger  ---
Author: edlinger
Date: Fri Aug 16 16:31:13 2019
New Revision: 274577

URL: https://gcc.gnu.org/viewcvs?rev=274577=gcc=rev
Log:
2019-08-16  Bernd Edlinger  

Backport from mainline
2019-08-07  Bernd Edlinger  

PR tree-optimization/91109
* lra-remat.c (update_scratch_ops): Remove assignment of the
hard register.

Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/lra-remat.c

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-16 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #16 from Bernd Edlinger  ---
Author: edlinger
Date: Fri Aug 16 15:34:47 2019
New Revision: 274573

URL: https://gcc.gnu.org/viewcvs?rev=274573=gcc=rev
Log:
2019-08-16  Bernd Edlinger  

PR tree-optimization/91109
* lra-int.h (lra_need_for_scratch_reg_p): Declare.
* lra.c (lra): Use lra_need_for_scratch_reg_p.
* lra-spills.c (lra_need_for_scratch_reg_p): New function.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/lra-int.h
trunk/gcc/lra-spills.c
trunk/gcc/lra.c

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-16 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #15 from Christophe Lyon  ---
Since r274532 (gcc-9-branch), I am seeing:
FAIL: gcc.c-torture/execute/20040709-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test

target arm-none-linux-gnueabi
--with-mode arm
--with-cpu cortex-a9

The same test passes on arm-none-linux-gnueabihf, or using --with-mode thumb

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-13 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #14 from Bernd Edlinger  ---
I can reproduce with trunk:
arm-linux-gnueabihf-gcc -S -O2 -mthumb -flto -fno-use-linker-plugin
20040709-1.c

but not with -O3 -g, neither with gcc-9 and my fix applied.

Nevertheless it is quite obvious that the second patch is needed to handle
the case when rematerialized instructions have scratches, but nothing needs
to be spilled so the loop need to continue with lra_assign instead of
lra_spill.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-12 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #13 from Bernd Edlinger  ---
Created attachment 46704
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46704=edit
another untested patch

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-12 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #12 from Christophe Lyon  ---
Indeed, although r274163 fixes the problem I reported, it also introduces a
regression when compiling the very same testcase but adding -mthumb:

FAIL: gcc.c-torture/execute/20040709-1.c   -O2  (internal compiler error)
FAIL: gcc.c-torture/execute/20040709-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
FAIL: gcc.c-torture/execute/20040709-3.c   -O2  (internal compiler error)
FAIL: gcc.c-torture/execute/20040709-3.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
FAIL: gcc.c-torture/execute/20040709-3.c   -O3 -g  (internal compiler error)

My gcc.log says:
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c: In function 'retmeD':
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:19:10: note: parameter
passing for argument of type 'struct D' changed in GCC 9.1
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:95:64: note: in expansion of
macro 'T'
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c: In function 'testI':
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:100:75: error: insn does not
satisfy its constraints:
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:55:10: note: in definition of
macro 'T'
(insn 311 122 309 8 (parallel [
(set (reg:SI 3 r3 [266])
(truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg:SI 10
r10 [265]))
(zero_extend:DI (reg:SI 8 r8 [267])))
(const_int 32 [0x20]
(clobber (scratch:SI))
]) "/gcc/testsuite/gcc.c-torture/execute/20040709-1.c":100:73 70
{*umulsi3_highpart_v6}
 (nil))
during RTL pass: reload
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:100:75: internal compiler
error: in extract_constrain_insn, at recog.c:2211
/gcc/testsuite/gcc.c-torture/execute/20040709-1.c:55:10: note: in definition of
macro 'T'
0x5a7d5d _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/gcc/rtl-error.c:108
0x5a7d83 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/gcc/rtl-error.c:119
0xb7b85d extract_constrain_insn(rtx_insn*)
/gcc/recog.c:2211
0xa629b7 check_rtl
/gcc/lra.c:2184
0xa67def lra(_IO_FILE*)
/gcc/lra.c:2622
0xa19f49 do_reload
/gcc/ira.c:5522
0xa19f49 execute
/gcc/ira.c:5706

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-12 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #11 from Bernd Edlinger  ---
No, it needs to be back-ported to gcc-9.3 (i am still reg-testing)
and Vladimir Makarov wrote the following:
https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00463.html

> Still I think more work on the PR is needed.  If subsequent LRA sub-pass
> spills some pseudo to assign a hard register to the scratch of the
> rematerialized insn as it was in the original insn, it might make this
> rematerialization unprofitable.  So I'll think how to avoid the
> unprofitable rematerialization in such cases and would like to work on
> this  PR more.
> 
> Please, do not close the PR after committing the patch.  I am going to
> work on it more when stage3 starts.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-12 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #10 from Martin Liška  ---
Bernd: Can the bug be marked as resolved?

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-07 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #9 from Bernd Edlinger  ---
Author: edlinger
Date: Wed Aug  7 13:45:06 2019
New Revision: 274163

URL: https://gcc.gnu.org/viewcvs?rev=274163=gcc=rev
Log:
2019-08-07  Bernd Edlinger  

PR tree-optimization/91109
* lra-remat.c (update_scratch_ops): Remove assignment of the
hard register.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/lra-remat.c

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-05 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #8 from Bernd Edlinger  ---
Patch is posted here: https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00305.html

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #7 from Bernd Edlinger  ---
I can reproduce this defect with gcc-9 (!)

$ ../gcc-9-branch/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf-linux64-1
--target=arm-linux-gnueabihf --enable-languages=c,c++ --with-arch=armv7-a
--with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard

$ TMP=. arm-linux-gnueabihf-gcc -O2 -flto -save-temps -fdump-rtl-all-all
20040709-1.c

$ grep same *.reload
 Assigning the same 6155 to r11
$ vi *.ltrans0.s
look for the last umull (it is always the last one):

str r5, [fp]
umull   fp, r3, r7, r8
[...]
str r6, [fp]

But the same does not happen for gcc-8:

$ grep same *.reload

the assembler listing looks okay.
But the update_scratch_ops looks exactly identical,
Therefore the issue is likely just a hidden one there.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-08-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #6 from Bernd Edlinger  ---
with this patch the relevant part if the reload dump file looks different:

(insn 3414 6591 6682 129 (set (mem/c:SI (reg/f:SI 5 r5 [5715]) [1
s.5566D.5531+0 S4 A32])
(reg:SI 6 r6 [orig:828 _821 ] [828])) "20040709-1.c":13:5 654
{*arm_movsi_vfp}
 (nil))
[...]
(insn 6826 3453 6816 129 (parallel [
(set (reg:SI 3 r3 [4187])
(truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg:SI 11
fp [4186]))
(zero_extend:DI (reg:SI 7 r7 [4188])))
(const_int 32 [0x20]
(clobber (reg:SI 1 r1 [5970]))
]) "20040709-1.c":108:291 70 {*umulsi3_highpart_v6}
 (nil))
[...]
(insn 3509 3530 3531 132 (set (mem/c:SI (reg/f:SI 5 r5 [5715]) [1
s.5566D.5531+0 S4 A32])
(reg:SI 14 lr [orig:692 D.6083 ] [692])) "20040709-1.c":13:5 654
{*arm_movsi_vfp}
 (nil))


This time insn 6826 is able to choose a different register than r5,
and most importantly the live-range info is correct, since the
old register r5970 is renamed to r6374 temporarily:

  Creating newreg=6374 from oldreg=5970, assigning class GENERAL_REGS to
scratch pseudo copy r6374
 6816: r6364:SI=r4187:SI
  REG_DEAD r4187:SI
Inserting rematerialization insn before:
 6826: {r6364:SI=trunc(zero_extend(r4186:SI)*zero_extend(r4188:SI)
0>>0x20);clobber r6374:SI;}
  REG_UNUSED r6374:SI

which is visible in the live ranges (which was not there before):

 r6007: [59..59]
 r6374: [1572..1572]
Compressing live ranges: from 3802 to 75 - 1%
Ranges after the compression:
[...]
 r6007: [1..1]
 r6374: [40..40]

However since the re-materialized instruction is able to use r1
there is no conflict any more.  So I believe the patch is a
straight improvement over the previous state of affairs.


So, as it looks like, this is a potentially catastrophic bug, and not
related to -flto at all or any specific target architecture.
From my testing it is likely that was already there in gcc-9.0.1.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-07-31 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #5 from Bernd Edlinger  ---
Created attachment 46654
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46654=edit
untested patch

It looks like update_scratch_ops creates a copy of the original scratch
register, but the new scratch register has no working live range info.
I don't know a correct solution for the underlying problem, but
removing the assignment to reg_renumber seems to fix the test case.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-07-31 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

Bernd Edlinger  changed:

   What|Removed |Added

 CC||bernd.edlinger at hotmail dot 
de

--- Comment #4 from Bernd Edlinger  ---
hmm, funny, I saw this test case failing since february at least:

https://gcc.gnu.org/ml/gcc-testresults/2019-02/msg02686.html

FAIL: gcc.c-torture/execute/20040709-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test

--with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16
--with-float=hard


I have not looked into it before, but
to me it looks like a reload bug:

str r6, [r5] <= r5 still valid
stm r9, {r0, r1, r2, r3}
umull   r5, r3, r7, fp <= r5 clobbered
ldr r2, [r4, #176]
lsr r9, r3, #3
mov r3, r0
eor r3, r3, r2
rsb r9, r9, r9, lsl #4
tst r3, r10
sub r9, fp, r9
bne .L29
ldrhr2, [sp, #176]
ldrhr3, [r4, #176]
eor r2, r2, r3
ubfxr2, r2, #0, #12
cmp r2, #0
bne .L29
cmp r9, r9
bne .L29
mla r6, r8, r6, lr
ldr fp, .L79+36
mla lr, r8, r6, lr
ubfxr6, r6, #16, #11
bfi r3, r6, #0, #12
strhr3, [r4, #176]  @ movhi
uxthr8, r3
ldm fp, {r0, r1, r2, r3}
ubfxip, lr, #16, #11
add r7, r6, ip
add ip, sp, #176
bfi r8, r7, #0, #12
str lr, [r5]   <= r5 invalid

reload:
(insn 6826 3453 6816 129 (parallel [
(set (reg:SI 3 r3 [4187])
(truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg:SI 11
fp [4186]))
(zero_extend:DI (reg:SI 7 r7 [4188])))
(const_int 32 [0x20]
(clobber (reg:SI 5 r5 [5970]))
]) "20040709-1.c":108:291 70 {*umulsi3_highpart_v6}
 (nil))
[...]
(insn 3509 3530 3531 132 (set (mem/c:SI (reg/f:SI 5 r5 [5715]) [1
s.5566D.5531+0 S4 A32])
(reg:SI 14 lr [orig:692 D.6083 ] [692])) "20040709-1.c":13:5 654
{*arm_movsi_vfp}
 (nil))

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-07-09 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #3 from rguenther at suse dot de  ---
On Mon, 8 Jul 2019, clyon at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109
> 
> --- Comment #2 from Christophe Lyon  ---
> Removing the test*() calls from the end, the first failing one is testX().
> However, if I remove all the preceding ones, the test passes.

Ugh.  Not very much simplification.  I suppose trying to trim the
number of test() calls before testX() isn't possible?

> Using -fwhole-program instead of -flto has no effect: the test still fails.

That's good news OTOH and simplifies analysis.

> Adding a printf() call in check() also makes the test pass.

test##S you mean probably.  But yes, that's expected.

Given there's no regression with hard float having testW () might
be important (uses long double).  There may be also ABI differences
(sizeof (long double)) when switching between hard-float and soft-float?
Looking at a cross long double == double == 8 bytes.

Again I'm expecting a target issue here.

The rev. made a difference in inlining because it removes less
stores as redundant during early optimizations.  testN is no
longer inlined.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-07-08 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

--- Comment #2 from Christophe Lyon  ---
Removing the test*() calls from the end, the first failing one is testX().
However, if I remove all the preceding ones, the test passes.

Using -fwhole-program instead of -flto has no effect: the test still fails.

Adding a printf() call in check() also makes the test pass.

[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135

2019-07-08 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org
   Target Milestone|--- |10.0

--- Comment #1 from Richard Biener  ---
Can you help and check which test* () call fails?  Also check whether
-fwhole-program instead of -flto makes it fail.  Does it still fail when you
comment
all but the failing test* () call?