[Bug target/92922] [10 regression] [ilp32] FAIL: gcc.target/aarch64/sve/acle/asm/ldnt1_u32.c -std=c90 -O1 -g -DTEST_FULL (internal compiler error)

2020-02-26 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92922

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||sudi at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #3 from sudi at gcc dot gnu.org ---
Fixed as commented

[Bug other/92870] new test case gcc.dg/vect/vect-shift-5.c fails starting with its introduction in r279114

2019-12-12 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92870

--- Comment #3 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Thu Dec 12 18:01:18 2019
New Revision: 279310

URL: https://gcc.gnu.org/viewcvs?rev=279310=gcc=rev
Log:
[Committed, testsuite] Fix PR92870

With my recent commit, I added a test that is not passing on all targets.
My change was valid for targets that have a vector/scalar shift/rotate optabs
(optab that supports vector shifted by scalar).

Since it does not seem to be easy to find out which targets would support it,
I am limiting the test to the targets that I know pass.

gcc/testsuite/ChangeLog

2019-12-12  Sudakshina Das  

PR testsuite/92870
* gcc.dg/vect/vect-shift-5.c: Add target to scan-tree-dump.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/vect/vect-shift-5.c

[Bug other/92870] new test case gcc.dg/vect/vect-shift-5.c fails starting with its introduction in r279114

2019-12-10 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92870

--- Comment #2 from sudi at gcc dot gnu.org ---
So I am sure how to get a list of targets that would support a particular
optab. I guess I can introduce a new effective target check with only the
targets that I know pass? Would that be ok?

[Bug other/92870] new test case gcc.dg/vect/vect-shift-5.c fails starting with its introduction in r279114

2019-12-10 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92870

--- Comment #1 from sudi at gcc dot gnu.org ---
Ah I think I need a better effective target check. This test would only pass
for target that have a vector/scalar shift/rotate optab.

[Bug other/92870] new test case gcc.dg/vect/vect-shift-5.c fails starting with its introduction in r279114

2019-12-10 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92870

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |sudi at gcc dot gnu.org
 Ever confirmed|0   |1

[Bug target/91816] [7/8/9/10 Regression] Arm generates out of range conditional branches in Thumb2

2019-10-02 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91816

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||sudi at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |sudi at gcc dot gnu.org
Summary|Arm generates out of range  |[7/8/9/10 Regression] Arm
   |conditional branches in |generates out of range
   |Thumb2  |conditional branches in
   ||Thumb2

[Bug target/88620] [7/8/9 Regression] ICE in assign_stack_temp_for_type, at function.c:837

2018-12-28 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88620

--- Comment #2 from sudi at gcc dot gnu.org ---
Haven't looked very closely to PR82564 but it was marked as a possible
duplicate to an already resolved ticket a year ago. In any case I can confirm
this failure is still occurring for aarch64 trunk and last couple of release
branches.

[Bug target/88620] [7/8/9 Regression] ICE in assign_stack_temp_for_type, at function.c:837

2018-12-28 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88620

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

[Bug target/88616] ICE in gimplify_expr at gcc/gimplify.c:13363

2018-12-28 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88616

--- Comment #1 from sudi at gcc dot gnu.org ---
Started somewhere between r264874 and r266250. (I know the window is too big
:()

[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794

2018-10-09 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870

--- Comment #11 from sudi at gcc dot gnu.org ---
Yes I remember spending a while to get it to reduce further. But it needs a big
constructor to fail.

[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794

2018-10-09 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870

--- Comment #7 from sudi at gcc dot gnu.org ---
It is not failing on x86_64 trunk anymore but with 8.0.1

+ TARGET=x86_64-pc-linux-gnu
+ GCC_INSTALL=/work/x86-trunk/bld
+ GCC=/work/x86-trunk/bld/bin/x86_64-pc-linux-gnu-gcc-8.0.1
+ LTO1=/work/x86-trunk/bld/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/lto1
+ CFLAGS=-O2 -flto
+ /work/x86-trunk/bld/bin/x86_64-pc-linux-gnu-gcc-8.0.1 -O2 -flto -c test_1.i
-o test_1.o
+ /work/x86-trunk/bld/bin/x86_64-pc-linux-gnu-gcc-8.0.1 -O2 -flto -c test_2.i
-o test_2.o
+ /work/x86-trunk/bld/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/lto1 test_1.o
test_2.o
Reading object files: test_1.o test_2.olto1: internal compiler error: in
linemap_line_start, at libcpp/line-map.c:794
0x14a025b linemap_line_start(line_maps*, unsigned int, unsigned int)
../../src/gcc/libcpp/line-map.c:794
0xa8c893 lto_location_cache::apply_location_cache()
../../src/gcc/gcc/lto-streamer-in.c:194
0x76bc54 lto_read_decls
../../src/gcc/gcc/lto/lto.c:1816
0x76e221 lto_file_finalize
../../src/gcc/gcc/lto/lto.c:2076
0x76e221 lto_create_files_from_ids
../../src/gcc/gcc/lto/lto.c:2086
0x76e221 lto_file_read
../../src/gcc/gcc/lto/lto.c:2127
0x76e221 read_cgraph_and_symbols
../../src/gcc/gcc/lto/lto.c:2839
0x76e221 lto_main()
../../src/gcc/gcc/lto/lto.c:3356
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794

2018-10-09 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870

--- Comment #6 from sudi at gcc dot gnu.org ---
Still fails for me on aarch64-none-linux-gnu-gcc and aarch64-none-elf-gcc on
trunk and gcc-8.2.1 with the same error

Reading object files: test_1.o test_2.olto1: internal compiler error: in
linemap_line_start, at libcpp/line-map.c:794
0x1414d7b linemap_line_start(line_maps*, unsigned int, unsigned int)
/aarch64-none-elf/build/src/gcc/libcpp/line-map.c:794
0x9a264f lto_location_cache::apply_location_cache()
/aarch64-none-elf/build/src/gcc/gcc/lto-streamer-in.c:194
0x5e946c lto_read_decls
/aarch64-none-elf/build/src/gcc/gcc/lto/lto.c:1852
0x5ea533 lto_file_finalize
/aarch64-none-elf/build/src/gcc/gcc/lto/lto.c:2121
0x5ea533 lto_create_files_from_ids
/aarch64-none-elf/build/src/gcc/gcc/lto/lto.c:2131
0x5ea533 lto_file_read
/aarch64-none-elf/build/src/gcc/gcc/lto/lto.c:2172
0x5ea533 read_cgraph_and_symbols
/aarch64-none-elf/build/src/gcc/gcc/lto/lto.c:2845
0x5ea533 lto_main()
/aarch64-none-elf/build/src/gcc/gcc/lto/lto.c:3362
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

aarch64-none-linux-gnu-gcc --version
aarch64-none-linux-gnu-gcc (fsf-trunk.1693) 9.0.0 20181005 (experimental)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

aarch64-none-linux-gnu-gcc --version
aarch64-none-linux-gnu-gcc (fsf-8.90) 8.2.1 20181007
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

aarch64-none-elf-gcc --version
aarch64-none-elf-gcc (fsf-trunk.1693) 9.0.0 20181005 (experimental)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

aarch64-none-elf-gcc --version
aarch64-none-elf-gcc (fsf-8.90) 8.2.1 20181007
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[Bug target/87511] New: [9 Regression][AArch64] UBFIZ instruction with invalid immediate emitted

2018-10-04 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87511

Bug ID: 87511
   Summary: [9 Regression][AArch64] UBFIZ instruction with invalid
immediate emitted
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

When compiling the code below with aarch64 and -Os

int a, d;
struct {
  signed f5 : 26;
  signed f6 : 12;
} b;
signed char c;
void fn1() {
  signed char *e = 
  d = a * 10;
  *e = d;
  b.f6 = c;
  b.f5 = 8 <= 3;
}

We get:
$ aarch64-none-elf-gcc -march=armv8-a -c test.c -o /dev/null -Os -Wall
/tmp/ccVimNZB.s: Assembler messages:
/tmp/ccVimNZB.s:20: Error: immediate value out of range 1 to 32 at operand 4 --
`ubfiz x0,x0,32,38'

This started somewhere between r260322 and r261702.

Seems to be incorrectly matching the below in IRA
//(insn:TI 30 22 35 (set (reg:DI 0 x0 [120])
// (and:DI (ashift:DI (reg:DI 0 x0 [orig:92 _3 ] [92])
// (const_int 32 [0x20]))
// (const_int 17587958185983 [0xfff03ff]))) "bfiz.c":12 786
\{*andim_ashiftdi_bfiz}
// (nil))
 ubfiz x0, x0, 32, 38 // 30 [c=4 l=4] *andim_ashiftdi_bfiz

[Bug target/84521] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-08-01 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #28 from sudi at gcc dot gnu.org ---
Created attachment 44478
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44478=edit
Failing test case

As advised by James on the mailing list, I am adding the test case that is
failing on at least AAcrh64 and x86. My proposed patch fixes this on AArch64,
but I would like to add this test in the general gcc.c-torture/execute/ folder
so that other targets can also check.

[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-12 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

sudi at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2018-7-12
 CC||sudi at gcc dot gnu.org

--- Comment #6 from sudi at gcc dot gnu.org ---
(In reply to kugan from comment #1)
> Sorry about the breakage, I am trying to reproduce it on x86-64. Please let
> me know if you have testcase.

This can reproduce the failure:

int a = 0, b = 0;
void fn1() {
  int c = 0;
  for (; a; a--)
c += b;
  while ((c - 1) & c)
;
}

[Bug c/85925] New: [ARM][7/8/9 Regression] Mis-compilation at -02, masking with 257 goes wrong in combine

2018-05-25 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85925

Bug ID: 85925
   Summary: [ARM][7/8/9 Regression] Mis-compilation at -02,
masking with 257 goes wrong in combine
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

The following test case:

#include 

int a, c, d;
volatile int b;
int *e = 

union U1 {
  unsigned f0;
  unsigned f1 : 15;
};

int main() {
  for (c = 0; c <= 1; c++) {
union U1 f = {0x10101};
if (c == 1)
  b;
*e = f.f1;
  }

  printf("checksum = %X\n", d);
}

which should print "checksum = 101", but when compiled at -O2 for an aarch32
target it prints "checksum = 10101"

arm-none-eabi-gcc -march=armv7-a -c test.c -o test.o -O2

Compiles correctly for -march=armv8-a.

Kyrill helped to show the difference between armv7-a and armv8-a starts at
combine where the good dump shows:
Trying 22 -> 23:
   22: r123:HI#0=zero_extract(r117:SI,0xf,0)
  REG_DEAD r117:SI
   23: r124:SI=zero_extend(r123:HI)
  REG_DEAD r123:HI
Successfully matched this instruction:
(set (reg:SI 124)
(and:SI (reg/v:SI 117 [ f ])
(const_int 257 [0x101])))
allowing combination of insns 22 and 23
original costs 8 + 8 = 16
replacement cost 12
deferring deletion of insn with uid = 22.
modifying insn i323: r124:SI=r117:SI&0x101
  REG_DEAD r117:SI
deferring rescan insn with uid = 23.

Trying 23 -> 24:
   23: r124:SI=r117:SI&0x101
  REG_DEAD r117:SI
   24: [r116:SI]=r124:SI
  REG_DEAD r124:SI
Failed to match this instruction:
(set (mem:SI (reg/f:SI 116 [ pretmp_20 ]) [1 *pretmp_20+0 S4 A32])
(and:SI (reg/v:SI 117 [ f ])
(const_int 257 [0x101])))

but the bad shows:
Trying 22 -> 23:
   22: r123:HI#0=zero_extract(r117:SI,0xf,0)
  REG_DEAD r117:SI
   23: r124:SI=zero_extend(r123:HI)
  REG_DEAD r123:HI
Successfully matched this instruction:
(set (reg:SI 124)
(and:SI (reg/v:SI 117 [ f ])
(const_int 257 [0x101])))
rejecting combination of insns 22 and 23
original costs 4 + 4 = 8
replacement cost 12

Trying 23 -> 24:
   23: r124:SI=zero_extend(r123:HI)
  REG_DEAD r123:HI
   24: [r116:SI]=r124:SI
  REG_DEAD r124:SI
Successfully matched this instruction:
(set (mem:SI (reg/f:SI 116 [ pretmp_20 ]) [1 *pretmp_20+0 S4 A32])
(subreg:SI (reg:HI 123) 0))
allowing combination of insns 23 and 24
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 23.
modifying insn i324: [r116:SI]=r123:HI#0
  REG_DEAD r123:HI
deferring rescan insn with uid = 24.

Trying 22 -> 24:
   22: r123:HI#0=zero_extract(r117:SI,0xf,0)
  REG_DEAD r117:SI
   24: [r116:SI]=r123:HI#0
  REG_DEAD r123:HI
Successfully matched this instruction:
(set (mem:SI (reg/f:SI 116 [ pretmp_20 ]) [1 *pretmp_20+0 S4 A32])
(reg/v:SI 117 [ f ]))
allowing combination of insns 22 and 24
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 22.
modifying insn i324: [r116:SI]=r117:SI
  REG_DEAD r117:SI
deferring rescan insn with uid = 24.

and thus eats out the masking (and the zero_extend).

[Bug target/84882] -mstrict-align on aarch64 should not be RejectNegative

2018-05-23 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84882

--- Comment #3 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Wed May 23 11:33:09 2018
New Revision: 260604

URL: https://gcc.gnu.org/viewcvs?rev=260604=gcc=rev
Log:
[AArch64][PR target/84882] Add mno-strict-align

*** gcc/ChangeLog ***

2018-05-23  Sudakshina Das  <sudi@arm.com>

PR target/84882
* common/config/aarch64/aarch64-common.c (aarch64_handle_option):
Check val before adding MASK_STRICT_ALIGN to opts->x_target_flags.
* config/aarch64/aarch64.opt (mstrict-align): Remove RejectNegative.
* config/aarch64/aarch64.c (aarch64_attributes): Mark allow_neg
as true for strict-align.
(aarch64_can_inline_p): Perform checks even when callee has no
attributes to check for strict alignment.
* doc/extend.texi (AArch64 Function Attributes): Document
no-strict-align.
* doc/invoke.texi: (AArch64 Options): Likewise.

*** gcc/testsuite/ChangeLog ***

2018-05-23  Sudakshina Das  <sudi@arm.com>

PR target/84882
* gcc.target/aarch64/pr84882.c: New test.
* gcc.target/aarch64/target_attr_18.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr84882.c
trunk/gcc/testsuite/gcc.target/aarch64/target_attr_18.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/common/config/aarch64/aarch64-common.c
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/config/aarch64/aarch64.opt
trunk/gcc/doc/extend.texi
trunk/gcc/doc/invoke.texi
trunk/gcc/testsuite/ChangeLog

[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794

2018-05-22 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870

sudi at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords|ice-on-invalid-code |ice-on-valid-code

--- Comment #2 from sudi at gcc dot gnu.org ---
Sorry that was my bad..its valid code

[Bug c/85870] New: [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794

2018-05-22 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870

Bug ID: 85870
   Summary: [6/7/8/9 Regression][LTO1] ICE in linemap_line_start,
at libcpp/line-map.c:794
   Product: gcc
   Version: lto
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

Created attachment 44160
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44160=edit
Reproducer

Hi

Please find the attached tar to show the following failure

Reading object files: test_1.o test_2.olto1: internal compiler error:
in linemap_line_start, at libcpp/line-map.c:794
0x134278f linemap_line_start(line_maps*, unsigned int, unsigned int)
/work/trunk/src/gcc/libcpp/line-map.c:794
0x97fea1 lto_location_cache::apply_location_cache()
/work/trunk/src/gcc/gcc/lto-streamer-in.c:194
0x5dd948 lto_read_decls
/work/trunk/src/gcc/gcc/lto/lto.c:1852
0x5dee7d lto_file_finalize
/work/trunk/src/gcc/gcc/lto/lto.c:2121
0x5dee7d lto_create_files_from_ids
/work/trunk/src/gcc/gcc/lto/lto.c:2131
0x5dee7d lto_file_read
/work/trunk/src/gcc/gcc/lto/lto.c:2172
0x5dee7d read_cgraph_and_symbols
/work/trunk/src/gcc/gcc/lto/lto.c:2845
0x5dee7d lto_main()
/work/trunk/src/gcc/gcc/lto/lto.c:3362
Please submit a full bug report,
with preprocessed source if appropriate.

Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

This is failing on (at least) AArch64 and x86 targets.
On AArch64, this is failing on gcc-6, gcc-7, gcc-8 and trunk.
On x86, this is failing on at least gcc-8 and trunk.

The folder contains the following:
  1) A README file.
  2) Two test files : test_1.i and test_2.i
  3) One script to reproduce the failure.

To be able to run the script, it needs the following to be defined:
  1) TARGET (aarch-none-linux-gnu or x86_64-pc-linux-gnu)
  2) GCC_INSTALL (/path/to/install/directory)

[Bug libstdc++/85818] [8/9 Regression] undefined reference to `std::experimental::filesystem::v1::__cxx11::path::preferred_separator'

2018-05-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85818

--- Comment #9 from sudi at gcc dot gnu.org ---
Thanks!

[Bug libstdc++/85818] [8/9 Regression] undefined reference to `std::experimental::filesystem::v1::__cxx11::path::preferred_separator'

2018-05-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85818

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #4 from sudi at gcc dot gnu.org ---
I have observed the following failures on baremetals aarch64-none-elf,
aarch64_be-none-elf and arm-none-eabi nightly tests for both trunk and gcc-8
branch.

FAIL: experimental/filesystem/path/preferred_separator.cc (test for excess
errors)

with

compilation terminated.

compiler exited with status 1
output is:
/build/src/gcc/libstdc++-v3/testsuite/experimental/filesystem/path/preferred_separator.cc:21:
fatal error: experimental/filesystem: No such file or directory

compilation terminated.

[Bug tree-optimization/85804] [8/9 Regression][AArch64] Mis-compilation of loop with strided array access and xor reduction

2018-05-16 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85804

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Target||aarch64-none-linux-gnu
   Target Milestone|--- |8.2

[Bug tree-optimization/85804] New: [8/9 Regression][AArch64] Mis-compilation of loop with strided array access and xor reduction

2018-05-16 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85804

Bug ID: 85804
   Summary: [8/9 Regression][AArch64] Mis-compilation of loop with
strided array access and xor reduction
   Product: gcc
   Version: 8.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

The following test case:

#include 

long d[32] = {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

int main() {
  int b = 0;
  for (int c = 0; c <= 5; c++)
b ^= d[c * 5 + 1];
  printf("checksum = %x\n", b);
}

when compiled with:
aarch64-none-linux-gnu-gcc -O3 f.c
prints "checksum = 1".

All the elements being xor'd (1,6,11,16,21,26) are 0s and thus the result
should also be 0.

The assembly for main looks like:
main:
.LFB11:
.cfi_startproc
adrpx1, .LANCHOR0
add x1, x1, :lo12:.LANCHOR0
stp x29, x30, [sp, -16]!
.cfi_def_cfa_offset 16
.cfi_offset 29, -16
.cfi_offset 30, -8
adrpx0, .LC0
add x0, x0, :lo12:.LC0
mov x29, sp
ldr q1, [x1, 8]
ldr q2, [x1, 24]
ldr q0, [x1, 40]
xtn v2.2s, v2.2d
xtn v1.2s, v1.2d
xtn v0.2s, v0.2d
eor v1.8b, v1.8b, v2.8b
eor v0.8b, v0.8b, v1.8b
ushr d1, d0, 32
eor v0.8b, v0.8b, v1.8b
umovw1, v0.s[0]
bl  printf
mov w0, 0
ldp x29, x30, [sp], 16
.cfi_restore 30
.cfi_restore 29
.cfi_def_cfa_offset 0
ret
.cfi_endproc


This goes away with -fno-tree-loop-vectorize.

[Bug tree-optimization/85794] New: [8/9 Regression][AArch64] ICE in expand_vector_condition in GIMPLE pass: veclower2

2018-05-15 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85794

Bug ID: 85794
   Summary: [8/9 Regression][AArch64] ICE in
expand_vector_condition in GIMPLE pass: veclower2
   Product: gcc
   Version: 8.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

The following code ICEs with -O3 with aarch64-none-linux-gnu

int a, b, *c, d, *e;
int *f[6];
void fn1() {
  b = 1;
  for (; b >= 0; b--) {
d = 0;
for (; d <= 3; d++) {
  int **g = 
  *g = 
  *e |= f[b * 5] != c;
}
  }
}

./build-aarch64-none-linux-gnu/install/bin/aarch64-none-linux-gnu-gcc f.c -O3
-S -fdump-tree-all
during GIMPLE pass: veclower2
dump file: f.c.173t.veclower21
f.c: In function ‘fn1’:
f.c:6:6: internal compiler error: Segmentation fault
 void fn1() {
  ^~~
0xbd57ff crash_signal
/work/trunk/src/gcc/gcc/toplev.c:325
0xe55d90 tree_class_check
/work/trunk/src/gcc/gcc/tree.h:3257
0xe55d90 expand_vector_condition
/work/trunk/src/gcc/gcc/tree-vect-generic.c:901
0xe587c1 expand_vector_operations_1
/work/trunk/src/gcc/gcc/tree-vect-generic.c:1582
0xe587c1 expand_vector_operations
/work/trunk/src/gcc/gcc/tree-vect-generic.c:1829
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug c/85793] New: [8/9 Regression][AARCH64] ICE in verify_gimple during GIMPLE pass vect.

2018-05-15 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85793

Bug ID: 85793
   Summary: [8/9 Regression][AARCH64] ICE in verify_gimple during
GIMPLE pass vect.
   Product: gcc
   Version: 8.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

The following test case fails with aarch64-none-lunux-gnu-gcc -O3

int a, c, d;
long b[6];
void fn1() {
  for (; a < 2; a++) {
c = 0;
for (; c <= 5; c++)
  d &= b[a * 3];
  }
}

$ ./build-aarch64-none-linux-gnu/install/bin/aarch64-none-linux-gnu-gcc f.c -O3
-fdump-tree-all-all
f.c: In function ‘fn1’:
f.c:7:6: error: type mismatch in vector pack expression
 void fn1() {
  ^~~
vector(2) int
long int
long int
vect__3.16_63 = VEC_PACK_TRUNC_EXPR <_59, _61>;
during GIMPLE pass: vect
dump file: f.c.161t.vect
f.c:7:6: internal compiler error: verify_gimple failed
0xc2dcb8 verify_gimple_in_cfg(function*, bool)
/work/trunk/src/gcc/gcc/tree-cfg.c:5355
0xaee93a execute_function_todo
/work/trunk/src/gcc/gcc/passes.c:1994
0xaef385 execute_todo
/work/trunk/src/gcc/gcc/passes.c:2048
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug c++/85600] [9 Regression] CPU2006 471.omnetpp fails starting with r259771

2018-05-03 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85600

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #7 from sudi at gcc dot gnu.org ---
Also failing on aarch64 targets.

[Bug target/82989] [6/7 regression] Inexplicable use of NEON for 64-bit math

2018-04-04 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #26 from sudi at gcc dot gnu.org ---
Should be fixed on gcc-7 and gcc-6

[Bug target/81647] inconsistent LTGT behavior at different optimization levels on AArch64.

2018-04-04 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81647

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from sudi at gcc dot gnu.org ---
should be fixed now in both trunk and gcc-7-branch

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-29 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

--- Comment #14 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Thu Mar 29 09:27:53 2018
New Revision: 258949

URL: https://gcc.gnu.org/viewcvs?rev=258949=gcc=rev
Log:
[ARM][PR target/84826] Fix ICE in extract_insn, at recog.c:2304 on
arm-linux-gnueabihf

This patch backports r258777 and r258805 to gcc-7-branch
and gcc-6-branch. The same ICE occurs in both the branches with
-fstack-check. Thus the test case directive has been changed.

The discussion on the patch that went into trunk is:
https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html

ChangeLog entries:

*** gcc/ChangeLog ***

2018-03-29  Sudakshina Das  <sudi@arm.com>

Backport from mainline
2018-03-22  Sudakshina Das  <sudi@arm.com>

PR target/84826
* config/arm/arm.h (machine_function): Add static_chain_stack_bytes.
* config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid
re-computing once computed.
(arm_expand_prologue): Compute machine->static_chain_stack_bytes.
(arm_init_machine_status): Initialize
machine->static_chain_stack_bytes.

*** gcc/testsuite/ChangeLog ***

2018-03-29  Sudakshina Das  <sudi@arm.com>

* gcc.target/arm/pr84826.c: Change dg-option to -fstack-check.

Backport from mainline
2018-03-23  Sudakshina Das  <sudi@arm.com>

PR target/84826
* gcc.target/arm/pr84826.c: Add dg directive.

Backport from mainline
2018-03-22  Sudakshina Das  <sudi@arm.com>

PR target/84826
* gcc.target/arm/pr84826.c: New test.

Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/pr84826.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/arm/arm.c
branches/gcc-6-branch/gcc/config/arm/arm.h
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-29 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

--- Comment #13 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Thu Mar 29 09:19:45 2018
New Revision: 258948

URL: https://gcc.gnu.org/viewcvs?rev=258948=gcc=rev
Log:
[ARM][PR target/84826] Fix ICE in extract_insn, at recog.c:2304 on
arm-linux-gnueabihf

This patch backports r258777 and r258805 to gcc-7-branch
and gcc-6-branch. The same ICE occurs in both the branches with
-fstack-check. Thus the test case directive has been changed.

The discussion on the patch that went into trunk is:
https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html

ChangeLog entries:

*** gcc/ChangeLog ***

2018-03-29  Sudakshina Das  <sudi@arm.com>

Backport from mainline
2018-03-22  Sudakshina Das  <sudi@arm.com>

PR target/84826
* config/arm/arm.h (machine_function): Add static_chain_stack_bytes.
* config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid
re-computing once computed.
(arm_expand_prologue): Compute machine->static_chain_stack_bytes.
(arm_init_machine_status): Initialize
machine->static_chain_stack_bytes.

*** gcc/testsuite/ChangeLog ***

2018-03-29  Sudakshina Das  <sudi@arm.com>

* gcc.target/arm/pr84826.c: Change dg-option to -fstack-check.

Backport from mainline
2018-03-23  Sudakshina Das  <sudi@arm.com>

PR target/84826
* gcc.target/arm/pr84826.c: Add dg directive.

Backport from mainline
2018-03-22  Sudakshina Das  <sudi@arm.com>

PR target/84826
* gcc.target/arm/pr84826.c: New test.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/pr84826.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/arm/arm.c
branches/gcc-7-branch/gcc/config/arm/arm.h
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-29 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

sudi at gcc dot gnu.org changed:

   What|Removed |Added

  Known to fail||6.4.1, 7.3.1

--- Comment #12 from sudi at gcc dot gnu.org ---
The same failure is happening on gcc-7 and gcc-6 with -fstack-check. Have sent
a backport request

[Bug target/81647] inconsistent LTGT behavior at different optimization levels on AArch64.

2018-03-28 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81647

--- Comment #11 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Wed Mar 28 10:15:47 2018
New Revision: 258917

URL: https://gcc.gnu.org/viewcvs?rev=258917=gcc=rev
Log:
[PR81647][AARCH64] Fix handling of Unordered Comparisons in aarch64-simd.md

This is a backport of r258653 and r258672.

ChangeLog Entries:

*** gcc/ChangeLog ***

2018-03-28  Sudakshina Das  <sudi@arm.com>

2018-03-19  Sudakshina Das  <sudi@arm.com>
PR target/81647

* config/aarch64/aarch64-simd.md (vec_cmp): Modify
instructions for UNLT, UNLE, UNGT, UNGE, UNEQ, UNORDERED and ORDERED.

*** gcc/testsuite/ChangeLog ***

2018-03-28  Sudakshina Das  <sudi@arm.com>
Christophe Lyon  <christophe.l...@linaro.org>

2018-03-20  Christophe Lyon  <christophe.l...@linaro.org>

PR target/81647
* gcc.target/aarch64/pr81647.c: Require fenv_exceptions.

2018-03-19  Sudakshina Das  <sudi@arm.com>

PR target/81647
* gcc.target/aarch64/pr81647.c: New.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/aarch64/pr81647.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/aarch64/aarch64-simd.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/82989] [6/7 regression] Inexplicable use of NEON for 64-bit math

2018-03-27 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

--- Comment #25 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Tue Mar 27 13:40:56 2018
New Revision: 258884

URL: https://gcc.gnu.org/viewcvs?rev=258884=gcc=rev
Log:
[ARM][PR82989] Fix unexpected use of NEON instructions for shifts

This is a backport of r258677 and r258723 of trunk.

*** gcc/ChangeLog ***

2018-03-27  Sudakshina Das  <sudi@arm.com>

Backport from mainline:
2018-03-20  Sudakshina Das  <sudi@arm.com>

PR target/82989
* config/arm/neon.md (ashldi3_neon): Update ?s for constraints
to favor GPR over NEON registers.
(di3_neon): Likewise.

*** gcc/testsuite/ChangeLog ***

2018-03-27  Sudakshina Das  <sudi@arm.com>

Backport from mainline:
2018-03-20  Sudakshina Das  <sudi@arm.com>

PR target/82989
* gcc.target/arm/pr82989.c: New test.

Backport from mainline:
2018-03-21  Sudakshina Das  <sudi@arm.com>

PR target/82989
* gcc.target/arm/pr82989.c: Change dg scan-assembly directives.

Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/pr82989.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/arm/neon.md
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug target/82989] [6/7 regression] Inexplicable use of NEON for 64-bit math

2018-03-27 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

--- Comment #24 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Tue Mar 27 13:26:56 2018
New Revision: 258883

URL: https://gcc.gnu.org/viewcvs?rev=258883=gcc=rev
Log:
[ARM][PR82989] Fix unexpected use of NEON instructions for shifts

This is a backport of r258677 and r258723 of trunk.

*** gcc/ChangeLog ***

2018-03-27  Sudakshina Das  <sudi@arm.com>

Backport from mainline:
2018-03-20  Sudakshina Das  <sudi@arm.com>

PR target/82989
* config/arm/neon.md (ashldi3_neon): Update ?s for constraints
to favor GPR over NEON registers.
(di3_neon): Likewise.

*** gcc/testsuite/ChangeLog ***

2018-03-27  Sudakshina Das  <sudi@arm.com>

Backport from mainline:
2018-03-20  Sudakshina Das  <sudi@arm.com>

PR target/82989
* gcc.target/arm/pr82989.c: New test.

Backport from mainline:
2018-03-21  Sudakshina Das  <sudi@arm.com>

PR target/82989
* gcc.target/arm/pr82989.c: Change dg scan-assembly directives.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/pr82989.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/arm/neon.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/84882] -mstrict-align on aarch64 should not be RejectNegative

2018-03-27 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84882

--- Comment #2 from sudi at gcc dot gnu.org ---
Proposed patch

https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01439.html

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-23 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

--- Comment #11 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Fri Mar 23 13:57:28 2018
New Revision: 258805

URL: https://gcc.gnu.org/viewcvs?rev=258805=gcc=rev
Log:
[ARM] Fix pr84826.c failure for thumb1

*** gcc/testsuite/ChangeLog ***

2018-03-23  Sudakshina Das  <sudi@arm.com>

PR target/84826
* gcc.target/arm/pr84826.c: Add dg directive.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/arm/pr84826.c

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-22 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

--- Comment #10 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Thu Mar 22 17:24:41 2018
New Revision: 258777

URL: https://gcc.gnu.org/viewcvs?rev=258777=gcc=rev
Log:
[ARM][PR target/84826] Fix ICE in extract_insn, at recog.c:2304 on
arm-linux-gnueabi

The ICE in the bug report was happening because the macro
USE_RETURN_INSN (FALSE) was returning different values at different points
in the compilation. This was internally occurring because the function
arm_compute_static_chain_stack_bytes () which was dependent on
arm_r3_live_at_start_p () was giving a different value after the cond_exec
instructions were created in ce3 causing the liveness of r3 to escape up
to the start block.

The function arm_compute_static_chain_stack_bytes () should really only
compute the value once duringepilogue/prologue stage. This pass introduces
a new member 'static_chain_stack_bytes' to the target definition of the
struct machine_function which gets calculated in expand_prologue and is the
value that is returned by arm_compute_static_chain_stack_bytes () beyond that.

ChangeLog entries:

*** gcc/ChangeLog ***

2018-03-22  Sudakshina Das  <sudi@arm.com>

PR target/84826
* config/arm/arm.h (machine_function): Add static_chain_stack_bytes.
* config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid
re-computing once computed.
(arm_expand_prologue): Compute machine->static_chain_stack_bytes.
(arm_init_machine_status): Initialize
machine->static_chain_stack_bytes.

*** gcc/testsuite/ChangeLog ***

2018-03-22  Sudakshina Das  <sudi@arm.com>

PR target/84826
* gcc.target/arm/pr84826.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/arm/pr84826.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/config/arm/arm.h
trunk/gcc/testsuite/ChangeLog

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

--- Comment #9 from sudi at gcc dot gnu.org ---
Proposed patch
https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html

[Bug target/82989] [6/7/8 regression] Inexplicable use of NEON for 64-bit math

2018-03-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

--- Comment #22 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Wed Mar 21 17:14:48 2018
New Revision: 258723

URL: https://gcc.gnu.org/viewcvs?rev=258723=gcc=rev
Log:
[ARM] Fix test pr82989.c for big endian and mthumb

The test pr82989.c which was added in one of previous commits is failing for
mthumb and big-endian configurations. The aim of this test was to check that
NEON instructions are not being used for simple shift operations. The scanning
of lsl and lsr instructions and checking its counts were just too restrictive
for different configurations. So I have now simplified the test to only check
for the absence of NEON instructions.

*** gcc/testsuite/ChangeLog ***

2018-03-21  Sudakshina Das  <sudi@arm.com>

PR target/82989
* gcc.target/arm/pr82989.c: Change dg scan-assembly directives.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/arm/pr82989.c

[Bug target/84882] -mstrict-align on aarch64 should not be RejectNegative

2018-03-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84882

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||sudi at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |sudi at gcc dot gnu.org

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-20 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

--- Comment #8 from sudi at gcc dot gnu.org ---
(In reply to Wilco from comment #5)
> It seems a latent bug in arm_r3_live_at_start_p which now triggers much more
> often due to stack clash protection:
> 
>   if (IS_NESTED (arm_current_func_type ())
>   && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
>   || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
>|| flag_stack_clash_protection)
>   && !df_regs_ever_live_p (LR_REGNUM)))
>   && arm_r3_live_at_start_p ()
>   && crtl->args.pretend_args_size == 0)
> 
> Given that liveness can't guarantee dead registers won't look live at start,
> the r3_live_at_start should really be about function parameters which is a
> fixed concept. Is there no query that can accurately tell you which
> registers are used for parameters in the current function?
> 
> For GCC9 we need to redesign this whole area - most of the above checks are
> quite inaccurate (for example a temporary is only used for stack checking if
> the stack size is > 16KB), copy and pasted multiple times in slightly
> different ways, and not cached when computing the frame layout like on
> AArch64.
> 
> However a quick workaround for GCC8 would be to assume
> arm_r3_live_at_start_p is always true in the above code. Also we should
> never change the generated code in functions which do not require stack
> checking, so changing the stack checking enabled test to framesize > 16KB
> would be the right thing to do.

I have created a new report PR 85005 for this cleanup. For now I am only making
changes enough to get rid of the ICE.

[Bug target/85005] Redesign and cleanup arm.c wrt to flag_stack_clash_protection and flag_stack_check

2018-03-20 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85005

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Target||arm*-*-*
 CC||wilco at gcc dot gnu.org
   Target Milestone|--- |9.0

[Bug target/85005] New: Redesign and cleanup arm.c wrt to flag_stack_clash_protection and flag_stack_check

2018-03-20 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85005

Bug ID: 85005
   Summary: Redesign and cleanup arm.c wrt to
flag_stack_clash_protection and flag_stack_check
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

I am creating this for GCC9 as a follow-up on PR 84826 comment 5 by Wilco.
There are several places where the following code is checked.

  if (IS_NESTED (arm_current_func_type ())
  && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
  || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
   || flag_stack_clash_protection)
  && !df_regs_ever_live_p (LR_REGNUM)))
  && arm_r3_live_at_start_p ()
  && crtl->args.pretend_args_size == 0)

Most of these times there is also slight variations on there checks. The flags
being checked (flag_stack_check == STATIC_BUILTIN_STACK_CHECK ||
flag_stack_clash_protection) are also probably not used correctly. This should
be tightened to only have any effect of the frame size is more than 16KB. In
all other cases these flags do not matter. This piece of code is also not
tested given  the number of very specific checks involved and also the fact
that check_effective_target_supports_stack_clash_protection does not list any
Arm backends.

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-20 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

--- Comment #6 from sudi at gcc dot gnu.org ---
(In reply to Eric Botcazou from comment #4)
> > So I looked into this. Turns out the actual issue is that USE_RETURN_INSN
> > (FALSE) changes its value and becomes false after pass ce3.
> > 
> > According to what I can see, arm_r3_live_at_start_p() starts to return true
> > after ce3.
> 
> Right and I think that we can live that.
> 
> > My question is:
> > 1) Is there any easy way to avoid the false positives from
> > arm_r3_live_at_start_p()
> 
> I don't think so.
> 
> > 2) Why is r3 still live at IN of BB2 when there is no reaching definition of
> > it? I mean also there will never be any reaching definition at IN of BB2.
> 
> Because r3 is an argument register so it has got an artifical def on entry:
> 
> ;;  entry block defs   0 [r0] 1 [r1] 2 [r2] 3 [r3] 12 [ip] 13 [sp] 14 [lr]
> 

Oops I think I missed the artificial defs. Then the liveness makes sense. Out
of curiosity why are all the argument register defined? This function for
instance does not need 4 argument register.

> 
> IMO the issue is that arm_compute_static_chain_stack_bytes changes its value
> and this doesn't make sense since whether or not the static chain is pushed
> onto the stack is decided once for all at prologue time.  So the fix is
> probably to cache the value of the function in the machine_function
> structure and return this cached value after prologue/epilogue generation
> (e.g. epilogue_completed == 1).

Thanks I think this approach solves the problem. I am currently testing a patch
for this.

[Bug target/82989] [6/7/8 regression] Inexplicable use of NEON for 64-bit math

2018-03-20 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

--- Comment #21 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Tue Mar 20 10:54:42 2018
New Revision: 258677

URL: https://gcc.gnu.org/viewcvs?rev=258677=gcc=rev
Log:
[ARM][PR82989] Fix unexpected use of NEON instructions for shifts

This patch fixes PR82989 so that we avoid NEON instructions when
-mneon-for-64bits is not enabled. This is more of a short term fix
for the real deeper problem of making an early decision of choosing
or rejecting NEON instructions. There is now a new ticket PR84467 to
deal with the longer term solution.
(Please refer to the discussion in the bug report for more details).

Sudi

*** gcc/ChangeLog ***

2018-03-20  Sudakshina Das  <sudi@arm.com>

PR target/82989
* config/arm/neon.md (ashldi3_neon): Update ?s for constraints
to favor GPR over NEON registers.
(di3_neon): Likewise.

*** gcc/testsuite/ChangeLog ***

2018-03-20  Sudakshina Das  <sudi@arm.com>

PR target/82989
* gcc.target/arm/pr82989.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/arm/pr82989.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/neon.md
trunk/gcc/testsuite/ChangeLog

[Bug target/81647] inconsistent LTGT behavior at different optimization levels on AArch64.

2018-03-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81647

--- Comment #9 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Mon Mar 19 18:50:32 2018
New Revision: 258653

URL: https://gcc.gnu.org/viewcvs?rev=258653=gcc=rev
Log:
[PR81647][AARCH64] Fix handling of Unordered Comparisons in aarch64-simd.md

This patch fixes the inconsistent behavior observed at -O3 for the unordered
comparisons. According to the online docs (https://gcc.gnu.org/onlinedocs
/gcc-7.2.0/gccint/Unary-and-Binary-Expressions.html), all of the following
should not raise an FP exception:
- UNGE_EXPR
- UNGT_EXPR
- UNLE_EXPR
- UNLT_EXPR
- UNEQ_EXPR
Also ORDERED_EXPR and UNORDERED_EXPR should only return zero or one.

The aarch64-simd.md handling of these were generating exception raising
instructions such as fcmgt. This patch changes the instructions that are
emitted in order to not give out the exceptions. We first check each
operand for NaNs and force any elements containing NaN to zero before using
them in the compare.

Example: UN (a, b) -> UNORDERED (a, b)
  | (cm (isnan (a) ? 0.0 : a, isnan (b) ? 0.0 : b))


The ORDERED_EXPR is now handled as (cmeq (a, a) & cmeq (b, b)) and
UNORDERED_EXPR as ~ORDERED_EXPR and UNEQ as (~ORDERED_EXPR | cmeq (a,b)).

ChangeLog Entries:

*** gcc/ChangeLog ***

2018-03-19  Sudakshina Das  <sudi@arm.com>

PR target/81647
* config/aarch64/aarch64-simd.md (vec_cmp): Modify
instructions for UNLT, UNLE, UNGT, UNGE, UNEQ, UNORDERED and ORDERED.

*** gcc/testsuite/ChangeLog ***

2018-03-19  Sudakshina Das  <sudi@arm.com>

PR target/81647
* gcc.target/aarch64/pr81647.c: New.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr81647.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64-simd.md
trunk/gcc/testsuite/ChangeLog

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org

--- Comment #3 from sudi at gcc dot gnu.org ---
So I looked into this. Turns out the actual issue is that USE_RETURN_INSN
(FALSE) changes its value and becomes false after pass ce3.

According to what I can see, arm_r3_live_at_start_p() starts to return true
after ce3.

+static bool
+arm_r3_live_at_start_p (void)
+{
+  /* Just look at cfg info, which is still close enough to correct at this
+ point.  This gives false positives for broken functions that might use
+ uninitialized data that happens to be allocated in r3, but who cares?  */

This particular test cares :P

+  return REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), 3);
+}

r3 in this test case is being allocated for b and after ifcvt, it becomes
partially defined and thus the liveness does not get killed. (Look below for
excerpt from ce3)

;; lr  in0 [r0] 3 [r3] 12 [ip] 13 [sp] 14 [lr]
;; lr  use   0 [r0] 3 [r3] 12 [ip] 13 [sp]
;; lr  def   100 [cc]
;; live  in  0 [r0] 3 [r3] 12 [ip] 13 [sp] 14 [lr]
;; live  gen 3 [r3] 100 [cc]
;; live  kill
(note 5 1 47 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 47 5 2 2 NOTE_INSN_PROLOGUE_END)
(note 2 47 4 2 NOTE_INSN_DELETED)
(note 4 2 7 2 NOTE_INSN_FUNCTION_BEG)
(insn 7 4 10 2 (set (reg:CC 100 cc)
(compare:CC (reg:SI 0 r0 [ cD.5556 ])
(const_int 0 [0]))) "ice.i":6 193 {*arm_cmpsi_insn}
 (expr_list:REG_DEAD (reg:SI 0 r0 [ cD.5556 ])
(nil)))
(insn 10 7 11 2 (cond_exec (ne (reg:CC 100 cc)
(const_int 0 [0]))
(set (reg:SI 3 r3 [orig:115 CHAIN.1_5(D)->bD.5567 ] [115])
(mem/j:SI (reg/f:SI 12 ip [orig:113 CHAIN.1D.5566 ] [113]) [1
CHAIN.1_5(D)->bD.5567+0 S4 A32]))) "ice.i":7 4068 {*p *arm_movsi_vfp}
 (expr_list:REG_EQUIV (mem/j:SI (reg/f:SI 12 ip [orig:113 CHAIN.1D.5566 ]
[113]) [1 CHAIN.1_5(D)->bD.5567+0 S4 A32])
(nil)))

My question is:
1) Is there any easy way to avoid the false positives from
arm_r3_live_at_start_p()
2) Why is r3 still live at IN of BB2 when there is no reaching definition of
it? I mean also there will never be any reaching definition at IN of BB2. So
shouldn't all the liveness (barring the artificially create ones for the
prologues/stack requirements) be killed there? I know I may be over simplifying
all the intricate details of liveness analysis when I ask this question, but I
am looking for some help here!

[Bug target/84826] ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabi

2018-03-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84826

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||sudi at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |sudi at gcc dot gnu.org

[Bug target/84521] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-03-14 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #25 from sudi at gcc dot gnu.org ---
Proposed patch. This obviously does not solve all the issues

https://gcc.gnu.org/ml/gcc-patches/2018-03/msg00668.html

[Bug target/82989] [6/7/8 regression] Inexplicable use of NEON for 64-bit math

2018-03-14 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

--- Comment #20 from sudi at gcc dot gnu.org ---
Proposed patch
https://gcc.gnu.org/ml/gcc-patches/2018-03/msg00644.html

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-03-07 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #17 from sudi at gcc dot gnu.org ---
I looked up what other targets were doing and one thing found to be interesting
was that a lot of them are defining the target hook
TARGET_BUILTIN_SETJMP_FRAME_VALUE. In AArch64 case I am suggesting to define it
to return the hard frame pointer. That seems to solve the issue with both the
attached test case and the test that Wilco mentioned.

Does this look like it solves "mid-end versus back-end : who fixes this issue"
problem?

I am still pretty new to knowing how the stack should actually look. So calling
for some help!

Sudi

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-03-06 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

--- Comment #16 from sudi at gcc dot gnu.org ---
So I think I would go with Jakub's suggestion of defining calls_builtin_setjmp
and use that in aarch64_layout_frame for cfun->machine->frame.emit_frame_chain.
I am still investigating Wilco's concern over pr60003.c

Also I would like to ask if we can pick up the discussions on PR59039 and on
the documentation patch. Some definition would be a lot more helpful for
someone like me who had no idea about __builtin_setjmp/longjmp to wrap my head
around what it actually is.

The macro DONT_USE_BUITIN_SETJMP made me very confused for a while until Ramana
pointed it out to me that its only used in the context of compiler's exception
unwinding machinery and does not apply to a case like this where the user calls
the builtin function. With this context in head when I went back and read the
document, it made sense. Can we possibly tweak the wording to make it clearer?

[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer

2018-03-06 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/81228] [7 Regression] ICE in gen_vec_cmpv2dfv2di, at config/aarch64/aarch64-simd.md:2508

2018-02-22 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81228

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from sudi at gcc dot gnu.org ---
Backported to gcc-7 as r257901

[Bug target/81228] [7 Regression] ICE in gen_vec_cmpv2dfv2di, at config/aarch64/aarch64-simd.md:2508

2018-02-22 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81228

--- Comment #10 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Thu Feb 22 15:01:05 2018
New Revision: 257901

URL: https://gcc.gnu.org/viewcvs?rev=257901=gcc=rev
Log:
Adding the missing LTGT to plug the ICE in PR81228.
This is a backport of r255625 of trunk.

*** gcc/ChangeLog ***

2018-02-22  Sudakshina Das  <sudi@arm.com>
Bin Cheng  <bin.ch...@arm.com>

Backport from mainline:
2017-12-14  Sudakshina Das  <sudi@arm.com>
Bin Cheng  <bin.ch...@arm.com>

PR target/81228
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Move LTGT to
CCFPEmode.
* config/aarch64/aarch64-simd.md (vec_cmp): Add
LTGT.

*** gcc/testsuite/ChangeLog ***

2017-02-22  Sudakshina Das  <sudi@arm.com>

Backport from mainline:
2017-12-14  Sudakshina Das  <sudi@arm.com>

PR target/81228
* gcc.dg/pr81228.c: New.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.dg/pr81228.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/aarch64/aarch64-simd.md
branches/gcc-7-branch/gcc/config/aarch64/aarch64.c
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/82096] [6 Regression] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2018-02-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from sudi at gcc dot gnu.org ---
Backported to gcc-6

[Bug target/82096] [6 Regression] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2018-02-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

--- Comment #13 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Wed Feb 21 12:50:31 2018
New Revision: 257871

URL: https://gcc.gnu.org/viewcvs?rev=257871=gcc=rev
Log:
Fix emit_store_flag_force () function to fix ICE in int_mode_for_mode,
at stor-layout.c:403 with arm-linux-gnueabi.

*** gcc/ChangeLog ***

2018-02-21  Sudakshina Das  <sudi@arm.com>

Backport from trunk
2018-01-10  Sudakshina Das  <sudi@arm.com>

PR target/82096
* expmed.c (emit_store_flag_force): Swap if const op0
and change VOIDmode to mode of op0.

*** gcc/testsuite/ChangeLog ***

2018-02-21  Sudakshina Das  <sudi@arm.com>

Backport from trunk
2018-01-12  Sudakshina Das  <sudi@arm.com>

* gcc.c-torture/compile/pr82096.c: Add dg-skip-if
directive.

Backport from trunk
2018-01-10  Sudakshina Das  <sudi@arm.com>

PR target/82096
* gcc.c-torture/compile/pr82096.c: New test.

Added:
branches/gcc-6-branch/gcc/testsuite/gcc.c-torture/compile/pr82096.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/expmed.c
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug target/82989] [7/8 regression] Inexplicable use of NEON for 64-bit math

2018-02-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

--- Comment #18 from sudi at gcc dot gnu.org ---
Created bug 84467 to continue discussions about the early expand phase
decisions to choose or reject NEON operations

[Bug target/84467] Choosing between Integer and NEON for 64-bit operations

2018-02-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84467

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org,
   ||matthijsvanduin at gmail dot 
com,
   ||wilco at gcc dot gnu.org
   Target Milestone|--- |9.0

[Bug target/84467] New: Choosing between Integer and NEON for 64-bit operations

2018-02-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84467

Bug ID: 84467
   Summary: Choosing between Integer and NEON for 64-bit
operations
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sudi at gcc dot gnu.org
  Target Milestone: ---

This is a follow up report to bug 82989
The comment bug 82989, comment 12 details about the need for early decisions to
be made about choosing to take either NEON code or ARM code. This means that at
the expand phase, we should be able to make a clear choice and avoid mixing the
two.

[Bug target/82989] [7/8 regression] Inexplicable use of NEON for 64-bit math

2018-02-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

--- Comment #17 from sudi at gcc dot gnu.org ---
Since this looks like a pretty invasive problem, according to my discussions
with Wilco and Kyrill, I think I will try to propose a smaller, but temporary
fix using the ?s and special casing 32 for this PR (which could go in sooner).
I will also open a new PR to handle this at the expand phase and clean up the
code aimed at gcc 9.

[Bug target/82096] [6/7 Regression] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2018-02-16 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

--- Comment #11 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Fri Feb 16 15:37:35 2018
New Revision: 257741

URL: https://gcc.gnu.org/viewcvs?rev=257741=gcc=rev
Log:
Fix emit_store_flag_force () function to fix ICE in int_mode_for_mode,
at stor-layout.c:403 with arm-linux-gnueabi.

*** gcc/ChangeLog ***

2018-02-16  Sudakshina Das  <sudi@arm.com>

Backport from trunk
2018-01-10  Sudakshina Das  <sudi@arm.com>

PR target/82096
* expmed.c (emit_store_flag_force): Swap if const op0
and change VOIDmode to mode of op0.

*** gcc/testsuite/ChangeLog ***

2018-02-16  Sudakshina Das  <sudi@arm.com>

Backport from trunk
2018-01-12  Sudakshina Das  <sudi@arm.com>

* gcc.c-torture/compile/pr82096.c: Add dg-skip-if
directive.

Backport from trunk
2018-01-10  Sudakshina Das  <sudi@arm.com>

PR target/82096
* gcc.c-torture/compile/pr82096.c: New test.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.c-torture/compile/pr82096.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/expmed.c
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/82989] [7/8 regression ] Inexplicable use of NEON for 64-bit math

2018-02-16 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82989

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/84363] [8 Regression] Assembler error in stage1/libgcc: Error: view number mismatch

2018-02-13 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84363

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #1 from sudi at gcc dot gnu.org ---
This looks similar to PR84342

[Bug target/83915] FAIL: gcc.target/aarch64/sve/extract_1.c -march=armv8.2-a+sve (internal compiler error)

2018-02-13 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83915

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #1 from sudi at gcc dot gnu.org ---
This should have been fixed with r257178

[Bug target/83712] [6/7/8 Regression] "Unable to find a register to spill" when compiling for thumb1

2018-01-12 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83712

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||law at redhat dot com,
   ||vmakarov at redhat dot com

--- Comment #4 from sudi at gcc dot gnu.org ---
What I have observed so far is that the failure occurs based on how the
scheduler (sched1) chooses to schedule the movmem12b instructions (insn 16 in
all the cases below). If that
instruction is scheduled a bit later (even by one instruction), its all good!

Even though the movmem12b instruction has a very heavy demand on the registers,
shouldn't the allocator and/or the scheduler be able to detect that? Is this a
scheduler problem or an allocator problem or neither?

Example Passing cases:

-mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1
;; Pressure summary: GENERAL_REGS:8

;;0--> b  0: i  13 r119=[`*.LC1'] 
:(l_a+e_1),l_dc1,l_dc2,l_wb:GENERAL_REGS+1(1)
;;1--> b  0: i  12 r118=sfp-0x10  
:e_1,e_2,e_3,e_wb:@GENERAL_REGS+1(1)
;;2--> b  0: i   2 r111=r0
:e_1,e_2,e_3,e_wb:@GENERAL_REGS+1(0):model 0
;;3--> b  0: i  16
{[r118]=[r119];[r118+0x4]=[r119+0x4];[r118+0x8]=[r119+0x8];r120=r118+0xc;r121=r119+0xc;clobber
scratch;clobber scratch;clobber
scratch;}:(l_a+e_1),l_dc1*2,l_dc2,l_wb:GENERAL_REGS+2(1)
...
,..

Example Failing case:

-mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1
-mtune=cortex-m0plus
;; Pressure summary: GENERAL_REGS:8

;;0--> b  0: i  13 r119=[`*.LC1'] 
:core:GENERAL_REGS+1(1)
;;1--> b  0: i  12 r118=sfp-0x10  
:core:@GENERAL_REGS+1(1)
;;2--> b  0: i  16
{[r118]=[r119];[r118+0x4]=[r119+0x4];[r118+0x8]=[r119+0x8];r120=r118+0xc;r121=r119+0xc;clobber
scratch;clobber scratch;clobber scratch;}:core*4:GENERAL_REGS+2(1)
...
...

Other passing option:
mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1
-mtune=cortex-m7

Other failing option:
-mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1
-mtune=cortex-m4

[Bug target/83712] [6/7/8 Regression] "Unable to find a register to spill" when compiling for thumb1

2018-01-11 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83712

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||sudi at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |sudi at gcc dot gnu.org

[Bug target/82096] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2018-01-11 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

--- Comment #9 from sudi at gcc dot gnu.org ---
Yes I at least 6 and 7 need backports. Haven't gone beyond that yet.

[Bug target/82096] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2018-01-11 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

--- Comment #7 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Thu Jan 11 10:46:59 2018
New Revision: 256526

URL: https://gcc.gnu.org/viewcvs?rev=256526=gcc=rev
Log:
[PR82096] Fix ICE in int_mode_for_mode with arm-linux-gnueabi

The bug reported a particular test di-longlong64-sync-1.c failing when run
on arm-linux-gnueabi with options -mthumb -march=armv5t -O[g,1,2,3] and
-mthumb -march=armv6 -O[g,1,2,3].

The crash was caused because of the explicit VOIDmode argument that is sent
to emit_store_flag_force () and that the emit_store_flag_force () was not
handling the VOIDmode adequately. This patch fixes that.

ChangeLog entries:

*** gcc/ChangeLog ***

2017-01-11  Sudakshina Das  <sudi@arm.com>

PR target/82096
* expmed.c (emit_store_flag_force): Swap if const op0
and change VOIDmode to mode of op0.

*** gcc/testsuite/ChangeLog ***

2017-01-11  Sudakshina Das  <sudi@arm.com>

PR target/82096
* gcc.c-torture/compile/pr82096.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr82096.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/expmed.c
trunk/gcc/testsuite/ChangeLog

[Bug target/82439] Missing (x | y) == x simplifications

2018-01-05 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82439

--- Comment #7 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Fri Jan  5 10:45:37 2018
New Revision: 256275

URL: https://gcc.gnu.org/viewcvs?rev=256275=gcc=rev
Log:
[PATCH PR82439][simplify-rtx] Simplify (x | y) == x -> (y & ~x) == 0

This patch add support for the missing transformation of
(x | y) == x -> (y & ~x) == 0. The transformation for (x & y) == x case
already exists in simplify-rtx.c since 2014 as of r218503 and this patch
only adds a couple of extra patterns for the IOR case. This benefits 
targets that have the BICS instruction to generate better code. For
targets that do not have the BICS instructions, it still results in
no worse code generation and gives out 2 instructions.

ChangeLog Entries:

*** gcc/ChangeLog ***

2018-01-05  Sudakshina Das  <sudi@arm.com>

PR target/82439
* simplify-rtx.c (simplify_relational_operation_1): Add simplifications
of (x|y) == x for BICS pattern.

*** gcc/testsuite/ChangeLog ***

2018-01-05  Sudakshina Das  <sudi@arm.com>

PR target/82439
* gcc.target/aarch64/bics_5.c: New test.
* gcc.target/arm/bics_5.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/bics_5.c
trunk/gcc/testsuite/gcc.target/arm/bics_5.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/simplify-rtx.c
trunk/gcc/testsuite/ChangeLog

[Bug target/82096] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2018-01-04 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed|2017-12-19 00:00:00 |2018-01-04
 Ever confirmed|0   |1

--- Comment #6 from sudi at gcc dot gnu.org ---
Patch submitted 

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00219.html

[Bug target/82439] Missing (x | y) == x simplifications

2018-01-03 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82439

--- Comment #5 from sudi at gcc dot gnu.org ---
Patch submitted

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00139.html

[Bug target/81472] gcc.dg/torture/pr52028.c failed on armeb big-endian

2017-12-29 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81472

--- Comment #1 from sudi at gcc dot gnu.org ---
I see all execute tests failing with r256033

FAIL: gcc.dg/torture/pr52028.c   -O1  execution test
FAIL: gcc.dg/torture/pr52028.c   -O2  execution test
FAIL: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gcc.dg/torture/pr52028.c   -O3 -g  execution test
FAIL: gcc.dg/torture/pr52028.c   -Os  execution test
FAIL: gcc.dg/torture/pr52028.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.dg/torture/pr52028.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test

[Bug target/82074] [aarch64] vmlsq_f32 compiled into 2 instructions

2017-12-29 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82074

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #3 from sudi at gcc dot gnu.org ---
I think this is fixed on trunk as of r254905 and needs a backport to older
branches.

2017-11-17  Steve Ellcey  <sell...@cavium.com>

* config/aarch64/aarch64-simd.md (fnma4): Move neg operator
to canonical location.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@254905
138bc75d-0d04-0410-961f-82ee72b054a4

[Bug target/82439] Missing (x | y) == x simplifications

2017-12-29 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82439

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #4 from sudi at gcc dot gnu.org ---
The transformation
(x & y) == y -> (y & ~x) == 0
already exists since 2014 (r218503)!
This is done in simplify-rtx.c. If that looks like a reasonable approach, I am
testing a similar patch for (x | y) == x -> (y & ~x) == 0 transformation

[Bug target/82096] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2017-12-27 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

--- Comment #5 from sudi at gcc dot gnu.org ---
As far as I can see, the function expand_atomic_compare_and_swap() calls
emit_store_flag_force() with a explicit VOIDmode and because of this,
do_compare_rtx_and_jump() fails to split the instruction using any
do_jump_by_parts_* (which needs an int_mode). This then moves deep and causes
the forced move instructions with a VOIDmode and causes the ICE. The attached
patch fixes this but I am not sure why the explicit VOIDmode was used on the
first place or if the proposed patch is the right approach here? Anyone got any
ideas?

Sudi

diff --git a/gcc/optabs.c b/gcc/optabs.c
index 3354e40..efc95f7 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6312,7 +6312,7 @@ expand_atomic_compare_and_swap (rtx *ptarget_bool, rtx
*ptarget_oval,

  success_bool_from_val:
target_bool = emit_store_flag_force (target_bool, EQ, target_oval,
-   expected, VOIDmode, 1, 1);
+   expected, mode, 1, 1);
  success:
   /* Make sure that the oval output winds up where the caller asked.  */
   if (ptarget_oval)

[Bug target/83335] [8 regression][aarch64,ilp32] gcc.target/aarch64/asm-2.c ICEs since 255481

2017-12-22 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83335

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #2 from sudi at gcc dot gnu.org ---
Confirmed

[Bug target/82096] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2017-12-21 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

--- Comment #4 from sudi at gcc dot gnu.org ---
I can see this failing with:

$./arm-none-linux-gnueabi-gcc
./src/gcc/gcc/testsuite/gcc.dg/di-longlong64-sync-1.c -mthumb -march=armv5t
-O[g,1,2,3]

and

$./arm-none-linux-gnueabi-gcc
./src/gcc/gcc/testsuite/gcc.dg/di-longlong64-sync-1.c -mthumb -march=armv6
-O[g,1,2,3]

[Bug target/82096] ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2017-12-19 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82096

--- Comment #3 from sudi at gcc dot gnu.org ---
(In reply to Martin Liška from comment #2)
> I can with:
> 
> commit 23298f15ba71145bae317e9c07f7078663dbd923 (HEAD, parent/trunk,
> parent/master)
> Author: rguenth <rguenth@138bc75d-0d04-0410-961f-82ee72b054a4>
> Date:   Mon Dec 18 08:35:23 2017 +
> 
> 2017-12-18  Richard Biener  <rguent...@suse.de>
> 
> PR tree-optimization/81877
> * tree-ssa-loop-im.c (ref_indep_loop_p): Remove safelen
> parameters.
> (outermost_indep_loop): Adjust.
> (ref_indep_loop_p_1): Likewise.  Remove safelen handling again.
> (can_sm_ref_p): Adjust.
> 
> * g++.dg/torture/pr81877.C: New testcase.
> * g++.dg/vect/pr70729.cc: XFAIL.
> * g++.dg/vect/pr70729-nest.cc: XFAIL.
> 
> 
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@255776
> 138bc75d-0d04-0410-961f-82ee72b054a4
> 
> $  ./xgcc -v
> Using built-in specs.
> COLLECT_GCC=./xgcc
> Target: arm-unknown-linux-gnueabi
> Configured with: ../configure --enable-languages=c,c++ --disable-multilib
> --disable-bootstrap --target=arm-unknown-linux-gnueabi
> Thread model: posix
> gcc version 8.0.0 20171218 (experimental) (GCC) 
> 
> $ ./cc1 -fpreprocessed di-longlong64-sync-1.i  -mflip-thumb -mcpu=arm10tdmi
> -mtls-dialect=gnu -marm -march=armv5t -Og

I can reproduce it with this.

[Bug target/81228] [7/8 Regression] ICE in gen_vec_cmpv2dfv2di, at config/aarch64/aarch64-simd.md:2508

2017-12-14 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81228

--- Comment #7 from sudi at gcc dot gnu.org ---
Author: sudi
Date: Thu Dec 14 10:35:38 2017
New Revision: 255625

URL: https://gcc.gnu.org/viewcvs?rev=255625=gcc=rev
Log:
[PATCH PR81228][AARCH64]Fix ICE by adding LTGT in vec_cmp

This patch is a follow up to the existing discussions on
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01904.html
Bin had earlier submitted this patch to fix the ICE that
occurs because of the missing LTGT in aarch64-simd.md.
That discussion opened up a new bug report PR81647 for
an inconsistent behavior.

As discussed earlier on the gcc-patches discussion and on
the bug report, PR81647 was occurring because of how UNEQ
was handled in aarch64-simd.md rather than LTGT.
Since __builtin_islessgreater is guaranteed to not give an
FP exception but LTGT might, __builtin_islessgreater gets
converted to ~UNEQ very early on in fold_builtin_unordered_cmp.
Thus I will post a separate patch for correcting how UNEQ and
other unordered comparisons are handled in aarch64-simd.md.

This patch is only adding the missing LTGT to plug the ICE.

Testing done: Checked for regressions on bootstrapped
aarch64-none-linux-gnu and added a new compile time test case
that gives out LTGT to make sure it doesn't ICE

*** gcc/ChangeLog ***

2017-12-14  Sudakshina Das  <sudi@arm.com>
Bin Cheng  <bin.ch...@arm.com>

PR target/81228
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Move LTGT
to CCFPEmode.
* config/aarch64/aarch64-simd.md (vec_cmp): Add
LTGT.

*** gcc/testsuite/ChangeLog ***

2017-12-14  Sudakshina Das  <sudi@arm.com>

PR target/81228
* gcc.dg/pr81228.c: New.

Added:
trunk/gcc/testsuite/gcc.dg/pr81228.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64-simd.md
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/testsuite/ChangeLog

[Bug target/81228] [7/8 Regression] ICE in gen_vec_cmpv2dfv2di, at config/aarch64/aarch64-simd.md:2508

2017-12-13 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81228

--- Comment #6 from sudi at gcc dot gnu.org ---
Submitted for review

https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00838.html

[Bug target/81228] [7/8 Regression] ICE in gen_vec_cmpv2dfv2di, at config/aarch64/aarch64-simd.md:2508

2017-12-12 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81228

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #5 from sudi at gcc dot gnu.org ---
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01904.html

A patch already exists for this. The discussion their gave out PR81647 but it
is not exactly related to this. The existing patch applies IMHO with some
changes to rebase maybe. Since I am working on PR81647, I can rework this as
well.

[Bug target/81647] inconsistent LTGT behavior at different optimization levels on AArch64.

2017-12-08 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81647

--- Comment #8 from sudi at gcc dot gnu.org ---
For the inconsistent behavior on AArch64, I will try to write a patch

[Bug target/81647] inconsistent LTGT behavior at different optimization levels on AArch64.

2017-12-08 Thread sudi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81647

sudi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sudi at gcc dot gnu.org

--- Comment #7 from sudi at gcc dot gnu.org ---
(In reply to amker from comment #1)
> According to thread https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html
> it's still not clear if LTGT should be quite or singaling, but inconsistent
> behavior seems not correct here.

According to what I noticed by running this example, the
fold_builtin_unordered_cmp functions folds __builtin_islessgreater into ~UNEQ
for most combinations of options (except -fno-trapping-math) even on -O0. Since
UNEQ is guaranteed to not throw an exception, this conversion will probably
make sure that __builtin_islessgreater also does not throw an exception?