[Bug c++/97399] g++ 9.3 cannot compile SFINAE code with separated declaration and definition, g++ 7.3 compiles

2020-10-13 Thread renlin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97399

--- Comment #1 from Renlin Li  ---
Created attachment 49363
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49363=edit
test case 2

[Bug c++/97399] New: g++ 9.3 cannot compile SFINAE code with separated declaration and definition, g++ 7.3 compiles

2020-10-13 Thread renlin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97399

Bug ID: 97399
   Summary: g++ 9.3 cannot compile SFINAE code with separated
declaration and definition, g++ 7.3 compiles
   Product: gcc
   Version: 9.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 49362
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49362=edit
test case 1

For gcc_1.c++
gcc 7.3 compiles for this code
clang 7 compiles for this code
gcc 9.3 fails to compile with following message

Not sure if this is gcc's issue or clang.

```
:29:16: error: no declaration matches 'constexpr
enable_if_t<((tmp*)this)->is_integral(), bool> tmp::func(E, E) const'

   29 | constexpr auto tmp::func(E f_lhs, E f_rhs)

  |^~~

:18:27: note: candidate is: 'template static constexpr
enable_if_t<((tmp*)this)->is_integral(), bool> tmp::func(E, E)'

   18 | static constexpr auto func(E f_lhs, E f_rhs)

  |   ^~~~

:12:8: note: 'struct tmp' defined here

   12 | struct tmp
```

Meanwhile for gcc_2.c++
gcc compiles without any issue.
clang gives the following error message
```
:27:28: error: template parameter redefines default argument

template (), bool>>

   ^

:17:32: note: previous default template argument defined here

template (), bool>>
```
It seems this is not an new issue, and might be duplicated.

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2018-11-21 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

Renlin Li  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Renlin Li  ---
Mark it as fixed.

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2018-11-21 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

--- Comment #4 from Renlin Li  ---
Author: renlin
Date: Wed Nov 21 14:29:19 2018
New Revision: 266345

URL: https://gcc.gnu.org/viewcvs?rev=266345=gcc=rev
Log:
[PATCH][PR84877]Dynamically align the address for local parameter copy on the
stack when required alignment is larger than MAX_SUPPORTED_STACK_ALIGNMENT


As described in PR84877. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877
The local copy of parameter on stack is not aligned.

For BLKmode paramters, a local copy on the stack will be saved.
There are three cases:
1) arguments passed partially on the stack, partially via registers.
2) arguments passed fully on the stack.
3) arguments passed via registers.

After the change here, in all three cases, the stack slot for the local
parameter copy is aligned by the data type.
The stack slot is the DECL_RTL of the parameter. All the references thereafter
in the function will refer to this RTL.

To populate the local copy on the stack,
For case 1) and 2), there are operations to move data from the caller's stack
(from incoming rtl) into callee's stack.
For case 3), the registers are directly saved into the stack slot.

In all cases, the destination address is properly aligned.
But for case 1) and case 2), the source address is not aligned by the type.
It is defined by the PCS how the arguments are prepared.
The block move operation is fulfilled by emit_block_move (). As far as I can
see,
it will use the smaller alignment of source and destination.
This looks fine as long as we don't use instructions which requires a strict
larger alignment than the address actually has.

Here, it only changes receiving parameters.
The function assign_stack_local_1 will be called in various places.
Usually, the caller will constraint the ALIGN parameter.
For example via STACK_SLOT_ALIGNMENT macro.
assign_parm_setup_block will call assign_stack_local () with alignment from the
parameter type which in this case could be
larger than MAX_SUPPORTED_STACK_ALIGNMENT.

The alignment operation for parameter copy on the stack is similar to stack
vars.
First, enough space is reserved on the stack. The size is fixed at compile
time.
Instructions are emitted to dynamically get an aligned address at runtime
within this piece of memory.

This will unavoidably increase the usage of stack. However, it really depends
on
how many over-aligned parameters are passed by value.


gcc/

2018-11-21  Renlin Li  

PR middle-end/84877
* explow.h (get_dynamic_stack_size): Declare it as external.
* explow.c (record_new_stack_level): Remove function static attribute.
* function.c (assign_stack_local_1): Dynamically align the stack slot
addr for parameter copy on the stack.

gcc/testsuite/

2018-11-21  Renlin Li  

PR middle-end/84877
* gcc.dg/pr84877.c: New.


Added:
trunk/gcc/testsuite/gcc.dg/pr84877.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/explow.c
trunk/gcc/explow.h
trunk/gcc/function.c
trunk/gcc/testsuite/ChangeLog

[Bug target/87815] ICE in DSE with -march=armv8-a+sve while trying to replace load with previously stored value

2018-11-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87815

Renlin Li  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Renlin Li  ---
Fix by r266033

[Bug target/87815] ICE in DSE with -march=armv8-a+sve while trying to replace load with previously stored value

2018-11-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87815

--- Comment #1 from Renlin Li  ---
Author: renlin
Date: Mon Nov 12 16:47:24 2018
New Revision: 266033

URL: https://gcc.gnu.org/viewcvs?rev=266033=gcc=rev
Log:
[PR87815]Don't generate shift sequence for load replacement in DSE when the
mode size is not compile-time constant

The patch adds a check if the gap is compile-time constant.

This happens when dse decides to replace the load with previous store value.
The problem is that, shift sequence could not accept compile-time non-constant
mode operand.

gcc/

2018-11-12  Renlin Li  

PR target/87815
* dse.c (get_stored_val): Add check for compile-time
constantness of gap.

gcc/testsuite/

2018-11-12  Renlin Li  

PR target/87815
* gcc.target/aarch64/sve/pr87815.c: New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/sve/pr87815.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/dse.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/87899] [9 regression]r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-08 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899

--- Comment #6 from Renlin Li  ---
Created attachment 44975
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44975=edit
IRA dump

[Bug middle-end/87899] [9 regression]r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-08 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899

--- Comment #5 from Renlin Li  ---
Created attachment 44974
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44974=edit
IRA dump

The code you want to check is the following in ira pass:
insn 10905: r1 = r2040
insn 208: use and update r1 with pre_modify
insn 191: use pseudo r2040

[Bug middle-end/87899] [9 regression]r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899

--- Comment #3 from Renlin Li  ---
(In reply to Renlin Li from comment #1)
> in tree-loop-distribution.c, distribution_loop function, I got the following
> code snippets.
> 
>  30386: 0103cff4 4 OBJECT  LOCAL  DEFAULT   25 _ZL23bb_top_order_index_s
>  30387: 0103cff8 4 OBJECT  LOCAL  DEFAULT   25 _ZL18bb_top_order_index
>  30388: 0103cffc 4 OBJECT  LOCAL  DEFAULT   25 _ZL10ddrs_table
>  30389: 0103d000 4 OBJECT  LOCAL  DEFAULT   25 _ZL9loop_nest
>  30390: 0103d004 4 OBJECT  LOCAL  DEFAULT   25 _ZL12datarefs_vec
> 
> 
> r1 = 0x103cff4, which points to the local anchor area.
> r4 is the dynamically allocated has_table pointer which supposed to be store
> into ddrs_table, i.e. 0103cffc.
> 
>0x61a346 ,
> control_dependences*, int*, bool*)+90>:  strbr7, [r2, #0]
>0x61a348 ,
> control_dependences*, int*, bool*)+92>:  str.w   r7, [r8]
> 1=>0x61a34c ,
> control_dependences*, int*, bool*)+96>:  str.w   r7, [r1, #12]!
>0x61a350 ,
> control_dependences*, int*, bool*)+100>: mov r5, r1
> 2=>0x61a352 ,
> control_dependences*, int*, bool*)+102>: str r4, [r1, #8]
>0x61a354 ,
> control_dependences*, int*, bool*)+104>: str r0, [r4, #0]
>0x61a356 ,
> control_dependences*, int*, bool*)+106>: mov r0, r9
> 
> However, r1 is changed by the previous pre-indexed store at 0x61a34c (marked
> as 1).
> This makes the store later store the pointer in the wrong position.
> Later when accessing ddrs_table, it got a null pointer, eventually resulting
> in the ICE observed here.
> 
> The full assembly is attached.


Before the change:

 0x0061a746 <+26>:bl  0xc86134 
   0x0061a74a <+30>:movwr2, #57316  ; 0xdfe4
   0x0061a74e <+34>:movtr2, #259; 0x103
   0x0061a752 <+38>:str r2, [sp, #28]
   0x0061a754 <+40>:mov r4, r0
   0x0061a756 <+42>:movwr0, #389; 0x185
   0x0061a75a <+46>:str r7, [r4, #8]
   0x0061a75c <+48>:str r7, [r4, #12]
   0x0061a75e <+50>:strdr7, r7, [r4, #16]
   0x0061a762 <+54>:strhr7, [r4, #28]
   0x0061a764 <+56>:bl  0xc2bc50

   0x0061a768 <+60>:movwr3, #8452   ; 0x2104
   0x0061a76c <+64>:movtr3, #242; 0xf2
   0x0061a770 <+68>:lslsr2, r0, #4
   0x0061a772 <+70>:mov r5, r0
   0x0061a774 <+72>:mov r0, r4
   0x0061a776 <+74>:ldr r6, [r3, r2]
   0x0061a778 <+76>:mov r1, r6
   0x0061a77a <+78>:bl  0x61d1b4 ::alloc_entries(unsigned int) const>
   0x0061a77e <+82>:ldr.w   r12, [sp, #28]
   0x0061a782 <+86>:ldr r2, [sp, #296]  ; 0x128
   0x0061a784 <+88>:str r5, [r4, #24]
   0x0061a786 <+90>:mov r1, r12
   0x0061a788 <+92>:str r6, [r4, #4]
   0x0061a78a <+94>:strbr7, [r2, #0]
   0x0061a78c <+96>:mov r5, r12
   0x0061a78e <+98>:str.w   r7, [r8]
   0x0061a792 <+102>:   str.w   r7, [r1, #12]!
   0x0061a796 <+106>:   str.w   r4, [r12, #8]

We can see that, r4 is store in [r12+8], not using the updated r1 above.

[Bug middle-end/87899] [9 regression]r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899

--- Comment #2 from Renlin Li  ---
Created attachment 44965
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44965=edit
disassembly of distribute_loop

disassembly of wrongly compiled distribute_loop function

[Bug middle-end/87899] [9 regression]r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899

--- Comment #1 from Renlin Li  ---

in tree-loop-distribution.c, distribution_loop function, I got the following
code snippets.

 30386: 0103cff4 4 OBJECT  LOCAL  DEFAULT   25 _ZL23bb_top_order_index_s
 30387: 0103cff8 4 OBJECT  LOCAL  DEFAULT   25 _ZL18bb_top_order_index
 30388: 0103cffc 4 OBJECT  LOCAL  DEFAULT   25 _ZL10ddrs_table
 30389: 0103d000 4 OBJECT  LOCAL  DEFAULT   25 _ZL9loop_nest
 30390: 0103d004 4 OBJECT  LOCAL  DEFAULT   25 _ZL12datarefs_vec


r1 = 0x103cff4, which points to the local anchor area.
r4 is the dynamically allocated has_table pointer which supposed to be store
into ddrs_table, i.e. 0103cffc.

   0x61a346 ,
control_dependences*, int*, bool*)+90>:  strbr7, [r2, #0]
   0x61a348 ,
control_dependences*, int*, bool*)+92>:  str.w   r7, [r8]
1=>0x61a34c ,
control_dependences*, int*, bool*)+96>:  str.w   r7, [r1, #12]!
   0x61a350 ,
control_dependences*, int*, bool*)+100>: mov r5, r1
2=>0x61a352 ,
control_dependences*, int*, bool*)+102>: str r4, [r1, #8]
   0x61a354 ,
control_dependences*, int*, bool*)+104>: str r0, [r4, #0]
   0x61a356 ,
control_dependences*, int*, bool*)+106>: mov r0, r9

However, r1 is changed by the previous pre-indexed store at 0x61a34c (marked as
1).
This makes the store later store the pointer in the wrong position.
Later when accessing ddrs_table, it got a null pointer, eventually resulting in
the ICE observed here.

The full assembly is attached.

[Bug middle-end/87899] New: [9 regression]r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899

Bug ID: 87899
   Summary: [9 regression]r264897 cause mis-compiled native
arm-linux-gnueabihf toolchain
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

since r264897, native arm-linux-gnueabihf toolchain has been mis-compiled.
Somehow, it survives boostrap.

It ICEs when compiling a lot of test cases. They fail with similar message.
For example:

./gcc/cc1 ~/gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c  -O3
 test main
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data>   
  Streaming LTO
   
Assembling
functions:
  testduring GIMPLE pass: ldist

gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c: In function ‘test’:
gcc/./gcc/testsuite/gcc.c-torture/execute/pr36034-1.c:9:1: internal compiler
error: Segmentation fault
9 | test (void)
  | ^~~~
0x5c3a37 crash_signal
../../gcc/gcc/toplev.c:325
0x63ef6b inchash::hash::add(void const*, unsigned int)
../../gcc/gcc/inchash.h:100
0x63ef6b inchash::hash::add_ptr(void const*)
../../gcc/gcc/inchash.h:94
0x63ef6b ddr_hasher::hash(data_dependence_relation const*)
../../gcc/gcc/tree-loop-distribution.c:143
0x63ef6b hash_table::find_slot(data_dependence_relation* const&, insert_option)
../../gcc/gcc/hash-table.h:414
0x63ef6b get_data_dependence
../../gcc/gcc/tree-loop-distribution.c:1184
0x63f2bd data_dep_in_cycle_p
../../gcc/gcc/tree-loop-distribution.c:1210
0x63f2bd update_type_for_merge
../../gcc/gcc/tree-loop-distribution.c:1255
0x64064b build_rdg_partition_for_vertex
../../gcc/gcc/tree-loop-distribution.c:1302
0x64064b rdg_build_partitions
../../gcc/gcc/tree-loop-distribution.c:1754
0x64064b distribute_loop
../../gcc/gcc/tree-loop-distribution.c:2795
0x642299 execute
../../gcc/gcc/tree-loop-distribution.c:3133
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug target/87815] ICE in DSE with -march=armv8-a+sve while trying to replace load with previously stored value

2018-10-30 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87815

Renlin Li  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
 Target||aarch64-none-elf
Version|8.0 |9.0
   Target Milestone|--- |9.0
  Known to fail||9.0

[Bug target/87815] ICE in DSE with -march=armv8-a+sve while trying to replace load with previously stored value

2018-10-30 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87815

Renlin Li  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-10-30
 Ever confirmed|0   |1

[Bug target/87815] New: ICE in DSE with -march=armv8-a+sve while trying to replace load with previously stored value

2018-10-30 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87815

Bug ID: 87815
   Summary: ICE in DSE with -march=armv8-a+sve while trying to
replace load with previously stored value
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

The following test case ICEs with:
-march=armv8.2-a+sve -O3 and -Ofast 

int a, b, d;
short e;
void f() {
  for (int i = 0; i < 8; i++) {
e = b >= 2 ?: a >> b;
d = e && b;
  }
}


test.c: In function 'f':
test.c:8:1: internal compiler error: in smallest_mode_for_size, at
stor-layout.c:355
8 | }
  | ^
0x1048b4a smallest_mode_for_size(poly_int<2u, unsigned long>, mode_class)
src/gcc/gcc/stor-layout.c:355
0xa1a14e smallest_int_mode_for_size(poly_int<2u, unsigned long>)
src/gcc/gcc/machmode.h:838
0x1a93f86 find_shift_sequence
src/gcc/gcc/dse.c:1704
0x1a9497b get_stored_val
 src/gcc/gcc/dse.c:1850
0x1a94dae replace_read
src/gcc/gcc/dse.c:1955
0x1a958db check_mem_read_rtx
src/gcc/gcc/dse.c:2187
0x1a95dfc check_mem_read_use
src/gcc/gcc/dse.c:2293
0xfd0fd9 note_uses(rtx_def**, void (*)(rtx_def**, void*), void*)
src/gcc/gcc/rtlanal.c:2005
0x1a9660d scan_insn
src/gcc/gcc/dse.c:2401
0x1a972f3 dse_step1
src/gcc/gcc/dse.c:2659
0x1a9968b rest_of_handle_dse
src/gcc/gcc/dse.c:3576
0x1a9981e execute
src/gcc/gcc/dse.c:3634
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug target/87563] [9 regression ] ICE with -march=armv8-a+sve

2018-10-30 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563

Renlin Li  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Renlin Li  ---
fix committed as r265172. Close it.

[Bug target/87563] [9 regression ] ICE with -march=armv8-a+sve

2018-10-15 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563

--- Comment #4 from Renlin Li  ---
Author: renlin
Date: Mon Oct 15 16:49:05 2018
New Revision: 265172

URL: https://gcc.gnu.org/viewcvs?rev=265172=gcc=rev
Log:
[PR87563][AARCH64-SVE]: Don't keep ifcvt loop when COND_ ifn could not be
vectorized.

ifcvt will created versioned loop and it will permissively generate
scalar COND_ ifn.

If in the loop vectorize pass, COND_ could not get vectoized,
the if-converted loop should be abandoned when the target doesn't support
such ifn.


gcc/

2018-10-12  Renlin Li  

PR target/87563
* tree-vectorizer.c (try_vectorize_loop_1): Don't use
if-conversioned loop when it contains ifn with types not
supported by backend.
* internal-fn.c (expand_direct_optab_fn): Add an assert.
(direct_internal_fn_supported_p): New helper function.
* internal-fn.h (direct_internal_fn_supported_p): Declare.

gcc/testsuite/

2018-10-12  Renlin Li  

PR target/87563
* gcc.target/aarch64/sve/pr87563.c: New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/sve/pr87563.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/internal-fn.c
trunk/gcc/internal-fn.h
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vectorizer.c

[Bug tree-optimization/87562] [9 Regression] ICE in in linemap_position_for_line_and_column, at libcpp/line-map.c:848

2018-10-15 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87562

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #2 from Renlin Li  ---
(In reply to David Malcolm from comment #1)
> linemap_position_for_line_and_column(line_maps*, line_map_ordinary const*,
> unsigned int, unsigned int) at libcpp/line-map.c:848
> is:
>   linemap_assert (ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map) <= line);
> 
> I wonder if I introduced this in r264887 with the changes to input.c
> (macro-handling and concatenated strings), which touched the function in the
> next frame.
> 
> I'll see if I can reproduce it.

Hi David,

I checked that, the ICE starts from r264887.

[Bug target/87563] [9 regression ] ICE with -march=armv8-a+sve

2018-10-09 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563

Renlin Li  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||renlin at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |renlin at gcc dot 
gnu.org

[Bug middle-end/84877] New: Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2018-03-15 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

Bug ID: 84877
   Summary: Local stack copy of BLKmode parameter on the stack is
not aligned when the requested alignment exceeds
MAX_SUPPORTED_STACK_ALIGNMENT
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

For a test case like this,

#include 
struct U {
uint32_t M0;
uint32_t M1;
} __attribute((aligned(16)));

void tmp (struct U *);
void foo(struct U P0)
{
  struct U P1 = P0;
  tmp ();
}

void bar(struct U P0)
{
  tmp ();
}

The required alignment of a BLKmode parameter is truncated to
MAX_SUPPORTED_STACK_ALIGNMENT when it exceeds.
On the other hand, the compiler will try to dynamically align
the stack slot for local variable.

For example, on arm-gcc toolchain,
The function foo () will return a 16-byte aligned address.
However, P0 is temporarily stored on stack in an unaligned address.
Function bar () will return an unaligned address which is the address
of local stack copy of P0.

a warning could be emitted when the alignment could not be fulfilled or
dynamically align it thought it will waste stack space.

[Bug target/83370] [AARCH64]Tailcall register may be corrupted by epilogue code

2018-02-02 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83370

Renlin Li  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Renlin Li  ---
(In reply to Richard Earnshaw from comment #3)
> Doesn't this need backporting?


Yes, it is needed. The same problem happens in gcc-6 and gcc-7.
The backporting is approved and committed now.

[Bug target/83370] [AARCH64]Tailcall register may be corrupted by epilogue code

2018-02-01 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83370

--- Comment #5 from Renlin Li  ---
Author: renlin
Date: Thu Feb  1 21:33:05 2018
New Revision: 257315

URL: https://gcc.gnu.org/viewcvs?rev=257315=gcc=rev
Log:
[PR83370][AARCH64]Use tighter register constraint for sibcall patterns.

gcc/

backport from mainline
2018-02-01  Renlin Li  

PR target/83370
* config/aarch64/aarch64.c (aarch64_class_max_nregs): Handle
TAILCALL_ADDR_REGS.
(aarch64_register_move_cost): Likewise.
* config/aarch64/aarch64.h (reg_class): Rename CALLER_SAVE_REGS to
TAILCALL_ADDR_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Rename CALLER_SAVE_REGS to
TAILCALL_ADDR_REGS. Remove IP registers.
* config/aarch64/aarch64.md (Ucs): Update register constraint.

gcc/testsuite/

backport from mainline
2018-02-01  Richard Sandiford  

PR target/83370
* gcc.target/aarch64/pr83370.c: New.


Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/aarch64/pr83370.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/aarch64/aarch64.c
branches/gcc-6-branch/gcc/config/aarch64/aarch64.h
branches/gcc-6-branch/gcc/config/aarch64/constraints.md
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug target/83370] [AARCH64]Tailcall register may be corrupted by epilogue code

2018-02-01 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83370

--- Comment #4 from Renlin Li  ---
Author: renlin
Date: Thu Feb  1 21:09:06 2018
New Revision: 257314

URL: https://gcc.gnu.org/viewcvs?rev=257314=gcc=rev
Log:
[PR83370][AARCH64]Use tighter register constraint for sibcall patterns.

gcc/

backport from mainline
2018-02-01  Renlin Li  

PR target/83370
* config/aarch64/aarch64.c (aarch64_class_max_nregs): Handle
TAILCALL_ADDR_REGS.
(aarch64_register_move_cost): Likewise.
* config/aarch64/aarch64.h (reg_class): Rename CALLER_SAVE_REGS to
TAILCALL_ADDR_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Rename CALLER_SAVE_REGS to
TAILCALL_ADDR_REGS. Remove IP registers.
* config/aarch64/aarch64.md (Ucs): Update register constraint.

gcc/testsuite/

backport from mainline
2018-02-01  Richard Sandiford  

PR target/83370
* gcc.target/aarch64/pr83370.c: New.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/aarch64/pr83370.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/aarch64/aarch64.c
branches/gcc-7-branch/gcc/config/aarch64/aarch64.h
branches/gcc-7-branch/gcc/config/aarch64/constraints.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/83370] [AARCH64]Tailcall register may be corrupted by epilogue code

2018-02-01 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83370

Renlin Li  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Renlin Li  ---
fix has been commit in trunk.

[Bug target/83370] [AARCH64]Tailcall register may be corrupted by epilogue code

2018-02-01 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83370

--- Comment #1 from Renlin Li  ---
Author: renlin
Date: Thu Feb  1 13:02:24 2018
New Revision: 257294

URL: https://gcc.gnu.org/viewcvs?rev=257294=gcc=rev
Log:
[PR83370][AARCH64]Use tighter register constraint for sibcall patterns.

In aarch64 backend, ip0/ip1 register will be used in the prologue/epilogue as
temporary register.

When the compiler is performing sibcall optimization. It has the chance to use
ip0/ip1 register for indirect function call to hold the address. However,
those two register might be clobbered by the epilogue code which makes the
last sibcall instruction invalid.

The patch here renames the register class CALLER_SAVE_REGS to
TAILCALL_ADDR_REGS
to reflect its usage, and remove IP registers from this class.

gcc/

2018-02-01  Renlin Li  

PR target/83370
* config/aarch64/aarch64.c (aarch64_class_max_nregs): Handle
TAILCALL_ADDR_REGS.
(aarch64_register_move_cost): Likewise.
* config/aarch64/aarch64.h (reg_class): Rename CALLER_SAVE_REGS to
TAILCALL_ADDR_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Rename CALLER_SAVE_REGS to
TAILCALL_ADDR_REGS. Remove IP registers.
* config/aarch64/aarch64.md (Ucs): Update register constraint.

gcc/testsuite/

2018-02-01  Richard Sandiford  

PR target/83370
* gcc.target/aarch64/pr83370.c: New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr83370.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/config/aarch64/aarch64.h
trunk/gcc/config/aarch64/constraints.md
trunk/gcc/testsuite/ChangeLog

[Bug target/83370] New: [AARCH64]Tailcall register may be corrupted by epilogue code

2017-12-11 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83370

Bug ID: 83370
   Summary: [AARCH64]Tailcall register may be corrupted by
epilogue code
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

The following example generates incorrect code:

void (*f)();
int xx;
void tailcall (int i)
{
   int arr[5000];
   xx = arr[i];
   f();
}

When built with -O2 -ffixed-x0 -ffixed-x1 -ffixed-x2 -ffixed-x3 -ffixed-x4
-ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11
-ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x17 -ffixed-x18

tailcall:
mov x16, 20016
sub sp, sp, x16
adrpx16, .LANCHOR0
stp x19, x30, [sp]
add x19, sp, 16
ldr s0, [x19, w0, sxtw 2]
ldp x19, x30, [sp]
str s0, [x16, #:lo12:.LANCHOR0]
mov x16, 20016
add sp, sp, x16
br  x16   // oops

So the issue is there is nothing in the tail call instruction that prevents it
from using IP0/IP1 which are used as temporaries in the epilogue. We use the
temporary for frames of 4-64KB, so this issue is more likely today (previously
temporary was used only in frames larger than 16MBytes).

The problem appears to be that while we have explicit clobbers in a tailcall,
they are after the call, not before it:

(call_insn/j 16 12 17 2 (parallel [
(call (mem:DI (reg/f:DI 84 [ f ]) [0 *f.0_2 S8 A8])
(const_int 0 [0]))
(return)
]) "tailcall.c":13 42 {*sibcall_insn}
 (expr_list:REG_DEAD (reg/f:DI 84 [ f ])
(expr_list:REG_CALL_DECL (nil)
(nil)))
(expr_list (clobber (reg:DI 17 x17))
(expr_list (clobber (reg:DI 16 x16))
(nil

This issues affects gcc-5, gcc-6, gcc-7 and current trunk.

[Bug lto/81351] [8 regression] Many LTO testcases FAIL

2017-10-09 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81351

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #5 from Renlin Li  ---
similar failures happens on aarch64-linux-gnu & arm-linux-gnueabihf

[Bug testsuite/81179] [8 regression] gcc.dg/vect/pr65947-9.c and gcc.dg/vect/pr65947-14.c fail starting with r249553

2017-06-23 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81179

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #2 from Renlin Li  ---
The same failures are observed on all arm and aarch64 targets.

FAIL: gcc.dg/vect/pr65947-9.c -flto -ffat-lto-objects  scan-tree-dump vect
"loop size is greater than data size"
FAIL: gcc.dg/vect/pr65947-9.c scan-tree-dump vect "loop size is greater than
data size"
FAIL: gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/pr65947-14.c execution test

[Bug c++/81067] [8 regression] g++.dg/template/nontype10.C FAILs

2017-06-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81067

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #1 from Renlin Li  ---
I confirm I noticed the same regressions on arm targets.

[Bug tree-optimization/80948] [8 regression] test case gcc.dg/torture/pr68017.c fails with ICE starting with r248771

2017-06-02 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80948

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #3 from Renlin Li  ---
saw this ICE on arm and aarch64 target as well.

[Bug tree-optimization/78529] [7 Regression] gcc.c-torture/execute/builtins/strcat-chk.c failed with lto/O2

2017-01-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78529

--- Comment #25 from Renlin Li  ---
Created attachment 40474
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40474=edit
reduced objdump assembler file

[Bug tree-optimization/78529] [7 Regression] gcc.c-torture/execute/builtins/strcat-chk.c failed with lto/O2

2017-01-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78529

--- Comment #24 from Renlin Li  ---
Created attachment 40473
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40473=edit
memset.c

[Bug tree-optimization/78529] [7 Regression] gcc.c-torture/execute/builtins/strcat-chk.c failed with lto/O2

2017-01-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78529

--- Comment #23 from Renlin Li  ---
Created attachment 40472
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40472=edit
test case

[Bug tree-optimization/78529] [7 Regression] gcc.c-torture/execute/builtins/strcat-chk.c failed with lto/O2

2017-01-06 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78529

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #22 from Renlin Li  ---
(In reply to James Greenhalgh from comment #19)

> That would be an error:
> 
> /tmp/ccpefK3l.ltrans0.ltrans.o: In function `memset':
> :(.text+0x4a0): multiple definition of `memset'
> .../aarch64-none-elf/lib/libc.a(lib_a-memset.o):
> .../newlib/libc/machine/aarch64/memset.S:90: first defined here
> 
> Were it not for the flag added to resolve PR55994
> -Wl,--allow-multiple-definition .
> 
> So, in my opinion, the testcase is broken and could always have failed in
> this way. The combination of register allocation, LTO and order the linker
> sees symbols explains why this is hard to reproduce.

I had exactly the same errors and issues today.
I reduced it to a minimum test case. Please check the new attachment

The build command line is:
aarch64-none-elf-gcc -O2 -specs=aem-ve.specs -Wl,--allow-multiple-definition
-lm -flto main.c memset.c  -o new.exe

The expected output should be "A A A 2"

80001038 :
80001038:   a9bf7bfdstp x29, x30, [sp,#-16]!
8000103c:   9123adrpx3, 80025000 <__global_locale+0x68>
80001040:   52800044mov w4, #0x2// #2
80001044:   91060060add x0, x3, #0x180
80001048:   910003fdmov x29, sp
8000104c:   b9018064str w4, [x3,#384]
80001050:   d2800402mov x2, #0x20   // #32
80001054:   52800821mov w1, #0x41   // #65
80001058:   91002000add x0, x0, #0x8

# At this function entry, x4 is not saved. Because LTO thinks the local memset
# implementation will not clobber it. However, the libc version of memeset is
# linked in the final binary. The implementation there will clobber x4. This
# will cause run-time data corruption, which is shown here.

8000105c:   94000a39bl  80003940 
80001060:   a8c17bfdldp x29, x30, [sp],#16
80001064:   52800823mov w3, #0x41   // #65
80001068:   9080adrpx0, 80011000 <__swbuf_r+0x70>
8000106c:   2a0303e2mov w2, w3
80001070:   2a0303e1mov w1, w3
80001074:   91152000add x0, x0, #0x548
80001078:   140015c0b   80006778 
8000107c:   .inst   0x ; undefined



This is mentioned above. But allow me to ask again:
"aarch64-none-elf-gcc -O2  main.c memset.c  -o new.o -specs=aem-ve.specs -lm
-flto"
will give the "multiple definition of `memset'" error
while
"aarch64-none-elf-gcc -O2  main.c memset.c  -o new.o -specs=aem-ve.specs -lm"
won't.

Should them behavior the same? By adding "-Wl,--allow-multiple-definition" do
fix this erro. But why it's the test case that is broken instead of the lto
pass?

[Bug c++/71913] [5/6/7 Regression] Missing copy elision with operator new

2016-08-30 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71913

--- Comment #11 from Renlin Li  ---
(In reply to Christophe Lyon from comment #10)
> I've noticed that something similar to what Renlin suggested was committed
> to trunk as r238728.
> 
> Could this testcase fix be backported to the release branches too?

Yes, the failure can still be observed in branch 49 and 5.
It will be good to backport the fix to those branches.

[Bug middle-end/64971] [5 Regression] gcc.c-torture/compile/pr37433.c ICEs with -mabi=ilp32

2016-08-09 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64971

--- Comment #17 from Renlin Li  ---
Author: renlin
Date: Tue Aug  9 17:20:14 2016
New Revision: 239300

URL: https://gcc.gnu.org/viewcvs?rev=239300=gcc=rev
Log:
[PATCH][PR64971]Convert function pointer to Pmode when emit call.

gcc/

2016-08-04  Renlin Li  

PR middle-end/64971
* calls.c (prepare_call_address): Convert funexp to Pmode when
necessary.
* config/aarch64/aarch64.md (sibcall): Remove fix for PR 64971.
(sibcall_value): Likewise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/calls.c
trunk/gcc/config/aarch64/aarch64.md

[Bug fortran/71961] [7 Regression] 178.galgel in SPEC CPU 2000 is miscompiled

2016-07-28 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71961

--- Comment #15 from Renlin Li  ---
The change r238497 has been reverted as r238815.

I confirmed that, after the revert, the 178.gagel mis-compare
is fixed in aarch64-linux environment.

PR 71902 is reopend as well.

[Bug fortran/71902] [5/6 Regression] Unneeded temporary on reallocatable character assignment

2016-07-28 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71902

--- Comment #6 from Renlin Li  ---
Author: renlin
Date: Thu Jul 28 11:21:53 2016
New Revision: 238815

URL: https://gcc.gnu.org/viewcvs?rev=238815=gcc=rev
Log:
[PATCH] Revert Revert r238497 because of PR 71961.

This patch reverts the change for PR 71902 since it causes 178.gagel
miscompile in spec2000 as reported in PR 71961 which was observed in
x86_64, aarch64, powerpc64.

gcc/fortran/ChangeLog:

2016-07-28  Renlin Li  

Revert
2016-07-19  Thomas Koenig  

PR fortran/71902
* dependency.c (gfc_check_dependency): Use dep_ref.  Handle case
if identical is true and two array element references differ.
(gfc_dep_resovler):  Move most of the code to dep_ref.
(dep_ref):  New function.
* frontend-passes.c (realloc_string_callback):  Name temporary
variable "realloc_string".

gcc/testsuite/ChangeLog:

2016-07-28  Renlin Li  

Revert
2016-07-19  Thomas Koenig  

PR fortran/71902
* gfortran.dg/dependency_47.f90:  New test.


Removed:
trunk/gcc/testsuite/gfortran.dg/dependency_47.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/dependency.c
trunk/gcc/fortran/frontend-passes.c
trunk/gcc/testsuite/ChangeLog

[Bug fortran/71961] [7 Regression] 178.galgel in SPEC CPU 2000 is miscompiled

2016-07-28 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71961

--- Comment #14 from Renlin Li  ---
Author: renlin
Date: Thu Jul 28 11:21:53 2016
New Revision: 238815

URL: https://gcc.gnu.org/viewcvs?rev=238815=gcc=rev
Log:
[PATCH] Revert Revert r238497 because of PR 71961.

This patch reverts the change for PR 71902 since it causes 178.gagel
miscompile in spec2000 as reported in PR 71961 which was observed in
x86_64, aarch64, powerpc64.

gcc/fortran/ChangeLog:

2016-07-28  Renlin Li  

Revert
2016-07-19  Thomas Koenig  

PR fortran/71902
* dependency.c (gfc_check_dependency): Use dep_ref.  Handle case
if identical is true and two array element references differ.
(gfc_dep_resovler):  Move most of the code to dep_ref.
(dep_ref):  New function.
* frontend-passes.c (realloc_string_callback):  Name temporary
variable "realloc_string".

gcc/testsuite/ChangeLog:

2016-07-28  Renlin Li  

Revert
2016-07-19  Thomas Koenig  

PR fortran/71902
* gfortran.dg/dependency_47.f90:  New test.


Removed:
trunk/gcc/testsuite/gfortran.dg/dependency_47.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/dependency.c
trunk/gcc/fortran/frontend-passes.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/71913] [5/6/7 Regression] Missing copy elision with operator new

2016-07-25 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71913

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #9 from Renlin Li  ---
g++.dg/init/elide5.C fails on target whose SIZE_TYPE is not "long unsigned
int".

testsuite/g++.dg/init/elide5.C:4:42: error: 'operator new' takes type 'size_t'
('unsigned int') as first parameter [-fpermissive]

I have checked, for most 32 bit architectures or ABI, the SIZE_TYPE is
"unsigned int". arm is one of them.

To make this test case portable, __SIZE_TYPE__ should be better in this case,
instead of "unsigned long" as first argument of new operator.


> void* operator new(unsigned long, void* p) { return p; }

[Bug fortran/71961] [7 Regression] 178.galgel in SPEC CPU 2000 is miscompiled

2016-07-22 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71961

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #2 from Renlin Li  ---
The miscompare of 178.galgel is observed in aarch64-linux as well.

[Bug rtl-optimization/70030] [LRA]ICE when reload insn with output scratch operand

2016-06-28 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70030

--- Comment #8 from Renlin Li  ---
(In reply to Vladimir Makarov from comment #6)
> Created attachment 38033 [details]
> A patch
> 
> Here is the patch which might solve the problem.

Hi Vladimir,

Do you have plan to check this patch in?

Thanks!

[Bug middle-end/71625] missing strlen optimization on different array initialization style

2016-06-24 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

--- Comment #9 from Renlin Li  ---
(In reply to nsz from comment #8)
> (In reply to Jakub Jelinek from comment #6)
> > (In reply to Marc Glisse from comment #1)
> > > Or we could do like clang and improve alias analysis. We should know that
> > > array doesn't escape and thus that hallo() cannot write to it.
> > 
> > The strlen pass uses the alias oracle, so the question is why it thinks the
> > call might affect the array.
> 
> the optimization fails with
> 
>  const char array[] = "abc";
> 
> too (which is why i thought it was about pure strlen depending on global
> state
> other than the argument.. static const array works though).

char *array = "abc";

works, however, this generates string literals in read-only section.

[Bug middle-end/71625] New: missing strlen optimization on different array initialization style

2016-06-22 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625

Bug ID: 71625
   Summary: missing strlen optimization on different array
initialization style
   Product: gcc
   Version: tree-ssa
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

Hi,

The following two functions shall give the same result 3.
Currently, foo () can be optimized to return a constant.
bar (), however, contains function call to strlen, which is sub-optimal.

int foo ()
{
  char array[] = "abc";
  return __builtin_strlen (array);
}

int bar ()
{
  char array[] = {'a', 'b', 'c', '\0'};
  return __builtin_strlen (array);
}


Clang 3.8 produce optimal code-generation for both cases.
In addition, I have another case here:

int hallo ();
int dummy ()
{
  char array[] = "abc";
  return hallo () + __builtin_strlen (array);
}

the __builtin_strlen is not fold into a const as in foo () above. Presumably,
gcc is too conservative about what hallo () function can do. By adding a pure
attribute to hallo (), gcc will generate optimal code.

Clang 3.8 gives optimal code in this case as well.

[Bug rtl-optimization/70030] [LRA]ICE when reload insn with output scratch operand

2016-04-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70030

--- Comment #7 from Renlin Li  ---
(In reply to Vladimir Makarov from comment #6)
> Created attachment 38033 [details]
> A patch
> 
> Here is the patch which might solve the problem.

Hi Vladimir, sorry for the late reply. I am just back from holiday.

Thanks for the patch. I have tested that it fixes the ICE reported here!

scratch register in reload instructions are replaced by pseudo registers just
as other instructions feeding into LRA.

I have also did regression test and bootstrap check. It's all good for
aarch64-none-linux-gnu toolchain.

Are you going to post it?

[Bug rtl-optimization/70030] [LRA]ICE when reload insn with output scratch operand

2016-03-19 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70030

--- Comment #5 from Renlin Li  ---
(In reply to Vladimir Makarov from comment #3)
> (In reply to Ramana Radhakrishnan from comment #2)
> > Waiting.
> 
> Actually, I have a candidate patch to deal with scratches created during
> LRA.  But I can not test it as I have no "local change to gcc", a test case
> and used option set.
> 
> In any case, if this problem is solved by other means (e.g. using another
> patterns), we should probably close the bug.

Yes, it's possible to circumvent this bug by slightly adjusting the patterns.
For example, instead of relying on LRA to create pseudo (by using
match_scratch), pseudo registers can be created explicitly during expand stage,
and used as an normal early clobber register operand in the complex pattern.

However, the problem, described here is still there.

If it's Okay for you to share your change, I quite happy to test it.

[Bug rtl-optimization/70030] New: [LRA]ICE when reload insn with output scratch operand

2016-03-01 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70030

Bug ID: 70030
   Summary: [LRA]ICE when reload insn with output scratch operand
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

The ICE is triggered when building linux toolchain with local change to gcc
aarch64 backend.

vfprintf.c: In function ‘_IO_vfwprintf’:
vfprintf.c:1689:1: internal compiler error: in lra_set_insn_recog_data, at
lra.c:964
 }
 ^
0x952998 lra_set_insn_recog_data(rtx_insn*)
src/gcc/gcc/lra.c:962
0x9537b6 lra_get_insn_recog_data
src/gcc/gcc/lra-int.h:486
0x9537b6 lra_update_insn_regno_info
src/gcc/gcc/lra.c:1584
0x9537b6 lra_update_insn_regno_info
src/gcc/gcc/lra.c:1574
0x953a82 lra_push_insn_1
src/gcc/gcc/lra.c:1649
0x953a82 lra_push_insn(rtx_insn*)
src/gcc/gcc/lra.c:1657
0x953cb7 push_insns
gcc/gcc/lra.c:1700
0x954191 lra_process_new_insns(rtx_insn*, rtx_insn*, rtx_insn*, char const*)
gcc/gcc/lra.c:1754
0x9670e5 curr_insn_transform
src/gcc/gcc/lra-constraints.c:3962
0x968866 lra_constraints(bool)
src/gcc/gcc/lra-constraints.c:4450
0x954cb2 lra(_IO_FILE*)
src/gcc/gcc/lra.c:2277
0x90cfa9 do_reload
src/gcc/gcc/ira.c:5395
0x90cfa9 execute
src/gcc/gcc/ira.c:5566

Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.


The situation is like this,
To make insn_1 strict, lra generates a new insn_1_reload insn.
In insn_1_reload, there is a scratch operand with this form
clobber (match_scratch:MODE x "=r")
It's written in this way to reserve a pseudo register which will
be used as temporary within the pattern. 

When lra tries to reload insn_1_reload in later iteration, a new pseudo
register (let say RXX) is created to replace this scratch operand in-place.
Additionally, a new insn will be generated and inserted after insn_1_reload to
finish the reload. It's in this form:
(set scratch, RXX)
And this instruction is illegal. no target implements this kind of pattern.
LRA will ICE because of this. 


(1)   if (get_reload_reg (type, mode, old, goal_alt[i],
  loc != curr_id->operand_loc[i], "", _reg)
  && type != OP_OUT)
{
  push_to_sequence (before);
  lra_emit_move (new_reg, old);
  before = get_insns ();
  end_sequence ();
}
(2)   *loc = new_reg;
  if (type != OP_IN
  && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)
{
  start_sequence ();
(3)   lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);
  emit_insn (after);
  after = get_insns ();
  end_sequence ();
  *loc = new_reg;
}

(1) a reload pseudo register is generated: RXX
(2) replace original operand in-place: (clobber RXX)
(3) insert insn to set output operand: (set scratch, RXX)

[Bug target/63634] Compiler generated R_AARCH64_TLSLE_ADD_TPREL_HI12/LO12 pair overflowed by large TP offset

2016-02-08 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63634

Renlin Li  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||renlin at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #1 from Renlin Li  ---
r227215 [AArch64][TLSLE][3/3] Implement local executable mode for all memory
model
r227213 [AArch64][TLSLE][2/3] Rename SYMBOL_TLSLE to SYMBOL_TLSLE24
r227212 [AArch64][TLSLE][1/3] Add the option "-mtls-size"


Those three patches implemented TLS local executable mode for all memory
models.

I have double checked, if -mtls-size is specified properly, correct access
sequence and relocations will be emitted.

For example in this case -mtls-size=32 should generate movz/movk pair to give
32-bit TP offset.

So I will close this ticket now.

[Bug target/64152] internal compiler error: in gen_add2_insn

2016-02-02 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64152

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #3 from Renlin Li  ---
As -mno-lra option is already deprecated since GCC 5 for arm/aarch64 backend.
This ICE doesn't manifest since then.

I came across an insn canonicalization problem which I think is similar to this
one, so I spent some time understanding what's going on in both cases.  For the
record, below is what I found.

To reload the following insn,

(set (reg:DI 5 x5)
 (plus:DI (reg/f:DI 31 sp)
  (mem/u/c:DI (symbol_ref/u:DI ("*.LC201") [flags 0x2]) [0  S8
A64])))


two insns are generated by reload:

(set (reg:DI 5 x5)
 (mem/u/c:DI (symbol_ref/u:DI ("*.LC201"

(set (reg:DI 5 x5)
 (plus:DI (reg:DI 5 x5)
  (reg/f:DI 31 sp))) {*adddi3_aarch64}

The second insn here is not an strict rtx, because the rtx pattern defined in
the backend doesn't allow the third operand to be SP register.

However, at this stage, the rtx pattern is required to be strict.
So this reload is rejected, forcing the reload pass to try other possibilities,
This eventually leads the the ICEs observed here.

I have checked that there is no insn canonicalization rule for this scenario.
Either the target should provided more relaxed add pattern or the reload pass
can
try to swap the source operands for this commutative operator.

[Bug rtl-optimization/64895] RA picks the wrong register for -fipa-ra

2016-01-20 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64895

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #12 from Renlin Li  ---
The same happens for aarch64.

> [hjl@gnu-tools-1 gcc]$ cat /tmp/x.c 
> static int __attribute__((noinline))
> bar (int x)
> {
>   if (x > 4)
> return bar (x - 3);
>   return 0;
> }
> 
> int __attribute__((noinline))
> foo (int y)
> {
>   return y + bar (y);
> }
> 

There is another problem here actually.

In this particular case, bar() is a static function which is not exported.
Although -fpic option is provided, pic_offset_table_rtx is not used at all in
function foo().

In this case, pic_offset_table_rtx may not be initialized at all. The target
hook TARGET_INIT_PIC_REG can be improved to initialize pic register only when
necessary.


On the other hand, if pic_offset_table_rtx is not used at all,
lra_risky_transformations_p should not be true. Does it make sensible?

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index a78edd8..d4a950f 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4221,7 +4221,8 @@ lra_constraints (bool first_p)
 lra_constraint_iter);
   changed_p = false;
   if (pic_offset_table_rtx
-  && REGNO (pic_offset_table_rtx) >= FIRST_PSEUDO_REGISTER)
+  && (i = REGNO (pic_offset_table_rtx)) >= FIRST_PSEUDO_REGISTER
+  && lra_reg_info[i].nrefs > 0)
 lra_risky_transformations_p = true;
   else
 lra_risky_transformations_p = false;

[Bug target/69008] gcc emits unneeded memory access when passing trivial structs by value (ARM)

2016-01-14 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69008

Renlin Li  changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #2 from Renlin Li  ---
This relate to the strict alignment for aarch32 target. The structure is
treated as BLKmode and will be stored in the stack first.

However, I believe that this actually can be optimized by DSE pass, which will
forward the value to the ADD operation directly eliminate the store. However,
It seems it's unable to recognize the opportunities here. 

For example the following modified test case:

struct Trivial {
short i1;
short i2;
};

int foo(Trivial t)
{
return t.i1 + t.i2;
}

The expand will emits the following code, which still stores the structure into
stack first. However, DSE can optimized it and remove insn 2.

(insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 105 virtual-stack-vars)
(const_int -4 [0xfffc])) [1  S4 A32])
(reg:SI 0 r0)) test.c:7 -1
 (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 116)
(sign_extend:SI (mem/c:HI (plus:SI (reg/f:SI 105 virtual-stack-vars)
(const_int -4 [0xfffc])) [2 t.i1+0 S2 A32])))
test.c:8 -1
 (nil))
(insn 7 6 8 2 (set (reg:SI 117)
(sign_extend:SI (mem/c:HI (plus:SI (reg/f:SI 105 virtual-stack-vars)
(const_int -2 [0xfffe])) [2 t.i2+0 S2 A16])))
test.c:8 -1
 (nil))
(insn 8 7 9 2 (set (reg:SI 115)
(plus:SI (reg:SI 116)
(reg:SI 117))) test.c:8 -1
 (nil))


On the other hand, if the original test case is compiled with -mabi=apcs-gnu,
it will produce exactly the same code-gen as clang does.
"-mabi=apcs-gnu" will change the target BIGGEST_ALIGNMENT macro to 32.
In this case, the structure will be treated as scalar DImode. It will no longer
stored on the stack any more. The expand will emit different code from the very
beginning.

(insn 6 3 7 2 (set (reg:SI 114)
(plus:SI (subreg:SI (reg/v:DI 113 [ t ]) 0)
(subreg:SI (reg/v:DI 113 [ t ]) 4))) new.c:8 -1
 (nil))
(insn 7 6 11 2 (set (reg:SI 112 [  ])
(reg:SI 114)) new.c:8 -1
 (nil))
(insn 11 7 12 2 (set (reg/i:SI 0 r0)
(reg:SI 112 [  ])) new.c:9 -1
 (nil))
(insn 12 11 0 2 (use (reg/i:SI 0 r0)) new.c:9 -1
 (nil))

[Bug target/69082] Final link fails on ARM using lto

2016-01-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69082

--- Comment #14 from Renlin Li  ---
Author: renlin
Date: Tue Jan 12 17:32:18 2016
New Revision: 232284

URL: https://gcc.gnu.org/viewcvs?rev=232284=gcc=rev
Log:
[Backport][PR69082][ARM]Backport "[PATCH][ARM]Tighten the conditions for
arm_movw, arm_movt".

gcc/

2016-01-12  Renlin Li  

PR target/69082
Backport from mainline.
2015-08-24  Renlin Li  

* config/arm/arm-protos.h (arm_valid_symbolic_address_p): Declare.
* config/arm/arm.c (arm_valid_symbolic_address_p): Define.
* config/arm/arm.md (arm_movt): Use arm_valid_symbolic_address_p.
* config/arm/constraints.md ("j"): Add check for high code.


Modified:
branches/gcc-4_9-branch/gcc/ChangeLog
branches/gcc-4_9-branch/gcc/config/arm/arm-protos.h
branches/gcc-4_9-branch/gcc/config/arm/arm.c
branches/gcc-4_9-branch/gcc/config/arm/arm.md
branches/gcc-4_9-branch/gcc/config/arm/constraints.md

[Bug target/69082] Final link fails on ARM using lto

2016-01-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69082

Renlin Li  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Renlin Li  ---
Patch backported. It should be fixed then. I will mark it as resolved.

[Bug target/69082] Final link fails on ARM using lto

2016-01-08 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69082

Renlin Li  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |renlin at gcc dot 
gnu.org

--- Comment #12 from Renlin Li  ---
(In reply to Richard Earnshaw from comment #11)
> Looks like 
> https://gcc.gnu.org/ml/gcc-cvs/2015-08/msg00665.html
> 
> would be an appropriate fix for this.

I verified that, this patch fixes the problem described here.

I will do full regression test first. If nothing is broken, I will send a
backport patch to branch 4.9.

[Bug target/69082] Final link fails on ARM using lto

2016-01-08 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69082

--- Comment #13 from Renlin Li  ---
This problem can be reproduced using gcc 4.9.3 (r225077), and can be fixed by
r227129.

However, in branch 4.9 with the latest code, this bug cannot be trigger any
more. I have done a quick bisect, and find out it's r231177 which masked this
error out.
r231177 will change the register allocation result.

Presumably the problem is still there, as the initial patch is made to fix
exactly the same problem observed on trunk code.

arm-none-linux-gnueabihf tested without any new failures. I will send a
backport patch to mailing list.

[Bug rtl-optimization/67477] [6 Regression] ICE in cselib_record_set, at cselib.c:2388

2015-12-15 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67477

--- Comment #7 from Renlin Li  ---
(In reply to Jakub Jelinek from comment #4)
> The ICE has been on
> (insn 105 746 971 5 (parallel [
> (set (reg:V16QI 60 d22 [720])
> (unspec:V16QI [
> (reg:V16QI 60 d22 [720])
> (reg:V16QI 60 d22 [720])
> ] UNSPEC_VTRN1))
> (set (reg:V16QI 60 d22 [720])
> (unspec:V16QI [
> (reg:V16QI 60 d22 [720])
> (reg:V16QI 60 d22 [720])
> ] UNSPEC_VTRN2))
> ]) pr67477.c:63 1972 {*neon_vtrnv16qi_insn}
>  (nil))
> which was clearly invalid RTL, multiple sets of the same register.  The insn
> was still ok in the *.ira dump and broken in *.reload dump.
> (define_insn "*neon_vtrn_insn"
>   [(set (match_operand:VDQW 0 "s_register_operand" "=w")
> (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "0")
>   (match_operand:VDQW 3 "s_register_operand" "2")]
>  UNSPEC_VTRN1))
>(set (match_operand:VDQW 2 "s_register_operand" "=w")
>  (unspec:VDQW [(match_dup 1) (match_dup 3)]
>  UNSPEC_VTRN2))]
>   "TARGET_NEON"
>   "vtrn.\t%0, %2"
>   [(set_attr "type" "neon_permute")]
> doesn't look like a target bug that would allow 2 same set destinations.

That's exactly what I have observed. r228662 fixes that by adding early clobber
modifier to the operand, so that register could assign a different register.

[Bug target/67383] reload_cse_simplify_operands fails on ARMV7-M

2015-12-02 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67383

Renlin Li  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Renlin Li  ---
Backport committed as r231177. It should fix the ICE in the particular case.

However, this is not the whole story. I just found another problem.

In the test case, there are code structure like this.


uint64_t callee (int a, int b, int c, int d);
uint64_t caller (int a, int b, int c, int d)
{
  uint64_t res;
/*
single BB contains complicated data processing which requires register pair
*/

  res = callee (tmp, b ,c, d);
  return res;
}

CES pass in this case will extend the hard register live range across the whole
BB until the callee. In this case, r1, r2, r3 are excluded from allocatable
registers.


There are places in CES which prevents extending the hard register's live
range, for example for hard register which fullfil
small_register_classes_for_mode_p(), class_likely_spilled_p(). However,
argument registers belong to neither of them.


I tried to stop CES from extending argument registers live range. However,
later, scheduler jumps in and re-orders the instruction to reduce the pseudo
register pressure, which in effect extend the argument register live again.

[Bug target/66776] [AArch64] zero-extend version of csel not matching properly

2015-11-17 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66776

Renlin Li  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Renlin Li  ---
resolved.

[Bug rtl-optimization/66556] Wrong code-generation for armv7-a big-endian at -Os

2015-11-17 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66556

Renlin Li  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Renlin Li  ---
resolved

[Bug target/68286] [6 Regression] ICE: in wide_int_to_tree, at tree.c:1468

2015-11-11 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68286

Renlin Li  changed:

   What|Removed |Added

 Target|powerpc64le-unknown-linux-g |powerpc64le-unknown-linux-g
   |nu  |nu,
   ||arm-none-linux-gnueabihf
 CC||renlin at gcc dot gnu.org

--- Comment #3 from Renlin Li  ---
same issue happens on arm-none-linuxgnu-eabihf toolchain.

[Bug tree-optimization/67794] [6 regression] internal compiler error: Segmentation fault

2015-10-27 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67794

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #10 from renlin at gcc dot gnu.org ---
 (In reply to Martin Jambor from comment #9)
> Author: jamborm
> Date: Mon Oct 26 14:36:43 2015
> New Revision: 229367
> 
> URL: https://gcc.gnu.org/viewcvs?rev=229367=gcc=rev
> Log:
> Also remap SSA_NAMEs of PARM_DECLs in IPA-SRA
> 
> 2015-10-26  Martin Jambor  <mjam...@suse.cz>
> 
>   PR tree-optimization/67794
>   * tree-sra.c (replace_removed_params_ssa_names): Do not distinguish
>   between types of statements but accept original definitions as a
>   parameter.
>   (ipa_sra_modify_function_body): Use FOR_EACH_SSA_DEF_OPERAND to
>   iterate over definitions.
> 
> testsuite/
> * gcc.dg/ipa/ipa-sra-10.c: New test.
> * gcc.dg/torture/pr67794.c: Likewise.
> 
> 
> Added:
> branches/gcc-5-branch/gcc/testsuite/gcc.dg/ipa/ipa-sra-10.c
> branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr67794.c
> Modified:
> branches/gcc-5-branch/gcc/ChangeLog
> branches/gcc-5-branch/gcc/testsuite/ChangeLog
> branches/gcc-5-branch/gcc/tree-sra.c

Hi Martin,

After the backport patch to branch 5, aarch-none-elf fails to build because of
the following ICEs.

gcc/gcc/tree-sra.c: In function ‘tree_node*
replace_removed_params_ssa_names(tree, gimple_statement_base**,
ipa_parm_adjustment_vec)’:
gcc/gcc/tree-sra.c:4609:39: error: cannot convert ‘gimple_statement_base**’ to
‘gimple’ for argument ‘2’ to ‘tree_node* make_ssa_name(tree, gimple)’
gcc/gcc/tree-sra.c: In function ‘bool
ipa_sra_modify_function_body(ipa_parm_adjustment_vec)’:
gcc/gcc/tree-sra.c:4703:73: error: cannot convert ‘gphi*’ to
‘gimple_statement_base**’ for argument ‘2’ to ‘tree_node*
replace_removed_params_ssa_names(tree, gimple_statement_base**,
ipa_parm_adjustment_vec)’
gcc/gcc/tree-sra.c:4772:23: error: cannot convert ‘gimple’ to
‘gimple_statement_base**’ for argument ‘2’ to ‘tree_node*
replace_removed_params_ssa_names(tree, gimple_statement_base**,
ipa_parm_adjustment_vec)’

[Bug tree-optimization/67794] [6 regression] internal compiler error: Segmentation fault

2015-10-27 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67794

--- Comment #11 from renlin at gcc dot gnu.org ---
> 
> Hi Martin,
> 
> After the backport patch to branch 5, aarch-none-elf fails to build because
> of the following ICEs.
> 

I mean "aarch64-none-elf" here, sorry for the typo.


[Bug target/67383] reload_cse_simplify_operands fails on ARMV7-M

2015-10-14 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67383

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #5 from renlin at gcc dot gnu.org ---
(In reply to Vladimir Makarov from comment #4)
> I've tried to reproduce it on gcc-4.9 branch as of today but failed.  The
> problem with constraints and overlapped hard regs was probably fixed by
> backported patches.
> 
> Still I have another problem:
> 
> ../lib/mm/mm.c: In function ‘chunk_node’:
> ../lib/mm/mm.c:430:1: internal compiler error: in assign_by_spills, at
> lra-assigns.c:1357
> 0x853dd5 assign_by_spills
>
> /home/cygnus/vmakarov/build1/gcc-4.9-branch/gcc/gcc/lra-assigns.c:1357
> 0x854617 lra_assign()
>
> /home/cygnus/vmakarov/build1/gcc-4.9-branch/gcc/gcc/lra-assigns.c:1503
> 0x84de9c lra(_IO_FILE*)
> /home/cygnus/vmakarov/build1/gcc-4.9-branch/gcc/gcc/lra.c:2388
> 0x80ca16 do_reload
> /home/cygnus/vmakarov/build1/gcc-4.9-branch/gcc/gcc/ira.c:5474
> 0x80ca16 rest_of_handle_reload
> /home/cygnus/vmakarov/build1/gcc-4.9-branch/gcc/gcc/ira.c:5615
> 0x80ca16 execute
> /home/cygnus/vmakarov/build1/gcc-4.9-branch/gcc/gcc/ira.c:5644
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <http://gcc.gnu.org/bugs.html> for instructions.
> 
> The problem is in assigning a hard reg to reload pseudo 442 for insns
> 
>  Choosing alt 0 in insn 153:  (0) =  (1) %0  (2) r {*arm_adddi3}
>   Creating newreg=441, assigning class GENERAL_REGS to r441
>   Creating newreg=442 from oldreg=268, assigning class GENERAL_REGS to
> r442
>   153: {r441:DI=r441:DI+r442:DI;clobber cc:CC;}
>   REG_DEAD r268:DI
>   REG_UNUSED cc:CC
>   REG_EQUIV [sp:SI+0x10]
> Inserting insn reload before:
>   642: r441:DI=[sp:SI+0x8]
>   644: r442:DI=r268:DI
> Inserting insn reload after:
>   643: [sp:SI+0x10]=r441:DI
> 
> We canot use hard reg 0, 1, 2 as they live through insn 153:
> 
>  ...
>   153: {r272:DI=r268:DI+r159:DI;clobber cc:CC;}
>   REG_DEAD r268:DI
>   REG_UNUSED cc:CC
>   REG_EQUIV [sp:SI+0x10]
>   ...
>   159: call [`debug_printf'] argc:0x20
>   REG_DEAD r1:SI
>   REG_DEAD r0:SI
>   REG_DEAD r2:DI
> 
> Hard reg 7 (FP), 9 (thread), 10 (pic), 13 (sp), 15 (pc) are fixed.  So
> we have only one hole for DI value containing 2 regs (4, 5) and pair
> (4,5) is assigned to 441 and there are no regs for 442.


In this particular case, hard register 12 is free, and hard register 11 can be
spilled to accommodate this DImode pseudo register.

However, the target hook HARD_REGNO_MODE_OK rejects register pairs start from
odd number (11 in this case.) So find_hard_regno_for() failed.

I have found r209615 relaxes the target hook. In thumb2 mode, any register pair
is allowed. I have verified, it fix this ICE.


I will do a full regression test first, If no new issues, I will backport it to
branch 4.9

[Bug rtl-optimization/67715] [6 Regression][ARM] ICE in cselib.c during reload_cse_regs

2015-10-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67715

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #2 from renlin at gcc dot gnu.org ---
I have check that this ICE has been fixed by the target patch here:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00609.html

It's exactly the same type of error. The patch has already been committed on
trunk as r228662.


[Bug target/66776] [AArch64] zero-extend version of csel not matching properly

2015-10-02 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66776

--- Comment #1 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Fri Oct  2 11:55:04 2015
New Revision: 228384

URL: https://gcc.gnu.org/viewcvs?rev=228384=gcc=rev
Log:
[PATCH][AARCH64][PR66776]Add cmovdi_insn_uxtw pattern.

gcc/

2015-10-02  Renlin Li  <renlin...@arm.com>

PR target/66776
* config/aarch64/aarch64.md (cmovdi_insn_uxtw): New pattern.

gcc/testsuite/

2015-10-02  Renlin Li  <renlin...@arm.com>

PR target/66776
* gcc.target/aarch64/pr66776.c: New.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr66776.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.md
trunk/gcc/testsuite/ChangeLog


[Bug target/66776] [AArch64] zero-extend version of csel not matching properly

2015-10-01 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66776

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2015-10-01
 Ever confirmed|0   |1


[Bug rtl-optimization/67028] New: combine bug. Different assumptions about subreg in different places.

2015-07-27 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67028

Bug ID: 67028
   Summary: combine bug. Different assumptions about subreg in
different places.
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 36067
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36067action=edit
test case

This is a combine bug manifest on arm target. A test case is attached.

The expected output of the test case should be:
checksum = 1

However, with the following command line:

arm-none-eabi-gcc -march=armv7-a -O3 test.c -specs=rdimon.specs -o a.out

the output is:
checksum = 0

It generates wrong code when the optimization level is: -O2, -O3, -Os
-O0, -O1 works fine.


[Bug rtl-optimization/67028] combine bug. Different assumptions about subreg in different places.

2015-07-27 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67028

--- Comment #2 from renlin at gcc dot gnu.org ---
(In reply to Segher Boessenkool from comment #1)
 I have a hard time reproducing this.  Could you show the generated
 assembler code, and say why you think it is a combine bug?

This is my generated asm with this command cc1 -O3 -march=armv7-a test.c

stmfd   sp!, {r4, lr}
mov r1, #0
movwr0, #:lower16:.LC0
movtr0, #:upper16:.LC0
bl  printf
mov r0, #0
ldmfd   sp!, {r4, pc}


In simplify_comparison(), for the following rtx pattern,

and:M1 (subreg:M2 X 0) (const_int C1))

the code will try to permute the SUBREG and AND when WORD_REGISTER_OPERATIONS
is defined and the subreg here is Paradoxical. There is an assumption here: the
upper bits of the subreg should all be zeros.

However, this is not always true. In this particular test case, the AND
operation, which ensures the higher bits are zero, is removed. The register
here has two CONST_INT values in a if-then-else pattern. When further
simplifying this if-then-else pattern, subreg is applied to those two CONST_INT
value.

In simplify_immed_subreg, CONST_INT is always signed extended to a larger mode.
The different assumptions cause the wrong code-generation.


What's more, in the gcc internal documentation, it's written: subregs of
subregs are not supported
However, subreg of subreg pattern will be generated by combine pass, and
simplified by simplify_subreg().

For example:
subreg:SI (subreg:HI reg:SI r10)  reg:SI r10


[Bug rtl-optimization/66556] Wrong code-generation for armv7-a big-endian at -Os

2015-07-15 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66556

--- Comment #4 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Wed Jul 15 15:13:36 2015
New Revision: 225835

URL: https://gcc.gnu.org/viewcvs?rev=225835root=gccview=rev
Log:
[PATCH]Fix PR66556. Don't drop side-effect in
simplify_const_relational_operation function.

gcc/

Backport from mainline.
2015-07-13  Renlin Li  renlin...@arm.com

PR rtl/66556
* simplify-rtx.c (simplify_const_relational_operation): Add
side_effects_p checks.

gcc/testsuite/

Backport from mainline.
2015-07-13  Renlin Li  renlin...@arm.com

PR rtl/66556
* gcc.c-torture/execute/pr66556.c: New.


Added:
branches/gcc-5-branch/gcc/testsuite/gcc.c-torture/execute/pr66556.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/simplify-rtx.c
branches/gcc-5-branch/gcc/testsuite/ChangeLog


[Bug rtl-optimization/66556] Wrong code-generation for armv7-a big-endian at -Os

2015-07-13 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66556

--- Comment #3 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Mon Jul 13 08:29:46 2015
New Revision: 225729

URL: https://gcc.gnu.org/viewcvs?rev=225729root=gccview=rev
Log:
[PATCH]Fix PR66556. Don't drop side-effect in
simplify_const_relational_operation function.

gcc/

2015-07-13  Renlin Li  renlin...@arm.com

PR rtl/66556
* simplify-rtx.c (simplify_const_relational_operation): Add
side_effects_p checks.

gcc/testsuite/

2015-07-13  Renlin Li  renlin...@arm.com

PR rtl/66556
* gcc.c-torture/execute/pr66556.c: New.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/pr66556.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/simplify-rtx.c
trunk/gcc/testsuite/ChangeLog


[Bug rtl-optimization/66556] Wrong code-generation for armv7-a big-endian at -Os

2015-06-16 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66556

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2015-06-16
 Ever confirmed|0   |1


[Bug rtl-optimization/66556] Wrong code-generation for armv7-a big-endian at -Os

2015-06-16 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66556

--- Comment #1 from renlin at gcc dot gnu.org ---
(insn 22 94 24 4 (set (reg:SI 140 [ g+2 ])
(zero_extend:SI (mem/c:HI (post_modify:SI (reg/f:SI 156)
(plus:SI (reg/f:SI 156)
(const_int 20 [0x14]))) [5 g+4 S2 A32]))) test.c:36 159
{*arm_zero_extendhisi2_v6}
 (expr_list:REG_INC (reg/f:SI 156)
(expr_list:REG_EQUAL (zero_extend:SI (mem/c:HI (const:SI (plus:SI
(symbol_ref:SI (*.LANCHOR0) [flags 0x182])
(const_int 256 [0x100]))) [5 g+4 S2 A32]))
(nil
(insn 24 22 25 4 (set (subreg:SI (reg:HI 141 [ D.4259 ]) 0)
(zero_extract:SI (reg:SI 140 [ g+2 ])
(const_int 15 [0xf])
(const_int 1 [0x1]))) test.c:36 138 {extzv_t2}
 (expr_list:REG_DEAD (reg:SI 140 [ g+2 ])
(nil)))
(insn 25 24 27 4 (set (reg:SI 142 [ D.4255 ])
(zero_extend:SI (reg:HI 141 [ D.4259 ]))) test.c:36 159
{*arm_zero_extendhisi2_v6}
 (expr_list:REG_DEAD (reg:HI 141 [ D.4259 ])
(nil)))
(insn 33 32 34 4 (set (reg:CC 100 cc)
(compare:CC (reg:SI 142 [ D.4255 ])
(reg:SI 150 [ D.4255 ]))) test.c:36 188 {*arm_cmpsi_insn}
 (expr_list:REG_DEAD (reg:SI 150 [ D.4255 ])
(expr_list:REG_DEAD (reg:SI 142 [ D.4255 ])
(nil
(insn 34 33 36 4 (set (reg:SI 152)
(ltu:SI (reg:CC 100 cc)
(const_int 0 [0]))) test.c:36 198 {*mov_scc}
 (expr_list:REG_DEAD (reg:CC 100 cc)
(nil)))

In combine pass, the above rtx are simplified combined and insn 22, 24, 25, 33
are marked as deleted. However, the side-effect of insn 22, post_modify, is not
preserved.

(insn 43 41 45 4 (set (mem/c:HI (plus:SI (reg/f:SI 156)
(const_int 8 [0x8])) [4 MEM[(short int *)i + 8B]+0 S2 A16])
So for insn 43, the data is stored in the wrong place.


[Bug rtl-optimization/66556] New: Wrong code-generation for armv7-a big-endian at -Os

2015-06-16 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66556

Bug ID: 66556
   Summary: Wrong code-generation for armv7-a big-endian at -Os
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: renlin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 35789
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35789action=edit
test case

The test case is attached.

toolchain built from latest trunk code and branch 5 produce wrong
code-generation with the following command line option.

arm-none-eabi-gcc -march=armv7-a -mbig-endian -Os test.c -o test.out

The correct output should be:
checksum = ff

However, the result is:
checksum = 7

The testcase is correctly compiled at -O1, which gives the right execution
result. The test case works fine for little-endian at any optimization level.


[Bug target/65326] LRA missing a Thumb optimization.

2015-05-07 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65326

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #1 from renlin at gcc dot gnu.org ---
In this specific case, thumb_legitimize_address will generate ldr r0, [r9, r10]
pattern(after IRA). However, this pattern only allows LO_REGS. During reload, 
r9r10 will be spilled into LO_REGS, that's where those two mov instructions
come from.

(In reply to Matthew Wahab from comment #0)
 Created attachment 34964 [details]
 Testcase showing change in behaviour.
 
 The ARM backend no longer supports -mno-lra so only the LRA is available.
 This
 has also removed the Thumb mode optimiziation introduced in
 https://gcc.gnu.org/ml/gcc-patches/2005-08/msg01140.html to fix PR 23436.
 
 This turns sequences like
   mov r3, r9
   mov r2, r10
   ldr r0, [r3, r2]
 into
   mov r3, r9
   add r3, r3, r10
   ldr r0, [r3]
 which saves a register.
 
 Attached is a contrived test case. Compiling with gcc-4.9 with -mthumb
 -mno-lra 
 (at -O1 and higher) produces the second (better) sequence. Compiling with
 gcc-4.9 or gcc-trunk with -mthumb (at -O1 and higher) produces the first
 sequence. The sequences appear after the 'nop'
 
 gcc-4.9 is 
 arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.9.3 20141119
 (release) [ARM/embedded-4_9-branch revision 218278]
 
 trunk is:
 arm-none-eabi-gcc (unknown) 5.0.0 20150217 (experimental)

[Bug target/65459] SLOW_UNALIGNED_ACCESS unconditionally set to 1 for ARM targets

2015-03-20 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65459

renlin at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |renlin at gcc dot 
gnu.org

--- Comment #3 from renlin at gcc dot gnu.org ---
confirmed and assign it to myself


[Bug tree-optimization/46038] Vectorizer generates misaligned address for vld1 qn, [rn:alignment]

2015-03-12 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46038

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||renlin at gcc dot gnu.org

--- Comment #1 from renlin at gcc dot gnu.org ---
I cannot reproduce the fault in 4.9 or trunk.


[Bug libstdc++/64467] [5 Regression] 28_regex/traits/char/isctype.cc and wchar_t/isctype.cc

2015-02-04 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64467

--- Comment #7 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Wed Feb  4 09:24:56 2015
New Revision: 220392

URL: https://gcc.gnu.org/viewcvs?rev=220392root=gccview=rev
Log:
[PATCH][libstdc++][Testsuite] isctype test fails for newlib.

libstdc++-v3/
2015-02-02  Matthew Wahab  matthew.wa...@arm.com

PR libstdc++/64467
* testsuite/28_regex/testsuiteraits/char/isctype.cc (test01): Add newlib
special case for '\n'.
* test01estsuite/28_regex/traits/wchar_t/isctype.cc (test01): Likewise.


Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc
trunk/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc


[Bug target/64149] -mno-lra bitrots, suggest to remove for GCC 5

2015-01-20 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64149

--- Comment #7 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Tue Jan 20 10:26:18 2015
New Revision: 219884

URL: https://gcc.gnu.org/viewcvs?rev=219884root=gccview=rev
Log:
[ARM] PR 64149: Remove -mlra/-mno-lra option for ARM.

gcc/
2015-01-20  Matthew Wahab  matthew.wa...@arm.com

PR target/64149
* config/arm/arm.option: Remove lra option and arm_lra_flag variable.
* config/arm/arm.h (MODE_BASE_REG_CLASS): Remove use of arm_lra_flag,
replace the conditional with it's true branch.
* config/arm/arm.c (TARGET_LRA_P): Set to hook_bool_void_true.
(arm_lra_p): Remove.

gcc/testsuite/
2015-01-20  matthewhew Wahab  matthew.wa...@arm.com

PR target/64149
* gcc.target/arm/armthumb1-far-jump-3.c: Remove.


Removed:
trunk/gcc/testsuite/gcc.target/arm/thumb1-far-jump-3.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/config/arm/arm.h
trunk/gcc/config/arm/arm.opt
trunk/gcc/testsuite/ChangeLog


[Bug target/61413] __ARM_SIZEOF_WCHAR_T is constant 32 -- should be 4 or 2

2015-01-14 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61413

--- Comment #4 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Wed Jan 14 11:02:24 2015
New Revision: 219587

URL: https://gcc.gnu.org/viewcvs?rev=219587root=gccview=rev
Log:
[ARM]Fix definition of __ARM_SIZEOF_WCHAR_T.

Backport from mainline:
2014-08-12 Ramana Radhakrishnan ramana.radhakrish...@arm.com

PR target/61413
* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Fix definition
of __ARM_SIZEOF_WCHAR_T.

Modified:
branches/gcc-4_8-branch/gcc/ChangeLog
branches/gcc-4_8-branch/gcc/config/arm/arm.h


[Bug target/61413] __ARM_SIZEOF_WCHAR_T is constant 32 -- should be 4 or 2

2015-01-14 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61413

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||renlin at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #5 from renlin at gcc dot gnu.org ---
backport to branch 4.8  4.9.


[Bug target/63424] [4.9 regression] Octave -O3 build: internal compiler error: in prepare_cmp_insn, at optabs.c:4237

2015-01-13 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63424

--- Comment #6 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Tue Jan 13 16:25:00 2015
New Revision: 219540

URL: https://gcc.gnu.org/viewcvs?rev=219540root=gccview=rev
Log:
[AArch64] Implement sumaxminv2di3 pattern

Backport from mainline
2014-11-19 Renlin Li renlin...@arm.com

gcc/:
PR target/63424
* config/aarch64/aarch64-simd.md (sumaxminv2di3): New.

gcc/testsuite/:
PR target/63424
* gcc.target/aarch64/pr63424.c: New test.


Added:
branches/gcc-4_9-branch/gcc/testsuite/gcc.target/aarch64/pr63424.c
Modified:
branches/gcc-4_9-branch/gcc/ChangeLog
branches/gcc-4_9-branch/gcc/config/aarch64/aarch64-simd.md
branches/gcc-4_9-branch/gcc/testsuite/ChangeLog


[Bug target/63424] [4.9 regression] Octave -O3 build: internal compiler error: in prepare_cmp_insn, at optabs.c:4237

2015-01-13 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63424

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||renlin at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #7 from renlin at gcc dot gnu.org ---
backport to 4.9


[Bug ipa/64551] Segfault in target_opts_for_fn (from ipa_icf::sem_function::equals_private)

2015-01-09 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64551

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 Target|alpha-linux-gnu |alpha-linux-gnu,
   ||arm-none-linux-gnueabi
 CC||renlin at gcc dot gnu.org

--- Comment #1 from renlin at gcc dot gnu.org ---
I observed the same issue on arm-none-linux-gnueabi target


[Bug middle-end/64552] Build broken for cris-elf and others

2015-01-09 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64552

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||renlin at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #1 from renlin at gcc dot gnu.org ---
presumably a duplicate of 64551

*** This bug has been marked as a duplicate of bug 64551 ***


[Bug ipa/64551] Segfault in target_opts_for_fn (from ipa_icf::sem_function::equals_private)

2015-01-09 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64551

renlin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||hp at gcc dot gnu.org

--- Comment #2 from renlin at gcc dot gnu.org ---
*** Bug 64552 has been marked as a duplicate of this bug. ***


[Bug target/61413] __ARM_SIZEOF_WCHAR_T is constant 32 -- should be 4 or 2

2015-01-09 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61413

--- Comment #3 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Fri Jan  9 13:55:16 2015
New Revision: 219386

URL: https://gcc.gnu.org/viewcvs?rev=219386root=gccview=rev
Log:
[ARM]Fix definition of __ARM_SIZEOF_WCHAR_T.

Backport from mainline:
2014-08-12 Ramana Radhakrishnan ramana.radhakrish...@arm.com

PR target/61413
* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Fix definition
of __ARM_SIZEOF_WCHAR_T.

Modified:
branches/gcc-4_9-branch/gcc/ChangeLog
branches/gcc-4_9-branch/gcc/config/arm/arm.h


[Bug target/63661] [4.9 Regression] -O2 miscompiles with -mtune=nehalem or corei7

2014-12-03 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63661

--- Comment #29 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Wed Dec  3 11:13:50 2014
New Revision: 218306

URL: https://gcc.gnu.org/viewcvs?rev=218306root=gccview=rev
Log:
Backported from mainline

gcc/

2014-12-03  Renlin Li  renlin...@arm.com

PR middle-end/63762
PR target/63661
* ira.c (i386ra): Update preferred class.

gcc/testsuite/

2014-12-03  Renlin Li  renlin...@arm.com
H.J. Lu hongjiu...@intel.com

PR middle-end/63762
PR target/63661
* gcc.dg/pr63762.c: New test.
* gcc.target/i386/pr63661.c: New test.

Added:
branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/pr63762.c
branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/pr63661.c
Modified:
branches/gcc-4_9-branch/gcc/ChangeLog
branches/gcc-4_9-branch/gcc/ira.c
branches/gcc-4_9-branch/gcc/testsuite/ChangeLog


[Bug middle-end/63762] [ARM]GCC generates UNPREDICTABLE STR with Rn = Rt when hard-float abi is used

2014-12-03 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63762

--- Comment #8 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Wed Dec  3 11:13:50 2014
New Revision: 218306

URL: https://gcc.gnu.org/viewcvs?rev=218306root=gccview=rev
Log:
Backported from mainline

gcc/

2014-12-03  Renlin Li  renlin...@arm.com

PR middle-end/63762
PR target/63661
* ira.c (i386ra): Update preferred class.

gcc/testsuite/

2014-12-03  Renlin Li  renlin...@arm.com
H.J. Lu hongjiu...@intel.com

PR middle-end/63762
PR target/63661
* gcc.dg/pr63762.c: New test.
* gcc.target/i386/pr63661.c: New test.

Added:
branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/pr63762.c
branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/pr63661.c
Modified:
branches/gcc-4_9-branch/gcc/ChangeLog
branches/gcc-4_9-branch/gcc/ira.c
branches/gcc-4_9-branch/gcc/testsuite/ChangeLog


[Bug target/63661] [4.9/5 Regression] -O2 miscompiles with -mtune=nehalem or corei7

2014-11-28 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63661

--- Comment #24 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Fri Nov 28 11:01:27 2014
New Revision: 218143

URL: https://gcc.gnu.org/viewcvs?rev=218143root=gccview=rev
Log:
Add testcase for PR63661.

2014-11-28  Renlin Li  renlin...@arm.com

PR target/63661
* gcc.target/i386/pr63661.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr63661.c
Modified:
trunk/gcc/testsuite/ChangeLog


[Bug target/63661] [4.9/5 Regression] -O2 miscompiles with -mtune=nehalem or corei7

2014-11-28 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63661

--- Comment #25 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Fri Nov 28 11:18:47 2014
New Revision: 218144

URL: https://gcc.gnu.org/viewcvs?rev=218144root=gccview=rev
Log:
Use native tune. nehalem is not able to triggle the issue in trunk any more.

2014-11-28  Renlin Li  renlin...@arm.com

PR  target/63661
* gcc.target/i386/pr63661.c: Use native tune.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/i386/pr63661.c


[Bug middle-end/63762] [ARM]GCC generates UNPREDICTABLE STR with Rn = Rt when hard-float abi is used

2014-11-19 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63762

--- Comment #5 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Wed Nov 19 15:15:51 2014
New Revision: 217783

URL: https://gcc.gnu.org/viewcvs?rev=217783root=gccview=rev
Log:
2014-11-19  Renlin Li  renlin...@arm.com

PR middle-end/63762
* ira.c (ira): Update preferred class. 

* gcc.dg/pr63762.c: New test. 

Added:
trunk/gcc/testsuite/gcc.dg/pr63762.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ira.c
trunk/gcc/testsuite/ChangeLog


[Bug target/63424] Octave -O3 build: internal compiler error: in prepare_cmp_insn, at optabs.c:4237

2014-11-19 Thread renlin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63424

--- Comment #4 from renlin at gcc dot gnu.org ---
Author: renlin
Date: Wed Nov 19 16:34:38 2014
New Revision: 217786

URL: https://gcc.gnu.org/viewcvs?rev=217786root=gccview=rev
Log:
[AArch64] Implement sumaxminv2di3 pattern

gcc/:
PR target/63424
* config/aarch64/aarch64-simd.md (sumaxminv2di3): New.

gcc/testsuite/:
PR target/63424
* gcc.target/aarch64/pr63424.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr63424.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64-simd.md
trunk/gcc/testsuite/ChangeLog