[Bug testsuite/112728] gcc.dg/scantest-lto.c FAILs

2023-11-30 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112728

--- Comment #3 from Jorn Wolfgang Rennecke  ---
(In reply to Rainer Orth from comment #0)
> The gcc.dg/scantest-lto.c FAILs on quite a number of targets:
... 
> * On Darwin, the __TEXT,__eh_frame contains .ascii because the assembler
>   lacks support for cfi directives.

I suppose we could handle the darwin case by:

- Not doing the common scan-assembler* tests for darwin
- doing a scan-assembler-times test that expects exactly how many .ascii are
emitted for cfi.

[Bug target/112651] RISC-V Vector new option -mvect-lmul required to force LMUL values (rather than --param=riscv-autovec-lmul to hint at values)

2023-11-21 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112651

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #2 from Jorn Wolfgang Rennecke  ---
We can have in fact vector code without the intervention of the autovectorizer,
if the user uses GNU C to write explicitly vectorized code, which code
generation will simply translate to target instructions if the modes are
available.
Where the mode is too wide for the hardware becaue it doesn't support LMUL > 1,
we want the vector lowering to kick in.
I think we should achieve this aim by disabling vector modes altogether that
are too wide for the hardware.
That is alone is not a full solution, though, since a number of vector modes
can be obtained with more than one LMUL value.  Often, the higher LMUL values
appear to be more efficient when just counting instructions because they allow
vsetivli to be used for larger vectors, thus reducing the need to load
constants into general purpose registers first.

[Bug target/112537] Is there a way to disable cpymem pass for rvv

2023-11-17 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112537

--- Comment #13 from Jorn Wolfgang Rennecke  ---
Before we can consider any costs, we first have to know what they are.  Is
there any manual for a hardware implementation that specifies costs?

[Bug target/112537] Is there a way to disable cpymem pass for rvv

2023-11-17 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112537

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #12 from Jorn Wolfgang Rennecke  ---
(In reply to JuzheZhong from comment #2)
> Currently, we don't have a compile option to disable cpymem by RVV.

If you don't want any vector instructions to be emitted, why do you tell the
compiler to enable the 'v' extsnsion of the architecture?

[Bug testsuite/111298] time-profiler-2.c flaky on glibc RISC-V

2023-11-08 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111298

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #3 from Jorn Wolfgang Rennecke  ---
(In reply to Patrick O'Neill from comment #0)
> I'm guessing that this is likely due to some conflict between
> time-profiler-1.c and time-profiler-2.c and filing this under testsuite
> framework issue, but feel free to move it if it's likely caused by a
> specific component.

My guess is that the atomic fetch-and-update emitted by
gimple_gen_time_profiler
is not actually atomic (at least under RISC-V Qemu).
Note that in time-profiler-2.c, there is a parent and a child process that
access the same gcov data.

[Bug testsuite/111658] New: test-function-bodies fails to find functions with single-letter names

2023-10-01 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111658

Bug ID: 111658
   Summary: test-function-bodies fails to find functions with
single-letter names
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
  Target Milestone: ---

When you use check-function-bodies with a function that has a single-letter
name, the start regexp set by configure_check-function-bodies and used by
parse_function_bodies to find function starts fails to match, making the
test always fail.
There is no mention about such a restriction in sourcebuild.texi

[Bug testsuite/110951] [13/14] RISCV: rv32 newlib gcc.c-torture testsuite fails with xgcc: fatal error: Cannot find suitable multilib set for '-march=rv32imafdc_zicsr_zifencei'/'-mabi=ilp32d'

2023-10-01 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110951

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #3 from Jorn Wolfgang Rennecke  ---
I see something like this come up randomly (i.e. not strictly reproducible)
with gcc14 about one to three times per million tests, in parts like gcc.dg.
I wonder if it could be related?  What was your testing environment problem?

[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)

2023-09-29 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566

--- Comment #5 from Jorn Wolfgang Rennecke  ---
I had a look at riscv_legitimize_move.  It doesn't seem to suffer from quite
the same problem as legitimize_move does, but it could if another problem was
fixed: riscv_legitimize_move changes the rtl it's passed.  That can lead to
trouble if this is shared rtl.

[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)

2023-09-29 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566

--- Comment #4 from Jorn Wolfgang Rennecke  ---
Also, the GET_MODE_BITSIZE (mode).to_constant () <= MAX_BITS_PER_WORD
in the *mov_mem_to_mem splitter can generate unaligned accesses, yet it
is not guarded by a check that the target supports them.

[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)

2023-09-29 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566

--- Comment #3 from Jorn Wolfgang Rennecke  ---
riscv-v.cc:legitimize_move has:

  if (MEM_P (dest) && !REG_P (src))
src = force_reg (mode, src);

  return false;

since src is passed by value, this is pointless.  The caller still had src
as a MEM.

[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)

2023-09-29 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #2 from Jorn Wolfgang Rennecke  ---
This also causes trouble with my cpymem patch.

With the *movv8si_mem_to_mem pattern, ira.cc:combine_and_move_insns
will eagerly transform

(insn 1606 1603 1608 77 (set (reg/f:SI 1187)
(plus:SI (reg/f:SI 65 frame)
(const_int -1248 [0xfb20])))
"/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0
discrim 126 4 {*addsi3}
 (nil))

(insn 1608 1606 1609 77 (set (reg:V8SI 1189)
(mem/u/c:V8SI (reg/f:SI 5064) [0  S32 A128]))
"/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0
discrim 126 1151 {*movv8si}
 (expr_list:REG_DEAD (reg/f:SI 5064)
(expr_list:REG_EQUAL (mem/u/c:V8SI (const:SI (plus:SI (symbol_ref:SI
("*.LANCHOR0") [flags 0x182])
(const_int 64 [0x40]))) [0  S32 A128])
(nil

(insn 1609 1608 12961 77 (set (mem/v/c:V8SI (reg/f:SI 1187) [1  S32 A128])
(reg:V8SI 1189))
"/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0
discrim 126 1151 {*movv8si}
 (expr_list:REG_DEAD (reg:V8SI 1189)
(expr_list:REG_DEAD (reg/f:SI 1187)
(nil

into


(insn 1608 1603 16000 77 (set (reg:V8SI 1189)
(mem/u/c:V8SI (reg/f:SI 5064) [0  S32 A128]))
"/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0
discrim 126 1151 {*movv8si}
 (expr_list:REG_EQUIV (mem/u/c:V8SI (const:SI (plus:SI (symbol_ref:SI
("*.LANCHOR0") [flags 0x182])
(const_int 64 [0x40]))) [0  S32 A128])
(expr_list:REG_DEAD (reg/f:SI 5064)
(nil

(insn 16000 1608 1609 77 (set (reg/f:SI 1187)
(plus:SI (reg/f:SI 65 frame)
(const_int -1248 [0xfb20])))
"/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0
discrim 126 4 {*addsi3}
 (expr_list:REG_EQUIV (plus:SI (reg/f:SI 65 frame)
(const_int -1248 [0xfb20]))
(nil)))

(insn 1609 16000 12961 77 (set (mem/v/c:V8SI (reg/f:SI 1187) [1  S32 A128])
(mem/u/c:V8SI (reg/f:SI 5064) [0  S32 A128]))
"/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0
discrim 126 -1
 (expr_list:REG_DEAD (reg:V8SI 1189)
(expr_list:REG_DEAD (reg/f:SI 1187)
(nil

during compilation of check_add_long_double.

When a pattern with a mandatory split is recognized, you must make sure it can
be split.  If the pattern ceases to be valid at some point during the
compilation, you must make sure it can be split or otherwise transformed
before another attempt to recognize it is made.

[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-15 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #6 from Jorn Wolfgang Rennecke  ---
(In reply to H. Peter Anvin from comment #5)

> 2. It seems like it almost would require an implementation-specific
> performance model. Now, one can validly argue that by setting the cost of
> unimplemented instructions to a (near-)infinite value such instructions
> should never be generated even if they are "enabled". That might also be a
> possible avenue for achieving this.

Yes, that makes it possible to implement the interface without actually having
a dedicated mask table.  However, you still have the headache of how to get
code generation to use this effectively.  A lot of code generation strategies
are basically canned solution that a skilled assembler programmer has devised;
you can theoretically use the superoptimizer to find linear sequences for
arbitrary instruction sets, but the compilation time cost and the limit to
linear sequences makes this impractical.
Therefore, as you want to co-develop architecture and software, you likely also
have to hack the compiler to make effective use of your architecture.
FWIW, 'infinite' cost seems unnecessarily high, considering you could make your
assembler replace missing instructions with function calls, and these functions
can get linked from a library.  So you have a finite cost per-call for the call
site size (static instruction count) & time (dynamic instruction count), and a
one-time size cost per-object for each function used.  Such a library and
assembler modification could be prepared for specific extensions that you want
to deconstruct, and then used flexibly.

[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’

2021-05-19 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361

--- Comment #8 from Jorn Wolfgang Rennecke  ---
Bootstrapped and regression tested on x86_64-pc-linux-gnu.

[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’

2021-05-18 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

  Attachment #50837|0   |1
is obsolete||

--- Comment #7 from Jorn Wolfgang Rennecke  ---
Created attachment 50839
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50839=edit
Amended patch

This patch also disables the affected tests.

[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’

2021-05-18 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361

--- Comment #5 from Jorn Wolfgang Rennecke  ---
(In reply to Patrick Palka from comment #3)

> Btw, we already disable the floating-point to_chars on targets without a
> binary64 double.  So is our test for detecting binary64 not accurate enough,
> or are these 16-bit targets whose double type really is binary64?

At least in the case of eSi-RISC, it is the latter.

[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’

2021-05-18 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361

--- Comment #4 from Jorn Wolfgang Rennecke  ---
Created attachment 50837
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50837=edit
Proposed patch

This patch fixes the problem for eSi-RISC and bootstraps on x86_64-pc-linux-gnu
, with floating_to_chars.o properly built in each stage.

Could you check that this also works for msp430?

[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’

2021-05-18 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2021-05-18

--- Comment #1 from Jorn Wolfgang Rennecke  ---
I also see this for 16 bit eSi-RISC targets.  This array can't fit into a 16
bit address space that addresses 8 bit units.

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2019-07-08 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

  Attachment #46574|0   |1
is obsolete||

--- Comment #18 from Jorn Wolfgang Rennecke  ---
Created attachment 46577
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46577=edit
patch for aligned stack - but clamping max alignment at
MAX_SUPPORTED_STACK_ALIGNMENT

(In reply to r...@cebitec.uni-bielefeld.de from comment #17)
> > --- Comment #15 from Jorn Wolfgang Rennecke  ---
> > Created attachment 46574 [details]
> >   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46574=edit
> > patch for the case that the stack is sufficiently aligned
> [...]
> > I have attached a patch to preserve the alignment of the passed type for the
> > case that the stack is already sufficiently aligned.
> 
> This patch breaks sparc-sun-solaris2.11 bootstrap with an ICE while
> compiling stage2 function.c:
> 
> during RTL pass: expand
> /vol/gcc/src/hg/trunk/local/gcc/function.c: In function 'void
> assign_parm_find_data_types(assign_parm_data_all*, tree,
> assign_parm_data_one*)':
> /vol/gcc/src/hg/trunk/local/gcc/function.c:2426:49: internal compiler error:

This location doesn't make much sense to me.  Maybe some artefact from
optimized compilation and register windows?

> in assign_stack_temp_for_type, at function.c:880
>  2426 |   else if (targetm.calls.strict_argument_naming (all->args_so_far))
>   |~^~
> 0x11bc22f assign_stack_temp_for_type(machine_mode, poly_int<1u, long long>,
> tree_node*)
>   /vol/gcc/src/hg/trunk/local/gcc/function.c:878
> 0x11bc963 assign_temp(tree_node*, int, int)

This looks like the modified assert there has triggered.  It'd be interesting
to know why - i.e. what variable does want more alignment than
MAX_SUPPORTED_STACK_ALIGNMENT - during bootstrap?  Or is this a BLKmode
variable with less alignment than BIGGEST_ALIGNMENT?
User code could specify silly alignments which we couldn't provide with
ordinary
allocation (using a fixed offset from sp/fp) and which could also blow up the
frame size too much if we tried, so it makes sense to clamp the alignment to
MAX_SUPPORTED_STACK_ALIGNMENT in get_stack_local_alignment.
The other side is that the code in assign_stack_temp_for_type seems to require
BIGGEST_ALIGNMENT for BLKmode; I'm not sure about assign_stack_local_1
slots.  It seems a bit wasteful, but trying to reduce waste of space in the
stack frame is really a different issue, so I also modified the patch to use
at least BIGGEST_ALIGNMENT for BLKmode so that it's (bug-?)compatible in that
aspect with the previous code - see attached modified patch.

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2019-07-07 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

--- Comment #15 from Jorn Wolfgang Rennecke  ---
Created attachment 46574
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46574=edit
patch for the case that the stack is sufficiently aligned

(In reply to dave.anglin from comment #11)
 > $sp is aligned on entry to main:
> (gdb) p/x $sp
> $1 = 0xf8d02300
> 
> However, the invisible reference is a $sp - 0x78.  That's not sufficiently
> aligned.

I've built a cross compiler to take a closer look. 
MAX_SUPPORTED_STACK_ALIGNMENT is 512, so the problem is completely different
for this target.  Looking at pa.h, the value comes from
PREFERRED_STACK_BOUNDARY :

/* Boundary (in *bits*) on which stack pointer is always aligned;
   certain optimizations in combine depend on this.

   The HP-UX runtime documents mandate 64-byte and 16-byte alignment for
   the stack on the 32 and 64-bit ports, respectively.  However, we
   are only guaranteed that the stack is aligned to BIGGEST_ALIGNMENT
   in main.  Thus, we treat the former as the preferred alignment.  */
#define STACK_BOUNDARY BIGGEST_ALIGNMENT
#define PREFERRED_STACK_BOUNDARY (TARGET_64BIT ? 128 : 512)
...
/* No data type wants to be aligned rounder than this.  The long double
   type has 16-byte alignment on the 64-bit target even though it was never
   implemented in hardware.  The software implementation only needs 8-byte
   alignment.  This matches the biggest alignment of the HP compilers.  */
#define BIGGEST_ALIGNMENT (2 * BITS_PER_WORD)


Even with TARGET_64_BIT, we got a PREFERRED_STACK_BOUNDARY of 128 .

However:
#define UNITS_PER_WORD (TARGET_64BIT ? 8 : 4)

It seems suspicious that PREFERRED_STACK_BOUNDARY is smaller for TARGET_64BIT ?

Be this as it may, the problem for the 84877 testcase is not that the stack has
insufficient alignment, but that the stack slot doesn't have an aligned offset.

The alignment gets pruned in function.c:get_stack_local_alignment :

  if (mode == BLKmode)
alignment = BIGGEST_ALIGNMENT;

I have attached a patch to preserve the alignment of the passed type for the
case that the stack is already sufficiently aligned.

To test the case where the stack is insufficiently aligned, for hppa we should
use a different testcase with > 512 bit alignment of the type.

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2019-07-07 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

--- Comment #13 from Jorn Wolfgang Rennecke  ---
(In reply to Hans-Peter Nilsson from comment #12)
> (In reply to Jorn Wolfgang Rennecke from comment #10)
> > Created attachment 46567 [details]
> > Fix for targets that pass the argument by invisible reference
> 
> Thanks for your efforts.  This *may* have affected the code generated by
> gcc.dg/pr84877.c; that test now passes, but that's unreliable as I've seen
> the outcome depends on random stack alignment of the context, and my
> baseline is from a context different enough.  I believe inspecting the
> generated code isn't of much interest given David Anglin's observations for
> hppa and...
> 
> However, it introduces these regressions:
> +gcc.sum gcc.dg/pr80286.c
> +gcc.sum gcc.dg/torture/pr78542.c
> +gcc.sum gcc.dg/torture/pr86363.c
> +gcc.sum gcc.dg/torture/va-arg-25.c

I tried if I could reproduce this with a cross-compiler built for
--target=hppa-linux-gnu; the va-arg-25.c test case needs headers, but the
others can be
compiled just using xgcc & cc1.  I tried with the options in dg-options,
and for pr78542.c / pr86363.c I also tried additional -O options.
However, I don't see any ICE.  Is there a special configuration or set of
options needed, or is this just impossible with a cross compiler?

[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code

2019-07-06 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073

--- Comment #16 from Jorn Wolfgang Rennecke  ---
Going from gcc 8.2 to gcc 9.1, I find the following two test cases are now
autovectorized:

/* { dg-do compile } */
/* { dg-options "-O3" } */

/* Test auto-vectorization */

#include "vector-types.h"

#define LENGTH 256

__attribute__((aligned (VECTOR_SIZE))) short a[LENGTH], b[LENGTH];
short c;

void foo (void) {
  int i;

  for (i=0; i> (c & 0xf);
  }
}




/* { dg-do compile } */
/* { dg-options "-O3" } */

/* Test auto-vectorization */

#include "vector-types.h"

#define LENGTH 256

__attribute__((aligned (VECTOR_SIZE))) unsigned short a[LENGTH], b[LENGTH];
unsigned short c;

void foo (void) {
  int i;

  for (i=0; i> (c & 0xf);
  }
}

[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT

2019-07-06 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #10 from Jorn Wolfgang Rennecke  ---
Created attachment 46567
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46567=edit
Fix for targets that pass the argument by invisible reference

I also observe this problem on esirisc.

assign_parm is only relevant for the testcase if the argument is passed by
value, where the copy is made in foo.
If the argument is passed by invisible reference, we have instead during
compilation of main expand_call calling initialize_argument_information, which
calls assign_temp, which calls assign_temp_for_type, which calls
assign_stack_local_1 .
The attached patch changes initialize_argument_information to use the same
code path as for variable-sized arguments; it's a bit more overhead, but I
would
think that excess alignment is a relatively rare case.  If performance for this
alignment were really important, you could change the stack management so that
the alignment can be provided more cheaply.

Since the esirisc port is not in the FSF tree, it doesn't really count for
testing; also, the behaviour will vary depending on argument passing of the
target, so we need to test a variety of targets.

[Bug testsuite/91065] gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab

2019-07-02 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Jorn Wolfgang Rennecke  ---
Patch applied, not a regression, since the test was like this from the start.

[Bug testsuite/91065] gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab

2019-07-02 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065

--- Comment #2 from Jorn Wolfgang Rennecke  ---
Author: amylaar
Date: Wed Jul  3 00:22:53 2019
New Revision: 272954

URL: https://gcc.gnu.org/viewcvs?rev=272954=gcc=rev
Log:
PR testsuite/91065
* testsuite/gcc.dg/plugin/start_unit_plugin.c: Register a root tab
to reference fake_var.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/gcc.dg/plugin/start_unit_plugin.c

[Bug ipa/91062] gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc was configured with --enable-checking=all

2019-07-02 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91062

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #2 from Jorn Wolfgang Rennecke  ---
varmap is allocated on the heap, and lives across passes.  Yes it references
a name that is sometimes in static storage, but mostly in ggc-allocated memory.
I suppose inhibiting garbage collection during ipa would be no good, so
either the names should be allocated on the heap (ironically, often the name is
generated on the heap and later copied to ggc memory), or be reachable from a
ggc root.

I have traced the output of one garbage string emitted in the dump file for
gcc.dg/torture/ipa-pta-1.c back to its origin (index is 9 in new_var_info,
and the string is in "name"; gcc source svn revision is 272931):

#0  new_var_info (t=0x0, name=0x7fffefba2050 "test4.clobber", add_id=false)
at ../../gcc/gcc/tree-ssa-structalias.c:383
#1  0x00fa2d81 in create_function_info_for (decl=0x7fffefce8700, 
name=0x7fffefb9ff40 "test4", add_id=false, nonlocal_p=true)
at ../../gcc/gcc/tree-ssa-structalias.c:5785
#2  0x00fa9725 in ipa_pta_execute ()
at ../../gcc/gcc/tree-ssa-structalias.c:8095
#3  0x00faab71 in (anonymous namespace)::pass_ipa_pta::execute (
this=0x271a9e0) at ../../gcc/gcc/tree-ssa-structalias.c:8493
#4  0x00c5e991 in execute_one_pass (pass=pass@entry=0x271a9e0)
at ../../gcc/gcc/passes.c:2473
#5  0x00c5fa32 in execute_ipa_pass_list (pass=0x271a9e0)
at ../../gcc/gcc/passes.c:2913
#6  0x008918e9 in symbol_table::compile (
this=this@entry=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2648
#7  0x00894b08 in symbol_table::compile (this=0x7fffefb9e100)
at ../../gcc/gcc/cgraphunit.c:2825
#8  symbol_table::finalize_compilation_unit (this=0x7fffefb9e100)
at ../../gcc/gcc/cgraphunit.c:2861
#9  0x00d8d544 in compile_file () at ../../gcc/gcc/toplev.c:481
#10 0x006b8919 in do_compile () at ../../gcc/gcc/toplev.c:2209
#11 toplev::main (this=this@entry=0x7fffddf0, argc=, 
argc@entry=22, argv=, argv@entry=0x7fffdef8)

... at the end of the pass ...

#0  ggc_collect () at ../../gcc/gcc/ggc-page.c:2174
#1  0x00c5e6fb in execute_one_ipa_transform_pass (ipa_pass=0x271a2e0, 
node=0x7fffefb9d708) at ../../gcc/gcc/passes.c:2232
#2  execute_all_ipa_transforms () at ../../gcc/gcc/passes.c:2250
#3  0x00882662 in cgraph_node::get_body (this=0x7fffefb9d708)
at ../../gcc/gcc/cgraph.c:3621
#4  0x00fa9633 in ipa_pta_execute ()
at ../../gcc/gcc/tree-ssa-structalias.c:8077
#5  0x00faab71 in (anonymous namespace)::pass_ipa_pta::execute (
this=0x271a9e0) at ../../gcc/gcc/tree-ssa-structalias.c:8493
#6  0x00c5e991 in execute_one_pass (pass=pass@entry=0x271a9e0)
at ../../gcc/gcc/passes.c:2473
#7  0x00c5fa32 in execute_ipa_pass_list (pass=0x271a9e0)
at ../../gcc/gcc/passes.c:2913
#8  0x008918e9 in symbol_table::compile (
this=this@entry=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2648
#9  0x00894b08 in symbol_table::compile (this=0x7fffefb9e100)
at ../../gcc/gcc/cgraphunit.c:2825
#10 symbol_table::finalize_compilation_unit (this=0x7fffefb9e100)
at ../../gcc/gcc/cgraphunit.c:2861
#11 0x00d8d544 in compile_file () at ../../gcc/gcc/toplev.c:481
#12 0x006b8919 in do_compile () at ../../gcc/gcc/toplev.c:2209
#13 toplev::main (this=this@entry=0x7fffddf0, argc=, 


... lots of garbage collections and constraint dumpings later...


#0  __GI__IO_fputs (str=0x7fffefba2050 '\245' , "\"",
fp=0x27fd8e0) at iofputs.c:32Breakpoint 8, dump_constraint (file=0x27fd8e0,
c=0x281fc38)
at ../../gcc/gcc/tree-ssa-structalias.c:678
678   fprintf (file, "%s", get_varinfo (c->lhs.var)->name);
(gdb) s
get_varinfo (n=9) at ../../gcc/gcc/tree-ssa-structalias.c:346
346   return varmap[n];
(gdb) fin
Run till exit from #0  get_varinfo (n=9)
at ../../gcc/gcc/tree-ssa-structalias.c:346
0x00f9570e in dump_constraint (file=0x27fd8e0, c=0x281fc38)
at ../../gcc/gcc/tree-ssa-structalias.c:678
678   fprintf (file, "%s", get_varinfo (c->lhs.var)->name);
Value returned is $27 = (variable_info *) 0x27cd7b0
(gdb) s
__GI__IO_fputs (str=0x7fffefba2050 '\245' , "\"", 
fp=0x27fd8e0) at iofputs.c:32
32  {
(gdb) p $22
$28 = 0x7fffefba2050 '\245' , "\""
(gdb) bt
#0  __GI__IO_fputs (str=0x7fffefba2050 '\245' , "\"", 
fp=0x27fd8e0) at iofputs.c:32
#1  0x00f95721 in dump_constraint (file=0x27fd8e0, c=0x281fc38)
at ../../gcc/gcc/tree-ssa-structalias.c:678
#2  0x00f958db in dump_constraints (file=0x27fd8e0, from=44)
at ../../gcc/gcc/tree-ssa-structalias.c:723
#3  0x00fa9d22 in ipa_pta_execute ()
at ../../gcc/gcc/tree-ssa-structalias.c:8193
#4 

[Bug testsuite/91065] gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab

2019-07-02 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #1 from Jorn Wolfgang Rennecke  ---
I've posted a patch: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00187.html

[Bug testsuite/91065] New: gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab

2019-07-02 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065

Bug ID: 91065
   Summary: gcc.dg/plugin/start_unit_plugin.c uses ggc memory
without registering a root_tab
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: GC
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu (probably doesn't really matter)
Target: native or cross

gcc.dg/plugin/start_unit_plugin.c isets fake_var to ggc-allocated memory,
without registering a root_tab that references fake_var.
This causes gcc.dg/plugin/start_unit-test-1.c to fail when the compiler is
configured with --enable-checking=all

[Bug ipa/91062] gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc was configured with --enable-checking=all

2019-07-02 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91062

--- Comment #1 from Jorn Wolfgang Rennecke  ---
Similarly, gcc.dg/torture/ipa-pta-1.c fails four scan tests because
ipa-pta-1.c.083i.pta2 gets corrupted in the ENABLE_GC_ALWAYS_COLLECT scenario.

[Bug ipa/91062] New: gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc was configured with --enable-checking=all

2019-07-02 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91062

Bug ID: 91062
   Summary: gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc
was configured with --enable-checking=all
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: GC
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu (probably doesn't really matter)
Target: native or cross

A number of symbol names in the dump file have been replaced by what looks like
ggc erased memory.  The problem can be hidden by adding a suitable min_expand
value,
e.g. (for native unix):

make check-gcc RUNTESTFLAGS='--target_board=unix/--param=ggc-min-expand=30
ipa.exp=ipa-pta-1.c'

on a machine with 16 GB RAM + 8 GB swap.

OTOH, I haven't been able to reproduce this using a compiler that hasn't been
configured with --enable-checking, or merely with --enable-checking=yes, even
when adding --param=ggc-min-expand=0 .

We've originally observed this in a variant of gcc 8.2, so this bug has
probably been around for a while.

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2019-07-01 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726

--- Comment #21 from Jorn Wolfgang Rennecke  ---
Author: amylaar
Date: Mon Jul  1 21:48:55 2019
New Revision: 272911

URL: https://gcc.gnu.org/viewcvs?rev=272911=gcc=rev
Log:
PR middle-end/66726
* tree-ssa-phiopt.c (factor_out_conditional_conversion):
Tune heuristic from PR71016 to allow MIN / MAX.
* testsuite/gcc.dg/tree-ssa/pr66726-4.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr66726-4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-phiopt.c

[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code

2018-11-24 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073

--- Comment #14 from Jorn Wolfgang Rennecke  ---
(In reply to Jorn Wolfgang Rennecke from comment #12)
> If we are right shifting a signed type, we could apply a MAX operation to the
> shift count.

Oops, I mean MIN of course.  So that we can guarantee that the maximum
applied shift count is one less than the bitsize of the shifted value.

[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code

2018-11-24 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073

--- Comment #13 from Jorn Wolfgang Rennecke  ---
If the shifted value is 16 bit and int is 32 bit wide, then, depending
on target costs, instead of a vector compare, we might decide to use
a sign extract of bit 4 of the shift count instead.

[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code

2018-11-24 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073

--- Comment #12 from Jorn Wolfgang Rennecke  ---
If we are left shifting a narrow signed type for the result, and no defined
overflow semantics are in place, it should be OK to just vectorize the code
using the result type.

If we are right shifting a signed type, we could apply a MAX operation to the
shift count.
If we are shifting an unsigned type, we can do a vector compare to check
if the shift count exceeds the range, and use an AND to zero the result if
that is the case.
If we are doing a shift right of a signed value where -fwrapv semantics
are required or allowed, we can do the same as for unsigned shift.

Thus, a shift is replaced by two or three vactor operations, which should be
a win if the vectorization factor is four or more.
The MAX and compare operations might subsequently be eliminated if value range
propagation finds that the value can't be out of range.

[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code

2018-11-24 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #11 from Jorn Wolfgang Rennecke  ---
Created attachment 45079
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45079=edit
testcase using restricted shift count

Even if the shift count is restricted in range by applying an AND first, which
also further boosts the optimization potential for SHIFT_COUNT_TRUNCATED
targets, the code is not vectorized.

[Bug tree-optimization/44976] reductions with short variables do not get vectorized

2018-11-22 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44976

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #3 from Jorn Wolfgang Rennecke  ---
Ironically, this is a case where -fwrapv improves optimization.

[Bug other/39363] [meta-bug] pending patches from ARC International (UK) Ltd

2018-10-21 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39363
Bug 39363 depends on bug 39302, which changed state.

Bug 39302 Summary: [meta-bug] bugs waiting for Copyright Assignment 
acknowledgemt for ARC International (UK) Ltd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39302

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug other/39302] [meta-bug] bugs waiting for Copyright Assignment acknowledgemt for ARC International (UK) Ltd

2018-10-21 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39302

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Jorn Wolfgang Rennecke  ---
(In reply to Eric Gallager from comment #2)
> (In reply to Jorn Wolfgang Rennecke from comment #1)
> > Confirmation received.  I'll have to send out the patches now.
> 
> Have you done this yet?

Yes, see other/39363 and the various ARC branches from that time.

> Also does this need to keep the "meta-bug" label

Yes.  This 'bug' describes and tracks state of a set of other bugs, it is not
a GNU software bug in its own right.

OTOH, the issue being tracked by this meta-bug - need for (verification of)
 Copyright assignment for patches from ARC International (UK) Ltd - has been
resolved, and the dependent bugs are thus no longer blocked (since comment #1),
so moving this to FIXED.

[Bug rtl-optimization/55531] peephole2 pattern with multiple insns with match_parallel insn causes corrupted peephole2_insns matching function

2018-10-21 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55531

--- Comment #2 from Jorn Wolfgang Rennecke  ---
(In reply to Eric Gallager from comment #1)
> so this is... what, wrong-code? ice-on-valid-code? build? 
> 
> (I should go to bed instead of trying to figure this out...)

ice-on-valid-code, and consequently a build issue.

[Bug target/85993] config/sh/sh.c:10878: suspicious if .. else chain

2018-05-30 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85993

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||olegendo at gcc dot gnu.org

--- Comment #2 from Jorn Wolfgang Rennecke  ---
(In reply to David Binderman from comment #0)
> config/sh/sh.c:10878:12: warning: duplicated ‘if’ condition
> [-Wduplicated-cond]
> 
> Source code is
> 
>  else if (scratch0 != scratch1)
> {
>   emit_move_insn (scratch1, GEN_INT (vcall_offset));
>   emit_insn (gen_add2_insn (scratch0, scratch1));
>   offset_addr = scratch0;
> }
> 
> but earlier is code
> 
>  else if (scratch0 != scratch1)
> {
>   /* scratch0 != scratch1, and we have indexed loads.  Get better
>  schedule by loading the offset into r1 and using an indexed
>  load - then the load of r1 can issue before the load from
>  (this_rtx + delta) finishes.  */
>   emit_move_insn (scratch1, GEN_INT (vcall_offset));
>   offset_addr = gen_rtx_PLUS (Pmode, scratch0, scratch1);
> }

The condition for this block used to be:
  else if (! TARGET_SH5 && scratch0 != scratch1)
because the SH5 SHcompact indexed addressing doesn't actually work
the way GCC expects indexed addressing to work.

Thus, the second block (quoted first) is SH5 code.

[Bug other/44032] internals documentation is not legally safe to use

2018-03-17 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44032

--- Comment #4 from Jorn Wolfgang Rennecke  ---
(In reply to Eric Gallager from comment #3)
> Is this fixed in the same way that bug 44035 was fixed?

No. 44035 was about the inability to fix, 44032 is about the
actual licensing state of the documentation.  A brief look at
gccint.texi shows that this file remains purely GFDL.
I suppose there are numerous other files likewise affected.

It can only be considered fixed if all the parts of existing
documentation that you might conceivably want to cut & paste into
GPLed code are suitably re-licensed, and we have put something in
place that the issue will generally not appear with new GCC
documentation.

If all documentation files that come with GCC were patched as
suggested in comment #2, that could be considered a solution,
as people who cut & paste the copyright blurb for new files
would pick up the new text.  Well, there might be a transition
period when backed-up patches and patches made with using older
baselines need to be vetted for necessary adjustments.

If only some documentation files are patches to have the
amended copyright blurb, as others have no applicable code
samples, the others should have a warning not to copy them to
new files that will have such samples.

[Bug other/44035] internals documentation cannot be fixed without new GFDL license grants

2018-03-15 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44035

--- Comment #7 from Jorn Wolfgang Rennecke  ---
(In reply to jos...@codesourcery.com from comment #6)
> Since we have docstring relicensing maintainers, I don't think this is an 
> issue now.

Oops, that slipped my mind.  Indeed, we can consider this arrangement
to have fixed this issue.

[Bug other/44035] internals documentation cannot be fixed without new GFDL license grants

2018-03-09 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44035

--- Comment #5 from Jorn Wolfgang Rennecke  ---
(In reply to Eric Gallager from comment #4)
> Does this really need to have "blocker" importance? It has gone several
> years without actually blocking any releases.

The license issue has blocked a comprehensive consolidation of the target
description.

The question if it's currently blocking is a bit philosophical.  If the
license issue was resolved, would there be anyone right now with the time
and motivation to take up the work?

OTOH, we generally accept that there can be multiple blocking issues,
all of which have to be resolved to allow a certain task to proceed.

[Bug tree-optimization/38785] [6/7/8 Regression] huge performance regression on EEMBC bitmnp01

2018-02-08 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

--- Comment #50 from Jorn Wolfgang Rennecke  ---
It certainly is the case that the merit of an optimization can often not be
evaluated until forther optimization passes are done.  In fact, as an assembly
programmer, evaluating potential alternative code transformations, and
selecting the most suitable, or backtracking altogether, are a common modus
operandi.
Where pre creates a lot of new phi-nodes, in the hope that subsequently there
will be a commensurate pay-off, this should be evaluated at a later point down
the chain of optimization passes, either on a per-function, or on a
per-SESE-region basis.
In obvious cases, it might be enough you have a certain number of deletions of
code / phi nodes nodes to phi nodes previously created, or of overall cost
decrease for the function / SESE region, while in more complicated cases (or
just because you choose a higher optimization level),
you want to actually compare the code with and without the aggressive pre
optimization, or compare various levels of aggressiveness of pre optimizations.
We have long limited GCC to only follow a static pass phasing and doing
decisions one at a time, not to be reconsidered, but maybe undone by a
subsequent pass, if possible and deemed suitable at the time then.
As long as we don't allow GCC to consider doing alternative transformations,
and backtracking, it will be forever be limited.
I wonder if people would consider to use an operating-system dependent
operation - namely fork - to get the ball rolling.  I am aware that we'd
eventually need a further pointer abstraction for cross-pass persistent memory
to support compiler instance duplication on systems that can't fork,
and with GTY and C++ copy constructors we should be half-way there, but I think
we should first explore what we can do with compiler instance duplication on
systems where we can have it essentially for free.

[Bug rtl-optimization/29854] reload_combine looses track of uses

2016-03-08 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29854

--- Comment #8 from Jorn Wolfgang Rennecke  ---
revision 149282:

2009-07-06  J"orn Rennecke  
Kaz Kojima  

PR rtl-optimization/30807
* postreload.c (reload_combine): For every new use of REG_SUM,
record the use of BASE.

[Bug tree-optimization/28144] floating point constant -> byte/char/short conversion is wrong for java

2016-03-08 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28144

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
   Last reconfirmed||2016-03-08
 Resolution|INVALID |---
 Ever confirmed|0   |1

--- Comment #7 from Jorn Wolfgang Rennecke  ---
PR 27394 was closed on the grounds that the code was exhibited undefined
behaviour and that alternate facilities had been added in the meantime
which mitigate the impact of the inconsistent implemented behaviour on
debugging.

However, this PR (28144) is about the impact on Java; an updated link
to the quoted spec above is:

http://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.1.3

where it defines the exact behaviour of conversions.

The comment at the start of fold_convert_const_int_from_real 
claims that the code implements the floating point to integer
conversion rules required by the Java Language Specification,
but due to the problem discussed here, that is not true when
it comes to conversion to types narrower than int.

[Bug other/29842] [meta-bug] outstanding patches / issues from STMicroelectronics

2016-03-08 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29842
Bug 29842 depends on bug 28144, which changed state.

Bug 28144 Summary: floating point constant -> byte/char/short conversion is 
wrong for java
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28144

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|INVALID |---

[Bug tree-optimization/27394] double -> char conversion varies with optimization level

2016-03-08 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27394
Bug 27394 depends on bug 28144, which changed state.

Bug 28144 Summary: floating point constant -> byte/char/short conversion is 
wrong for java
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28144

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|INVALID |---

[Bug c++/68767] [6 regression] spurious warning: null argument where non-null required

2016-01-15 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68767

--- Comment #11 from Jorn Wolfgang Rennecke  ---
(In reply to Jakub Jelinek from comment #10)
 > Of course, the question is if the warning isn't really desirable, the user
> should really just choose some non-NULL magic value to pass in the
> impossible cases.

Are you saying the *_TYPE definitions in newlib-stdint.h should not use 0 in
any branches of their expressions?

[Bug middle-end/68767] spurious warning: null argument where non-null required

2015-12-09 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68767

--- Comment #3 from Jorn Wolfgang Rennecke  ---
(In reply to Manuel López-Ibáñez from comment #2)
> I don't understand. It is indeed passing NULL to a non-null function. What
> is wrong with the warning?

When you look at the original testcase closely, you'll see that it can never
(unless there is a race condition, invoking undefined behaviour) pass NULL.
In fact, it always passes "lstr" .

The the reduced testcase from comment #1 is more ambiguous.  If it can or can
not pass NULL depends on values that the variable might attain.

[Bug middle-end/68767] New: spurious warning: null argument where non-null required

2015-12-07 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68767

Bug ID: 68767
   Summary: spurious warning: null argument where non-null
required
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
  Target Milestone: ---

This test, compiled with g++ -c -Werror -Wall:

// { dg-do compile }
// { dg-options "-Werror -Wall" }

extern int len (const char *__s)
 throw () __attribute__ ((__pure__)) __attribute__ ((__nonnull__ (1)));

extern int num;

int
f (void)
{
  int i;

  i = len num != 2) ? "lstr" : num == 1 ? "str" : 0) ? ((num != 2) ? "lstr"
: num == 1 ? "str" : 0) : "lstr" ));
  return i;
}

gets the spurious warning:
tmp.C:14:115: error: null argument where non-null required (argument 1)
[-Werror=nonnull]
 m == 1 ? "str" : 0) ? ((num != 2) ? "lstr" : num == 1 ? "str" : 0) : "lstr"
));
  ^
Ironically, this is condensed down from c-common.c complaining about itself
when building gcc for a target with a variable BITS_PER_UNIT, which also uses
newlib-stdint.h .

Originally observed with g++ (GCC) 5.1.1 20150618 (Red Hat 5.1.1-4), but also
reproduced with g++ (GCC) 6.0.0 20151207 (experimental) .

[Bug libgcc/66883] config/epiphany/udivsi3-float.c:52: bad if test ?

2015-10-23 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66883

--- Comment #2 from Jorn Wolfgang Rennecke  ---
Author: amylaar
Date: Fri Oct 23 11:57:26 2015
New Revision: 229236

URL: https://gcc.gnu.org/viewcvs?rev=229236=gcc=rev
Log:
PR libgcc/66883
* config/epiphany/udivsi3-float.c: Fix CONCISE test, and comment typo.

N.B., this is not active code, just documenting a previous approach for this
function in C.

Modified:
trunk/libgcc/ChangeLog
trunk/libgcc/config/epiphany/udivsi3-float.c


[Bug other/39374] reload is too earer to re-use reload registers

2015-03-11 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39374

--- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 35011
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35011action=edit
gcc14:/home/amylaar/pr39374/pr39374-diff


[Bug other/39374] reload is too earer to re-use reload registers

2015-03-11 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39374

--- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 35012
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35012action=edit
gcc14:/home/amylaar/pr39374/pr39374-r14476


[Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md

2014-12-04 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003

--- Comment #13 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to David Malcolm from comment #6)
 If I'm reading things right, this loop in shorten_branches populates
 insn_lengths[uid] in order of the NEXT_INSN () iteration:
 
   int (*length_fun) (rtx_insn *) = increasing ? insn_min_length :
 insn_default_length;
 
   for (insn_current_address = 0, insn = first;
insn != 0;
insn_current_address += insn_lengths[uid], insn = NEXT_INSN (insn))
 {
   uid = INSN_UID (insn);
 
   insn_lengths[uid] = 0;
 
   /* lots of logic, which can call length_fun, and hence
 insn_min_length.  */
 }
 
 and length_fun can call into insn_min_length, and hence this calls into
 the get_attr_length_nobnd, which AIUI for this case is accessing lengths of
 other insns before they've been populated: presumably for a jump forwards?

insn_min_length is not supposed to use current insn lengths.
genattrtab does not follow attributes for the purposes of determining
insn current length dependence.
So far we consider it the job of the port to provide
a length attribute that allows the calculation of minimum/maximum instruction
lengths with this limitation in mind.
That means the length attribute in i386.md is broken.
The get_attr_length_nobnd attribute need to be either inlined, or its use
guarded in a clause that appears to be length depepdent and supplies minimum
and maximum values.

AFAICS, the length attribute was broken in r217125
https://gcc.gnu.org/ml/gcc-cvs/2014-11/msg00133.html


[Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md

2014-12-04 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #16 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Jeffrey A. Law from comment #14)

 Is this documented anywhere?  I certainly don't recall this restriction, but
 it does answer one of the questions I'd been kicking around in my head.

I've put a comment into sh.md to that effect - can't put a link to the
gcc-cvs archive here because the code is from 1998, but here's an excerpt:

;; ??? This should use something like *branch_p (minus (match_dup 0) (pc)),
;; but getattrtab doesn't understand this.
(define_attr length 
  (cond [(eq_attr type cbranch)
 (cond [(eq_attr short_cbranch_p yes)
(const_int 2)
(eq_attr med_cbranch_p yes)
(const_int 6)
(eq_attr braf_cbranch_p yes)
(const_int 12)
;; ??? using pc is not computed transitively.
(ne (match_dup 0) (match_dup 0))
(const_int 14)
...

The (ne (match_dup 0) (match_dup 0)) clause tells genattrtab that this
cond form is length-varying.

I had a patch to clear this up with a usable  documented interface:
https://gcc.gnu.org/ml/gcc-patches/2012-11/msg00473.html
It got stuck in code review, so it's now a local patch in the
Synopsys toolchains.


[Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md

2014-12-04 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003

--- Comment #18 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Ilya Enkovich from comment #17)
 If I understand the problem correctly the root is in attempt to get length
 of following instructions computing length for forwrad jump instruction. 
 How comes r217125 is guilty for that? It doesn't introduce such
 computations, it just renames length attribute into length_nobnd for
 mentioned jump patterns.  Do I miss something here?

The length attribute is treated specially by genattrtab.


[Bug other/39363] [meta-bug] pending patches from ARC International (UK) Ltd

2014-11-09 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39363

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 Depends on|31634   |

--- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
31634 used to be relevant for ARC, but that port has since ceased to
support changing the name if TEXT_SECTION_ASM_OP etc. by command line
option, and uses now a string literal, precisely in order to work around
this bug.


[Bug pch/31634] *_SECTION_ASM_OP storage has undocumented constraints

2014-11-09 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31634

--- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
31634 used to be relevant for ARC, but that port has since ceased to
support changing the name if TEXT_SECTION_ASM_OP etc. by command line
option, and uses now a string literal, precisely in order to work around
this bug.
Hence, this no longer blocks other/39363 .


[Bug target/39346] no mxp target port

2014-11-09 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39346

--- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
target/39346, other/39347 and other/39348 are no longer relevant to
other/39363,
because the Successor of ARC International (UK) Ltd, Synopsys, does not offer
an mxp option in its DesignWare ARC Processor Cores lineup.


[Bug other/39363] [meta-bug] pending patches from ARC International (UK) Ltd

2014-11-09 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39363

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 Depends on|39347, 39348|

--- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
target/39346, other/39347 and other/39348 are no longer relevant to
other/39363,
because the Successor of ARC International (UK) Ltd, Synopsys, does not offer
an mxp option in its DesignWare ARC Processor Cores lineup.


[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=

2014-10-21 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223

--- Comment #9 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Georg-Johann Lay from comment #8)
 (In reply to Jorn Wolfgang Rennecke from comment #4)
  (In reply to Georg-Johann Lay from comment #1)
  do_global_dtors is supposed to start at the start and increment from there.
  I see it used to be half-way wrong and half-way correct.
  (Starting at the start, decrementing for __AVR_HAVE_ELPM__, incrementing
  otherwise.)
  However, you now made it all the way use an incorrect order - starting at 
  the
  end and incrementing from there.
  Is there a rationale for this?
 
 The old code was broken as it decremented begainning at the start address. 
 The flaw never came apparent for __dtors_start = __dtors_end or with
 simulators that terminated in exit.
 
 The new code uses the same traverse direction like __do_global_ctors.
 
 Is the order of .ctors, .dtors defined in any way?  I.e. how do you express
 that constructor A must run before constructor B in the C program?  Same for
 destructors.

The C++ standard says that destructors have to run in reverse order of
completion
of constructors.
crtstuff.c:__do_global_ctors_aux starts at the first constructor, and
increments from there;
crtstuff.c:__do_global_dtors_aux starts at the last destructor, and decrements
from there.


[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=

2014-10-21 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223

--- Comment #10 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 33768
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33768action=edit
patch for dtor direction

I have this patch for fixing the direction of the dtor execution,
but I got stuck trying to write a testcase.


[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=

2014-10-17 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Georg-Johann Lay from comment #1)
 Author: gjl
 Date: Thu Sep 11 08:08:17 2014
 New Revision: 215152
 
 URL: https://gcc.gnu.org/viewcvs?rev=215152root=gccview=rev
 Log:
 gcc/
   PR target/63223
   * config/avr/avr.md (*tablejump.3byte-pc): New insn.
   (*tablejump): Restrict to !AVR_HAVE_EIJMP_EICALL.  Add void clobber.
   (casesi): Expand to *tablejump.3byte-pc if AVR_HAVE_EIJMP_EICALL.
 libgcc/
   PR target/63223
   * config/avr/libgcc.S (__tablejump2__): Rewrite to use RAMPZ, ELPM
   and R24 as needed.  Make work for all devices and .text locations.
   (__do_global_ctors, __do_global_dtors): Use word addresses.

do_global_dtors is supposed to start at the start and increment from there.
I see it used to be half-way wrong and half-way correct.
(Starting at the start, decrementing for __AVR_HAVE_ELPM__, incrementing
otherwise.)
However, you now made it all the way use an incorrect order - starting at the
end and incrementing from there.
Is there a rationale for this?

[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=

2014-10-17 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223

--- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
I also observe that the cpi/cpc/brne idiom that is used throughout -
before and after your patch - is nonsentical.


[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=

2014-10-17 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223

--- Comment #6 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Jorn Wolfgang Rennecke from comment #4)
 However, you now made it all the way use an incorrect order - starting at the
 end and incrementing from there.
Oops, I mean decrementing from there.  But the point still stands.


[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=

2014-10-17 Thread amylaar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223

--- Comment #7 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Jorn Wolfgang Rennecke from comment #5)
 I also observe that the cpi/cpc/brne idiom that is used throughout -
 before and after your patch - is nonsentical.

Oops, I drew conclusions from the operation short description of CPC that are
not borne out by the detailed flag setting description.


[Bug rtl-optimization/61017] New: lra aborts on optional match_scratch

2014-04-30 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61017

Bug ID: 61017
   Summary: lra aborts on optional match_scratch
   Product: gcc
   Version: 4.10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
CC: vmakarov at gcc dot gnu.org

lra is still not able to compile libgcc2 for ARC:

./cc1 libgcc2.i -O2 -mlra

../../../../unisrc-209293-arc/libgcc/libgcc2.c:2105:1: internal compiler error:
in curr_insn_transform, at lra-constraints.c:3492

The abort happens for the doloop_end_i pattern.
It contains
(clobber (match_scratch:SI 3 =X,X,r))

and for that, a register is allocated in advance without regard to need:

lra.c:remove_scratches 1992ff
  if (GET_CODE (*id-operand_loc[i]) == SCRATCH
   GET_MODE (*id-operand_loc[i]) != VOIDmode)
{
  insn_changed_p = true;
  *id-operand_loc[i] = reg
= lra_create_new_reg (static_id-operand[i].mode,
  *id-operand_loc[i], ALL_REGS, NULL);

As process_alr_operands find that no the alternative uses X for that operand,
it set this alternative to NO_REGS:

lra-constraints.c:process_alt_operands 1608ff
  if (curr_static_id-operand_alternative[opalt_num].anything_ok)
{
  /* Fast track for no constraints at all.  */
  curr_alt[nop] = NO_REGS;
  CLEAR_HARD_REG_SET (curr_alt_set[nop]);
  curr_alt_win[nop] = true;
  curr_alt_match_win[nop] = false;
  curr_alt_offmemok[nop] = false;
  curr_alt_matches[nop] = -1;
  continue;
}

which causes an abort later:

lra-constraints.c:curr_insn_transform 3486ff
if (REG_P (reg)  (regno = REGNO (reg)) = FIRST_PSEUDO_REGISTER)
  {
bool ok_p = in_class_p (reg, goal_alt[i], new_class);

if (new_class != NO_REGS  get_reg_class (regno) != new_class)
  {
lra_assert (ok_p);
lra_change_class (regno, new_class,   Change to, true);
  }
  }


[Bug rtl-optimization/61017] lra aborts on optional match_scratch

2014-04-30 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61017

--- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 32717
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32717action=edit
preprocessed libgcc file


[Bug other/60824] New: meta-bug: issues waiting for gcc 4.10 phase 1

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824

Bug ID: 60824
   Summary: meta-bug: issues waiting for gcc 4.10 phase 1
   Product: gcc
   Version: 4.10.0
Status: UNCONFIRMED
  Keywords: meta-bug
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org


[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

--- Comment #3 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
This patch:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00091.html
has been approved for gcc4.10, modulo one spelling fix:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00263.html


[Bug middle-end/59049] Two VOIDmode constant in comparison passed to cstoresi4

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59049

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Problem has been fixed for 4.9 with the commit shown in comment #9.


[Bug target/60811] arc/arc.c:2135: possible bad argument to abs

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60811

--- Comment #3 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Author: amylaar
Date: Fri Apr 11 18:04:43 2014
New Revision: 209311

URL: http://gcc.gnu.org/viewcvs?rev=209311root=gccview=rev
Log:
PR target/60811
* config/arc/arc.c (arc_save_restore): Fix assert typo.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arc/arc.c


[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

--- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Author: amylaar
Date: Fri Apr 11 18:12:53 2014
New Revision: 209312

URL: http://gcc.gnu.org/viewcvs?rev=209312root=gccview=rev
Log:
gcc:
PR rtl-optimization/60651
* mode-switching.c (optimize_mode_switching): Make sure to emit
sets of a lower numbered entity before sets of a higher numbered
entity to a mode of the same or lower priority.
When creating a seginfo for a basic block that starts with a code
label, move the insertion point past the code label.
(new_seginfo): Document and enforce requirement that
NOTE_INSN_BASIC_BLOCK only appears for empty blocks.
* doc/tm.texi.in: Document ordering constraint for emitted mode sets.
* doc/tm.texi: Regenerate.
gcc/testsuite:
PR rtl-optimization/60651
* gcc.target/epiphany/mode-switch.c: New test.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/doc/tm.texi
trunk/gcc/doc/tm.texi.in
trunk/gcc/mode-switching.c
trunk/gcc/testsuite/ChangeLog


[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

--- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Author: amylaar
Date: Fri Apr 11 18:27:45 2014
New Revision: 209318

URL: http://gcc.gnu.org/viewcvs?rev=209318root=gccview=rev
Log:
gcc/testsuite:
PR rtl-optimization/60651
* gcc.target/epiphany/mode-switch.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/epiphany/mode-switch.c


[Bug other/60824] meta-bug: issues waiting for gcc 4.10 phase 1

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824
Bug 60824 depends on bug 60651, which changed state.

Bug 60651 Summary: Mode switching instructions are sometimes emitted in the 
wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED


[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
  Known to work||4.10.0
 Resolution|--- |FIXED
  Known to fail||4.9.0

--- Comment #6 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Fixed with commits of comment #4/#5.


[Bug target/60811] arc/arc.c:2135: possible bad argument to abs

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60811

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Fixed with commit of comment #3.


[Bug other/60824] meta-bug: issues waiting for gcc 4.10 phase 1

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824
Bug 60824 depends on bug 60811, which changed state.

Bug 60811 Summary: arc/arc.c:2135: possible bad argument to abs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60811

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED


[Bug other/60824] meta-bug: issues waiting for gcc 4.10 phase 1

2014-04-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
gcc 4.10 phase 1 is now open.


[Bug rtl-optimization/60757] combine uses exponential time in nonzero_bits1 recursion

2014-04-04 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757

--- Comment #3 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 32544
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32544action=edit
typescript with backtrace

It appears that some other epiphany patches I had in my tree I thought were
unrelated are, in fact, also relevant.

The exact version I've been using can be retrieved with:

git clone g...@github.com:adapteva/epiphany-gcc.git
cd epiphany-gcc
git checkout ee67b804bd922ddcc72695973bed4641ba29801c


[Bug rtl-optimization/60757] combine uses exponential time in nonzero_bits1 recursion

2014-04-04 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757

--- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Jorn Wolfgang Rennecke from comment #3)
 Created attachment 32544 [details]
 typescript with backtrace
 
 It appears that some other epiphany patches I had in my tree I thought were
 unrelated are, in fact, also relevant.
 
 The exact version I've been using can be retrieved with:
 
 git clone g...@github.com:adapteva/epiphany-gcc.git
 cd epiphany-gcc
 git checkout ee67b804bd922ddcc72695973bed4641ba29801c

P.S.: that version sits on branch epiphany-gcc-4.8, so it should be sufficient
to clone that branch.  And it's based on the gcc git mirror, so if you have
a git local repo with gcc git mirror contents, most of the objects should
already be there.


[Bug rtl-optimization/60749] New: combine is overly cautious when operating on volatile memory references

2014-04-03 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60749

Bug ID: 60749
   Summary: combine is overly cautious when operating on volatile
memory references
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
Blocks: 53938

Curtesy of volatile_ok / init_recog_no_volatile, combine will
reject any combination that involves a volatile memref in the combined
pattern.

In particular, if any narrow memory location is read on a
WORD_REGISTER_OPERATIONS target, the zero/sign extension can't be combined
with a memory read, even if a suitably extending memory load instruction is
available - unless that pattern gets specifically written to accept
volatile memrefs, shunning the standard memory_operand and
general_operand predicates.

combine already needs to do special checks to make sure it doesn't
slip up when handling such patterns (E.g. see PR51374), so what good
does init_recog_non_volatile do combine these days?

At the very least, I think we should allow combinations involving a single
memref with unchanged mode before and after combination - that woud cover
the zero and sign extending loads.


[Bug rtl-optimization/60757] New: combine uses exponential time in nonzero_bits1 recursion

2014-04-03 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757

Bug ID: 60757
   Summary: combine uses exponential time in nonzero_bits1
recursion
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org

Created attachment 32540
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32540action=edit
pruned down testcase

With a small fix to the rtx_costs for epiphany, gcc.c-torture/compile/pr43415.c
times out compiling at -O3.
Even when the loop iteration counts are pruned, it's still too much,
as nonzero_bits recurses for both operands of a binary operator...
going through 40 operations means 2^40 paths being followed...


[Bug rtl-optimization/60757] combine uses exponential time in nonzero_bits1 recursion

2014-04-03 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757

--- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 32541
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32541action=edit
epiphany cost fix that triggers combine exponential behaviour


[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order

2014-04-02 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

--- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 32526
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32526action=edit
preprocessed libjava file

With the latest proposed patch, we get an assertion failure building libjava
during the i686-pc-linux-gnu bootstrap; this is the command line:

./cc1plus -fpreprocessed interpret.ii -quiet -dumpbase interpret.cc
-mtune=generic -march=pentiumpro -auxbase-strip .libs/interpret.o -g -O2
-Wswitch-enum -Wextra -Wall -version -fno-rtti -fnon-call-exceptions
-fdollars-in-identifiers -ffloat-store -fomit-frame-pointer -fwrapv -fPIC -o
interpret.s

The block in question looks like this:

(code_label/s 9087 9590 9090 17 990  [1 uses])

(note 9090 9087 9088 17 [bb 17] NOTE_INSN_BASIC_BLOCK)

where the BB_HEAD is the CODE_LABEL, and the BB_END is the
NOTE_INSN_BASIC_BLOCK.

The caller of new_seginfo is the abnormal-edge code that I've patched to
handle non-empty blocks differently; this block is mistaken for a non-empty
block.

Now, interestingly, the pre-existing code already handles this incorrectly,
by inserting instructions between the CODE_LABEL an the NOTE_INSN_BASIC_BLOCK.


[Bug rtl-optimization/60651] New: Mode switching instructions are sometimes emitted in the wrong order

2014-03-25 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

Bug ID: 60651
   Summary: Mode switching instructions are sometimes emitted in
the wrong order
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
Target: epiphany-*-*

As dicussed at
http://forums.parallella.org/viewtopic.php?f=13t=1053sid=2d28ee29b5dd3c591d947074f46ac752p=6654#p6654,
this code:

int a;
int c;

void __attribute__((interrupt))
misc_handler (void) {
   a*= c;
}

Is compiled into code that uses an uninitialized register.
As it turns out, the interrupt attribute is actually a red herring (as long as
you use the default of (-mfp-mode=caller).
The problem is that, after emitting the mask-loading instruction, mode
switching emits the mode switch to the caller's mode which uses that mask
*before* the load of the mask, thus using the register uninitialized.
The mask loading instruction, thus rendered useless, is later deleted.

The things with lcm is that we have an algorithm that can be a bit expensive,
but we can process multiple entities at almost no extra cost.
The epiphany needs to load constants to do its mode switching; these constants
can be anticipated further up in the dominance graph.  This can be modelled
as having a different entity for each mask needed, the need for which is
indicated at the same point as the mode switch itself.  Because the mask
load entities are not subject to transparency issues (except in the unfortunate
case of abnormal edges), lcm can move the loads up in suitable dominator
positions.
The modes priorities on the epiphany are also such that the mask loads have a
mode with the same or higher priority as the mask uses.
Also, the mask loads have lowered numbered entities than the mask uses.
As the lcm part of optimize_mode_switching inserts, for each priority, the
mode setting in ascending order of entities, and insert_insn_on_edge
appends to the currently registered sequence, this works find there.
The segment-based code also preserves entity order when inserting before an
instruction.
However, when inserting after a basic block head, later inserted mode
switch instructions end up prior to ones earlier inserted into the insn
stream.
To preserve the order, in the case of an initially empty basic block,
what we have to do is append the new instructions at the end of the basic
block.


[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order

2014-03-25 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651

--- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 32447
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32447action=edit
patch

The attached patch implements this aforementioned insertion at the end
of an (initially) empty basic block.
I'm currently bootstrapping/regtesting this on i686-pc-linux-gnu


[Bug other/60040] AVR: error: unable to find a register to spill in class 'POINTER_REGS'

2014-03-17 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60040

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 32372
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32372action=edit
tentative patch for tentative reloads

In this case, reload already knows that it has to re-do the reloads, but it
goes ahead anyway and computes reloads registers for this iteration.
Unfortunately, when find_reload_regs fails, it then calls spill_failure,
giving a hard error for a reload that we don't need in the first place.

The patch in this attachment passes down something_changed from reload
as tentative to select_reload_regs and then on to find_reload_regs to
not worry about the failure.
Also, in reload, I made it not 'goto failure' in that case.


[Bug target/58400] gcc for h8300 internal compiler error: insn does not satisfy its constraints at fs/ext4/mballoc.c: In function 'mb_free_blocks':

2014-03-06 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58400

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #8 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Created attachment 32285
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32285action=edit
patch made as an example how to debug gcc

here is a patch - not regtested.
you might also consider to put the three non-constriant uses of
[satisfies_constraint_]U in predicates.md into a different
constraint /vpredicate.
And delete the unused fix_bit_operand,


[Bug c++/2316] g++ fails to overload on language linkage

2014-03-02 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2316

--- Comment #50 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Marc Glisse from comment #49)
 large pieces of my patch as nonsense). Fixing this particular issue should
 not be too hard, there must be a place in the compiler that merges a number
 of properties from the early declaration into the definition, and we need to
 add extern C to that list.

It's not exactly a single place. For C, in c/c-decl.c, we got
duplicate_decls, which uses merge_decls.

For C++, in cp/decl.c, we got another function called duplicate_decls.


[Bug other/50925] [4.7/4.8/4.9 Regression][avr] ICE at spill_failure, at reload1.c:2118

2014-02-19 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50925

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #28 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
I can't reproduce this with the current trunk.  Can was mark this
as known to work for 4.9 ?


[Bug ipa/58253] IPA-SRA creates calls with different arguments that the callee accepts

2013-12-03 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58253

--- Comment #8 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Martin Jambor from comment #7)
 Thanks I have posted the updated patch (which checks for
 gimple_register_type rather than non-BLKmode)

FWIW, it is possible to have a BLKmode struct passed in a register.
The compat testcases have a number of those.  Not sure if it's
possible to craft a testcase that also triggers this ipa path.

 Computing, storing and re-using the types would certainly be too
 invasive a change for stage 3.  Moreover, it would basically mean
 passing the PARM_DECL types as types of actual arguments and I am not
 even sure that it is correct, the back-end should probably see the
 actual arguments as exactly what they are in the callers.

The idea of a function is that there can be multiple callers, using
different actual arguments, thus you shoud pick one formal argument type
for each argument, and stick with it for all callers and the callee.
The formal argument type determines how the argument is passed.
Now, I understand that with ipa, you will often have only a single
caller, and the compiler can change the types with consideration of the
passed actual arguments to fit various optimization purposes, but
it still has to pick one list of formal parameters types for each specialized
callee, and stick to this list at the corresponding call site(s).


[Bug tree-optimization/58253] IPA-SRA creates calls with different arguments that the callee accepts

2013-12-02 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58253

--- Comment #6 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Martin Jambor from comment #1)
 But again, I am not really sure what the semantics of alignment of
 scalar PARM_DECL is.

The relevance of various type properties will vary from target to target.
The only safe way for the caller to receive the arguments as passed is to
have caller and callee agree on the types passed.

It would seem to me that computing the types once and then storing them
somewhere, so that identical argument lists are used when procesing caller
and callee, is the safest way to make argument lists agree.
However, if you can make sure that you compute the same types in both
places, I suppose that should work too.

From a performance point of view, alignment to the natural alignment of
an integral mode is generally better than a lesser alignment, because it
allows efficient loads / stores to stack slots, should any become necessary.

 Nevertheless, can you please check if the patch
 indeed fixes the bug?  If so, I'll post it to the mailing list for
 review/further discussion.  Thanks.

The patch gets rid of the gcc.dg/torture/pr52402.c execution failures.

The only other difference observed with/without the patch is
8192 vs. 8173 tests being run in the libstdc++-v3 testsuite; the number
of tests run there under Fedora 19/20 appears to vary from time to time
independently of the compiler under test, so without running a
statistically significant number of test runs (which would take a few
months), I wouldn't draw any conclusion regarding the compiler from these
differences.


[Bug middle-end/59327] New: warning in expand_used_vars

2013-11-28 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327

Bug ID: 59327
   Summary: warning in expand_used_vars
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Keywords: build
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
CC: jakub at redhat dot com

Code added this morning to cfgexpand.c:expand_used_vars causes a warning:

g++ -c   -g  -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
-DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/.
-I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include 
-I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/dpd
-I../libdecnumber -I../../gcc/gcc/../libbacktrace-o cfgexpand.o -MT
cfgexpand.o -MMD -MP -MF ./.deps/cfgexpand.TPo ../../gcc/gcc/cfgexpand.c
../../gcc/gcc/cfgexpand.c: In function ‘rtx_def* expand_used_vars()’:
../../gcc/gcc/cfgexpand.c:1836:35: error: comparison between signed and
unsigned integer expressions [-Werror=sign-compare]
 sz + ASAN_RED_ZONE_SIZE = data.asan_alignb)
   ^
cc1plus: all warnings being treated as errors
make: *** [cfgexpand.o] Error 1


Seen for target arc-elf.
[amylaar@rowan gcc]$ g++ --version
g++ (GCC) 4.9.0 20131126 (experimental)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[amylaar@rowan gcc]$ uname -a
Linux rowan 3.11.7-200.fc19.i686.PAE #1 SMP Mon Nov 4 14:22:33 UTC 2013 i686
i686 i386 GNU/Linux

[Bug middle-end/59327] warning in expand_used_vars

2013-11-28 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327

--- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
The warning also happens when using g++ (GCC) 4.9.0 20131128 (experimental),
and when building gcc for target epiphany-elf.


[Bug middle-end/59327] warning in expand_used_vars

2013-11-28 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327

--- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
sz is HOST_WIDE_INT, ASAN_RED_ZONE_SIZE is an int literal, and data.asan_alignb
is an unsigned int.

With 32 bit int and HOST_WIDE_INT, this results in a 32 bit signed/unsigned
comparison.

When building a target with need_64bit_hwint (according to config.gcc),
on a host with 32 bit int, the right hand side of the comparison gets
sign extended to HOST_WIDE_INT, thus the warning will not show up when
testing such a combination / bootstrapping such a host/target.


[Bug middle-end/59327] [4.9 Regression] warning in expand_used_vars

2013-11-28 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327

--- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #3)
 Created attachment 31318 [details]
 gcc49-pr59327.patch
 
 Untested fix.

This allows arc-elf and arc-epiphany configureed with --enable-werror-always
to build on i686-pc-linux.gnu.


[Bug target/18335] [4.7/4.8/4.9 regression] mmix-knuth-mmixware testsuite failure: gcc.dg/debug/debug-1.c and debug-2 xyzzy

2013-11-22 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18335

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #15 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
Looking at the asembly output, this uses conditional execution like the MIPS,
so this is a testsuite bug.


[Bug middle-end/59049] Two VOIDmode constant in comparison passed to cstoresi4

2013-11-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59049

Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||patch

--- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
A patch is here: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00931.html


[Bug middle-end/59049] Two VOIDmode constant in comparison passed to cstoresi4

2013-11-11 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59049

--- Comment #8 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org ---
(In reply to Richard Biener from comment #7)
 That is, sth like
 
 Index: gcc/tree-ssa-ter.c
 ===
 --- gcc/tree-ssa-ter.c  (revision 204664)
 +++ gcc/tree-ssa-ter.c  (working copy)
 @@ -438,6 +439,12 @@ ter_is_replaceable_p (gimple stmt)
!is_gimple_val (gimple_assign_rhs1 (stmt)))
 return false;
  
 +  /* Do not propagate modeless constants - we may end up confusing
 the RTL
 +expanders.  Leave the optimization to RTL CCP.  */
 +  if (gimple_assign_single_p (stmt)
 +  CONSTANT_CLASS_P (gimple_assign_rhs1 (stmt)))
 +   return false;
 +
return true;
  }
return false;

Constants are often very valuable for rtl expansion, allowing to use
cheaper patterns.
And some constant propagations are impossible in rtl because of mode
oddities.  E.g. when you have a have a mulsidi3 pattern, you generally
have a sign_extend - you can't have a VOIDmode constant inside that.
Therefore, I would rather have the middle-end move the constants
to registers only when necessary to preserve the mode, and preferrably
fold instead in the first place when optimizing.


  1   2   3   4   5   6   7   >