[Bug testsuite/112728] gcc.dg/scantest-lto.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112728 --- Comment #3 from Jorn Wolfgang Rennecke --- (In reply to Rainer Orth from comment #0) > The gcc.dg/scantest-lto.c FAILs on quite a number of targets: ... > * On Darwin, the __TEXT,__eh_frame contains .ascii because the assembler > lacks support for cfi directives. I suppose we could handle the darwin case by: - Not doing the common scan-assembler* tests for darwin - doing a scan-assembler-times test that expects exactly how many .ascii are emitted for cfi.
[Bug target/112651] RISC-V Vector new option -mvect-lmul required to force LMUL values (rather than --param=riscv-autovec-lmul to hint at values)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112651 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #2 from Jorn Wolfgang Rennecke --- We can have in fact vector code without the intervention of the autovectorizer, if the user uses GNU C to write explicitly vectorized code, which code generation will simply translate to target instructions if the modes are available. Where the mode is too wide for the hardware becaue it doesn't support LMUL > 1, we want the vector lowering to kick in. I think we should achieve this aim by disabling vector modes altogether that are too wide for the hardware. That is alone is not a full solution, though, since a number of vector modes can be obtained with more than one LMUL value. Often, the higher LMUL values appear to be more efficient when just counting instructions because they allow vsetivli to be used for larger vectors, thus reducing the need to load constants into general purpose registers first.
[Bug target/112537] Is there a way to disable cpymem pass for rvv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112537 --- Comment #13 from Jorn Wolfgang Rennecke --- Before we can consider any costs, we first have to know what they are. Is there any manual for a hardware implementation that specifies costs?
[Bug target/112537] Is there a way to disable cpymem pass for rvv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112537 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #12 from Jorn Wolfgang Rennecke --- (In reply to JuzheZhong from comment #2) > Currently, we don't have a compile option to disable cpymem by RVV. If you don't want any vector instructions to be emitted, why do you tell the compiler to enable the 'v' extsnsion of the architecture?
[Bug testsuite/111298] time-profiler-2.c flaky on glibc RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111298 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #3 from Jorn Wolfgang Rennecke --- (In reply to Patrick O'Neill from comment #0) > I'm guessing that this is likely due to some conflict between > time-profiler-1.c and time-profiler-2.c and filing this under testsuite > framework issue, but feel free to move it if it's likely caused by a > specific component. My guess is that the atomic fetch-and-update emitted by gimple_gen_time_profiler is not actually atomic (at least under RISC-V Qemu). Note that in time-profiler-2.c, there is a parent and a child process that access the same gcov data.
[Bug testsuite/111658] New: test-function-bodies fails to find functions with single-letter names
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111658 Bug ID: 111658 Summary: test-function-bodies fails to find functions with single-letter names Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org Target Milestone: --- When you use check-function-bodies with a function that has a single-letter name, the start regexp set by configure_check-function-bodies and used by parse_function_bodies to find function starts fails to match, making the test always fail. There is no mention about such a restriction in sourcebuild.texi
[Bug testsuite/110951] [13/14] RISCV: rv32 newlib gcc.c-torture testsuite fails with xgcc: fatal error: Cannot find suitable multilib set for '-march=rv32imafdc_zicsr_zifencei'/'-mabi=ilp32d'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110951 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #3 from Jorn Wolfgang Rennecke --- I see something like this come up randomly (i.e. not strictly reproducible) with gcc14 about one to three times per million tests, in parts like gcc.dg. I wonder if it could be related? What was your testing environment problem?
[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566 --- Comment #5 from Jorn Wolfgang Rennecke --- I had a look at riscv_legitimize_move. It doesn't seem to suffer from quite the same problem as legitimize_move does, but it could if another problem was fixed: riscv_legitimize_move changes the rtl it's passed. That can lead to trouble if this is shared rtl.
[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566 --- Comment #4 from Jorn Wolfgang Rennecke --- Also, the GET_MODE_BITSIZE (mode).to_constant () <= MAX_BITS_PER_WORD in the *mov_mem_to_mem splitter can generate unaligned accesses, yet it is not guarded by a check that the target supports them.
[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566 --- Comment #3 from Jorn Wolfgang Rennecke --- riscv-v.cc:legitimize_move has: if (MEM_P (dest) && !REG_P (src)) src = force_reg (mode, src); return false; since src is passed by value, this is pointless. The caller still had src as a MEM.
[Bug target/111566] RISC-V Vector Fortran: ICE in final_scan_insn_1 (final RTL pass)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111566 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #2 from Jorn Wolfgang Rennecke --- This also causes trouble with my cpymem patch. With the *movv8si_mem_to_mem pattern, ira.cc:combine_and_move_insns will eagerly transform (insn 1606 1603 1608 77 (set (reg/f:SI 1187) (plus:SI (reg/f:SI 65 frame) (const_int -1248 [0xfb20]))) "/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0 discrim 126 4 {*addsi3} (nil)) (insn 1608 1606 1609 77 (set (reg:V8SI 1189) (mem/u/c:V8SI (reg/f:SI 5064) [0 S32 A128])) "/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0 discrim 126 1151 {*movv8si} (expr_list:REG_DEAD (reg/f:SI 5064) (expr_list:REG_EQUAL (mem/u/c:V8SI (const:SI (plus:SI (symbol_ref:SI ("*.LANCHOR0") [flags 0x182]) (const_int 64 [0x40]))) [0 S32 A128]) (nil (insn 1609 1608 12961 77 (set (mem/v/c:V8SI (reg/f:SI 1187) [1 S32 A128]) (reg:V8SI 1189)) "/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0 discrim 126 1151 {*movv8si} (expr_list:REG_DEAD (reg:V8SI 1189) (expr_list:REG_DEAD (reg/f:SI 1187) (nil into (insn 1608 1603 16000 77 (set (reg:V8SI 1189) (mem/u/c:V8SI (reg/f:SI 5064) [0 S32 A128])) "/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0 discrim 126 1151 {*movv8si} (expr_list:REG_EQUIV (mem/u/c:V8SI (const:SI (plus:SI (symbol_ref:SI ("*.LANCHOR0") [flags 0x182]) (const_int 64 [0x40]))) [0 S32 A128]) (expr_list:REG_DEAD (reg/f:SI 5064) (nil (insn 16000 1608 1609 77 (set (reg/f:SI 1187) (plus:SI (reg/f:SI 65 frame) (const_int -1248 [0xfb20]))) "/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0 discrim 126 4 {*addsi3} (expr_list:REG_EQUIV (plus:SI (reg/f:SI 65 frame) (const_int -1248 [0xfb20])) (nil))) (insn 1609 16000 12961 77 (set (mem/v/c:V8SI (reg/f:SI 1187) [1 S32 A128]) (mem/u/c:V8SI (reg/f:SI 5064) [0 S32 A128])) "/home/amylaar/embecosm/fsf-cme3/gcc/gcc/testsuite/c-c++-common/torture/complex-sign-add.c":44:0 discrim 126 -1 (expr_list:REG_DEAD (reg:V8SI 1189) (expr_list:REG_DEAD (reg/f:SI 1187) (nil during compilation of check_add_long_double. When a pattern with a mandatory split is recognized, you must make sure it can be split. If the pattern ceases to be valid at some point during the compilation, you must make sure it can be split or otherwise transformed before another attempt to recognize it is made.
[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #6 from Jorn Wolfgang Rennecke --- (In reply to H. Peter Anvin from comment #5) > 2. It seems like it almost would require an implementation-specific > performance model. Now, one can validly argue that by setting the cost of > unimplemented instructions to a (near-)infinite value such instructions > should never be generated even if they are "enabled". That might also be a > possible avenue for achieving this. Yes, that makes it possible to implement the interface without actually having a dedicated mask table. However, you still have the headache of how to get code generation to use this effectively. A lot of code generation strategies are basically canned solution that a skilled assembler programmer has devised; you can theoretically use the superoptimizer to find linear sequences for arbitrary instruction sets, but the compilation time cost and the limit to linear sequences makes this impractical. Therefore, as you want to co-develop architecture and software, you likely also have to hack the compiler to make effective use of your architecture. FWIW, 'infinite' cost seems unnecessarily high, considering you could make your assembler replace missing instructions with function calls, and these functions can get linked from a library. So you have a finite cost per-call for the call site size (static instruction count) & time (dynamic instruction count), and a one-time size cost per-object for each function used. Such a library and assembler modification could be prepared for specific extensions that you want to deconstruct, and then used flexibly.
[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361 --- Comment #8 from Jorn Wolfgang Rennecke --- Bootstrapped and regression tested on x86_64-pc-linux-gnu.
[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361 Jorn Wolfgang Rennecke changed: What|Removed |Added Attachment #50837|0 |1 is obsolete|| --- Comment #7 from Jorn Wolfgang Rennecke --- Created attachment 50839 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50839=edit Amended patch This patch also disables the affected tests.
[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361 --- Comment #5 from Jorn Wolfgang Rennecke --- (In reply to Patrick Palka from comment #3) > Btw, we already disable the floating-point to_chars on targets without a > binary64 double. So is our test for detecting binary64 not accurate enough, > or are these 16-bit targets whose double type really is binary64? At least in the case of eSi-RISC, it is the latter.
[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361 --- Comment #4 from Jorn Wolfgang Rennecke --- Created attachment 50837 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50837=edit Proposed patch This patch fixes the problem for eSi-RISC and bootstraps on x86_64-pc-linux-gnu , with floating_to_chars.o properly built in each stage. Could you check that this also works for msp430?
[Bug libstdc++/100361] gcc-11 for msp430-elf fails to build: src/c++17/floating_to_chars.cc:107: d2fixed_full_table.h:1283:23: error: size of array ‘POW10_SPLIT_2’ exceeds maximum object size ‘32767’
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100361 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-05-18 --- Comment #1 from Jorn Wolfgang Rennecke --- I also see this for 16 bit eSi-RISC targets. This array can't fit into a 16 bit address space that addresses 8 bit units.
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 Jorn Wolfgang Rennecke changed: What|Removed |Added Attachment #46574|0 |1 is obsolete|| --- Comment #18 from Jorn Wolfgang Rennecke --- Created attachment 46577 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46577=edit patch for aligned stack - but clamping max alignment at MAX_SUPPORTED_STACK_ALIGNMENT (In reply to r...@cebitec.uni-bielefeld.de from comment #17) > > --- Comment #15 from Jorn Wolfgang Rennecke --- > > Created attachment 46574 [details] > > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46574=edit > > patch for the case that the stack is sufficiently aligned > [...] > > I have attached a patch to preserve the alignment of the passed type for the > > case that the stack is already sufficiently aligned. > > This patch breaks sparc-sun-solaris2.11 bootstrap with an ICE while > compiling stage2 function.c: > > during RTL pass: expand > /vol/gcc/src/hg/trunk/local/gcc/function.c: In function 'void > assign_parm_find_data_types(assign_parm_data_all*, tree, > assign_parm_data_one*)': > /vol/gcc/src/hg/trunk/local/gcc/function.c:2426:49: internal compiler error: This location doesn't make much sense to me. Maybe some artefact from optimized compilation and register windows? > in assign_stack_temp_for_type, at function.c:880 > 2426 | else if (targetm.calls.strict_argument_naming (all->args_so_far)) > |~^~ > 0x11bc22f assign_stack_temp_for_type(machine_mode, poly_int<1u, long long>, > tree_node*) > /vol/gcc/src/hg/trunk/local/gcc/function.c:878 > 0x11bc963 assign_temp(tree_node*, int, int) This looks like the modified assert there has triggered. It'd be interesting to know why - i.e. what variable does want more alignment than MAX_SUPPORTED_STACK_ALIGNMENT - during bootstrap? Or is this a BLKmode variable with less alignment than BIGGEST_ALIGNMENT? User code could specify silly alignments which we couldn't provide with ordinary allocation (using a fixed offset from sp/fp) and which could also blow up the frame size too much if we tried, so it makes sense to clamp the alignment to MAX_SUPPORTED_STACK_ALIGNMENT in get_stack_local_alignment. The other side is that the code in assign_stack_temp_for_type seems to require BIGGEST_ALIGNMENT for BLKmode; I'm not sure about assign_stack_local_1 slots. It seems a bit wasteful, but trying to reduce waste of space in the stack frame is really a different issue, so I also modified the patch to use at least BIGGEST_ALIGNMENT for BLKmode so that it's (bug-?)compatible in that aspect with the previous code - see attached modified patch.
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 --- Comment #15 from Jorn Wolfgang Rennecke --- Created attachment 46574 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46574=edit patch for the case that the stack is sufficiently aligned (In reply to dave.anglin from comment #11) > $sp is aligned on entry to main: > (gdb) p/x $sp > $1 = 0xf8d02300 > > However, the invisible reference is a $sp - 0x78. That's not sufficiently > aligned. I've built a cross compiler to take a closer look. MAX_SUPPORTED_STACK_ALIGNMENT is 512, so the problem is completely different for this target. Looking at pa.h, the value comes from PREFERRED_STACK_BOUNDARY : /* Boundary (in *bits*) on which stack pointer is always aligned; certain optimizations in combine depend on this. The HP-UX runtime documents mandate 64-byte and 16-byte alignment for the stack on the 32 and 64-bit ports, respectively. However, we are only guaranteed that the stack is aligned to BIGGEST_ALIGNMENT in main. Thus, we treat the former as the preferred alignment. */ #define STACK_BOUNDARY BIGGEST_ALIGNMENT #define PREFERRED_STACK_BOUNDARY (TARGET_64BIT ? 128 : 512) ... /* No data type wants to be aligned rounder than this. The long double type has 16-byte alignment on the 64-bit target even though it was never implemented in hardware. The software implementation only needs 8-byte alignment. This matches the biggest alignment of the HP compilers. */ #define BIGGEST_ALIGNMENT (2 * BITS_PER_WORD) Even with TARGET_64_BIT, we got a PREFERRED_STACK_BOUNDARY of 128 . However: #define UNITS_PER_WORD (TARGET_64BIT ? 8 : 4) It seems suspicious that PREFERRED_STACK_BOUNDARY is smaller for TARGET_64BIT ? Be this as it may, the problem for the 84877 testcase is not that the stack has insufficient alignment, but that the stack slot doesn't have an aligned offset. The alignment gets pruned in function.c:get_stack_local_alignment : if (mode == BLKmode) alignment = BIGGEST_ALIGNMENT; I have attached a patch to preserve the alignment of the passed type for the case that the stack is already sufficiently aligned. To test the case where the stack is insufficiently aligned, for hppa we should use a different testcase with > 512 bit alignment of the type.
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 --- Comment #13 from Jorn Wolfgang Rennecke --- (In reply to Hans-Peter Nilsson from comment #12) > (In reply to Jorn Wolfgang Rennecke from comment #10) > > Created attachment 46567 [details] > > Fix for targets that pass the argument by invisible reference > > Thanks for your efforts. This *may* have affected the code generated by > gcc.dg/pr84877.c; that test now passes, but that's unreliable as I've seen > the outcome depends on random stack alignment of the context, and my > baseline is from a context different enough. I believe inspecting the > generated code isn't of much interest given David Anglin's observations for > hppa and... > > However, it introduces these regressions: > +gcc.sum gcc.dg/pr80286.c > +gcc.sum gcc.dg/torture/pr78542.c > +gcc.sum gcc.dg/torture/pr86363.c > +gcc.sum gcc.dg/torture/va-arg-25.c I tried if I could reproduce this with a cross-compiler built for --target=hppa-linux-gnu; the va-arg-25.c test case needs headers, but the others can be compiled just using xgcc & cc1. I tried with the options in dg-options, and for pr78542.c / pr86363.c I also tried additional -O options. However, I don't see any ICE. Is there a special configuration or set of options needed, or is this just impossible with a cross compiler?
[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073 --- Comment #16 from Jorn Wolfgang Rennecke --- Going from gcc 8.2 to gcc 9.1, I find the following two test cases are now autovectorized: /* { dg-do compile } */ /* { dg-options "-O3" } */ /* Test auto-vectorization */ #include "vector-types.h" #define LENGTH 256 __attribute__((aligned (VECTOR_SIZE))) short a[LENGTH], b[LENGTH]; short c; void foo (void) { int i; for (i=0; i> (c & 0xf); } } /* { dg-do compile } */ /* { dg-options "-O3" } */ /* Test auto-vectorization */ #include "vector-types.h" #define LENGTH 256 __attribute__((aligned (VECTOR_SIZE))) unsigned short a[LENGTH], b[LENGTH]; unsigned short c; void foo (void) { int i; for (i=0; i> (c & 0xf); } }
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #10 from Jorn Wolfgang Rennecke --- Created attachment 46567 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46567=edit Fix for targets that pass the argument by invisible reference I also observe this problem on esirisc. assign_parm is only relevant for the testcase if the argument is passed by value, where the copy is made in foo. If the argument is passed by invisible reference, we have instead during compilation of main expand_call calling initialize_argument_information, which calls assign_temp, which calls assign_temp_for_type, which calls assign_stack_local_1 . The attached patch changes initialize_argument_information to use the same code path as for variable-sized arguments; it's a bit more overhead, but I would think that excess alignment is a relatively rare case. If performance for this alignment were really important, you could change the stack management so that the alignment can be provided more cheaply. Since the esirisc port is not in the FSF tree, it doesn't really count for testing; also, the behaviour will vary depending on argument passing of the target, so we need to test a variety of targets.
[Bug testsuite/91065] gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065 Jorn Wolfgang Rennecke changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Jorn Wolfgang Rennecke --- Patch applied, not a regression, since the test was like this from the start.
[Bug testsuite/91065] gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065 --- Comment #2 from Jorn Wolfgang Rennecke --- Author: amylaar Date: Wed Jul 3 00:22:53 2019 New Revision: 272954 URL: https://gcc.gnu.org/viewcvs?rev=272954=gcc=rev Log: PR testsuite/91065 * testsuite/gcc.dg/plugin/start_unit_plugin.c: Register a root tab to reference fake_var. Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/gcc.dg/plugin/start_unit_plugin.c
[Bug ipa/91062] gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc was configured with --enable-checking=all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91062 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #2 from Jorn Wolfgang Rennecke --- varmap is allocated on the heap, and lives across passes. Yes it references a name that is sometimes in static storage, but mostly in ggc-allocated memory. I suppose inhibiting garbage collection during ipa would be no good, so either the names should be allocated on the heap (ironically, often the name is generated on the heap and later copied to ggc memory), or be reachable from a ggc root. I have traced the output of one garbage string emitted in the dump file for gcc.dg/torture/ipa-pta-1.c back to its origin (index is 9 in new_var_info, and the string is in "name"; gcc source svn revision is 272931): #0 new_var_info (t=0x0, name=0x7fffefba2050 "test4.clobber", add_id=false) at ../../gcc/gcc/tree-ssa-structalias.c:383 #1 0x00fa2d81 in create_function_info_for (decl=0x7fffefce8700, name=0x7fffefb9ff40 "test4", add_id=false, nonlocal_p=true) at ../../gcc/gcc/tree-ssa-structalias.c:5785 #2 0x00fa9725 in ipa_pta_execute () at ../../gcc/gcc/tree-ssa-structalias.c:8095 #3 0x00faab71 in (anonymous namespace)::pass_ipa_pta::execute ( this=0x271a9e0) at ../../gcc/gcc/tree-ssa-structalias.c:8493 #4 0x00c5e991 in execute_one_pass (pass=pass@entry=0x271a9e0) at ../../gcc/gcc/passes.c:2473 #5 0x00c5fa32 in execute_ipa_pass_list (pass=0x271a9e0) at ../../gcc/gcc/passes.c:2913 #6 0x008918e9 in symbol_table::compile ( this=this@entry=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2648 #7 0x00894b08 in symbol_table::compile (this=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2825 #8 symbol_table::finalize_compilation_unit (this=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2861 #9 0x00d8d544 in compile_file () at ../../gcc/gcc/toplev.c:481 #10 0x006b8919 in do_compile () at ../../gcc/gcc/toplev.c:2209 #11 toplev::main (this=this@entry=0x7fffddf0, argc=, argc@entry=22, argv=, argv@entry=0x7fffdef8) ... at the end of the pass ... #0 ggc_collect () at ../../gcc/gcc/ggc-page.c:2174 #1 0x00c5e6fb in execute_one_ipa_transform_pass (ipa_pass=0x271a2e0, node=0x7fffefb9d708) at ../../gcc/gcc/passes.c:2232 #2 execute_all_ipa_transforms () at ../../gcc/gcc/passes.c:2250 #3 0x00882662 in cgraph_node::get_body (this=0x7fffefb9d708) at ../../gcc/gcc/cgraph.c:3621 #4 0x00fa9633 in ipa_pta_execute () at ../../gcc/gcc/tree-ssa-structalias.c:8077 #5 0x00faab71 in (anonymous namespace)::pass_ipa_pta::execute ( this=0x271a9e0) at ../../gcc/gcc/tree-ssa-structalias.c:8493 #6 0x00c5e991 in execute_one_pass (pass=pass@entry=0x271a9e0) at ../../gcc/gcc/passes.c:2473 #7 0x00c5fa32 in execute_ipa_pass_list (pass=0x271a9e0) at ../../gcc/gcc/passes.c:2913 #8 0x008918e9 in symbol_table::compile ( this=this@entry=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2648 #9 0x00894b08 in symbol_table::compile (this=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2825 #10 symbol_table::finalize_compilation_unit (this=0x7fffefb9e100) at ../../gcc/gcc/cgraphunit.c:2861 #11 0x00d8d544 in compile_file () at ../../gcc/gcc/toplev.c:481 #12 0x006b8919 in do_compile () at ../../gcc/gcc/toplev.c:2209 #13 toplev::main (this=this@entry=0x7fffddf0, argc=, ... lots of garbage collections and constraint dumpings later... #0 __GI__IO_fputs (str=0x7fffefba2050 '\245' , "\"", fp=0x27fd8e0) at iofputs.c:32Breakpoint 8, dump_constraint (file=0x27fd8e0, c=0x281fc38) at ../../gcc/gcc/tree-ssa-structalias.c:678 678 fprintf (file, "%s", get_varinfo (c->lhs.var)->name); (gdb) s get_varinfo (n=9) at ../../gcc/gcc/tree-ssa-structalias.c:346 346 return varmap[n]; (gdb) fin Run till exit from #0 get_varinfo (n=9) at ../../gcc/gcc/tree-ssa-structalias.c:346 0x00f9570e in dump_constraint (file=0x27fd8e0, c=0x281fc38) at ../../gcc/gcc/tree-ssa-structalias.c:678 678 fprintf (file, "%s", get_varinfo (c->lhs.var)->name); Value returned is $27 = (variable_info *) 0x27cd7b0 (gdb) s __GI__IO_fputs (str=0x7fffefba2050 '\245' , "\"", fp=0x27fd8e0) at iofputs.c:32 32 { (gdb) p $22 $28 = 0x7fffefba2050 '\245' , "\"" (gdb) bt #0 __GI__IO_fputs (str=0x7fffefba2050 '\245' , "\"", fp=0x27fd8e0) at iofputs.c:32 #1 0x00f95721 in dump_constraint (file=0x27fd8e0, c=0x281fc38) at ../../gcc/gcc/tree-ssa-structalias.c:678 #2 0x00f958db in dump_constraints (file=0x27fd8e0, from=44) at ../../gcc/gcc/tree-ssa-structalias.c:723 #3 0x00fa9d22 in ipa_pta_execute () at ../../gcc/gcc/tree-ssa-structalias.c:8193 #4
[Bug testsuite/91065] gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065 Jorn Wolfgang Rennecke changed: What|Removed |Added Keywords||patch --- Comment #1 from Jorn Wolfgang Rennecke --- I've posted a patch: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00187.html
[Bug testsuite/91065] New: gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91065 Bug ID: 91065 Summary: gcc.dg/plugin/start_unit_plugin.c uses ggc memory without registering a root_tab Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: GC Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org Target Milestone: --- Host: x86_64-pc-linux-gnu (probably doesn't really matter) Target: native or cross gcc.dg/plugin/start_unit_plugin.c isets fake_var to ggc-allocated memory, without registering a root_tab that references fake_var. This causes gcc.dg/plugin/start_unit-test-1.c to fail when the compiler is configured with --enable-checking=all
[Bug ipa/91062] gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc was configured with --enable-checking=all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91062 --- Comment #1 from Jorn Wolfgang Rennecke --- Similarly, gcc.dg/torture/ipa-pta-1.c fails four scan tests because ipa-pta-1.c.083i.pta2 gets corrupted in the ENABLE_GC_ALWAYS_COLLECT scenario.
[Bug ipa/91062] New: gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc was configured with --enable-checking=all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91062 Bug ID: 91062 Summary: gcc.dg/ipa/ipa-pta-1.c dump contains garbage when gcc was configured with --enable-checking=all Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: GC Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org CC: marxin at gcc dot gnu.org Target Milestone: --- Host: x86_64-pc-linux-gnu (probably doesn't really matter) Target: native or cross A number of symbol names in the dump file have been replaced by what looks like ggc erased memory. The problem can be hidden by adding a suitable min_expand value, e.g. (for native unix): make check-gcc RUNTESTFLAGS='--target_board=unix/--param=ggc-min-expand=30 ipa.exp=ipa-pta-1.c' on a machine with 16 GB RAM + 8 GB swap. OTOH, I haven't been able to reproduce this using a compiler that hasn't been configured with --enable-checking, or merely with --enable-checking=yes, even when adding --param=ggc-min-expand=0 . We've originally observed this in a variant of gcc 8.2, so this bug has probably been around for a while.
[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726 --- Comment #21 from Jorn Wolfgang Rennecke --- Author: amylaar Date: Mon Jul 1 21:48:55 2019 New Revision: 272911 URL: https://gcc.gnu.org/viewcvs?rev=272911=gcc=rev Log: PR middle-end/66726 * tree-ssa-phiopt.c (factor_out_conditional_conversion): Tune heuristic from PR71016 to allow MIN / MAX. * testsuite/gcc.dg/tree-ssa/pr66726-4.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/pr66726-4.c Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-phiopt.c
[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073 --- Comment #14 from Jorn Wolfgang Rennecke --- (In reply to Jorn Wolfgang Rennecke from comment #12) > If we are right shifting a signed type, we could apply a MAX operation to the > shift count. Oops, I mean MIN of course. So that we can guarantee that the maximum applied shift count is one less than the bitsize of the shifted value.
[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073 --- Comment #13 from Jorn Wolfgang Rennecke --- If the shifted value is 16 bit and int is 32 bit wide, then, depending on target costs, instead of a vector compare, we might decide to use a sign extract of bit 4 of the shift count instead.
[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073 --- Comment #12 from Jorn Wolfgang Rennecke --- If we are left shifting a narrow signed type for the result, and no defined overflow semantics are in place, it should be OK to just vectorize the code using the result type. If we are right shifting a signed type, we could apply a MAX operation to the shift count. If we are shifting an unsigned type, we can do a vector compare to check if the shift count exceeds the range, and use an AND to zero the result if that is the case. If we are doing a shift right of a signed value where -fwrapv semantics are required or allowed, we can do the same as for unsigned shift. Thus, a shift is replaced by two or three vactor operations, which should be a win if the vectorization factor is four or more. The MAX and compare operations might subsequently be eliminated if value range propagation finds that the value can't be out of range.
[Bug tree-optimization/40073] Vector short/char shifts generate sub-optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40073 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #11 from Jorn Wolfgang Rennecke --- Created attachment 45079 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45079=edit testcase using restricted shift count Even if the shift count is restricted in range by applying an AND first, which also further boosts the optimization potential for SHIFT_COUNT_TRUNCATED targets, the code is not vectorized.
[Bug tree-optimization/44976] reductions with short variables do not get vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44976 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #3 from Jorn Wolfgang Rennecke --- Ironically, this is a case where -fwrapv improves optimization.
[Bug other/39363] [meta-bug] pending patches from ARC International (UK) Ltd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39363 Bug 39363 depends on bug 39302, which changed state. Bug 39302 Summary: [meta-bug] bugs waiting for Copyright Assignment acknowledgemt for ARC International (UK) Ltd https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39302 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug other/39302] [meta-bug] bugs waiting for Copyright Assignment acknowledgemt for ARC International (UK) Ltd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39302 Jorn Wolfgang Rennecke changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Jorn Wolfgang Rennecke --- (In reply to Eric Gallager from comment #2) > (In reply to Jorn Wolfgang Rennecke from comment #1) > > Confirmation received. I'll have to send out the patches now. > > Have you done this yet? Yes, see other/39363 and the various ARC branches from that time. > Also does this need to keep the "meta-bug" label Yes. This 'bug' describes and tracks state of a set of other bugs, it is not a GNU software bug in its own right. OTOH, the issue being tracked by this meta-bug - need for (verification of) Copyright assignment for patches from ARC International (UK) Ltd - has been resolved, and the dependent bugs are thus no longer blocked (since comment #1), so moving this to FIXED.
[Bug rtl-optimization/55531] peephole2 pattern with multiple insns with match_parallel insn causes corrupted peephole2_insns matching function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55531 --- Comment #2 from Jorn Wolfgang Rennecke --- (In reply to Eric Gallager from comment #1) > so this is... what, wrong-code? ice-on-valid-code? build? > > (I should go to bed instead of trying to figure this out...) ice-on-valid-code, and consequently a build issue.
[Bug target/85993] config/sh/sh.c:10878: suspicious if .. else chain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85993 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||olegendo at gcc dot gnu.org --- Comment #2 from Jorn Wolfgang Rennecke --- (In reply to David Binderman from comment #0) > config/sh/sh.c:10878:12: warning: duplicated ‘if’ condition > [-Wduplicated-cond] > > Source code is > > else if (scratch0 != scratch1) > { > emit_move_insn (scratch1, GEN_INT (vcall_offset)); > emit_insn (gen_add2_insn (scratch0, scratch1)); > offset_addr = scratch0; > } > > but earlier is code > > else if (scratch0 != scratch1) > { > /* scratch0 != scratch1, and we have indexed loads. Get better > schedule by loading the offset into r1 and using an indexed > load - then the load of r1 can issue before the load from > (this_rtx + delta) finishes. */ > emit_move_insn (scratch1, GEN_INT (vcall_offset)); > offset_addr = gen_rtx_PLUS (Pmode, scratch0, scratch1); > } The condition for this block used to be: else if (! TARGET_SH5 && scratch0 != scratch1) because the SH5 SHcompact indexed addressing doesn't actually work the way GCC expects indexed addressing to work. Thus, the second block (quoted first) is SH5 code.
[Bug other/44032] internals documentation is not legally safe to use
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44032 --- Comment #4 from Jorn Wolfgang Rennecke --- (In reply to Eric Gallager from comment #3) > Is this fixed in the same way that bug 44035 was fixed? No. 44035 was about the inability to fix, 44032 is about the actual licensing state of the documentation. A brief look at gccint.texi shows that this file remains purely GFDL. I suppose there are numerous other files likewise affected. It can only be considered fixed if all the parts of existing documentation that you might conceivably want to cut & paste into GPLed code are suitably re-licensed, and we have put something in place that the issue will generally not appear with new GCC documentation. If all documentation files that come with GCC were patched as suggested in comment #2, that could be considered a solution, as people who cut & paste the copyright blurb for new files would pick up the new text. Well, there might be a transition period when backed-up patches and patches made with using older baselines need to be vetted for necessary adjustments. If only some documentation files are patches to have the amended copyright blurb, as others have no applicable code samples, the others should have a warning not to copy them to new files that will have such samples.
[Bug other/44035] internals documentation cannot be fixed without new GFDL license grants
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44035 --- Comment #7 from Jorn Wolfgang Rennecke --- (In reply to jos...@codesourcery.com from comment #6) > Since we have docstring relicensing maintainers, I don't think this is an > issue now. Oops, that slipped my mind. Indeed, we can consider this arrangement to have fixed this issue.
[Bug other/44035] internals documentation cannot be fixed without new GFDL license grants
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44035 --- Comment #5 from Jorn Wolfgang Rennecke --- (In reply to Eric Gallager from comment #4) > Does this really need to have "blocker" importance? It has gone several > years without actually blocking any releases. The license issue has blocked a comprehensive consolidation of the target description. The question if it's currently blocking is a bit philosophical. If the license issue was resolved, would there be anyone right now with the time and motivation to take up the work? OTOH, we generally accept that there can be multiple blocking issues, all of which have to be resolved to allow a certain task to proceed.
[Bug tree-optimization/38785] [6/7/8 Regression] huge performance regression on EEMBC bitmnp01
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785 --- Comment #50 from Jorn Wolfgang Rennecke --- It certainly is the case that the merit of an optimization can often not be evaluated until forther optimization passes are done. In fact, as an assembly programmer, evaluating potential alternative code transformations, and selecting the most suitable, or backtracking altogether, are a common modus operandi. Where pre creates a lot of new phi-nodes, in the hope that subsequently there will be a commensurate pay-off, this should be evaluated at a later point down the chain of optimization passes, either on a per-function, or on a per-SESE-region basis. In obvious cases, it might be enough you have a certain number of deletions of code / phi nodes nodes to phi nodes previously created, or of overall cost decrease for the function / SESE region, while in more complicated cases (or just because you choose a higher optimization level), you want to actually compare the code with and without the aggressive pre optimization, or compare various levels of aggressiveness of pre optimizations. We have long limited GCC to only follow a static pass phasing and doing decisions one at a time, not to be reconsidered, but maybe undone by a subsequent pass, if possible and deemed suitable at the time then. As long as we don't allow GCC to consider doing alternative transformations, and backtracking, it will be forever be limited. I wonder if people would consider to use an operating-system dependent operation - namely fork - to get the ball rolling. I am aware that we'd eventually need a further pointer abstraction for cross-pass persistent memory to support compiler instance duplication on systems that can't fork, and with GTY and C++ copy constructors we should be half-way there, but I think we should first explore what we can do with compiler instance duplication on systems where we can have it essentially for free.
[Bug rtl-optimization/29854] reload_combine looses track of uses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29854 --- Comment #8 from Jorn Wolfgang Rennecke --- revision 149282: 2009-07-06 J"orn RenneckeKaz Kojima PR rtl-optimization/30807 * postreload.c (reload_combine): For every new use of REG_SUM, record the use of BASE.
[Bug tree-optimization/28144] floating point constant -> byte/char/short conversion is wrong for java
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28144 Jorn Wolfgang Rennecke changed: What|Removed |Added Status|RESOLVED|REOPENED Last reconfirmed||2016-03-08 Resolution|INVALID |--- Ever confirmed|0 |1 --- Comment #7 from Jorn Wolfgang Rennecke --- PR 27394 was closed on the grounds that the code was exhibited undefined behaviour and that alternate facilities had been added in the meantime which mitigate the impact of the inconsistent implemented behaviour on debugging. However, this PR (28144) is about the impact on Java; an updated link to the quoted spec above is: http://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.1.3 where it defines the exact behaviour of conversions. The comment at the start of fold_convert_const_int_from_real claims that the code implements the floating point to integer conversion rules required by the Java Language Specification, but due to the problem discussed here, that is not true when it comes to conversion to types narrower than int.
[Bug other/29842] [meta-bug] outstanding patches / issues from STMicroelectronics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29842 Bug 29842 depends on bug 28144, which changed state. Bug 28144 Summary: floating point constant -> byte/char/short conversion is wrong for java https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28144 What|Removed |Added Status|RESOLVED|REOPENED Resolution|INVALID |---
[Bug tree-optimization/27394] double -> char conversion varies with optimization level
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27394 Bug 27394 depends on bug 28144, which changed state. Bug 28144 Summary: floating point constant -> byte/char/short conversion is wrong for java https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28144 What|Removed |Added Status|RESOLVED|REOPENED Resolution|INVALID |---
[Bug c++/68767] [6 regression] spurious warning: null argument where non-null required
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68767 --- Comment #11 from Jorn Wolfgang Rennecke --- (In reply to Jakub Jelinek from comment #10) > Of course, the question is if the warning isn't really desirable, the user > should really just choose some non-NULL magic value to pass in the > impossible cases. Are you saying the *_TYPE definitions in newlib-stdint.h should not use 0 in any branches of their expressions?
[Bug middle-end/68767] spurious warning: null argument where non-null required
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68767 --- Comment #3 from Jorn Wolfgang Rennecke --- (In reply to Manuel López-Ibáñez from comment #2) > I don't understand. It is indeed passing NULL to a non-null function. What > is wrong with the warning? When you look at the original testcase closely, you'll see that it can never (unless there is a race condition, invoking undefined behaviour) pass NULL. In fact, it always passes "lstr" . The the reduced testcase from comment #1 is more ambiguous. If it can or can not pass NULL depends on values that the variable might attain.
[Bug middle-end/68767] New: spurious warning: null argument where non-null required
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68767 Bug ID: 68767 Summary: spurious warning: null argument where non-null required Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: diagnostic Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org Target Milestone: --- This test, compiled with g++ -c -Werror -Wall: // { dg-do compile } // { dg-options "-Werror -Wall" } extern int len (const char *__s) throw () __attribute__ ((__pure__)) __attribute__ ((__nonnull__ (1))); extern int num; int f (void) { int i; i = len num != 2) ? "lstr" : num == 1 ? "str" : 0) ? ((num != 2) ? "lstr" : num == 1 ? "str" : 0) : "lstr" )); return i; } gets the spurious warning: tmp.C:14:115: error: null argument where non-null required (argument 1) [-Werror=nonnull] m == 1 ? "str" : 0) ? ((num != 2) ? "lstr" : num == 1 ? "str" : 0) : "lstr" )); ^ Ironically, this is condensed down from c-common.c complaining about itself when building gcc for a target with a variable BITS_PER_UNIT, which also uses newlib-stdint.h . Originally observed with g++ (GCC) 5.1.1 20150618 (Red Hat 5.1.1-4), but also reproduced with g++ (GCC) 6.0.0 20151207 (experimental) .
[Bug libgcc/66883] config/epiphany/udivsi3-float.c:52: bad if test ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66883 --- Comment #2 from Jorn Wolfgang Rennecke --- Author: amylaar Date: Fri Oct 23 11:57:26 2015 New Revision: 229236 URL: https://gcc.gnu.org/viewcvs?rev=229236=gcc=rev Log: PR libgcc/66883 * config/epiphany/udivsi3-float.c: Fix CONCISE test, and comment typo. N.B., this is not active code, just documenting a previous approach for this function in C. Modified: trunk/libgcc/ChangeLog trunk/libgcc/config/epiphany/udivsi3-float.c
[Bug other/39374] reload is too earer to re-use reload registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39374 --- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 35011 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35011action=edit gcc14:/home/amylaar/pr39374/pr39374-diff
[Bug other/39374] reload is too earer to re-use reload registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39374 --- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 35012 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35012action=edit gcc14:/home/amylaar/pr39374/pr39374-r14476
[Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003 --- Comment #13 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to David Malcolm from comment #6) If I'm reading things right, this loop in shorten_branches populates insn_lengths[uid] in order of the NEXT_INSN () iteration: int (*length_fun) (rtx_insn *) = increasing ? insn_min_length : insn_default_length; for (insn_current_address = 0, insn = first; insn != 0; insn_current_address += insn_lengths[uid], insn = NEXT_INSN (insn)) { uid = INSN_UID (insn); insn_lengths[uid] = 0; /* lots of logic, which can call length_fun, and hence insn_min_length. */ } and length_fun can call into insn_min_length, and hence this calls into the get_attr_length_nobnd, which AIUI for this case is accessing lengths of other insns before they've been populated: presumably for a jump forwards? insn_min_length is not supposed to use current insn lengths. genattrtab does not follow attributes for the purposes of determining insn current length dependence. So far we consider it the job of the port to provide a length attribute that allows the calculation of minimum/maximum instruction lengths with this limitation in mind. That means the length attribute in i386.md is broken. The get_attr_length_nobnd attribute need to be either inlined, or its use guarded in a clause that appears to be length depepdent and supplies minimum and maximum values. AFAICS, the length attribute was broken in r217125 https://gcc.gnu.org/ml/gcc-cvs/2014-11/msg00133.html
[Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #16 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Jeffrey A. Law from comment #14) Is this documented anywhere? I certainly don't recall this restriction, but it does answer one of the questions I'd been kicking around in my head. I've put a comment into sh.md to that effect - can't put a link to the gcc-cvs archive here because the code is from 1998, but here's an excerpt: ;; ??? This should use something like *branch_p (minus (match_dup 0) (pc)), ;; but getattrtab doesn't understand this. (define_attr length (cond [(eq_attr type cbranch) (cond [(eq_attr short_cbranch_p yes) (const_int 2) (eq_attr med_cbranch_p yes) (const_int 6) (eq_attr braf_cbranch_p yes) (const_int 12) ;; ??? using pc is not computed transitively. (ne (match_dup 0) (match_dup 0)) (const_int 14) ... The (ne (match_dup 0) (match_dup 0)) clause tells genattrtab that this cond form is length-varying. I had a patch to clear this up with a usable documented interface: https://gcc.gnu.org/ml/gcc-patches/2012-11/msg00473.html It got stuck in code review, so it's now a local patch in the Synopsys toolchains.
[Bug target/64003] valgrind complains about get_attr_length_nobnd in insn-attrtab.c from i386.md
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003 --- Comment #18 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Ilya Enkovich from comment #17) If I understand the problem correctly the root is in attempt to get length of following instructions computing length for forwrad jump instruction. How comes r217125 is guilty for that? It doesn't introduce such computations, it just renames length attribute into length_nobnd for mentioned jump patterns. Do I miss something here? The length attribute is treated specially by genattrtab.
[Bug other/39363] [meta-bug] pending patches from ARC International (UK) Ltd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39363 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added Depends on|31634 | --- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- 31634 used to be relevant for ARC, but that port has since ceased to support changing the name if TEXT_SECTION_ASM_OP etc. by command line option, and uses now a string literal, precisely in order to work around this bug.
[Bug pch/31634] *_SECTION_ASM_OP storage has undocumented constraints
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31634 --- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- 31634 used to be relevant for ARC, but that port has since ceased to support changing the name if TEXT_SECTION_ASM_OP etc. by command line option, and uses now a string literal, precisely in order to work around this bug. Hence, this no longer blocks other/39363 .
[Bug target/39346] no mxp target port
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39346 --- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- target/39346, other/39347 and other/39348 are no longer relevant to other/39363, because the Successor of ARC International (UK) Ltd, Synopsys, does not offer an mxp option in its DesignWare ARC Processor Cores lineup.
[Bug other/39363] [meta-bug] pending patches from ARC International (UK) Ltd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39363 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added Depends on|39347, 39348| --- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- target/39346, other/39347 and other/39348 are no longer relevant to other/39363, because the Successor of ARC International (UK) Ltd, Synopsys, does not offer an mxp option in its DesignWare ARC Processor Cores lineup.
[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223 --- Comment #9 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Georg-Johann Lay from comment #8) (In reply to Jorn Wolfgang Rennecke from comment #4) (In reply to Georg-Johann Lay from comment #1) do_global_dtors is supposed to start at the start and increment from there. I see it used to be half-way wrong and half-way correct. (Starting at the start, decrementing for __AVR_HAVE_ELPM__, incrementing otherwise.) However, you now made it all the way use an incorrect order - starting at the end and incrementing from there. Is there a rationale for this? The old code was broken as it decremented begainning at the start address. The flaw never came apparent for __dtors_start = __dtors_end or with simulators that terminated in exit. The new code uses the same traverse direction like __do_global_ctors. Is the order of .ctors, .dtors defined in any way? I.e. how do you express that constructor A must run before constructor B in the C program? Same for destructors. The C++ standard says that destructors have to run in reverse order of completion of constructors. crtstuff.c:__do_global_ctors_aux starts at the first constructor, and increments from there; crtstuff.c:__do_global_dtors_aux starts at the last destructor, and decrements from there.
[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223 --- Comment #10 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 33768 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33768action=edit patch for dtor direction I have this patch for fixing the direction of the dtor execution, but I got stuck trying to write a testcase.
[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Georg-Johann Lay from comment #1) Author: gjl Date: Thu Sep 11 08:08:17 2014 New Revision: 215152 URL: https://gcc.gnu.org/viewcvs?rev=215152root=gccview=rev Log: gcc/ PR target/63223 * config/avr/avr.md (*tablejump.3byte-pc): New insn. (*tablejump): Restrict to !AVR_HAVE_EIJMP_EICALL. Add void clobber. (casesi): Expand to *tablejump.3byte-pc if AVR_HAVE_EIJMP_EICALL. libgcc/ PR target/63223 * config/avr/libgcc.S (__tablejump2__): Rewrite to use RAMPZ, ELPM and R24 as needed. Make work for all devices and .text locations. (__do_global_ctors, __do_global_dtors): Use word addresses. do_global_dtors is supposed to start at the start and increment from there. I see it used to be half-way wrong and half-way correct. (Starting at the start, decrementing for __AVR_HAVE_ELPM__, incrementing otherwise.) However, you now made it all the way use an incorrect order - starting at the end and incrementing from there. Is there a rationale for this?
[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223 --- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- I also observe that the cpi/cpc/brne idiom that is used throughout - before and after your patch - is nonsentical.
[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223 --- Comment #6 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Jorn Wolfgang Rennecke from comment #4) However, you now made it all the way use an incorrect order - starting at the end and incrementing from there. Oops, I mean decrementing from there. But the point still stands.
[Bug target/63223] [avr] Make jumptables work with -Wl,--section-start,.text=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63223 --- Comment #7 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Jorn Wolfgang Rennecke from comment #5) I also observe that the cpi/cpc/brne idiom that is used throughout - before and after your patch - is nonsentical. Oops, I drew conclusions from the operation short description of CPC that are not borne out by the detailed flag setting description.
[Bug rtl-optimization/61017] New: lra aborts on optional match_scratch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61017 Bug ID: 61017 Summary: lra aborts on optional match_scratch Product: gcc Version: 4.10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org CC: vmakarov at gcc dot gnu.org lra is still not able to compile libgcc2 for ARC: ./cc1 libgcc2.i -O2 -mlra ../../../../unisrc-209293-arc/libgcc/libgcc2.c:2105:1: internal compiler error: in curr_insn_transform, at lra-constraints.c:3492 The abort happens for the doloop_end_i pattern. It contains (clobber (match_scratch:SI 3 =X,X,r)) and for that, a register is allocated in advance without regard to need: lra.c:remove_scratches 1992ff if (GET_CODE (*id-operand_loc[i]) == SCRATCH GET_MODE (*id-operand_loc[i]) != VOIDmode) { insn_changed_p = true; *id-operand_loc[i] = reg = lra_create_new_reg (static_id-operand[i].mode, *id-operand_loc[i], ALL_REGS, NULL); As process_alr_operands find that no the alternative uses X for that operand, it set this alternative to NO_REGS: lra-constraints.c:process_alt_operands 1608ff if (curr_static_id-operand_alternative[opalt_num].anything_ok) { /* Fast track for no constraints at all. */ curr_alt[nop] = NO_REGS; CLEAR_HARD_REG_SET (curr_alt_set[nop]); curr_alt_win[nop] = true; curr_alt_match_win[nop] = false; curr_alt_offmemok[nop] = false; curr_alt_matches[nop] = -1; continue; } which causes an abort later: lra-constraints.c:curr_insn_transform 3486ff if (REG_P (reg) (regno = REGNO (reg)) = FIRST_PSEUDO_REGISTER) { bool ok_p = in_class_p (reg, goal_alt[i], new_class); if (new_class != NO_REGS get_reg_class (regno) != new_class) { lra_assert (ok_p); lra_change_class (regno, new_class, Change to, true); } }
[Bug rtl-optimization/61017] lra aborts on optional match_scratch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61017 --- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 32717 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32717action=edit preprocessed libgcc file
[Bug other/60824] New: meta-bug: issues waiting for gcc 4.10 phase 1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824 Bug ID: 60824 Summary: meta-bug: issues waiting for gcc 4.10 phase 1 Product: gcc Version: 4.10.0 Status: UNCONFIRMED Keywords: meta-bug Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org
[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 --- Comment #3 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- This patch: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00091.html has been approved for gcc4.10, modulo one spelling fix: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00263.html
[Bug middle-end/59049] Two VOIDmode constant in comparison passed to cstoresi4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59049 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #10 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Problem has been fixed for 4.9 with the commit shown in comment #9.
[Bug target/60811] arc/arc.c:2135: possible bad argument to abs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60811 --- Comment #3 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Author: amylaar Date: Fri Apr 11 18:04:43 2014 New Revision: 209311 URL: http://gcc.gnu.org/viewcvs?rev=209311root=gccview=rev Log: PR target/60811 * config/arc/arc.c (arc_save_restore): Fix assert typo. Modified: trunk/gcc/ChangeLog trunk/gcc/config/arc/arc.c
[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 --- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Author: amylaar Date: Fri Apr 11 18:12:53 2014 New Revision: 209312 URL: http://gcc.gnu.org/viewcvs?rev=209312root=gccview=rev Log: gcc: PR rtl-optimization/60651 * mode-switching.c (optimize_mode_switching): Make sure to emit sets of a lower numbered entity before sets of a higher numbered entity to a mode of the same or lower priority. When creating a seginfo for a basic block that starts with a code label, move the insertion point past the code label. (new_seginfo): Document and enforce requirement that NOTE_INSN_BASIC_BLOCK only appears for empty blocks. * doc/tm.texi.in: Document ordering constraint for emitted mode sets. * doc/tm.texi: Regenerate. gcc/testsuite: PR rtl-optimization/60651 * gcc.target/epiphany/mode-switch.c: New test. Modified: trunk/gcc/ChangeLog trunk/gcc/doc/tm.texi trunk/gcc/doc/tm.texi.in trunk/gcc/mode-switching.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 --- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Author: amylaar Date: Fri Apr 11 18:27:45 2014 New Revision: 209318 URL: http://gcc.gnu.org/viewcvs?rev=209318root=gccview=rev Log: gcc/testsuite: PR rtl-optimization/60651 * gcc.target/epiphany/mode-switch.c: New test. Added: trunk/gcc/testsuite/gcc.target/epiphany/mode-switch.c
[Bug other/60824] meta-bug: issues waiting for gcc 4.10 phase 1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824 Bug 60824 depends on bug 60651, which changed state. Bug 60651 Summary: Mode switching instructions are sometimes emitted in the wrong order http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Known to work||4.10.0 Resolution|--- |FIXED Known to fail||4.9.0 --- Comment #6 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Fixed with commits of comment #4/#5.
[Bug target/60811] arc/arc.c:2135: possible bad argument to abs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60811 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Fixed with commit of comment #3.
[Bug other/60824] meta-bug: issues waiting for gcc 4.10 phase 1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824 Bug 60824 depends on bug 60811, which changed state. Bug 60811 Summary: arc/arc.c:2135: possible bad argument to abs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60811 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug other/60824] meta-bug: issues waiting for gcc 4.10 phase 1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60824 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WORKSFORME --- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- gcc 4.10 phase 1 is now open.
[Bug rtl-optimization/60757] combine uses exponential time in nonzero_bits1 recursion
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757 --- Comment #3 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 32544 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32544action=edit typescript with backtrace It appears that some other epiphany patches I had in my tree I thought were unrelated are, in fact, also relevant. The exact version I've been using can be retrieved with: git clone g...@github.com:adapteva/epiphany-gcc.git cd epiphany-gcc git checkout ee67b804bd922ddcc72695973bed4641ba29801c
[Bug rtl-optimization/60757] combine uses exponential time in nonzero_bits1 recursion
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757 --- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Jorn Wolfgang Rennecke from comment #3) Created attachment 32544 [details] typescript with backtrace It appears that some other epiphany patches I had in my tree I thought were unrelated are, in fact, also relevant. The exact version I've been using can be retrieved with: git clone g...@github.com:adapteva/epiphany-gcc.git cd epiphany-gcc git checkout ee67b804bd922ddcc72695973bed4641ba29801c P.S.: that version sits on branch epiphany-gcc-4.8, so it should be sufficient to clone that branch. And it's based on the gcc git mirror, so if you have a git local repo with gcc git mirror contents, most of the objects should already be there.
[Bug rtl-optimization/60749] New: combine is overly cautious when operating on volatile memory references
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60749 Bug ID: 60749 Summary: combine is overly cautious when operating on volatile memory references Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org Blocks: 53938 Curtesy of volatile_ok / init_recog_no_volatile, combine will reject any combination that involves a volatile memref in the combined pattern. In particular, if any narrow memory location is read on a WORD_REGISTER_OPERATIONS target, the zero/sign extension can't be combined with a memory read, even if a suitably extending memory load instruction is available - unless that pattern gets specifically written to accept volatile memrefs, shunning the standard memory_operand and general_operand predicates. combine already needs to do special checks to make sure it doesn't slip up when handling such patterns (E.g. see PR51374), so what good does init_recog_non_volatile do combine these days? At the very least, I think we should allow combinations involving a single memref with unchanged mode before and after combination - that woud cover the zero and sign extending loads.
[Bug rtl-optimization/60757] New: combine uses exponential time in nonzero_bits1 recursion
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757 Bug ID: 60757 Summary: combine uses exponential time in nonzero_bits1 recursion Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org Created attachment 32540 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32540action=edit pruned down testcase With a small fix to the rtx_costs for epiphany, gcc.c-torture/compile/pr43415.c times out compiling at -O3. Even when the loop iteration counts are pruned, it's still too much, as nonzero_bits recurses for both operands of a binary operator... going through 40 operations means 2^40 paths being followed...
[Bug rtl-optimization/60757] combine uses exponential time in nonzero_bits1 recursion
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60757 --- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 32541 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32541action=edit epiphany cost fix that triggers combine exponential behaviour
[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 --- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 32526 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32526action=edit preprocessed libjava file With the latest proposed patch, we get an assertion failure building libjava during the i686-pc-linux-gnu bootstrap; this is the command line: ./cc1plus -fpreprocessed interpret.ii -quiet -dumpbase interpret.cc -mtune=generic -march=pentiumpro -auxbase-strip .libs/interpret.o -g -O2 -Wswitch-enum -Wextra -Wall -version -fno-rtti -fnon-call-exceptions -fdollars-in-identifiers -ffloat-store -fomit-frame-pointer -fwrapv -fPIC -o interpret.s The block in question looks like this: (code_label/s 9087 9590 9090 17 990 [1 uses]) (note 9090 9087 9088 17 [bb 17] NOTE_INSN_BASIC_BLOCK) where the BB_HEAD is the CODE_LABEL, and the BB_END is the NOTE_INSN_BASIC_BLOCK. The caller of new_seginfo is the abnormal-edge code that I've patched to handle non-empty blocks differently; this block is mistaken for a non-empty block. Now, interestingly, the pre-existing code already handles this incorrectly, by inserting instructions between the CODE_LABEL an the NOTE_INSN_BASIC_BLOCK.
[Bug rtl-optimization/60651] New: Mode switching instructions are sometimes emitted in the wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 Bug ID: 60651 Summary: Mode switching instructions are sometimes emitted in the wrong order Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org Target: epiphany-*-* As dicussed at http://forums.parallella.org/viewtopic.php?f=13t=1053sid=2d28ee29b5dd3c591d947074f46ac752p=6654#p6654, this code: int a; int c; void __attribute__((interrupt)) misc_handler (void) { a*= c; } Is compiled into code that uses an uninitialized register. As it turns out, the interrupt attribute is actually a red herring (as long as you use the default of (-mfp-mode=caller). The problem is that, after emitting the mask-loading instruction, mode switching emits the mode switch to the caller's mode which uses that mask *before* the load of the mask, thus using the register uninitialized. The mask loading instruction, thus rendered useless, is later deleted. The things with lcm is that we have an algorithm that can be a bit expensive, but we can process multiple entities at almost no extra cost. The epiphany needs to load constants to do its mode switching; these constants can be anticipated further up in the dominance graph. This can be modelled as having a different entity for each mask needed, the need for which is indicated at the same point as the mode switch itself. Because the mask load entities are not subject to transparency issues (except in the unfortunate case of abnormal edges), lcm can move the loads up in suitable dominator positions. The modes priorities on the epiphany are also such that the mask loads have a mode with the same or higher priority as the mask uses. Also, the mask loads have lowered numbered entities than the mask uses. As the lcm part of optimize_mode_switching inserts, for each priority, the mode setting in ascending order of entities, and insert_insn_on_edge appends to the currently registered sequence, this works find there. The segment-based code also preserves entity order when inserting before an instruction. However, when inserting after a basic block head, later inserted mode switch instructions end up prior to ones earlier inserted into the insn stream. To preserve the order, in the case of an initially empty basic block, what we have to do is append the new instructions at the end of the basic block.
[Bug rtl-optimization/60651] Mode switching instructions are sometimes emitted in the wrong order
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60651 --- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 32447 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32447action=edit patch The attached patch implements this aforementioned insertion at the end of an (initially) empty basic block. I'm currently bootstrapping/regtesting this on i686-pc-linux-gnu
[Bug other/60040] AVR: error: unable to find a register to spill in class 'POINTER_REGS'
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60040 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 32372 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32372action=edit tentative patch for tentative reloads In this case, reload already knows that it has to re-do the reloads, but it goes ahead anyway and computes reloads registers for this iteration. Unfortunately, when find_reload_regs fails, it then calls spill_failure, giving a hard error for a reload that we don't need in the first place. The patch in this attachment passes down something_changed from reload as tentative to select_reload_regs and then on to find_reload_regs to not worry about the failure. Also, in reload, I made it not 'goto failure' in that case.
[Bug target/58400] gcc for h8300 internal compiler error: insn does not satisfy its constraints at fs/ext4/mballoc.c: In function 'mb_free_blocks':
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58400 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #8 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Created attachment 32285 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32285action=edit patch made as an example how to debug gcc here is a patch - not regtested. you might also consider to put the three non-constriant uses of [satisfies_constraint_]U in predicates.md into a different constraint /vpredicate. And delete the unused fix_bit_operand,
[Bug c++/2316] g++ fails to overload on language linkage
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=2316 --- Comment #50 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Marc Glisse from comment #49) large pieces of my patch as nonsense). Fixing this particular issue should not be too hard, there must be a place in the compiler that merges a number of properties from the early declaration into the definition, and we need to add extern C to that list. It's not exactly a single place. For C, in c/c-decl.c, we got duplicate_decls, which uses merge_decls. For C++, in cp/decl.c, we got another function called duplicate_decls.
[Bug other/50925] [4.7/4.8/4.9 Regression][avr] ICE at spill_failure, at reload1.c:2118
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50925 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #28 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- I can't reproduce this with the current trunk. Can was mark this as known to work for 4.9 ?
[Bug ipa/58253] IPA-SRA creates calls with different arguments that the callee accepts
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58253 --- Comment #8 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Martin Jambor from comment #7) Thanks I have posted the updated patch (which checks for gimple_register_type rather than non-BLKmode) FWIW, it is possible to have a BLKmode struct passed in a register. The compat testcases have a number of those. Not sure if it's possible to craft a testcase that also triggers this ipa path. Computing, storing and re-using the types would certainly be too invasive a change for stage 3. Moreover, it would basically mean passing the PARM_DECL types as types of actual arguments and I am not even sure that it is correct, the back-end should probably see the actual arguments as exactly what they are in the callers. The idea of a function is that there can be multiple callers, using different actual arguments, thus you shoud pick one formal argument type for each argument, and stick with it for all callers and the callee. The formal argument type determines how the argument is passed. Now, I understand that with ipa, you will often have only a single caller, and the compiler can change the types with consideration of the passed actual arguments to fit various optimization purposes, but it still has to pick one list of formal parameters types for each specialized callee, and stick to this list at the corresponding call site(s).
[Bug tree-optimization/58253] IPA-SRA creates calls with different arguments that the callee accepts
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58253 --- Comment #6 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Martin Jambor from comment #1) But again, I am not really sure what the semantics of alignment of scalar PARM_DECL is. The relevance of various type properties will vary from target to target. The only safe way for the caller to receive the arguments as passed is to have caller and callee agree on the types passed. It would seem to me that computing the types once and then storing them somewhere, so that identical argument lists are used when procesing caller and callee, is the safest way to make argument lists agree. However, if you can make sure that you compute the same types in both places, I suppose that should work too. From a performance point of view, alignment to the natural alignment of an integral mode is generally better than a lesser alignment, because it allows efficient loads / stores to stack slots, should any become necessary. Nevertheless, can you please check if the patch indeed fixes the bug? If so, I'll post it to the mailing list for review/further discussion. Thanks. The patch gets rid of the gcc.dg/torture/pr52402.c execution failures. The only other difference observed with/without the patch is 8192 vs. 8173 tests being run in the libstdc++-v3 testsuite; the number of tests run there under Fedora 19/20 appears to vary from time to time independently of the compiler under test, so without running a statistically significant number of test runs (which would take a few months), I wouldn't draw any conclusion regarding the compiler from these differences.
[Bug middle-end/59327] New: warning in expand_used_vars
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327 Bug ID: 59327 Summary: warning in expand_used_vars Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: build Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: amylaar at gcc dot gnu.org CC: jakub at redhat dot com Code added this morning to cfgexpand.c:expand_used_vars causes a warning: g++ -c -g -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I../../gcc/gcc/../libbacktrace-o cfgexpand.o -MT cfgexpand.o -MMD -MP -MF ./.deps/cfgexpand.TPo ../../gcc/gcc/cfgexpand.c ../../gcc/gcc/cfgexpand.c: In function ‘rtx_def* expand_used_vars()’: ../../gcc/gcc/cfgexpand.c:1836:35: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] sz + ASAN_RED_ZONE_SIZE = data.asan_alignb) ^ cc1plus: all warnings being treated as errors make: *** [cfgexpand.o] Error 1 Seen for target arc-elf. [amylaar@rowan gcc]$ g++ --version g++ (GCC) 4.9.0 20131126 (experimental) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [amylaar@rowan gcc]$ uname -a Linux rowan 3.11.7-200.fc19.i686.PAE #1 SMP Mon Nov 4 14:22:33 UTC 2013 i686 i686 i386 GNU/Linux
[Bug middle-end/59327] warning in expand_used_vars
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327 --- Comment #1 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- The warning also happens when using g++ (GCC) 4.9.0 20131128 (experimental), and when building gcc for target epiphany-elf.
[Bug middle-end/59327] warning in expand_used_vars
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327 --- Comment #2 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- sz is HOST_WIDE_INT, ASAN_RED_ZONE_SIZE is an int literal, and data.asan_alignb is an unsigned int. With 32 bit int and HOST_WIDE_INT, this results in a 32 bit signed/unsigned comparison. When building a target with need_64bit_hwint (according to config.gcc), on a host with 32 bit int, the right hand side of the comparison gets sign extended to HOST_WIDE_INT, thus the warning will not show up when testing such a combination / bootstrapping such a host/target.
[Bug middle-end/59327] [4.9 Regression] warning in expand_used_vars
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59327 --- Comment #4 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #3) Created attachment 31318 [details] gcc49-pr59327.patch Untested fix. This allows arc-elf and arc-epiphany configureed with --enable-werror-always to build on i686-pc-linux.gnu.
[Bug target/18335] [4.7/4.8/4.9 regression] mmix-knuth-mmixware testsuite failure: gcc.dg/debug/debug-1.c and debug-2 xyzzy
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18335 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added CC||amylaar at gcc dot gnu.org --- Comment #15 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- Looking at the asembly output, this uses conditional execution like the MIPS, so this is a testsuite bug.
[Bug middle-end/59049] Two VOIDmode constant in comparison passed to cstoresi4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59049 Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org changed: What|Removed |Added Keywords||patch --- Comment #5 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- A patch is here: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00931.html
[Bug middle-end/59049] Two VOIDmode constant in comparison passed to cstoresi4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59049 --- Comment #8 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org --- (In reply to Richard Biener from comment #7) That is, sth like Index: gcc/tree-ssa-ter.c === --- gcc/tree-ssa-ter.c (revision 204664) +++ gcc/tree-ssa-ter.c (working copy) @@ -438,6 +439,12 @@ ter_is_replaceable_p (gimple stmt) !is_gimple_val (gimple_assign_rhs1 (stmt))) return false; + /* Do not propagate modeless constants - we may end up confusing the RTL +expanders. Leave the optimization to RTL CCP. */ + if (gimple_assign_single_p (stmt) + CONSTANT_CLASS_P (gimple_assign_rhs1 (stmt))) + return false; + return true; } return false; Constants are often very valuable for rtl expansion, allowing to use cheaper patterns. And some constant propagations are impossible in rtl because of mode oddities. E.g. when you have a have a mulsidi3 pattern, you generally have a sign_extend - you can't have a VOIDmode constant inside that. Therefore, I would rather have the middle-end move the constants to registers only when necessary to preserve the mode, and preferrably fold instead in the first place when optimizing.