[Bug target/59710] Nios2: Missing gprel optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59710 --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- I'm thinking this isn't an appropriate kind of patch to backport to 4.9 -- it's a fix for a missed optimization and not a serious bug (wrong code or ICE). Maybe I'm being exceptionally dense here, but I can't figure out how to close an issue or adjust milestones in bugzilla -- all I get when I click on Status is a page with an explanation of what the different statuses mean. Maybe I lack appropriate admin powers to do that?
[Bug target/64377] nios2 compile error in options-save.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 --- Comment #10 from Sandra Loosemore sandra at codesourcery dot com --- Test results do not look good with the new patch; over 7000 new failures on -flto tests in the gcc testsuite alone. :-( I see a lot of lto1: internal compiler error: in operator[], at vec.h:736 0x884aba2 vectree_node*, va_heap, vl_embed::operator[](unsigned int) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/vec.h:736 0x884aba2 vectree_node*, va_heap, vl_ptr::operator[](unsigned int) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/vec.h:1202 0x884aba2 streamer_tree_cache_get_tree /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/tree-streamer.h:112 0x884aba2 streamer_get_pickled_tree(lto_input_block*, data_in*) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/tree-streamer-in.c:1112 0x84fa1fb lto_input_tree_1(lto_input_block*, data_in*, LTO_tags, unsigned int) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto-streamer-in.c:1303 0x84fa5ef lto_input_tree(lto_input_block*, data_in*) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto-streamer-in.c:1349 0x8849997 lto_input_ts_common_tree_pointers /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/tree-streamer-in.c:666 0x8849997 streamer_read_tree_body(lto_input_block*, data_in*, tree_node*) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/tree-streamer-in.c:1044 0x84f9aef lto_read_tree_1 /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto-streamer-in.c:1179 0x84fa069 lto_read_tree /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto-streamer-in.c:1213 0x84fa069 lto_input_tree_1 /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto-streamer-in.c:1332 0x84fa069 lto_input_tree_1(lto_input_block*, data_in*, LTO_tags, unsigned int) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto-streamer-in.c:1283 0x84fa501 lto_input_scc(lto_input_block*, data_in*, unsigned int*, unsigned int*) /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto-streamer-in.c:1237 0x81a8336 lto_read_decls /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto/lto.c:1899 0x81aac1e lto_file_finalize /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto/lto.c:2228 0x81aac1e lto_create_files_from_ids /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto/lto.c:2238 0x81aac1e lto_file_read /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto/lto.c:2279 0x81aac1e read_cgraph_and_symbols /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto/lto.c:2980 0x81aac1e lto_main() /scratch/sandra/nios2-elf-fsf/src/gcc-mainline/gcc/lto/lto.c:3435 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. lto-wrapper: fatal error: nios2-elf-gcc returned 1 exit status compilation terminated.
[Bug target/64231] [5 Regression] SIGSEGV building glibc on aarch64-linux-gnu from r217852
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64231 --- Comment #12 from Sandra Loosemore sandra at codesourcery dot com --- I'm using a 4.7.3 based gcc as the host compiler (built from one of our own CodeBench release branches). Regardless of whether the actual failure is reproducible, if you look at the code I pointed at in comment 7, there is clearly a bug here: if force_const_mem returns NULL, GCC will crash, not just here but in several other places as well.
[Bug target/64377] nios2 compile error in options-save.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 --- Comment #13 from Sandra Loosemore sandra at codesourcery dot com --- I think the new version of the patch in comment 11 is probably OK. I ran the entire gcc testsuite (but not g++, etc yet) and have a couple hundred regressions compared to my r217010 build, but I don't see a pattern of them being obviously lto-related. (I started skimming over them and they look mostly like broken test cases.)
[Bug target/64377] nios2 compile error in options-save.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 --- Comment #15 from Sandra Loosemore sandra at codesourcery dot com --- It looks like the apparent regressions in my test results are actually the result of cascading errors from the test harness (Dejagnu is failing to fully reset state after a test that got an error talking to the board). So I'm sure it's unrelated to the options problem or patch. Martin, thanks a lot for working on this fix! :-)
[Bug target/64377] nios2 compile error in options-save.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 --- Comment #9 from Sandra Loosemore sandra at codesourcery dot com --- I've started running nios2-elf regression tests on hardware to compare against a pre-breakage version from early November; it probably will not be done until tomorrow morning. I've heard that someone is working on the Nios II QEMU port again, but there is nothing available to the general public yet, AFAIK, and I'm not sure of the timeline for that.
[Bug target/64377] nios2 compile error in options-save.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 --- Comment #5 from Sandra Loosemore sandra at codesourcery dot com --- I think complete failure to build GCC for nios2 target due to target-inspecific changes is a serious regression that needs to be addressed for GCC 5 release. Can we up the priority?
[Bug target/64377] nios2 compile error in options-save.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||hubicka at ucw dot cz Assignee|sandra at codesourcery dot com |unassigned at gcc dot gnu.org --- Comment #3 from Sandra Loosemore sandra at codesourcery dot com --- This has been discussed on the gcc-patches mailing list in this thread: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02616.html
[Bug target/64377] nios2 compile error in options-save.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- Consensus from the discussion is that the nios2 backend is *not* broken, this is a bug in the option streaming code. Jan Hubicka offered to fix it but I haven't seen a patch go by yet.
[Bug target/64231] [5 Regression] SIGSEGV building glibc on aarch64-linux-gnu from r217852
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64231 --- Comment #6 from Sandra Loosemore sandra at codesourcery dot com --- This reproduces it for me; my build is at r217852. $ aarch64-linux-gnu-gcc argp-help.i -c -O2 argp-help.c: In function '_help': argp-help.c:1684:1: internal compiler error: Segmentation fault 0x874f9b0 crash_signal /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/toplev.c:359 0x8406cfb plus_constant(machine_mode, rtx_def*, long long, bool) /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/explow.c:120 0x82e2db2 init_alias_analysis() /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/alias.c:2966 0x8b59681 cse_main /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cse.c:6597 0x8b5a5eb rest_of_handle_cse2 /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cse.c:7528 0x8b5a5eb execute /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cse.c:7581 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. When I have a chance, I'll see if I can dig up some more information out of the debugger and/or reduce the testcase enough to allow setting some breakpoints.
[Bug target/64231] [5 Regression] SIGSEGV building glibc on aarch64-linux-gnu from r217852
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64231 --- Comment #7 from Sandra Loosemore sandra at codesourcery dot com --- H. I'm not sure why there's trouble in reproducing the failure, but looking at this some more, it seems like we have a problem with this code fragment from force_const_mem in varasm.c: /* If we're not allowed to drop X into the constant pool, don't. */ if (targetm.cannot_force_const_mem (mode, x)) return NULL_RTX; and the code at the call site in plus_constant in explow.c: tem = force_const_mem (GET_MODE (x), tem); if (memory_address_p (GET_MODE (tem), XEXP (tem, 0))) return tem; which is clearly not expecting force_const_mem to return null. Guarding the reference in the conditional like if (tem memory_address_p (GET_MODE (tem), XEXP (tem, 0))) ... fixes the SEGV, but a quick look shows that there are a lot of other uses of force_const_mem that expect it to return a non-null value, with no checking. So, probably this has nothing to do with the specific change in r217852, but has been a lurking bug for a long time, and it needs more than a band-aid on this one particular call site.
[Bug target/64231] New: SIGSEGV building glibc on aarch64-linux-gnu from r217852
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64231 Bug ID: 64231 Summary: SIGSEGV building glibc on aarch64-linux-gnu from r217852 Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sandra at codesourcery dot com CC: belagod at gcc dot gnu.org Host: i686-pc-linux-gnu Target: aarch64-linux-gnu Build: i686-pc-linux-gnu I have an aarch64-linux-gnu build tree with a glibc checkout from about a month ago (revision 1400983e04d7b4b5a92db79ab27b0d0ec7d8bdef) that has started giving a SEGV when building argp/argp-help.c. I tracked it down to this GCC commit: r217852 | belagod | 2014-11-20 05:58:23 -0800 (Thu, 20 Nov 2014) | 17 lines 2014-11-20 Tejas Belagod tejas.bela...@arm.com gcc/ * config/aarch64/aarch64-protos.h (aarch64_classify_symbol): Fixup prototype. * config/aarch64/aarch64.c (aarch64_expand_mov_immediate, aarch64_cannot_force_const_mem, aarch64_classify_address, aarch64_classify_symbolic_expression): Fixup call to aarch64_classify_symbol. (aarch64_classify_symbol): Add range-checking for symbol + offset addressing for tiny and small models. testsuite/ * gcc.target/aarch64/symbol-range.c: New. * gcc.target/aarch64/symbol-range-tiny.c: New. Here's the info from running the debugger on cc1. Program received signal SIGSEGV, Segmentation fault. plus_constant (mode=DImode, x=0xf78e1f30, c=144, inplace=optimized out, inplace@entry=false) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/explow.c:120 120 if (memory_address_p (GET_MODE (tem), XEXP (tem, 0))) (gdb) print debug_rtx(x) (mem/u/c:DI (symbol_ref/u:DI (*.LC39) [flags 0x2]) [4 S8 A64]) $6 = void (gdb) print tem $7 = (rtx) 0x0 (gdb) bt #0 plus_constant (mode=DImode, x=0xf78e1f30, c=144, inplace=optimized out, inplace@entry=false) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/explow.c:120 #1 0x082e2db3 in init_alias_analysis () at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/alias.c:2966 #2 0x08b59682 in cse_main (nregs=optimized out, f=optimized out) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cse.c:6597 #3 0x08b5a5ec in rest_of_handle_cse2 () at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cse.c:7528 #4 (anonymous namespace)::pass_cse2::execute (this=0x92c99c8) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cse.c:7581 #5 0x0868ca49 in execute_one_pass (pass=pass@entry=0x92c99c8) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/passes.c:2311 #6 0x0868cf16 in execute_pass_list_1 (pass=0x92c99c8) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/passes.c:2363 #7 0x0868cf26 in execute_pass_list_1 (pass=0x92c93c8, pass@entry=0x92c7288) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/passes.c:2364 #8 0x0868cf72 in execute_pass_list (fn=0xf7933dac, pass=0x92c7288) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/passes.c:2374 #9 0x08367a5d in cgraph_node::expand (this=this@entry=0xf794cd20) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cgraphunit.c:1773 #10 0x083692ff in expand_all_functions () at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cgraphunit.c:1909 #11 symbol_table::compile (this=this@entry=0xf7c3b000) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cgraphunit.c:2263 #12 0x0836ae3d in symbol_table::finalize_compilation_unit (this=0xf7c3b000) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/cgraphunit.c:2340 #13 0x081ec3a4 in c_write_global_declarations () at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/c/c-decl.c:10777 #14 0x0874fdce in compile_file () at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/toplev.c:584 #15 0x081d1823 in do_compile () at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/toplev.c:2041 #16 toplev::main (this=this@entry=0xcfff, argc=argc@entry=103, argv=argv@entry=0xd0b4) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/toplev.c:2138 #17 0x081d2175 in main (argc=103, argv=0xd0b4) at /scratch/sandra/aarch64-fsf/src/gcc-mainline/gcc/main.c:38 Hopefully this is enough info to track it down? Seems clear that something in the bad patch started causing force_const_mem to return NULL in this case and the call site in plus_mem is not expecting that.
[Bug target/64231] SIGSEGV building glibc on aarch64-linux-gnu from r217852
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64231 --- Comment #3 from Sandra Loosemore sandra at codesourcery dot com --- Created attachment 34225 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34225action=edit preprocessor output (gzipped) Preprocessor output attached.
[Bug target/64231] SIGSEGV building glibc on aarch64-linux-gnu from r217852
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64231 --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- In case it's also relevant, my GCC was configured with: Configured with: /scratch/sandra/aarch64-fsf/src/gcc-mainline/configure --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu --target=aarch64-linux-gnu --enable-threads --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --with-march=armv8-a --disable-libsanitizer --with-gnu-as --with-gnu-ld --enable-languages=c,c++ --enable-shared --enable-lto --enable-symvers=gnu --enable-__cxa_atexit --with-glibc-version=2.21 --disable-nls --prefix=/scratch/sandra/aarch64-fsf/install --disable-shared --disable-threads --disable-libssp --disable-libgomp --without-headers --with-newlib --disable-decimal-float --disable-libffi --disable-libquadmath --disable-libitm --disable-libatomic --enable-languages=c --with-sysroot=/scratch/sandra/aarch64-fsf/install/aarch64-linux-gnu/libc --with-gmp=/scratch/sandra/aarch64-fsf/obj/pkg-mainline-0-aarch64-linux-gnu/fsf-mainline-0-aarch64-linux-gnu.extras/host-libs-i686-pc-linux-gnu/usr --with-mpfr=/scratch/sandra/aarch64-fsf/obj/pkg-mainline-0-aarch64-linux-gnu/fsf-mainline-0-aarch64-linux-gnu.extras/host-libs-i686-pc-linux-gnu/usr --with-mpc=/scratch/sandra/aarch64-fsf/obj/pkg-mainline-0-aarch64-linux-gnu/fsf-mainline-0-aarch64-linux-gnu.extras/host-libs-i686-pc-linux-gnu/usr --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-isl=/scratch/sandra/aarch64-fsf/obj/pkg-mainline-0-aarch64-linux-gnu/fsf-mainline-0-aarch64-linux-gnu.extras/host-libs-i686-pc-linux-gnu/usr --with-cloog=/scratch/sandra/aarch64-fsf/obj/pkg-mainline-0-aarch64-linux-gnu/fsf-mainline-0-aarch64-linux-gnu.extras/host-libs-i686-pc-linux-gnu/usr --disable-libgomp --disable-libitm --disable-libatomic --disable-libssp --enable-poison-system-directories --with-build-time-tools=/scratch/sandra/aarch64-fsf/install/aarch64-linux-gnu/bin --with-build-time-tools=/scratch/sandra/aarch64-fsf/install/aarch64-linux-gnu/bin SED=sed Thread model: single gcc version 5.0.0 20141120 (experimental) (GCC)
[Bug libstdc++/64203] shared_mutex compile errors on bare-metal targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64203 --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- (In reply to Jonathan Wakely from comment #3) How's this one? Looks better; this version fixes the compile-time errors.
[Bug libstdc++/64203] New: shared_mutex compile errors on bare-metal targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64203 Bug ID: 64203 Summary: shared_mutex compile errors on bare-metal targets Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: sandra at codesourcery dot com CC: 3dw4rd at verizon dot net After merging GCC 4.9.2 onto our local branch, I saw that the new libstdc++ testcase experimental/feat-cxx14.cc is failing on bare-metal targets (arm-none-eabi, mips-sde-elf, powerpc-eabi, possibly others). Here's an excerpt from the test logs: Executing on host: mips-sde-elf-g++ -fdiagnostics-color=never -D_GLIBCXX_ASSERT -fmessage-length=0 -g -O2 -DLOCALEDIR=. -I/scratch/sandra/mips-elf-trunk/src/gcc-trunk-4.9/libstdc++-v3/testsuite/util /scratch/sandra/mips-elf-trunk/src/gcc-trunk-4.9/libstdc++-v3/testsuite/experimental/feat-cxx14.cc -std=gnu++14 -S-mips16 -o feat-cxx14.s(timeout = 600) /scratch/sandra/mips-elf-trunk/src/gcc-trunk-4.9/libstdc++-v3/testsuite/experimental/feat-cxx14.cc:110:4: error: #error __cpp_lib_shared_timed_mutex # error __cpp_lib_shared_timed_mutex ^ In file included from /scratch/sandra/mips-elf-trunk/src/gcc-trunk-4.9/libstdc++-v3/testsuite/experimental/feat-cxx14.cc:13:0: /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:274:36: error: 'defer_lock_t' has not been declared shared_lock(mutex_type __m, defer_lock_t) noexcept ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:277:36: error: 'try_to_lock_t' has not been declared shared_lock(mutex_type __m, try_to_lock_t) ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:277:7: error: 'std::shared_lock_Mutex::shared_lock(std::shared_lock_Mutex::mutex_type, int)' cannot be overloaded shared_lock(mutex_type __m, try_to_lock_t) ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:274:7: error: with 'std::shared_lock_Mutex::shared_lock(std::shared_lock_Mutex::mutex_type, int)' shared_lock(mutex_type __m, defer_lock_t) noexcept ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:280:36: error: 'adopt_lock_t' has not been declared shared_lock(mutex_type __m, adopt_lock_t) ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:280:7: error: 'std::shared_lock_Mutex::shared_lock(std::shared_lock_Mutex::mutex_type, int)' cannot be overloaded shared_lock(mutex_type __m, adopt_lock_t) ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:274:7: error: with 'std::shared_lock_Mutex::shared_lock(std::shared_lock_Mutex::mutex_type, int)' shared_lock(mutex_type __m, defer_lock_t) noexcept ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex: In member function 'void std::shared_lock_Mutex::unlock()': /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:347:29: error: 'errc' was not declared in this scope __throw_system_error(int(errc::resource_deadlock_would_occur)); ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex: In member function 'void std::shared_lock_Mutex::_M_lockable() const': /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:381:29: error: 'errc' was not declared in this scope __throw_system_error(int(errc::operation_not_permitted)); ^ /scratch/sandra/mips-elf-trunk/install/opt/codesourcery/mips-sde-elf/include/c++/4.9.2/shared_mutex:383:29: error: 'errc' was not declared in this scope __throw_system_error(int(errc::resource_deadlock_would_occur)); ^ compiler exited with status 1 I am guessing this part of shared_mutex is to blame: #if defined(_GLIBCXX_HAS_GTHREADS) defined(_GLIBCXX_USE_C99_STDINT_TR1) # include mutex # include condition_variable #endif Because of the guards, I think mutex is not being included and nothing else is providing declarations of the missing identifiers. I don't know my way around this code or the C++ library specification at all, though.
[Bug libstdc++/64203] shared_mutex compile errors on bare-metal targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64203 --- Comment #2 from Sandra Loosemore sandra at codesourcery dot com --- (In reply to Jonathan Wakely from comment #1) Created attachment 34208 [details] fix config macros for shared_lock Does this fix it? No, with this patch I'm still getting the same undefined symbol errors about defer_lock_t, try_to_lock_t, etc. FAOD it looks like on this target _GLIBCXX_USE_C99_STDINT_TR1 is defined but _GLIBCXX_HAS_GTHREADS is not.
[Bug rtl-optimization/62130] ld.exe: nios2_work.elf: Not enough room for program headers, try linking with -N
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62130 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #1 from Sandra Loosemore sandra at codesourcery dot com --- This error is from a very old toolchain version from Altera. The GCC and ld versions now in the FSF repository have changed quite a bit since 2006. If you can reproduce this error using a current toolchain, I think your first step should be to ask Altera for help since you are using their linker script and BSP library. If it turns out to be an actual linker bug, it should be reported in the binutils bug tracker (not GCC) with a complete reduced test case so that it can be reproduced and investigated.
[Bug target/59710] Nios2: Missing gprel optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59710 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #1 from Sandra Loosemore sandra at codesourcery dot com --- I'm working on a patch that will fix this by adding some additional choices for -mgpopt.
[Bug libquadmath/55821] Release tarballs (unconditionally) install libquadmath.info when libquadmath is not supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55821 --- Comment #9 from Sandra Loosemore sandra at codesourcery dot com --- Yes, that patch (with regenerated Makefile.in) did the trick. Thanks. config.log says my configure line is: $ /scratch/sandra/arm-fsf/src/gcc-mainline/libquadmath/configure --srcdir=/scr atch/sandra/arm-fsf/src/gcc-mainline/libquadmath --cache-file=./config.cache --e nable-multilib --with-cross-host=i686-pc-linux-gnu --enable-threads --disable-li bmudflap --disable-libstdcxx-pch --with-gnu-as --with-gnu-ld --enable-shared --e nable-lto --enable-symvers=gnu --enable-__cxa_atexit --with-glibc-version=2.19 - -disable-nls --prefix=/scratch/sandra/arm-fsf/install --with-sysroot=/scratch/sa ndra/arm-fsf/install/arm-none-linux-gnueabi/libc --with-host-libstdcxx=-static-l ibgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm --enable-libgomp --enable-libitm --ena ble-libatomic --disable-libssp --enable-poison-system-directories --with-build-t ime-tools=/scratch/sandra/arm-fsf/install/arm-none-linux-gnueabi/bin --enable-la nguages=c,c++,fortran,lto --program-transform-name=s^arm-none-linux-gnueabi- --disable-option-checking --with-target-subdir=arm-none-linux-gnueabi --build=i6 86-pc-linux-gnu --host=arm-none-linux-gnueabi --target=arm-none-linux-gnueabi
[Bug sanitizer/59009] libsanitizer merge from upstream r191666 breaks bootstrap on powerpc64-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59009 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #43 from Sandra Loosemore sandra at codesourcery dot com --- I'm seeing the same errors as in Comment 8 (complaints about arrays with negative sizes) when building a cross for aarch64-linux-gnu from mainline head. I'm confused about the status of this issue -- is there an uncommitted patch out there somewhere?
[Bug libquadmath/55821] Release tarballs (unconditionally) install libquadmath.info when libquadmath is not supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55821 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #7 from Sandra Loosemore sandra at codesourcery dot com --- Trying to build an arm-none-linux-gnueabi cross from mainline head, I'm getting this error now: /scratch/sandra/arm-fsf/src/gcc-mainline/libquadmath/libquadmath.texi:369: @include `libquadmath-vers.texi': No such file or directory. /scratch/sandra/arm-fsf/src/gcc-mainline/libquadmath/libquadmath.texi:374: warning: undefined flag: BUGURL. Reverting the patch from r216027 makes it work again. I don't see anything named BUILD_LIBQUADMATH coming out of configure, but in config.log I do see: BUILD_LIBQUADMATH_FALSE='' BUILD_LIBQUADMATH_TRUE='#' Are you sure that this patch was actually tested with a clean build directory where libquadmath-vers.texi was not already present from a previous build?
[Bug debug/62225] DW_AT_location for local variable is missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62225 --- Comment #5 from Sandra Loosemore sandra at codesourcery dot com --- Thinking about this some more Why doesn't -g always enable -fvar-tracking by default? It's currently only enabled if you specify both -g and -O.
[Bug debug/62225] DW_AT_location for local variable is missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62225 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- I believe this bug is responsible for the GDB gdb.base/restore.exp test failures reported in the GDB bug tracker (issues 16655 and 17019). There are many such failures for arm-none-eabi with -mthumb.
[Bug ipa/61160] [4.9/4.10 Regression] wrong code with -O3 (or ICE: verify_cgraph_node failed: edge points to wrong declaration)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61160 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #17 from Sandra Loosemore sandra at codesourcery dot com --- I'm seeing segfaults from pr61160-2.C and pr61160-3.C on arm-none-linux-gnueabi with -mthumb -- probably the same trouble reported by Christophe earlier. I believe this is a binutils issue. I reported the details here: https://sourceware.org/bugzilla/show_bug.cgi?id=17444
[Bug lto/61526] relocation R_X86_64_PC32 in shared object with static and extern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61526 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #3 from Sandra Loosemore sandra at codesourcery dot com --- The testcase added for this bug in GCC 4.9.1 is failing with a link error on an arm-none-eabi target that doesn't support -shared. Should it be restricted to linux targets?
[Bug target/61610] ICE in assign_by_spills, at lra-assigns.c:1335
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61610 --- Comment #1 from Sandra Loosemore sandra at codesourcery dot com --- Hmmm, this looks like a bug in LRA exposed by the change to register alloc order. In particular this comment in the code just above the assertion seems to reflect an incorrect assumption: /* We did not assign hard regs to reload pseudos after two iteration. It means something is wrong with asm insn constraints. Report it. */ since there is no inline asm in the test case.
[Bug middle-end/60102] powerpc fp-bit ices at dwf_regno
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60102 --- Comment #9 from Sandra Loosemore sandra at codesourcery dot com --- I've been looking at this a little bit more. DWARF_FRAME_REGNUM is specifically documented to take a hard register number as its operand, so the assertion in dwf_regno is at least consistent with that. The one in dbx_reg_number is more dubious, since neither LEAF_REG_REMAP or DBX_REGISTER_NUMBER are documented to require a hard register number. So: either the powerpc backend is broken to be using a pseudo in this context, or else the documentation for DWARF_FRAME_REGNUM should be changed to permit this and the assertions (as necessary) moved into the target-specific implementations of these macros.
[Bug libstdc++/60758] Infinite backtrace in __cxa_end_cleanup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60758 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #5 from Sandra Loosemore sandra at codesourcery dot com --- The patch committed as r210215 is broken for -mthumb on arm-none-eabi: libtool: compile: /scratch/sandra/arm-fsf/obj/gcc-mainline-0-arm-none-eabi-i686-pc-linux-gnu/./gcc/xgcc -shared-libgcc -B/scratch/sandra/arm-fsf/obj/gcc-mainline-0-arm-none-eabi-i686-pc-linux-gnu/./gcc -nostdinc++ -L/scratch/sandra/arm-fsf/obj/gcc-mainline-0-arm-none-eabi-i686-pc-linux-gnu/arm-none-eabi/thumb/libstdc++-v3/src -L/scratch/sandra/arm-fsf/obj/gcc-mainline-0-arm-none-eabi-i686-pc-linux-gnu/arm-none-eabi/thumb/libstdc++-v3/src/.libs -L/scratch/sandra/arm-fsf/obj/gcc-mainline-0-arm-none-eabi-i686-pc-linux-gnu/arm-none-eabi/thumb/libstdc++-v3/libsupc++/.libs -B/scratch/sandra/arm-fsf/install/arm-none-eabi/bin/ -B/scratch/sandra/arm-fsf/install/arm-none-eabi/lib/ -isystem /scratch/sandra/arm-fsf/install/arm-none-eabi/include -isystem /scratch/sandra/arm-fsf/install/arm-none-eabi/sys-include -mthumb -I/scratch/sandra/arm-fsf/src/gcc-mainline/libstdc++-v3/../libgcc -I/scratch/sandra/arm-fsf/obj/gcc-mainline-0-arm-none-eabi-i686-pc-linux-gnu/arm-none-eabi/thumb/libstdc++-v3/include/arm-none-eabi -I/scratch/sandra/arm-fsf/obj/gcc-mainline-0-arm-none-eabi-i686-pc-linux-gnu/arm-none-eabi/thumb/libstdc++-v3/include -I/scratch/sandra/arm-fsf/src/gcc-mainline/libstdc++-v3/libsupc++ -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual -Wabi -fdiagnostics-show-location=once -ffunction-sections -fdata-sections -frandom-seed=eh_arm.lo -g -O2 -mthumb -c /scratch/sandra/arm-fsf/src/gcc-mainline/libstdc++-v3/libsupc++/eh_arm.cc -o eh_arm.o /tmp/cchJLQxH.s: Assembler messages: /tmp/cchJLQxH.s:26: Error: invalid register list to push/pop instruction -- `pop {r1,r2,r3,lr}' It looks to me like lr is valid in the reglist for PUSH but not POP on Thumb. Since this breaks builds, please either fix ASAP or revert the broken patch.
[Bug target/44557] internal compiler error: in gen_thumb_movhi_clobber, at config/arm/arm.md:5811
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44557 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #9 from Sandra Loosemore sandra at codesourcery dot com --- The testcase (as packaged with Chung-Lin's patch) no longer fails on 4.9.0 or mainline. Has the problem gone away due to all the register allocator changes since it was initially reported against GCC 4.5, or is it simply being masked by them?
[Bug sanitizer/61021] [4.9/4.10 regression] libsanitizer fails to build with old glibc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61021 --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- Patch has been committed to llvm libsanitizer trunk: http://llvm.org/viewvc/llvm-project?view=revisionrevision=208066
[Bug sanitizer/61021] [4.9 regression] libsanitizer fails to build with old glibc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61021 --- Comment #3 from Sandra Loosemore sandra at codesourcery dot com --- Patch sent to llvm-commits. For now I can unblock my work by applying the patch locally, but this isn't something we'd want to carry around permanently and have to apply to future versions of GCC, especially if it is the wrong way to solve the problem.
[Bug middle-end/60102] powerpc fp-bit ices at dwf_regno
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60102 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #7 from Sandra Loosemore sandra at codesourcery dot com --- I ran into the same problem compiling fp-bit.c. Cesar's patch isn't enough to fix my build on its own -- I also had to revert the r199132 patch Sebastian pointed at in comment 4 to avoid another assertion failure in dbx_reg_number() later on.
[Bug sanitizer/61021] New: [4.9 regression] libsanitizer fails to build with old glibc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61021 Bug ID: 61021 Summary: [4.9 regression] libsanitizer fails to build with old glibc Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: sandra at codesourcery dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org Host: i686-pc-linux-gnu Target: i686-pc-linux-gnu Build: i686-pc-linux-gnu Created attachment 32718 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32718action=edit patch to conditionalize references We build a native i686-pc-linux-gnu toolchain against a relatively ancient sysroot (glibc 2.4) so that the resulting binaries will work on a variety of older GNU/Linux distros. GCC 4.9 is now failing to build this configuration due to references to undefined symbols PTRACE_GETSIGINFO and PTRACE_SETSIGINFO in libsanitizer. I see that in other issues the maintainers have suggested disabling libsanitizer in cases where the kernel/glibc version is too old for it to build, but this looks like a regression to me: it used to work in GCC 4.8. The attached patch is sufficient to get it to at least build again, and it's consistent with the way PTRACE_GETREGSET and PTRACE_SETREGSET are being handled. libsanitizer/README.gcc says Trivial and urgent fixes (portability, build fixes, etc.) may go directly to the GCC tree. Does this one qualify under that policy? If not, I'll have to echo what has already been suggested elsewhere: the minimum kernel/glibc requirements for libsanitizer need to be documented and enforced by the configure scripts if possible.
[Bug target/59393] New: [4.8/4.9 regression] mips16 code size
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59393 Bug ID: 59393 Summary: [4.8/4.9 regression] mips16 code size Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sandra at codesourcery dot com Created attachment 31383 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31383action=edit bf_enc.i The attached test case (part of blowfish) is compiled for MIPS16 with -Os -DNDEBUG -fno-schedule-insns2 -mno-check-zero-division -fno-common -fsection-anchors -fno-shrink-wrap -ffunction-sections -mips16 In GCC 4.7, this produced 2152 bytes of code. In GCC 4.8, it produces 2396 bytes. On mainline head, it's 2384. In both cases, the code growth is coming from reload blowing up. I tracked the 4.8 regression down to two distinct changes: (1) This patch for PR54109 http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00617.html removed forward-propagation of constant offsets into address expressions. No subsequent pass is making that optimization so the code requires one extra register that is live essentially through the whole function by the time it gets to reload. (2) The hoist pass seems to be significantly underestimating register pressure and adds 9 more pseudos with long lifetimes. (This might be a red herring, but why is it only considering GR_REGS as a pressure class and not M16_REGS?) By reverting the above patch and disabling the hoist pass, I was able to get the same code size on 4.8 as 4.7. On mainline head, there's something else going on as well, as this brought the code size down only halfway, to 2280 bytes. I haven't yet analyzed where the remaining bad code is coming from, but if I had to make a wild stab in the dark, I'd guess that if the register pressure calculation is wrong in hoist it may be wrong in other places as well. In any case, reverting a patch that fixes a correctness bug and disabling the hoist pass is clearly not an acceptable solution. Any suggestions on the right way to fix the two already-identified problems?
[Bug middle-end/23623] volatile keyword changes bitfield access size from 32bit to 8bit
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23623 --- Comment #17 from Sandra Loosemore sandra at codesourcery dot com --- Updated patch series: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02057.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02058.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02059.html Unfortunately, it seems that fixing bugs with -fstrict-volatile-bitfields has been blocked by disagreement between global reviewers and target maintainers who can't agree on whether the C/C++11 memory model should take precedence over target-specific ABIs by default. :-(
[Bug middle-end/48784] #pragma pack(1) + -fstrict-volatile-bitfields = bad codegen
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48784 --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- Updated patch series: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02057.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02058.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02059.html Unfortunately, it seems that fixing bugs with -fstrict-volatile-bitfields has been blocked by disagreement between global reviewers and target maintainers who can't agree on whether the C/C++11 memory model should take precedence over target-specific ABIs by default. :-(
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #13 from Sandra Loosemore sandra at codesourcery dot com --- Updated patch series: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02057.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02058.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02059.html Unfortunately, it seems that fixing bugs with -fstrict-volatile-bitfields has been blocked by disagreement between global reviewers and target maintainers who can't agree on whether the C/C++11 memory model should take precedence over target-specific ABIs by default. :-(
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 --- Comment #11 from Sandra Loosemore sandra at codesourcery dot com --- Updated patch series: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02057.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02058.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02059.html Unfortunately, it seems that fixing bugs with -fstrict-volatile-bitfields has been blocked by disagreement between global reviewers and target maintainers who can't agree on whether the C/C++11 memory model should take precedence over target-specific ABIs by default. :-(
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 --- Comment #8 from Sandra Loosemore sandra at codesourcery dot com --- Thanks for giving it a try. Do you think that in a case such as this where a single access of the appropriate size cannot be generated due to the struct having unaligned fields we should generate the same code as with -fno-strict-volatile-bitfields, or something else? I agree the behavior of my current patch is problematical here, but we need to decide what this case is supposed to do before I can figure out how to fix the code.
[Bug middle-end/23623] volatile keyword changes bitfield access size from 32bit to 8bit
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23623 --- Comment #16 from Sandra Loosemore sandra at codesourcery dot com --- Patch that fixes regression posted here: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00750.html
[Bug middle-end/48784] #pragma pack(1) + -fstrict-volatile-bitfields = bad codegen
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48784 --- Comment #3 from Sandra Loosemore sandra at codesourcery dot com --- Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00750.html
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #12 from Sandra Loosemore sandra at codesourcery dot com --- Patch for the first problem posted here: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00750.html
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 --- Comment #5 from Sandra Loosemore sandra at codesourcery dot com --- Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00750.html
[Bug middle-end/48784] #pragma pack(1) + -fstrict-volatile-bitfields = bad codegen
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48784 --- Comment #2 from Sandra Loosemore sandra at codesourcery dot com --- I'm working on a fix for this.
[Bug middle-end/23623] volatile keyword changes bitfield access size from 32bit to 8bit
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23623 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #15 from Sandra Loosemore sandra at codesourcery dot com --- This bug regressed sometime around GCC 4.7, when the C++ bitfield range support was added. I'm working on a fix that makes it work again in conjunction with -fstrict-volatile-bitfields.
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #4 from Sandra Loosemore sandra at codesourcery dot com --- I'm working on a fix for this.
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #10 from Sandra Loosemore sandra at codesourcery dot com --- I'm working on a new patch that addresses the first problem, the failure in test(). I think the second failure is not in test1() at all, and has nothing to do with -fstrict-volatile-bitfields. Looks to me like problem is that the expression x1-t1 is returning an unaligned pointer due to the packed attribute on struct test2. It should probably not be allowed to take the address of a packed struct field, at least on targets that require strict alignment. H, that bug is already filed as PR 41809. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41809
[Bug target/56564] movdqa on possibly-8-byte-aligned struct with -O3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot com --- Comment #11 from Sandra Loosemore sandra at codesourcery dot com --- This affects at least PowerPC, too, which implements DATA_ALIGNMENT to add additional alignment beyond that specified by the ABI. Isn't TYPE_ALIGN already supposed to return the ABI-mandated alignment for objects of a given type? The documentation for DATA_ALIGNMENT already suggests that its purpose is to add additional alignment for optimization purposes and I suspect other targets may be using it that way, too. Perhaps what's needed here is more careful monitoring of the places where DATA_ALIGNMENT is being used, rather than splitting it into two macros or adding an argument to control the two uses. Or at least, we'd have to clarify how the requirements for the ABI-conforming use of DATA_ALIGNMENT differ from what TYPE_ALIGN is supposed to do. It seems to me that DATA_ALIGNMENT's original purpose was to add additional alignment on variable definitions, and IIUC the problem now is either that it is being used in other contexts or that its intended use is not taking into account common, weak, and/or comdat definitions where the linker may substitute a less-aligned definition from another compilation unit. Also, somebody should check whether vect_can_force_dr_alignment_p in tree-vect-data-refs.c is catching all the cases it needs to for ABI conformance.
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot ||com --- Comment #5 from Sandra Loosemore sandra at codesourcery dot com 2013-02-18 15:30:16 UTC --- The patch linked from the initial message was rejected. I did not (and still do not) have the time to rewrite it; if someone else can figure out how to fix this in a way that's acceptable to the maintainers, that would be great.
[Bug middle-end/48784] #pragma pack(1) + -fstrict-volatile-bitfields = bad codegen
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48784 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot ||com --- Comment #1 from Sandra Loosemore sandra at codesourcery dot com 2012-08-18 01:11:12 UTC --- I just checked the test case on mainline head for a couple other builds I have handy. ARM EABI prints 1ff whether or not you compile with -fstrict-volatile-bitfields. MIPS ELF prints 0 with -fstrict-volatile-bitfields and fff without.
[Bug target/53633] __attribute__((naked)) should disable -Wreturn-type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53633 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot ||com --- Comment #2 from Sandra Loosemore sandra at codesourcery dot com 2012-07-21 20:06:29 UTC --- Paul Brook previously posted a patch for this, but it was never completed or committed: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01088.html I'm going to see if I can do anything to address the previous review comments. Also, that patch doesn't address similar problems in the C++ front end.
[Bug c++/51910] [4.7 Regression] -frepo linking failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910 --- Comment #20 from Sandra Loosemore sandra at codesourcery dot com 2012-02-17 18:51:48 UTC --- Apropos of the complaint that -frepo produces smaller executables than relying just on the linker discarding duplicate COMDAT groups I finally got around to packing up and submitting this linker patch that's been in my pile for a while. http://sourceware.org/ml/binutils/2012-02/msg00146.html
[Bug c++/51910] [4.7 Regression] -frepo linking failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910 --- Comment #17 from Sandra Loosemore sandra at codesourcery dot com 2012-01-30 00:12:53 UTC --- Cleaned up version of patch, with Jason's test case. http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01591.html
[Bug c++/51910] [4.7 Regression] -frepo linking failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910 --- Comment #16 from Sandra Loosemore sandra at codesourcery dot com 2012-01-29 04:50:12 UTC --- Created attachment 26498 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26498 new patch The attached patch seems to DTRT; I tested it also with explicit -Wl,--demangle and -Wl,--no-demangle on the command line, and -Wl,-Map=wa.map. Regression-testing now, and trying to figure out how to wrap up the test case for dejagnu.
[Bug c++/51910] [4.7 Regression] -frepo linking failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910 --- Comment #11 from Sandra Loosemore sandra at codesourcery dot com 2012-01-27 15:31:14 UTC --- I like the first patch too. Since -frepo seems to depend on telling the linker not to demangle, better to just say so. I'm not familiar with the overall code flow here. Does -frepo end up doing the final link with the demangling setting requested by the user, or does this change mean it always implies --no-demangle? E.g., if I specify both -frepo and -Wl,-Map expecting to get a demangled map file, will I get one?
[Bug c++/51910] [4.7 Regression] -frepo linking failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910 --- Comment #13 from Sandra Loosemore sandra at codesourcery dot com 2012-01-27 20:14:22 UTC --- Sigh. I think it would be OK to make -frepo imply --no-demangle, and document that this is the case. If my previous patch is reverted, that'll still leave -frepo broken on Windows hosts (because the logic in collect2 to disable demangling in ld was broken on Windows) in addition to re-breaking mapfile demangling on both Windows and Posix in the normal case where you aren't using -frepo. WDYT?
[Bug c++/51910] [4.7 Regression] -frepo linking failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910 --- Comment #15 from Sandra Loosemore sandra at codesourcery dot com 2012-01-27 23:22:45 UTC --- I've just dug around in the code a bit and I think we can fix this. I don't have a build tree to use for this set up at the moment, but roughly: the loop to attempt relinking after processing repo files is in do_tlink. Move the tlink_execute call at the bottom of the loop to the top and add --no-demangle. Add another tlink_execute call without --no-demangle after the end of the loop (but still in the if (read_repo_files ... condition). That means you'll do two extra link steps when processing repo files, but incurs no extra overhead in the normal case. I'll play with that over the weekend unless somebody points out that it's a dumb idea that won't work. :-P
[Bug c++/51910] [4.7 Regression] -frepo linking failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51910 --- Comment #7 from Sandra Loosemore sandra at codesourcery dot com 2012-01-23 23:14:18 UTC --- In addition to specifying an explicit command-line option, I think that if you configure GCC with --with-demangler-in-ld=no it'll restore the previous behavior, at least on systems where the previous behavior actually worked. See http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01368.html for the discussion of the various bugs I found here. If proper operation of -frepo depends on particular linker demangling options that ought to be documented, at least.
[Bug rtl-optimization/49936] [4.7 Regression] IRA handles CANNOT_CHANGE_MODE_CLASS poorly, + spills to memory on 4.7
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49936 --- Comment #10 from Sandra Loosemore sandra at codesourcery dot com 2012-01-05 17:31:39 UTC --- My notes are that the unnecessary register moves in the loop have been present since at least GCC 4.3, so it is not a 4.6-4.7 regression, at least.
[Bug rtl-optimization/49936] [4.7 Regression] IRA handles CANNOT_CHANGE_MODE_CLASS poorly, + spills to memory on 4.7
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49936 Sandra Loosemore sandra at codesourcery dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|FIXED | --- Comment #7 from Sandra Loosemore sandra at codesourcery dot com 2011-10-10 14:34:11 UTC --- The additional spills to memory on 4.7 compared to 4.6 were fixed (at least the last time I checked), but there is still a real problem with poor RA decisions resulting from CANNOT_CHANGE_MODE_CLASS. So, let's please not mark this issue resolved.
[Bug rtl-optimization/49936] [4.7 Regression] IRA handles CANNOT_CHANGE_MODE_CLASS poorly, + spills to memory on 4.7
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49936 --- Comment #3 from Sandra Loosemore sandra at codesourcery dot com 2011-08-16 04:13:02 UTC --- Hmmm. Is it possible to make the INT/memory/whatever decision based on move costs? Or use a target hook to supply a hint about what to do?
[Bug rtl-optimization/49936] New: [4.7 regression] IRA handles CANNOT_CHANGE_MODE_CLASS poorly, + spills to memory on 4.7
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49936 Summary: [4.7 regression] IRA handles CANNOT_CHANGE_MODE_CLASS poorly, + spills to memory on 4.7 Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: san...@codesourcery.com CC: vmaka...@redhat.com Created attachment 24885 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24885 test case abstest.c Consider the attached test case, compiled for MIPS with mipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops -march=74kf1_1 -S abstest.c On MIPS, the hardware floating-point abs and neg instructions aren't usable by default because they do the wrong thing with NaNs. And, the sign-bit-twiddling used by the optabs.c expansions can't be performed in a floating-point register because of CANNOT_CHANGE_MODE_CLASS. With a GCC 4.6 compiler, this snippet of code from the test1 function for (i=0; in; i++) { accum -= a[i]; } accum = fabs (accum); return accum; produces ... .L3: mtc1$3,$f2 ldc1$f0,0($5) addiu$5,$5,8 mtc1$2,$f3 sub.d$f2,$f2,$f0 mfc1$3,$f2 bne$5,$4,.L3 mfc1$2,$f3 ext$5,$2,0,31 move$4,$3 .L2: mtc1$4,$f0 j$31 mtc1$5,$f1 ... Because it thinks it cannot use a floating-point register, IRA has decided to put accum in a general-purpose register pair $2/$3, and is shuffling it back and forth to $f2/$f3 on every iteration of the loop. On 4.7 mainline trunk, it's now deciding accum must live in memory instead of a register: .L3: ldc1$f0,0($2) addiu$2,$2,8 sub.d$f2,$f2,$f0 bne$2,$3,.L3 sdc1$f2,0($sp) lw$2,0($sp) ext$3,$2,0,31 lw$2,4($sp) .L2: sw$2,4($sp) sw$3,0($sp) lw$3,4($sp) lw$2,0($sp) addiu$sp,$sp,8 mtc1$3,$f0 j$31 mtc1$2,$f1 I think a big part of the problem here is the code in ira-costs.c that refuses to consider the FP regs at all in computing the costs for where to put the accum variable. Better that it should just add in the move costs for reloading to some other register class, much as it would to satisfy normal insn register constraints. Naively commenting out all the #ifdef CANNOT_CHANGE_MODE_CLASS#endif instances in ira-costs.c gave this code in 4.6: .L3: ldc1$f2,0($2) addiu$2,$2,8 bne$2,$4,.L3 sub.d$f0,$f0,$f2 mfc1$2,$f0 mfc1$3,$f0 ext$5,$2,0,31 move$4,$3 .L2: mtc1$4,$f0 j$31 mtc1$5,$f1 However, same change on 4.7 didn't help; it's still preferring to spill to memory. I think there must be some other bug lurking here that's responsible for these additional memory spills on 4.7. I also saw them when experimenting with a patch to the MIPS backend to attack this problem in a target-specific way. Also, while splitting live ranges might help with the code in the test1 function where the fabs call appears outside the loop, the code for the test2 function (fabs in the body of the loop) suffers from the same problem and is spilling to memory on 4.7.
[Bug rtl-optimization/49936] [4.7 regression] IRA handles CANNOT_CHANGE_MODE_CLASS poorly, + spills to memory on 4.7
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49936 --- Comment #1 from Sandra Loosemore sandra at codesourcery dot com 2011-08-01 19:44:34 UTC --- Created attachment 24886 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24886 WIP patch to MIPS backend Here is the WIP patch I referred to earlier. This patch (which handles abs only) produced pretty good code on 4.6 by adding MIPS-specific expansions of abs with explicit register constraints, but didn't fix the memory spills on 4.7. Since this is pretty complicated, I think it would be better to fix IRA to deal better with CANNOT_CHANGE_MODE_CLASS in a target-inspecific way than continue farther down this route.
[Bug tree-optimization/39604] [4.3/4.4/4.5 Regression] tree-ssa-sink breaks stack layout
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39604 --- Comment #23 from Sandra Loosemore sandra at codesourcery dot com 2011-06-01 17:34:56 UTC --- Draft patch that addresses this bug here: http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02029.html
[Bug middle-end/42505] [4.4/4.5/4.6 Regression] loop canonicalization causes a lot of unnecessary temporary variables
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42505 --- Comment #13 from Sandra Loosemore sandra at codesourcery dot com 2010-10-01 15:01:08 UTC --- I think this bug is fixed now.
[Bug middle-end/29256] [4.3/4.4/4.5/4.6 regression] loop performance regression
--- Comment #38 from sandra at codesourcery dot com 2010-07-21 16:08 --- On reading the code again, I think the -7 is coming from the can_autoinc case in determine_use_iv_cost_address. I also think it is correct to prefer autoinc. E.g., here's the generated code for the loop in r161843: .L2: addi 11,8,9216 ldx 0,10,9 stdx 0,11,9 addi 9,9,8 bdnz .L2 and in r161844: .L2: ldu 0,8(11) stdu 0,8(9) bdnz .L2 I'm no expert on powerpc architecture, but 3 instructions versus 5 looks like a win to me. Bit-rotten test case? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256
[Bug middle-end/29256] [4.3/4.4/4.5/4.6 regression] loop performance regression
--- Comment #35 from sandra at codesourcery dot com 2010-07-21 04:16 --- Created an attachment (id=21274) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21274action=view) -fdump-tree-ivopts-details output from r161843 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256
[Bug middle-end/29256] [4.3/4.4/4.5/4.6 regression] loop performance regression
--- Comment #36 from sandra at codesourcery dot com 2010-07-21 04:16 --- Created an attachment (id=21275) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21275action=view) -fdump-tree-ivopts-details output from r161844 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256
[Bug middle-end/29256] [4.3/4.4/4.5/4.6 regression] loop performance regression
--- Comment #37 from sandra at codesourcery dot com 2010-07-21 04:21 --- It seems like the change was introduced by my patch for PR42505 in r161844. But, it is correctly choosing the lower-cost candidate set -- the problem is in the cost model, which was unchanged from r161843. Take a look at the Use-candidate costs section of the dump. Those costs with negative values (like -7) look very suspicious to me. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256
[Bug tree-optimization/39839] [4.3/4.4/4.5/4.6 regression] loop invariant motion causes stack spill
--- Comment #14 from sandra at codesourcery dot com 2010-07-13 16:13 --- There are two patches that made the difference: r158189 (Carrot's patch for PR42601) and r162043 (the second part of my patch for PR42505). I checked that backporting these two changes to the 4.5 branch is sufficient to fix the code size regression on this example there, too. I posted the test case patch here: http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01070.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39839
[Bug rtl-optimization/39837] [4.3/4.4/4.5 regression] extra spills due to RTL LICM
--- Comment #17 from sandra at codesourcery dot com 2010-07-13 17:13 --- As a point of clarification, I am not getting paid to care about this issue either. :-) At this time I have no plans to continue working on it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39837
[Bug rtl-optimization/39837] [4.3/4.4/4.5 regression] extra spills due to RTL LICM
--- Comment #14 from sandra at codesourcery dot com 2010-07-11 17:47 --- Yes, it looks like the prototype fix for PR 36758 fixes the test case at the top of this issue. The patch needs a little updating, though, and I can't say I grok the changes to the surrounding code sufficiently to be sure I've gotten it right. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39837
[Bug rtl-optimization/39837] [4.3/4.4/4.5 regression] extra spills due to RTL LICM
--- Comment #11 from sandra at codesourcery dot com 2010-07-10 21:07 --- I just checked to see if this is still a problem. As of r162042, the example in comment #1 produces the same (bad) output as GCC 4.4.1. However, the example in comment #4 looks fixed to me, with this output: test: push{r0, r1, r2, lr} mov r3, #0 str r3, [sp, #4] .L2: add r0, sp, #4 bl func ldr r3, [sp, #4] cmp r3, #12 ble .L2 @ sp needed for prologue pop {r0, r1, r2, pc} As it was the latter test case that caused this to be marked as a duplicate of PR 36758, maybe the original test case is tripping over a different problem and needs to be re-examined? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39837
[Bug tree-optimization/39839] [4.3/4.4/4.5/4.6 regression] loop invariant motion causes stack spill
--- Comment #13 from sandra at codesourcery dot com 2010-07-11 01:22 --- Some further analysis: The part of my PR42505 patch that made the difference was the change to estimate_register_pressure_cost in cfgloopanal.c, to make it exclude the call-clobbered registers. This part was finally committed separately in a revised version as r162043. I'm still looking into what to do about the test case and 4.5 backport. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39839
[Bug middle-end/44838] [4.6 regression] FAIL: gcc.dg/pr39794.c
--- Comment #2 from sandra at codesourcery dot com 2010-07-06 15:57 --- s/caused by/exposed by/ ? The patch to ivopts likely results in it selecting a different/smaller set of loop induction variables, but I don't see how this change by itself could have introduced a wrong-code error. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug middle-end/44838] [4.6 regression] FAIL: gcc.dg/pr39794.c
--- Comment #4 from sandra at codesourcery dot com 2010-07-06 21:10 --- Well, I'm *trying* to investigate but I haven't been able to reproduce the problem yet. I checked out r161844 and built for i686-pc-linux-gnu, and the gcc.dg/pr39794.c execution test passes. If this requires some other target and/or options to trigger the failure, can you be more specific about what they are? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug middle-end/44838] [4.6 regression] FAIL: gcc.dg/pr39794.c
--- Comment #7 from sandra at codesourcery dot com 2010-07-07 00:42 --- Hmmm. It's possible I built my toolchain incorrectly, but I'm seeing that it aborts when compiled with -m64 but not with -m32. The failure mode looks identical to that reported in PR39794: (gdb) print a $1 = {0, 1, 4, 2, 10, 12, 24, 44, 72, 92, 60, 34, 244, 47, 58, 291} (gdb) print ref $2 = {0, 1, 4, 2, 10, 12, 24, 44, 72, 136, 232, 416, 736, 1296, 2304, 2032} This slightly modified version of the test case fails when compiled with -m64 -O2 -funroll-loops -fno-ivopts: extern void abort (); void foo (int *a, int n) { int *lasta = a + n; for (; a != lasta; a++) { *a *= 2; a[1] = a[-1] + a[-2]; } } int a[16]; int ref[16] = { 0, 1, 4, 2, 10, 12, 24, 44, 72, 136, 232, 416, 736, 1296, 2304, 2032 }; int main () { int i; for (i = 0; i 16; i++) a[i] = i; foo (a + 2, 16 - 3); for (i = 0; i 16; i++) if (ref[i] != a[i]) abort (); return 0; } So, not an ivopts problem at all? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug middle-end/44838] [4.6 regression] FAIL: gcc.dg/pr39794.c
--- Comment #9 from sandra at codesourcery dot com 2010-07-07 01:09 --- Yes, this is on an Ubuntu system, but one of my co-workers says GCC multilibs work with Ubuntu now; the support is in gcc/config/i386/t-linux64. Me, I'm clueless about anything configury-related. :-( I can try again on another machine, but this being my third try already, I'm not terribly confident I'll get it right the next time, either. Frankly I do not see what effect Ubuntu vs non-Ubuntu multilib arrangements would have to do with ivopts behavior anyway. Can you try out my -fno-ivopts example in the configuration you found the original problem in? That would rule out my cluelessness in configuring the toolchain as a source of differing behavior, at least. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44838
[Bug tree-optimization/39839] [4.3/4.4/4.5/4.6 regression] loop invariant motion causes stack spill
--- Comment #10 from sandra at codesourcery dot com 2010-06-22 16:26 --- It looks like this bug has been fixed by my proposed patch for PR42505: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01920.html Applying that patch to r160755 gives: test: 0: b570push{r4, r5, r6, lr} 2: 4d0cldr r5, [pc, #48] ; (34 test+0x34) 4: 1c04addsr4, r0, #0 6: 6806ldr r6, [r0, #0] 8: 447dadd r5, pc a: e00fb.n 2c test+0x2c c: 6861ldr r1, [r4, #4] e: 1c2baddsr3, r5, #0 10: 780aldrbr2, [r1, #0] 12: 2a00cmp r2, #0 14: d101bne.n 1a test+0x1a 16: 4b08ldr r3, [pc, #32] ; (38 test+0x38) 18: 447badd r3, pc 1a: 4808ldr r0, [pc, #32] ; (3c test+0x3c) 1c: 1989addsr1, r1, r6 1e: 4478add r0, pc 20: 1c32addsr2, r6, #0 22: f7ff fffe bl 0 func 26: 6823ldr r3, [r4, #0] 28: 3b01subsr3, #1 2a: 6023str r3, [r4, #0] 2c: 6823ldr r3, [r4, #0] 2e: 2b00cmp r3, #0 30: daecbge.n c test+0xc 32: bd70pop {r4, r5, r6, pc} 34: 0028.word 0x0028 38: 001c.word 0x001c 3c: 001a.word 0x001a So, back down to 64 bytes of code, and no spills to stack. Assuming the PR42505 patch is approved, probably the only thing required to close this issue is checking in the additional test case. -- sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39839
[Bug tree-optimization/39839] [4.3/4.4/4.5/4.6 regression] loop invariant motion causes stack spill
--- Comment #12 from sandra at codesourcery dot com 2010-06-22 18:02 --- Hrmmm, I was planning to attempt a 4.5 backport of the PR42505 patch for internal use, but if it's not easy or doesn't help, I think I have better things to do with my time than to try to come up with some other fix. ;-) So, let's wait and see. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39839
[Bug target/43703] Unexpected floating point precision loss due to ARM NEON autovectorization
--- Comment #6 from sandra at codesourcery dot com 2010-06-22 01:55 --- Julian's patch overlapped some other NEON changes I was already preparing for submission, so I did some refactoring before posting it for review. Here's the main part of the fix: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02102.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43703
[Bug middle-end/42505] [4.4/4.5/4.6 Regression] loop canonicalization causes a lot of unnecessary temporary variables
--- Comment #10 from sandra at codesourcery dot com 2010-06-19 12:56 --- Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01920.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42505
[Bug middle-end/42505] [4.4/4.5/4.6 Regression] loop canonicalization causes a lot of unnecessary temporary variables
--- Comment #9 from sandra at codesourcery dot com 2010-06-12 07:42 --- I now have a specific theory of what is going on here. There are two problems: (1) estimate_reg_pressure_cost is not accounting for the function call in the loop body. In this case it ought to use call_used_regs instead of fixed_regs to determine how many registers are available for loop invariants. Here the target is Thumb-1 and there are only 4 non-call-clobbered registers available rather than 9, so we are much more constrained than ivopts thinks we are. This is pretty straightforward to fix. (2) For the test case filed with the issue, there are 4 registers needed for the two candidates and two invariants ivopts is selecting, so even with the fix for (1) ivopts thinks it has enough registers available. But, there are two uses of the form (src + offset) in the ivopts output, although they appear differently in the gimple code. RTL optimizations are combining these and allocating a temporary. Since the two uses span the function call in the loop body, the temporary needs to be assigned to a non-call-clobbered register. This is why there is a spill of the other loop invariant. Perhaps we could make the RA smarter about recomputing the src + offset value rather than resort to spilling something, but since I am dumb about the RA ;-) I'm planning to keep poking at the ivopts cost model instead. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42505
[Bug middle-end/42505] [4.4/4.5/4.6 Regression] loop canonicalization causes a lot of unnecessary temporary variables
--- Comment #8 from sandra at codesourcery dot com 2010-06-10 13:01 --- I was barking up the wrong tree with my last idea -- the signed/unsigned conversion business was a red herring. Here's what I now believe is the problem: the costs computation is underestimating the register pressure costs so that we are in fact spilling when the cost computation thinks it still has free registers. A hack to make get_computation_cost_at add target_reg_cost to the result when it must use a scratch register seemed to have positive overall effects on code size (as well as fixing the test case). But, I don't think that's the real solution, as I can't come up with a good logical justification for putting such a cost there. :-) estimate_reg_pressure_cost already reserves 3 free registers for such things. Anyway, I am continuing to poke at this in hopes of figuring out where the register costs model is really going wrong. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42505
[Bug middle-end/42505] [4.4/4.5/4.6 Regression] loop canonicalization causes a lot of unnecessary temporary variables
--- Comment #7 from sandra at codesourcery dot com 2010-06-05 20:41 --- OK, I'm testing a hack to rewrite_use_compare to make it know that it doesn't have to introduce a temporary just to compare against constant zero. I'm also doing a little tuning of the costs model for -Os, using CSiBE. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42505
[Bug middle-end/42505] [4.4/4.5/4.6 Regression] loop canonicalization causes a lot of unnecessary temporary variables
--- Comment #4 from sandra at codesourcery dot com 2010-06-04 00:09 --- I've been looking at this problem today. Here's the stupid part coming out of ivopts: bb 5: # ivtmp.7_21 = PHI 0(2), ivtmp.7_20(4) # ivtmp.10_22 = PHI ivtmp.10_24(2), ivtmp.10_23(4) count_25 = (int) ivtmp.10_22; if (count_25 != 0) goto bb 3; else goto bb 6; No subsequent pass is recognizing that the unsigned-to-signed conversion is useless and count is otherwise dead. If I change the parameter count to have type unsigned int, then ivopts does the obvious replacement itself: bb 5: # ivtmp.7_21 = PHI 0(2), ivtmp.7_20(4) # ivtmp.10_22 = PHI count_7(D)(2), ivtmp.10_23(4) if (ivtmp.10_22 != 0) goto bb 3; else goto bb 6; Then count is completely gone from the loop after ivopts and the resulting code looks good. So, fix this somewhere inside ivopts to make the signed case produce the same code as the unsigned one? Or tell it not to replace count at all if it has to do a type conversion? I'm still trying to find my way around the code for this pass to figure out where things happen, so if this is obvious to someone else I'd appreciate a pointer. :-) -- sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42505
[Bug tree-optimization/39874] [4.4 regression] missing VRP (submission)
--- Comment #4 from sandra at codesourcery dot com 2010-06-01 02:22 --- Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg1.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39874
[Bug middle-end/28685] Multiple comparisons are not simplified
--- Comment #15 from sandra at codesourcery dot com 2010-06-01 02:24 --- Proposed patch for PR 39874/comment #5 posted here: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg1.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28685
[Bug tree-optimization/39874] [4.4 regression] missing VRP (submission)
--- Comment #3 from sandra at codesourcery dot com 2010-05-24 13:08 --- I'm testing a fix for this (better comparison combination logic in the ifconvert pass). -- sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39874
[Bug middle-end/28685] Multiple comparisons are not simplified
--- Comment #13 from sandra at codesourcery dot com 2010-05-24 13:21 --- I'm working on a patch that fixes the test case in comment #5 (originally filed as PR 39874) and some other test cases by improving the comparison combination logic in both tree-ssa-ifcombine and tree-ssa-reassoc. The test case in comment #4 is a somewhat different problem -- maybe it is a VRP failure? The problem is figuring out the right place to attempt to combine the comparisons -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28685
[Bug middle-end/28685] Multiple comparisons are not simplified
--- Comment #11 from sandra at codesourcery dot com 2010-05-08 03:43 --- I've posted the patch to fix the first testcase here: http://gcc.gnu.org/ml/gcc-patches/2010-05/msg00564.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28685
[Bug middle-end/28685] Multiple comparisons are not simplified
--- Comment #10 from sandra at codesourcery dot com 2010-05-07 02:32 --- I've been working on a patch that fixes the original reported problem by adding a little logic to tree-ssa-reassoc.c to make it look for places where it can use combine_comparisons. Note that this test case does not involve an if or require any particular CFA, just straightforward expression simplification. My sense is that the test cases that do involve ifs and/or require flow analysis are in fact different bugs that require different fixes. (In fact, 28691 looks more like an RTL-level optimization to me, maybe even backend-specific.) So, is it really useful to lump them all together as duplicates for tracking purposes? Or am I totally barking up the wrong tree here? -- sandra at codesourcery dot com changed: What|Removed |Added CC||sandra at codesourcery dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28685
[Bug target/36223] IV-opt is not optimal for mips
--- Comment #6 from sandra at codesourcery dot com 2009-08-24 22:36 --- This bug appears to be fixed in mainline HEAD now. Here's an excerpt showing the generated code for the inner loop in the example program now: addiu $21,$28,%gp_rel(AA) addiu $10,$28,%gp_rel(A) addiu $20,$28,%gp_rel(BB) addiu $9,$28,%gp_rel(B) li $19,2044# 0x7fc li $18,10 # 0xa move$2,$0 .L3: addu$8,$10,$2 addu$3,$9,$2 lw $24,0($8) addu$14,$21,$2 lw $8,0($3) addu$3,$20,$2 addiu $2,$2,4 sw $24,0($14) bne $2,$19,.L3 sw $8,0($3) All 4 gp_rel address computations pulled outside the loop, and only 5 adds inside. I'm not sure what fixed this, but it does seem fixed. -- sandra at codesourcery dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||WORKSFORME http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36223
[Bug tree-optimization/39604] [4.3/4.4/4.5 Regression] tree-ssa-sink breaks stack layout
--- Comment #9 from sandra at codesourcery dot com 2009-04-03 12:54 --- After the merge of the alias_improvements branch to trunk, the test case no longer compiles incorrectly at -O1. Is this coincidence, or a real fix that addresses the underlying problem? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39604
[Bug middle-end/39604] New: tree-ssa-sink breaks stack layout
As reported in this thread: http://gcc.gnu.org/ml/gcc-patches/2009-03/msg01798.html This problem was reported by an ARM user and reproduced on arm-none-eabi, but is not target-specific. If the attached test program is compiled with -O1, it fails by incorrectly calling a pure virtual method. What is happening is that tree-ssa-sink is moving code from the inlined destructor for STUFF, in the first nested block, into the second nested block, where it ends up after the code for the inlined constructor for STUFF2. Then, cfgexpand comes along and decides that STUFF and STUFF2 can share stack space because they are in disjoint lexical blocks. Thus the sunk destructor statement for STUFF ends up trashing the vtable of STUFF2. The test program appears to work correctly at -O2 only because -fstrict-aliasing prevents cfgexpand from assigning STUFF and STUFF2 to the same stack offset. Per further discussion in the thread above, cfgexpand's stack layout should not be using lexical block scoping information to determine when stack variables may share storage, as GIMPLE lowering removes lexical scopes and promotes all locals to function scope, and subsequent middle-end optimizations do not preserve the lexical block structure. Since stack variable sharing is an important optimization for some applications, some other form of lifetime analysis is needed. Apparently PR middle-end/32327 was another incarnation of this same problem, but was closed without really addressing it. -- Summary: tree-ssa-sink breaks stack layout Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: major Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: sandra at codesourcery dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: arm-none-eabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39604
[Bug middle-end/39604] tree-ssa-sink breaks stack layout
--- Comment #1 from sandra at codesourcery dot com 2009-03-31 22:37 --- Created an attachment (id=17573) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17573action=view) C++ test case sink-1.C -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39604
[Bug target/36223] IV-opt is not optimal for mips
--- Comment #5 from sandra at codesourcery dot com 2008-06-30 02:05 --- Maybe I'm just being clueless here, but I don't understand why this bug was re-categorized. In my original analysis, I traced the bad code directly to the RA pass un-doing the results of previous optimizations. Andrew, if you think it is going wrong somewhere else, can you provide more details as to where, and what code you think ought to be coming out of that pass that isn't? I could perhaps chew on this some more if I knew what to look for. -- sandra at codesourcery dot com changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36223
[Bug rtl-optimization/36223] New: bad interaction between PRE/register allocation/reload
This is a missed-optimization bug. The following reduced test case illustrates the problem. It doesn't do anything useful, but just compile it with mipsisa32r2-elfoabi-gcc -S -mtune=24kc -G4096 -O2 example4.c #define N 511 #define M 9 long A[N]; long B[N]; long AA[N]; long BB[N]; long tA; long tB; void foo (unsigned iterations) { unsigned loop_cnt; static long *aLow; static long *bLow; static long *aHi; static long *bHi; static long n1; static long n2; static long l; static long i; static long j; static long k; for (loop_cnt = 0; loop_cnt iterations; loop_cnt ++) { /* This is the loop we're interested in. */ for (i = 0; i N; i ++) { AA[i] = A[i]; BB[i] = B[i]; } /* The rest of this stuff is just here to add some context to the outer loop. */ for (k = 1; k = M; k++) { n1 = 1 k; n2 = n1 1; for (j = 0; j n2; j++) { for (i = j; i N; i += n1) { l = i + n2; aLow = A[l]; bLow = B[l]; aHi = A[i]; bHi = B[i]; A[l] = *aHi - tA; B[l] = *bHi - tB; A[i] += tA; B[i] += tB; } } } } } The -G option forces the global variables to use GP-relative addressing, which involves an extra addition. Thus the first nested loop should be optimized as if it were written: { long *t1 = AA; long *t2 = A; long *t3 = BB; long *t4 = B; for (i = 0; i N; i++) { *t1 = *t2; *t3 = *t4; t1++; t2++; t3++; t4++; } } In 4.3.1, though, it is producing code with GP-relative addressing inside the loop, so that the loop body has 9 adds instead of 5. Mainline head does a better job and at least pulls out the references to A and B (which also appear in the second nested loop). PRE is working fine, and pulling the invariant GP-relative addressing of all four variables all the way out of the outer loop. However, this means the lifetimes of the corresponding pseudo-registers span the entire outer loop, and the register allocator is (correctly) giving priority to the more localized pseudos in the more deeply nested loops that follow. Having failed to allocate a hardware register to span the entire lifetime of the pseudos, reload stupidly re-inserts the previously hoisted GP-relative address computation at the point of reference, inside the first nested loop. I think what is needed is more smarts to make it understand that it should try allocating a register just around the inner loop if it can't get one for the entire outer loop, before giving up. Any thoughts on where the best place for this to happen would be? Can this be done entirely within the register allocator or do we need another pass to identify places where we can potentially shorten the lifetimes of pseudos? While this example is specific to MIPS with the GP-relative addressing, I can see that the underlying PRE/register allocation conflict is a more general problem that probably crops up in lots of other code with similar structure of outer-loop-containing-multiple-inner-loops. -Sandra -- Summary: bad interaction between PRE/register allocation/reload Product: gcc Version: 4.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: sandra at codesourcery dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: mipsisa32r2-elfoabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36223
[Bug target/36223] IV-opt is not optimal for mips
--- Comment #3 from sandra at codesourcery dot com 2008-05-12 19:10 --- One other tidbit: the MIPS SDE 3.4.4-based toolchain produced the desired code for this test case. It's really a 4.* regression, not an enhancement. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36223