[Bug debug/90586] New: [gdb] gdb wrongly set the breakpoint as expected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90586 Bug ID: 90586 Summary: [gdb] gdb wrongly set the breakpoint as expected Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: yangyibiao at nju dot edu.cn Target Milestone: --- $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/10.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --enable-languages=c,c++ --disable-multilib --prefix=/usr/local/gcc-trunk Thread model: posix gcc version 10.0.0 20190517 (experimental) (GCC) $ gdb -v GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git Copyright (C) 2018 Free Software Foundation, Inc. $ cat small.c int c() { int b = 1; f: if (b) { short g[1]; for (; b < 0;) { goto f; return 0; // line 9 } return 0; } else ; return 0; } void main() { c(); } $ gcc -O0 -g small.c; gdb -batch -x cmds a.out Breakpoint 1 at 0x40049b: file small.c, line 9. Breakpoint 1, c () at small.c:11 11 return 0; g = {64} b = 1 Kill the program being debugged? (y or n) [answered Y; input not from terminal] $ cat cmds b 9 r info locals kill q = We set breakpoint at line 9 "b 9" in cmds. Line #9 is never executed. Thus, the expected behavior should be exit normally. However, it stopped at line 11. We are not set breakpoint in line 11. Thus, I was wondering this is a bug in gdb.
[Bug tree-optimization/89479] __restrict on a pointer ignored when a function is passed alongside it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89479 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=81009 --- Comment #8 from Eric Gallager --- (In reply to Eyal Rozenberg from comment #2) > (In reply to Marc Glisse from comment #1) > > Seems similar enough. > > With respect - this is not about x being a const __restrict pointer; what I > said (including the clang behavior) applies exactly the same when we remove > the const. See: https://godbolt.org/z/hH643a (where the const is gone). OK, but even if it's not a dup, I still think it's related enough to go under "See Also"
[Bug libgomp/90585] libgomp hsa plugin ftbfs in the x32 multilib variant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90585 --- Comment #1 from Matthias Klose --- looks like libgomp/configure.ac always sets -Werror, not respecting the --disable-werror configure option.
[Bug c/88144] remove long-obsolete syntax for designated initializers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88144 --- Comment #6 from Eric Gallager --- (In reply to Jonathan Wakely from comment #3) > Maybe -Wdeprecated or -Wdeprecated-declarations I think clang puts this under -Wgnu-designator: https://clang.llvm.org/docs/DiagnosticsReference.html#wgnu-designator Just brainstorming an options entry: Wgnu-designator C ObjC C++ ObjC++ Warning Var(warn_gnu_designator) LangEnabledBy(C ObjC C++ ObjC++,Wall || Wextra || Wpedantic || Wdeprecated || Wdeprecated-declarations || Wdesignated-init) Warn on use of obsolete GNU syntax for designated initializers.
[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547 --- Comment #5 from uros at gcc dot gnu.org --- Author: uros Date: Thu May 23 04:55:40 2019 New Revision: 271537 URL: https://gcc.gnu.org/viewcvs?rev=271537=gcc=rev Log: Backported from mainline 2019-05-21 Uroš Bizjak * config/i386/cpuid.h (__cpuid): For 32bit targets, zero %ebx and %ecx bafore calling cpuid with leaf 1 or non-constant leaf argument. 2019-05-21 Uroš Bizjak PR target/90547 * config/i386/i386.md (anddi_1 to andsi_1_zext splitter): Avoid calling gen_lowpart with CONST operand. testsuite/ChangeLog: Backported from mainline 2019-05-21 Uroš Bizjak PR target/90547 * gcc.target/i386/pr90547.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr90547.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/cpuid.h branches/gcc-7-branch/gcc/config/i386/i386.md branches/gcc-7-branch/gcc/testsuite/ChangeLog
[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Uroš Bizjak --- Fixed everywhere.
[Bug libgomp/90585] New: libgomp hsa plugin ftbfs in the x32 multilib variant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90585 Bug ID: 90585 Summary: libgomp hsa plugin ftbfs in the x32 multilib variant Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: doko at debian dot org CC: jakub at gcc dot gnu.org Target Milestone: --- seen when configuring with --enable-offload-targets=nvptx-none,hsa --with-multilib-list=m32,m64,mx32 libtool: compile: /home/packages/gcc/9/gcc-9-9.1.0/build/./gcc/xgcc -B/home/packages/gcc/9/gcc-9-9.1.0/build/./gcc/ -B/usr/x86_64-linux -gnu/bin/ -B/usr/x86_64-linux-gnu/lib/ -isystem /usr/x86_64-linux-gnu/include -isystem /usr/x86_64-linux-gnu/sys-include -isystem /home/ packages/gcc/9/gcc-9-9.1.0/build/sys-include -DHAVE_CONFIG_H -I. -I../../../../src/libgomp -I../../../../src/libgomp/config/linux/x86 -I ../../../../src/libgomp/config/linux -I../../../../src/libgomp/config/posix -I../../../../src/libgomp -I../../../../src/libgomp/../inclu de -D_GNU_SOURCE -Wall -Werror -ftls-model=initial-exec -pthread -DUSING_INITIAL_EXEC_TLS -g -O2 -mx32 -MT libgomp_plugin_hsa_la-plugin- hsa.lo -MD -MP -MF .deps/libgomp_plugin_hsa_la-plugin-hsa.Tpo -c ../../../../src/libgomp/plugin/plugin-hsa.c -fPIC -DPIC -o .libs/libgo mp_plugin_hsa_la-plugin-hsa.o ../../../../src/libgomp/plugin/plugin-hsa.c: In function 'release_kernel_dispatch': ../../../../src/libgomp/plugin/plugin-hsa.c:1158:22: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 1158 | shadow->debug, (void *) shadow->debug); | ^ ../../../../src/libgomp/plugin/plugin-hsa.c:261:19: note: in definition of macro 'HSA_LOG' 261 | fprintf (stderr, __VA_ARGS__); \ | ^~~ ../../../../src/libgomp/plugin/plugin-hsa.c:1157:3: note: in expansion of macro 'HSA_DEBUG' 1157 | HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n", shadow, | ^ ../../../../src/libgomp/plugin/plugin-hsa.c:1157:14: error: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'uint64_t' {aka 'long long unsigned int'} [-Werror=format=] 1157 | HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n", shadow, | ^~~~ 1158 | shadow->debug, (void *) shadow->debug); | ~ | | | uint64_t {aka long long unsigned int} ../../../../src/libgomp/plugin/plugin-hsa.c:261:19: note: in definition of macro 'HSA_LOG' 261 | fprintf (stderr, __VA_ARGS__); \ | ^~~ ../../../../src/libgomp/plugin/plugin-hsa.c:1157:3: note: in expansion of macro 'HSA_DEBUG' 1157 | HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n", shadow, | ^ ../../../../src/libgomp/plugin/plugin-hsa.c:1157:57: note: format string is defined here 1157 | HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n", shadow, | ~~^ | | | long unsigned int | %llu ../../../../src/libgomp/plugin/plugin-hsa.c: In function 'print_kernel_dispatch': ../../../../src/libgomp/plugin/plugin-hsa.c:1279:31: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 ha s type 'uint64_t' {aka 'long long unsigned int'} [-Werror=format=] 1279 | fprintf (stderr, "object: %lu\n", dispatch->object); | ~~^ | | | | | uint64_t {aka long long unsigned int} | long unsigned int | %llu ../../../../src/libgomp/plugin/plugin-hsa.c:1281:31: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'uint64_t' {aka 'long long unsigned int'} [-Werror=format=] 1281 | fprintf (stderr, "signal: %lu\n", dispatch->signal); | ~~^ | | | | | uint64_t {aka long long unsigned int} | long unsigned int | %llu ../../../../src/libgomp/plugin/plugin-hsa.c:1289:44: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'uint64_t' {aka 'long long unsigned int'} [-Werror=format=] 1289 | fprintf (stderr, "children dispatches: %lu\n", | ~~^ |
[Bug c++/78388] Bogus "declaration shadows template parameter" error with parenthesized function-style casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78388 Eric Gallager changed: What|Removed |Added CC||jason at gcc dot gnu.org, ||nathan at gcc dot gnu.org --- Comment #2 from Eric Gallager --- cc-ing C++ FE maintainers
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #9 from Qi Feng --- And there's another problem. Take `x > y && x != 0 --> x > y' for example, I would also like to do x < y && y != 0 --> x < y x != 0 && x > y --> x > y y != 0 && x < y --> x < y If the constant always comes in as the second operand is incorrect, these would have to be doubled. I tried to add :c to truth_andif, but got the `operation is not commutative' error. I also tried to make truth_andif commutative by modifying genmatch.c, but again, I don't know it well, afraid that I would break something. The patterns I wrote looks like: /* x > y && x != 0 --> x > y Only for unsigned x and y. */ (simplify (truth_andif:c (gt@2 @0 @1) (ne @0 integer_zerop)) (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1))) @2)) I have to wrote 4 of this with minor modification for a single transformation. If there's better way to do it, please do leave a comment.
[Bug libstdc++/90415] std::is_copy_constructible> is incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415 --- Comment #3 from Rafael Avila de Espindola --- I see now that the corresponding commit on trunk was 31011b9a94fed33170c009292e82558336d1c4d7 (r261146). At that revision, the test in this bug passes. There was a more recent regression on trunk on revision a9b768f8f4fd471e315623b23c4f9e83463bf92e (r270433).
[Bug debug/90584] New: [gdb] gdb is not stopped at a breakpoint in an executed line of code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90584 Bug ID: 90584 Summary: [gdb] gdb is not stopped at a breakpoint in an executed line of code Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: yangyibiao at nju dot edu.cn Target Milestone: --- $ gcc --version gcc (GCC) 10.0.0 20190517 (experimental) Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gdb --version GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git Copyright (C) 2018 Free Software Foundation, Inc. $ cat small.c #include int main() { int i = 0; int j = 0; for (; i<=1; i++) { for (; j<=1; j++) { goto lbl; } } lbl: // line 11 printf("hello\n"); return 0; } $ gcc -O0 -g small.c; ./a.out hello $ gdb -batch -x cmds a.out Breakpoint 1 at 0x40051a: file small.c, line 11. hello [Inferior 1 (process 2774) exited normally] cmds:3: Error in sourced command file: No frame selected. $ cat cmds b 11 r info locals kill q According to the program output, Line 11 should be executed. Thus, when we set breakpoint at line 11, it should be stopped and print something. However, the program executed and exit directly.
[Bug c++/90462] Internal compiler error with deprecated-copy and json diagnostics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90462 --- Comment #4 from David Malcolm --- r271535 should fix the ICE on trunk, but it doesn't fix the missing "finish" location for the warning described in comment #2.
[Bug c++/90583] New: Implement DR 1722, lambda to function pointer conversion should be noexcept
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90583 Bug ID: 90583 Summary: Implement DR 1722, lambda to function pointer conversion should be noexcept Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: mpolacek at gcc dot gnu.org Target Milestone: --- Cf. http://wg21.link/cwg1722 void foo () { auto l = [](int){ return 42; }; static_assert(noexcept((int (*)(int))(l)), ""); }
[Bug c++/90462] Internal compiler error with deprecated-copy and json diagnostics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90462 --- Comment #3 from David Malcolm --- Author: dmalcolm Date: Thu May 23 00:42:03 2019 New Revision: 271535 URL: https://gcc.gnu.org/viewcvs?rev=271535=gcc=rev Log: Bulletproof -fdiagnostics-format=json against bad locations (PR c++/90462) PR c++/90462 reports an ICE with -fdiagnostics-format=json when attempting to serialize a malformed location to JSON. The compound location_t in question has meaningful "caret" and "start" locations, but has UNKNOWN_LOCATION for its "finish" location, leading to a NULL pointer dereference when attempting to build a JSON string for the filename. This patch bulletproofs the JSON output so that attempts to write a JSON object for a location with a NULL file will lead to an object with no "file" key, and attempts to write a compound location with UNKNOWN_LOCATION for its start or finish will lead to the corresponding JSON child object being omitted. This patch also adds a json::object::get member function, for self-testing the above. gcc/ChangeLog: PR c++/90462 * diagnostic-format-json.cc: Include "selftest.h". (json_from_expanded_location): Only add "file" key for non-NULL file strings. (json_from_location_range): Don't add "start" and "finish" children if they are UNKNOWN_LOCATION. (selftest::test_unknown_location): New selftest. (selftest::test_bad_endpoints): New selftest. (selftest::diagnostic_format_json_cc_tests): New function. * json.cc (json::object::get): New function. (selftest::test_object_get): New selftest. (selftest::json_cc_tests): Call it. * json.h (json::object::get): New decl. * selftest-run-tests.c (selftest::run_tests): Call selftest::diagnostic_format_json_cc_tests. * selftest.h (selftest::diagnostic_format_json_cc_tests): New decl. gcc/testsuite/ChangeLog: PR c++/90462 * g++.dg/pr90462.C: New test. Added: trunk/gcc/testsuite/g++.dg/pr90462.C Modified: trunk/gcc/ChangeLog trunk/gcc/diagnostic-format-json.cc trunk/gcc/json.cc trunk/gcc/json.h trunk/gcc/selftest-run-tests.c trunk/gcc/selftest.h trunk/gcc/testsuite/ChangeLog
[Bug fortran/90536] Spurious (?) warning when using -Wconversion with -fno-range-check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536 --- Comment #14 from Steve Kargl --- On Wed, May 22, 2019 at 11:21:52PM +, j.ravens.nz at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536 > > --- Comment #13 from Jonathan Ravens --- > Thanks everyone for your input on this issue. I hadn't realised thati > it could cause such dissent. There is no dissent. > As a software developer, my major driver is to manage the users' > expectations. > In that respect, declaring a byte and being able to set it to a valid value > should not raise a warning, especially when an option called no-range-check is > in use which, intuitively, would suppress range-checking errors instead of > causing them. I suggest it might only be technically correct from a > developer's perspective, but not from the user's. You appear to be conflating 2 issues. Range checking has nothing to do with type conversion. In your original code, you have a BOZ of '89'X (which to be standard conforming should be written as Z'89'). This BOZ is either an INTEGER(8) or INTEGER(16) (depends on the target) because gfortran follows how Fortran 95 handles a BOZ in a DATA statement (the only place a BOZ can appear in valid Fortran 95 code). It has a value of 137. So, you now have 2 problems when you are trying to assign it to a BYTE (aka INTEGER(1)) entity: 1) It is out-of-range. 2) It has a type of INTEGER(8) or INTEGER(16). -fno-range-check takes care of 1). -Wno-conversion takes care or 2). Now, when you have '09'X (or correctly Z'09'), this BOZ has a value of 9, but it is still a INTEGER(8) or INTEGER(16) entity. When gfortran performs the ranging checking for assigning 9 to a BYTE (aka INTEGER(1)) entity, it inibits the conversion warning because 9 is in range of a BYTE (aka INTEGER(1)). A warning isn't needed because gfortran knows there is no problem. When you specify -Wall -fno-range-check, the only thing that gfortran knows is that you're assigning an INTEGER(8) or INTEGER(16) entity to a BYTE (aka INTEGER(1)). So, gfortran brings the potential problem to your attention. You specifically requested this behavior via the options! > If commonly-used constructs such as BYTE are to be removed from gfortran, I'd > expect that to require a lot of re-coding for people in general, given the > amount of legacy Fortran code in use. In our case, I think the best option > would be to phase out usage of gfortran. No decisions have been made. I'll raise an RFC about deprecation of a number of mistakes in gfortran (when time permits as I am not paid to contribute to gfortran). The plan would be to issue a deprecation notice in the 10.x releases of gfortran with removal of the mistakes in 11.1. A deprection notice cannot be suppressed by an option, so user will see the notice everytime the user compiles his/her code. So, removal won't happen for 2 or more years. If removal of a mistake such as BYTE causes you to stop using gfortran, oh well.
[Bug ipa/88231] aligned functions laid down inefficiently
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231 --- Comment #8 from Martin Sebor --- (In reply to Martin Liška from comment #7) > Can we do such an optimization without GAS information about size of every > function? My thought was that we could use alignment alone if we didn't know the sizes of instructions on targets like i386 with variable instruction lengths, as a guesstimate, to do better than chance. On RISC targets with fixed instruction length like SPARC it should be possible to get the size just by counting instructions. I don't know this part of GCC so I have no idea what's available.
[Bug fortran/90536] Spurious (?) warning when using -Wconversion with -fno-range-check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536 --- Comment #13 from Jonathan Ravens --- Thanks everyone for your input on this issue. I hadn't realised that it could cause such dissent. As a software developer, my major driver is to manage the users' expectations. In that respect, declaring a byte and being able to set it to a valid value should not raise a warning, especially when an option called no-range-check is in use which, intuitively, would suppress range-checking errors instead of causing them. I suggest it might only be technically correct from a developer's perspective, but not from the user's. If commonly-used constructs such as BYTE are to be removed from gfortran, I'd expect that to require a lot of re-coding for people in general, given the amount of legacy Fortran code in use. In our case, I think the best option would be to phase out usage of gfortran.
[Bug libstdc++/83237] Values returned by std::poisson_distribution are not distributed correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83237 Hans-Peter Nilsson changed: What|Removed |Added CC||hp at gcc dot gnu.org --- Comment #6 from Hans-Peter Nilsson --- A note for the record: (In reply to pa...@gcc.gnu.org from comment #5) > Author: paolo > Date: Sun Dec 24 22:08:52 2017 > New Revision: 255993 > > URL: https://gcc.gnu.org/viewcvs?rev=255993=gcc=rev > Log: > 2017-12-24 Michele Pezzutti > > PR libstdc++/83237 > * include/bits/random.tcc (poisson_distribution<>::operator()): > Fix __x = 1 case - see updated Errata of Devroye's treatise. > * testsuite/26_numerics/random/poisson_distribution/operators/ > values.cc: Add test. Please don't "add test" to an existing file like that, instead put it in a new file. (This method of adding a test can cause side-effects such as a timeout. Example: cris-elf which runs in a simulator, now needs >10 minutes on a "i7-4770K CPU @ 3.50GHz". I intend to split up the test, as has been done in the past.) Also a question: is there a reasonable (much) lower number combination than the "testDiscreteDist<100, 200>" in the test? Perhaps that part of the test can reasonably be disabled for simulator targets? (Also, it seems this PR should be closed as the original issue has been fixed.)
[Bug target/90582] AArch64 stack-protector wastes an instruction on address-generation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90582 --- Comment #1 from Andrew Pinski --- > I assume EOR / CBNZ is as at least as efficient as SUBS / BNE on > all/most AArch64 microarchitectures, but someone should check. It is similar as x86 with that respect on some cores (Marvell's cores mostly). That is ThunderX, ThunderX 2 and OcteonTX and OcteonTX2 all have the ability to do macro-combining of the two instructions into one micro-op.
[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547 --- Comment #4 from uros at gcc dot gnu.org --- Author: uros Date: Wed May 22 22:50:39 2019 New Revision: 271529 URL: https://gcc.gnu.org/viewcvs?rev=271529=gcc=rev Log: Backported from mainline 2019-05-21 Uroš Bizjak * config/i386/cpuid.h (__cpuid): For 32bit targets, zero %ebx and %ecx bafore calling cpuid with leaf 1 or non-constant leaf argument. 2019-05-21 Uroš Bizjak PR target/90547 * config/i386/i386.md (anddi_1 to andsi_1_zext splitter): Avoid calling gen_lowpart with CONST operand. testsuite/ChangeLog: Backported from mainline 2019-05-21 Uroš Bizjak PR target/90547 * gcc.target/i386/pr90547.c: New test. Added: branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr90547.c Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/i386/cpuid.h branches/gcc-8-branch/gcc/config/i386/i386.md branches/gcc-8-branch/gcc/testsuite/ChangeLog
[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569 --- Comment #6 from Jonathan Wakely --- This bug also affects 32-bit GNU/Linux with older versions of glibc.
[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557 Jonathan Wakely changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Jonathan Wakely --- Fixed for GCC 9.2 and trunk. Thanks for the report.
[Bug target/90582] New: AArch64 stack-protector wastes an instruction on address-generation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90582 Bug ID: 90582 Summary: AArch64 stack-protector wastes an instruction on address-generation Product: gcc Version: 8.2.1 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: peter at cordes dot ca Target Milestone: --- void protect_me() { volatile int buf[2]; buf[1] = 3; } https://godbolt.org/z/xdlr5w AArch64 gcc8.2 -O3 -fstack-protector-strong protect_me: stp x29, x30, [sp, -32]! adrpx0, __stack_chk_guard add x0, x0, :lo12:__stack_chk_guard ### this instruction mov x29, sp # frame pointer even though -fomit-frame-pointer is part of -O3. Goes away with explicit -fomit-frame-pointer ldr x1, [x0]# copy the cookie str x1, [sp, 24] mov x1,0# and destroy the reg mov w1, 3 # right before it's already destroyed str w1, [sp, 20] # buf[1] = 3 ldr x1, [sp, 24]# canary ldr x0, [x0]# key destroys the key pointer eor x0, x1, x0 cbnzx0, .L5 ldp x29, x30, [sp], 32 # FP and LR save/restore (for some reason?) ret .L5: # can the store of the link register go here, for backtracing? bl __stack_chk_fail A function that returns a global can embed the low 12 bits of the address into the load instruction. AArch64 instructions are fixed-width, so there's no reason (AFAIK) not to do this. f: adrpx0, foo ldr w0, [x0, #:lo12:foo] ret I'm not an AArch64 performance expert; it's plausible that zero displacements are worth spending an extra instruction on for addresses that are used twice, but unlikely. So we should be doing adrpx0, __stack_chk_guard ldr x1, [x0, #:lo12:__stack_chk_guard] # in prologue to copy cookie ... ldr x0, [x0, #:lo12:__stack_chk_guard] # in epilogue to check cookie This also avoids leaving an exact pointer right to __stack_chk_guard in a register, in case a vulnerable callee or code in the function body can be tricked into dereferencing it and leaking the cookie. (In non-leaf functions, we generate the pointer in a call-preserved register like x19, so yes it will be floating around in a register for callees). I'd hate to suggest destroying the pointer when copying to the stack, because that would require another adrp later. Finding a gadget that has exactly the right offset (the low 12 bits of __stack_chk_guard's address) is a lot less likely than finding an ldr from [x0]. Of course this will introduce a lot of LDR instructions with an #:lo12:__stack_chk_guard offset, but hopefully they won't be part of useful gadgets because they lead to writing the stack, or to EOR/CBNZ to __stack_chk_fail I don't see a way to optimize canary^key == 0 any further, unlike x86-64 PR 90568. I assume EOR / CBNZ is as at least as efficient as SUBS / BNE on all/most AArch64 microarchitectures, but someone should check. -O3 includes -fomit-frame-pointer according to -fverbose-asm, but functions protected with -fstack-protector-strong still get a frame pointer in x29 (costing a MOV x29, sp instruction, and save/restore with STP/LDP along with x30.) However, explicitly using -fomit-frame-pointer stops that from happening. Is that a separate bug, or am I missing something? Without stack-protector, the function is vastly simpler protect_me: sub sp, sp, #16 mov w0, 3 str w0, [sp, 12] add sp, sp, 16 ret Does stack-protector really need to spill/reload x29/x30 (FP and LR)? Bouncing the return address through memory seems inefficient, even though branch prediction does hide that latency. Is that just so __stack_chk_fail can backtrace? Can we move the store of the link register into the __stack_chk_fail branch, off the fast path? Or if we do unconditionally store x30 (the link register), at least don't bother reloading it in a leaf function if register allocation didn't need to clobber it. Unlike x86-64, the return address can't be attacked with buffer overflows if it stays safe in a register the whole function. Obviously my test-case with a volatile array and no inputs at all is making -fstack-protector-strong look dumb by protecting a perfectly safe function. IDK how common it is to have leaf functions with arrays or structs that just use them for some computation on function args or globals and then return, maybe after copying the array back to somewhere else. A sort function might use a tmp
[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557 --- Comment #3 from Jonathan Wakely --- Author: redi Date: Wed May 22 22:36:21 2019 New Revision: 271528 URL: https://gcc.gnu.org/viewcvs?rev=271528=gcc=rev Log: PR libstdc++/90557 fix path assignment that alters source Backport from mainline 2019-05-22 Jonathan Wakely PR libstdc++/90557 * src/c++17/fs_path.cc (path::_List::operator=(const _List&)): Fix reversed arguments to uninitialized_copy_n. * testsuite/27_io/filesystem/path/assign/copy.cc: Check that source is unchanged by copy assignment. * testsuite/util/testsuite_fs.h (compare_paths): Use std::equal to compare path components. Modified: branches/gcc-9-branch/libstdc++-v3/ChangeLog branches/gcc-9-branch/libstdc++-v3/src/c++17/fs_path.cc branches/gcc-9-branch/libstdc++-v3/testsuite/27_io/filesystem/path/assign/copy.cc branches/gcc-9-branch/libstdc++-v3/testsuite/util/testsuite_fs.h
[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557 --- Comment #2 from Jonathan Wakely --- Author: redi Date: Wed May 22 22:14:34 2019 New Revision: 271527 URL: https://gcc.gnu.org/viewcvs?rev=271527=gcc=rev Log: PR libstdc++/90557 fix path assignment that alters source PR libstdc++/90557 * src/c++17/fs_path.cc (path::_List::operator=(const _List&)): Fix reversed arguments to uninitialized_copy_n. * testsuite/27_io/filesystem/path/assign/copy.cc: Check that source is unchanged by copy assignment. * testsuite/util/testsuite_fs.h (compare_paths): Use std::equal to compare path components. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/src/c++17/fs_path.cc trunk/libstdc++-v3/testsuite/27_io/filesystem/path/assign/copy.cc trunk/libstdc++-v3/testsuite/util/testsuite_fs.h
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 --- Comment #16 from dave.anglin at bell dot net --- On 2019-05-22 5:23 p.m., bugzilla-gcc at thewrittenword dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 > > --- Comment #15 from The Written Word com> --- > (In reply to dave.anglin from comment #12) >> It might help to compile stage1 with -O2 or -Os. > How does one do this? After ./configure, "gmake CFLAGS=-Os"? BOOT_CFLAGS > applies to stage2/3. STAGE1_CFLAGS and STAGE1_CXXFLAG used to work: make STAGE1_CFLAGS="-O2 -g" STAGE1_CXXFLAGS="-O2 -g" -j2 bootstrap
[Bug preprocessor/90581] provide an option to adjust the maximum depth of nested #include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90581 Jonathan Wakely changed: What|Removed |Added Severity|normal |enhancement
[Bug middle-end/20408] Unnecessary code generated for empty structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20408 --- Comment #22 from Jason Merrill --- Author: jason Date: Wed May 22 21:39:08 2019 New Revision: 271523 URL: https://gcc.gnu.org/viewcvs?rev=271523=gcc=rev Log: PR c++/20408 - unnecessary code for empty struct. Here initializing the argument from a TARGET_EXPR isn't an empty class copy even though the type is !TREE_ADDRESSABLE, so we should check simple_empty_class_p. * call.c (build_call_a): Use simple_empty_class_p. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/empty-3.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/call.c trunk/gcc/cp/cp-gimplify.c trunk/gcc/cp/cp-tree.h
[Bug middle-end/90549] missing -Wreturn-local-addr maybe returning an address of a local array plus offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90549 Martin Sebor changed: What|Removed |Added Keywords||patch --- Comment #4 from Martin Sebor --- Patch: https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01525.html
[Bug c/71924] missing -Wreturn-local-addr returning alloca result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71924 Martin Sebor changed: What|Removed |Added Keywords||patch --- Comment #5 from Martin Sebor --- Patch: https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01525.html
[Bug preprocessor/90581] New: provide an option to adjust the maximum depth of nested #include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90581 Bug ID: 90581 Summary: provide an option to adjust the maximum depth of nested #include Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: qinzhao at gcc dot gnu.org Target Milestone: --- for some large complicate applications, sometimes the depth of nested #include might be very big, exceeding the current hard-coded limit 200: directives.c: if (pfile->line_table->depth >= CPP_STACK_MAX) cpp_error (pfile, CPP_DL_ERROR, "#include nested too deeply"); internal.h: #define CPP_STACK_MAX 200 This PR is to request a first class option for users to adjust this limit during compilation time in order to compile the large application successfully.
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 --- Comment #15 from The Written Word --- (In reply to dave.anglin from comment #12) > It might help to compile stage1 with -O2 or -Os. How does one do this? After ./configure, "gmake CFLAGS=-Os"? BOOT_CFLAGS applies to stage2/3.
[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557 Jonathan Wakely changed: What|Removed |Added Known to work||8.3.0 Target Milestone|--- |9.2 Summary|Incorrect |[9/10 Regression] Incorrect |std::filesystem::path::oper |std::filesystem::path::oper |ator=(std::filesystem::path |ator=(std::filesystem::path |const&) in gcc 9.1.0|const&) in gcc 9.1.0 Known to fail||10.0, 9.1.0
[Bug c/90580] New: error: ‘offsetof’ undeclared when it is declared, but used with the wrong number of arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90580 Bug ID: 90580 Summary: error: ‘offsetof’ undeclared when it is declared, but used with the wrong number of arguments Product: gcc Version: 8.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: slandden at gmail dot com Target Milestone: --- test2.c: In function ‘main’: test2.c:113:42: error: macro "offsetof" requires 2 arguments, but only 1 given 113 | printf("offsetof %u", offsetof(key.rounds)); | ^ In file included from test2.c:64: /usr/lib/gcc/powerpc64le-linux-gnu/9/include/stddef.h:406: note: macro "offsetof" defined here 406 | #define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER) | test2.c:113:23: error: ‘offsetof’ undeclared (first use in this function) 113 | printf("offsetof %u", offsetof(key.rounds)); | ^~~~ test2.c:65:1: note: ‘offsetof’ is defined in header ‘’; did you forget to ‘#include ’? 64 | #include +++ |+#include 65 | /* test2.c:113:23: note: each undeclared identifier is reported only once for each function it appears in 113 | printf("offsetof %u", offsetof(key.rounds)); | ^~~~
[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 --- Comment #39 from Jonathan Wakely --- (In reply to dave.anglin from comment #37) > I believe I changed the glibc value because of the pthread mutex issue. Aha. > MALLOC_ABI_ALIGNMENT is defined in pa32-linux.h as follows: > #define MALLOC_ABI_ALIGNMENT 128 > > So, the defines are now consistent on linux. The only remaining problem is > 64-bit hpux where the actual > malloc alignment is 8 bytes. The resource_adapter.cc test still fails on I've just committed a change to the resource_adaptor implementation, but I don't expect it to change the FAIL for hpux yet. I hope the FAILs are fixed for Solaris now though, and if so then we make the special case apply to 64-bit hpux too, like so (are these the right macros to check for?): diff --git a/libstdc++-v3/include/experimental/memory_resource b/libstdc++-v3/include/experimental/memory_resource index dde3753fab7..dd6f3099a78 100644 --- a/libstdc++-v3/include/experimental/memory_resource +++ b/libstdc++-v3/include/experimental/memory_resource @@ -413,7 +413,8 @@ namespace pmr { do_allocate(size_t __bytes, size_t __alignment) override { // Cannot use max_align_t on 32-bit Solaris x86, see PR libstdc++/77691 -#if ! (defined __sun__ && defined __i386__) +#if ! (defined __sun__ && defined __i386__) \ + && ! (defined __hpux && defined _LP64) if (__alignment == alignof(max_align_t)) return _M_allocate(__bytes); #endif @@ -439,7 +440,8 @@ namespace pmr { do_deallocate(void* __ptr, size_t __bytes, size_t __alignment) noexcept override { -#if ! (defined __sun__ && defined __i386__) +#if ! (defined __sun__ && defined __i386__) \ + && ! (defined __hpux && defined _LP64) if (__alignment == alignof(max_align_t)) return (void) _M_deallocate(__ptr, __bytes); #endif diff --git a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc index 7dcb408f3f7..d4353ff6464 100644 --- a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc +++ b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc @@ -23,7 +23,8 @@ #include #include -#if defined __sun__ && defined __i386__ +#if (defined __sun__ && defined __i386__) \ + || (defined __hpux && defined _LP64) // See PR libstdc++/77691 # define BAD_MAX_ALIGN_T 1 #endif > it. Maybe I should change BIGGEST_ALIGNMENT > and MALLOC_ABI_ALIGNMENT to match the malloc implementation? I think that makes sense (although it won't change anything until we make the suggestion from PR 90569 as well, so I'll do that this week).
[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 --- Comment #38 from Jonathan Wakely --- Author: redi Date: Wed May 22 20:29:39 2019 New Revision: 271522 URL: https://gcc.gnu.org/viewcvs?rev=271522=gcc=rev Log: PR libstdc++/77691 fix resource_adaptor failures due to max_align_t bugs Remove the hardcoded whitelist of allocators expected to return memory aligned to alignof(max_align_t), because that doesn't work when the platform's malloc() and GCC's max_align_t do not agree what the largest fundamental alignment is. It's also sub-optimal for user-defined allocators that return memory suitable for any fundamental alignment. Instead use a hardcoded list of alignments that are definitely supported by the platform malloc, and use a copy of the allocator rebound to a POD type with the requested alignment. Only allocate an oversized buffer to use with std::align for alignments larger than any of the hardcoded values. For 32-bit Solaris x86 do not include alignof(max_align_t) in the hardcoded values. PR libstdc++/77691 * include/experimental/memory_resource: Add system header pragma. (__resource_adaptor_common::__guaranteed_alignment): Remove. (__resource_adaptor_common::_Types) (__resource_adaptor_common::__new_list) (__resource_adaptor_common::_New_list) (__resource_adaptor_common::_Alignments) (__resource_adaptor_common::_Fund_align_types): New utilities for creating a list of types with fundamental alignments. (__resource_adaptor_imp::do_allocate): Call new _M_allocate function. (__resource_adaptor_imp::do_deallocate): Call new _M_deallocate function. (__resource_adaptor_imp::_M_allocate): New function that first tries to use an allocator rebound to a type with a fundamental alignment. (__resource_adaptor_imp::_M_deallocate): Likewise for deallocation. * testsuite/experimental/memory_resource/new_delete_resource.cc: Adjust expected allocation sizes. * testsuite/experimental/memory_resource/resource_adaptor.cc: Remove xfail. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/experimental/memory_resource trunk/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc trunk/libstdc++-v3/testsuite/experimental/memory_resource/resource_adaptor.cc
[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 --- Comment #37 from dave.anglin at bell dot net --- On 2019-05-22 3:41 p.m., redi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 > > --- Comment #36 from Jonathan Wakely --- > Interesting. Yes, definitely similar ideas. It looks like it was solved > differently though, as config/pa/pa.h has > > #define MALLOC_ABI_ALIGNMENT (TARGET_64BIT ? 128 : 64) > > which should get used by the aligned new code, even without my suggested > change > in PR 90569. > > As an aside, the comment on MALLOC_ABI_ALIGNMENT says "The glibc > implementation > currently provides 8-byte alignment." But glibc malloc was changed to 16-byte > alignment a couple of years ago. I believe I changed the glibc value because of the pthread mutex issue. MALLOC_ABI_ALIGNMENT is defined in pa32-linux.h as follows: #define MALLOC_ABI_ALIGNMENT 128 So, the defines are now consistent on linux. The only remaining problem is 64-bit hpux where the actual malloc alignment is 8 bytes. The resource_adapter.cc test still fails on it. Maybe I should change BIGGEST_ALIGNMENT and MALLOC_ABI_ALIGNMENT to match the malloc implementation?
[Bug target/90330] gcc 9.1.0 fails to install on macOS 10.14.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90330 Iain Sandoe changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2019-05-22 Ever confirmed|0 |1
[Bug target/90330] gcc 9.1.0 fails to install on macOS 10.14.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90330 --- Comment #14 from Iain Sandoe --- (In reply to Iain Sandoe from comment #13) > (In reply to Matt Thompson from comment #12) > > (In reply to Iain Sandoe from comment #11) > > > (In reply to Matt Thompson from comment #10) > > > > (In reply to Iain Sandoe from comment #9) > > > > > (In reply to Matt Thompson from comment #8) > > > Well, 9.1.0 built just fine with 8.2.0 loaded in my environment. This seems > > to point to clang, which, well, doesn't surprise me as clang and I have had > > a difficult life together, but then again clang built 5.4.0 up to 8.2.0 just > > fine for me. > > > > I'm ran a 'make check' and got: > > > > Fixed: time.h > > Fixed: tinfo.h > > Fixed: types/vxTypesBase.h > > Fixed: unistd.h > > Newly fixed header: sys/ucred.h > > > > There were fixinclude test FAILURES > > make[2]: *** [Makefile:177: check] Error 1 > > make[2]: Leaving directory > > '/Users/mathomp4/src/GCC/gcc-9.1.0-BUILD-820loaded/fixincludes' > > make[1]: *** [Makefile:3829: check-fixincludes] Error 2 > > make[1]: Leaving directory > > '/Users/mathomp4/src/GCC/gcc-9.1.0-BUILD-820loaded' > > make: *** [Makefile:2358: do-check] Error 2 This was https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90379 fixed on trunk and for 9.2 (so current snapshots from the branch should have the fix). Other than that, I can't reproduce the problem locally - it installs for me whether built using the XC10.2 command line tools, or my own (GCC-8.3) toolset. ... is there anything more we need to do on this PR? (very happy to help, but not sure how to make pogress without a reproducer for the issue).
[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539 --- Comment #22 from Thomas Koenig --- I've been trying out some things, and I cannot construct a failing test case. A sane way to build such an interface would be cat tst.f90 module x use, intrinsic :: iso_c_binding, only : c_double implicit none interface subroutine foo(a) bind(c) import real(kind=c_double) :: a(*) end subroutine foo end interface private public :: bar contains subroutine bar(a) real(kind=c_double), dimension(:) :: a a = 42._c_double call foo(a) end subroutine bar end module x program main use, intrinsic :: iso_c_binding, only : c_double use x implicit none real(kind=c_double), dimension(1) :: a call bar(a) end program main $ cat foo.c #include void foo (double *a) { printf("%f\n", *a); } $ gfortran -flto -O tst.f90 foo.c $ ./a.out 42.00 This works as expected. What I do not understand is (comment #17) (gdb) p debug(fsym) || symbol: '_formal_107' type spec : (REAL 8) attributes: (VARIABLE DIMENSION DUMMY) Array spec:(0 [0]) This means that the dummy parameter has rank zero. How, then, is it possible to pass a rank-1 argument to it? (gdb) p debug(expr) nf90_put_var_1d_eightbytereal:values(FULL) (REAL 8) (gdb) p *expr->ref $8 = { type = REF_ARRAY, u = { ar = { type = AR_FULL, dimen = 1, codimen = 0, Something very fishy going on here. Please look up the Fortran interface to the C function that is called, nc_put_vara_double. Also, please break on gfc_conv_procedure_call for the call in question and do $ call debug(sym) $ p args $ call debug(args->expr) $ p args->next $ call debug(args->next->expr) ... and so on, until args->...->next becomes a null pointer. I am starting do suspect that this is, in fact, another piece of SPEC bugware where they made some sort of broken interface between C and Fortran, which is exposed by my patch. Hmpf...
[Bug c++/86485] [7/8 Regression] "anonymous" maybe-uninitialized false positive with ternary operator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86485 --- Comment #7 from Jason Merrill --- Author: jason Date: Wed May 22 19:48:05 2019 New Revision: 271521 URL: https://gcc.gnu.org/viewcvs?rev=271521=gcc=rev Log: PR c++/86485 - simple_empty_class_p Yet another tweak that would have fixed this bug: we should treat INIT_EXPR and MODIFY_EXPR differently for determining whether this is a simple empty class copy, since a TARGET_EXPR on the RHS is direct initialization if INIT_EXPR but copy if MODIFY_EXPR. * cp-gimplify.c (simple_empty_class_p): Also true for MODIFY_EXPR. Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/cp-gimplify.c
[Bug target/90568] stack protector should use cmp or sub, not xor, to allow macro-fusion on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 --- Comment #5 from Peter Cordes --- And BTW, this only helps if the SUB and JNE are consecutive, which GCC (correctly) doesn't currently optimize for with XOR. If this sub/jne is different from a normal sub/branch and won't already get optimized for macro-fusion, we may get even more benefit from this change by teaching gcc to keep them adjacent. GCC currently sometimes splits up the instructions like this: xorq%fs:40, %rdx movl%ebx, %eax jne .L7 from gcc8.3 (but not 9.1 or trunk in this case) on https://godbolt.org/z/nNjQ8u #include unsigned int get_random_seed() { std::random_device rd; return rd(); } Even with -O3 -march=skylake. That's not wrong because XOR can't macro-fuse, but the point of switching to SUB is that it *can* macro-fuse into a single sub-and-branch uop on Sandybridge-family. So we might need to teach gcc about that. So when you change this, please make it aware of optimizing for macro-fusion by keeping the sub and jne back to back. Preferably with tune=generic (because Sandybridge-family is fairly widespread and it doesn't hurt on other CPUs), but definitely with -mtune=intel or -mtune=sandybridge or later. Nehalem and earlier can only macro-fuse test/cmp The potential downside of putting it adjacent instead of 1 or 2 insns earlier for uarches that can't macro-fuse SUB/JNE should be about zero on average. These branches should predict very well, and there are no in-order x86 CPUs still being sold. So it's mostly just going to be variations in fetch/decode that help sometimes, hurt sometimes, like any code alignment change.
[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 --- Comment #36 from Jonathan Wakely --- Interesting. Yes, definitely similar ideas. It looks like it was solved differently though, as config/pa/pa.h has #define MALLOC_ABI_ALIGNMENT (TARGET_64BIT ? 128 : 64) which should get used by the aligned new code, even without my suggested change in PR 90569. As an aside, the comment on MALLOC_ABI_ALIGNMENT says "The glibc implementation currently provides 8-byte alignment." But glibc malloc was changed to 16-byte alignment a couple of years ago.
[Bug testsuite/90565] [10 regression] test cases gcc.dg/uninit-18.c and uninit-pr90394-1-gimple.c broken as of r271460
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90565 --- Comment #2 from seurer at gcc dot gnu.org --- Also possibly gcc.dg/pr67512.c
[Bug tree-optimization/90579] New: Huge store forward stall due to vectorizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579 Bug ID: 90579 Summary: Huge store forward stall due to vectorizer Product: gcc Version: 9.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com Target Milestone: --- Target: x86-64 loop/avx256 branch at https://gitlab.com/x86-benchmarks/microbenchmark shows huge store forward stall due to vectorizer in --- extern double r[6]; extern double a[]; double loop (int k, double x) { int i; double t=0; for (i=0;i<6;i++) r[i] = x * a[i + k]; for (i=0;i<6;i++) t+=r[5-i]; return t; } --- when compiled with -O3 -march=skylake: [hjl@gnu-cfl-1 microbenchmark]$ perf stat -e ld_blocks.store_forward ./event loop: 229408 Performance counter stats for './event': 1 ld_blocks.store_forward:u 0.000478529 seconds time elapsed 0.000502000 seconds user 0.0 seconds sys [hjl@gnu-cfl-1 microbenchmark]$ perf stat -e ld_blocks.store_forward ./event-avx128 loop: 191390 Performance counter stats for './event-avx128': 1 ld_blocks.store_forward:u 0.000526154 seconds time elapsed 0.000507000 seconds user 0.0 seconds sys [hjl@gnu-cfl-1 microbenchmark]$ perf stat -e ld_blocks.store_forward ./event-avx256 loop: 1312864 Performance counter stats for './event-avx256': 30,001 ld_blocks.store_forward:u 0.000756643 seconds time elapsed 0.000723000 seconds user 0.0 seconds sys [hjl@gnu-cfl-1 microbenchmark]$
[Bug target/88483] Unnecessary stack alignment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88483 --- Comment #5 from hjl at gcc dot gnu.org --- Author: hjl Date: Wed May 22 18:53:37 2019 New Revision: 271517 URL: https://gcc.gnu.org/viewcvs?rev=271517=gcc=rev Log: x86: Don't allocate stack frame nor align stack if not needed get_frame_size () returns used stack slots during compilation, which may be optimized out later. This patch does the followings: 1. Add stack_frame_required to machine_function to indicate that the function needs a stack frame. 2. Change ix86_find_max_used_stack_alignment to set stack_frame_required. 3. Always call ix86_find_max_used_stack_alignment to check if stack frame is needed. Tested on i686 and x86-64 with --with-arch=native --with-cpu=native Tested on AVX512 machine configured with --with-arch=native --with-cpu=native gcc/ PR target/88483 * config/i386/i386-options.c (ix86_init_machine_status): Set stack_frame_required to true. * config/i386/i386.c (ix86_get_frame_size): New function. (ix86_frame_pointer_required): Replace get_frame_size with ix86_get_frame_size. (ix86_compute_frame_layout): Likewise. (ix86_find_max_used_stack_alignment): Changed to void. Set stack_frame_required. (ix86_finalize_stack_frame_flags): Always call ix86_find_max_used_stack_alignment. Replace get_frame_size with ix86_get_frame_size. * config/i386/i386.h (machine_function): Add stack_frame_required. gcc/testsuite/ PR target/88483 * gcc.target/i386/stackalign/pr88483-1.c: New test. * gcc.target/i386/stackalign/pr88483-2.c: Likewise. Added: trunk/gcc/testsuite/gcc.target/i386/stackalign/pr88483-1.c trunk/gcc/testsuite/gcc.target/i386/stackalign/pr88483-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386-options.c trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.h trunk/gcc/testsuite/ChangeLog
[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547 --- Comment #3 from uros at gcc dot gnu.org --- Author: uros Date: Wed May 22 18:49:22 2019 New Revision: 271516 URL: https://gcc.gnu.org/viewcvs?rev=271516=gcc=rev Log: Backported from mainline 2019-05-21 Uroš Bizjak * config/i386/cpuid.h (__cpuid): For 32bit targets, zero %ebx and %ecx bafore calling cpuid with leaf 1 or non-constant leaf argument. 2019-05-21 Uroš Bizjak PR target/90547 * config/i386/i386.md (anddi_1 to andsi_1_zext splitter): Avoid calling gen_lowpart with CONST operand. testsuite/ChangeLog: Backported from mainline 2019-05-21 Uroš Bizjak PR target/90547 * gcc.target/i386/pr90547.c: New test. Added: branches/gcc-9-branch/gcc/testsuite/gcc.target/i386/pr90547.c Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/config/i386/cpuid.h branches/gcc-9-branch/gcc/config/i386/i386.md branches/gcc-9-branch/gcc/testsuite/ChangeLog
[Bug lto/90577] [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with -O(2|3) and -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90577 Iain Sandoe changed: What|Removed |Added Target||x86_64-apple-darwin*, ||x86_64-gnu-linux --- Comment #1 from Iain Sandoe --- this is repeatable on Linux (m32 and m64) FAIL: gfortran.dg/lrshift_1.f90 -O2 execution test FAIL: gfortran.dg/lrshift_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/lrshift_1.f90 -O3 -g execution test FAIL: gfortran.dg/lrshift_1.f90 -Os execution test
[Bug libstdc++/90415] std::is_copy_constructible> is incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415 --- Comment #2 from Rafael Avila de Espindola --- The bug is still present on trunk.
[Bug libstdc++/90415] std::is_copy_constructible> is incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415 Rafael Avila de Espindola changed: What|Removed |Added CC||jason at redhat dot com --- Comment #1 from Rafael Avila de Espindola --- This bug was present when gcc 8 branched. It was fixed in the gcc 8 branch, but I guess it was never fixed on trunk. On the gcc 8 branch it was fixed by r261463 (d26c6b8b0c6abba9a67b87a1d48f0c3165d021cc).
[Bug target/90568] stack protector should use cmp or sub, not xor, to allow macro-fusion on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-05-22 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #4 from Jakub Jelinek --- Ok, will change it then. THanks for the report.
[Bug lto/90577] [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with -O(2|3) and -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90577 Dominique d'Humieres changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-05-22 Ever confirmed|0 |1
[Bug fortran/90578] Wrong code with LSHIFT and optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90578 Dominique d'Humieres changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |NEW Last reconfirmed||2019-05-22 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=90577 Ever confirmed|0 |1
[Bug fortran/90578] New: Wrong code with LSHIFT and optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90578 Bug ID: 90578 Summary: Wrong code with LSHIFT and optimization Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: dominiq at lps dot ens.fr Target Milestone: --- While experimenting with pr90577, I have found that the run-time output for the following reduced test program test_rshift_lshift implicit none integer :: i(15), j, n i = (/ -huge(i), -huge(i)/2, -129, -128, -127, -2, -1, 0, & 1, 2, 127, 128, 129, huge(i)/2, huge(i) /) print *, lshift(i(1),-30) print *, lshift(i(1),-29) if (lshift(i(1),-30) /= 4) STOP 1 end program test_rshift_lshift depends on the optimization level: % gfc lrshift_1_red.f90 % ./a.out 4 8 % gfc lrshift_1_red.f90 -O % ./a.out 2 4 STOP 1 but gfc lrshift_1_red.f90 -fauto-inc-dec -fbranch-count-reg -fcombine-stack-adjustments -fcompare-elim -fcprop-registers -fdce -fdefer-pop -fdse -fforward-propagate -fguess-branch-probability -fif-conversion -fif-conversion2 -finline-functions-called-once -fipa-profile -fipa-pure-const -fipa-reference -fipa-reference-addressable -fmerge-constants -fmove-loop-invariants -fomit-frame-pointer -freorder-blocks -fshrink-wrap -fshrink-wrap-separate -fsplit-wide-types -fssa-backprop -fssa-phiopt -ftree-bit-ccp -ftree-ccp -ftree-ch -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre -ftree-phiprop -ftree-pta -ftree-scev-cprop -ftree-sink -ftree-slsr -ftree-sra -ftree-ter -funit-at-a-time gives also 4 8 I see this behavior from at least 4.8 up to trunk (10.0).
[Bug target/90568] stack protector should use cmp or sub, not xor, to allow macro-fusion on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 --- Comment #3 from Peter Cordes --- (In reply to Jakub Jelinek from comment #2) > The xor there is intentional, for security reasons we do not want the stack > canary to stay in the register afterwards, because then it could be later > spilled or accessible to some exploit in another way. Ok, so we can't use CMP, therefore we should use SUB, which as I showed does help on Sandybridge-family vs. XOR. x - x = 0 just like x ^ x = 0 Otherwise SUB wouldn't set ZF. SUB is not worse than XOR on any other CPUs; there are no CPUs with better XOR throughput than ADD/SUB. In the canary mismatch case, leaving attacker_value - key in a register seems no worse than leaving attacker_value ^ key in a register. Either value trivially reveals the canary value to an attacker that knows what they overwrote the stack with, if it does somehow leak. We jump to __stack_chk_fail in that case, not relying on the return value on the stack, so a ROP attack wouldn't be sufficient to leak that value anywhere.
[Bug lto/90577] New: [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with -O(2|3) and -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90577 Bug ID: 90577 Summary: [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with -O(2|3) and -flto Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: dominiq at lps dot ens.fr CC: hubicka at gcc dot gnu.org, iains at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- Testing fortran with -flto gives Running /opt/gcc/work/gcc/testsuite/gfortran.dg/dg.exp ... FAIL: gfortran.dg/lrshift_1.f90 -O2 execution test FAIL: gfortran.dg/lrshift_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/lrshift_1.f90 -O3 -g execution test FAIL: gfortran.dg/lrshift_1.f90 -Os execution test === gfortran Summary for unix/-m32/-flto === # of expected passes10 # of unexpected failures4 The behavior changed between revisions r268729 (2019-02-09, OK) and r269160 (2019-02-23, wrong-code). With the following change do n = 1, size(i) do j = -30, 30 + print *, n, j, lshift(i(n),j) + print *, n, j, c_lshift(i(n),j) if (lshift(i(n),j) /= c_lshift(i(n),j)) STOP 1 if (rshift(i(n),j) /= c_rshift(i(n),j)) STOP 2 end do the wrong code gives 1 -30 2 1 -30 -2 STOP 1 while the working one gives 1 -30 4 1 -30 4 1 -29 8 1 -29 8 ... I also see FAIL: gfortran.dg/ISO_Fortran_binding_9.f90 -g -O3 -fwhole-program -flto execution test
[Bug rtl-optimization/64895] RA picks the wrong register for -fipa-ra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64895 --- Comment #16 from Iain Sandoe --- Created attachment 46398 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46398=edit testsuite patch Will post this later, tested on x86_64-linux and x86_64-darwin.
[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569 --- Comment #5 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #4) > Rainer, the change to gcc/cp/init.c would allow you to do: > > #define MALLOC_ABI_ALIGNMENT 8 Oops, it's in bits not bytes, so that should be #define MALLOC_ABI_ALIGNMENT 64
[Bug target/68485] ICE while building gpsd package on microblaze
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68485 Giulio Benetti changed: What|Removed |Added CC||giulio.benetti@micronovasrl ||.com --- Comment #5 from Giulio Benetti --- This seems to be a duplicate of Bug 69401.
[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 --- Comment #35 from dave.anglin at bell dot net --- On 2019-05-22 11:03 a.m., redi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 > > --- Comment #34 from Jonathan Wakely --- > (In reply to Jonathan Wakely from comment #33) >> The correct fix is to adjust the value of __STDCPP_DEFAULT_NEW_ALIGNMENT__ >> on targets where malloc doesn't agree with GCC's alignof(max_align_t). > That only helps for C++17 and later though :-( > > The header is defined for C++14. > Reminds me of this patch: https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00528.html
[Bug tree-optimization/90576] New: [10 regression] SPEC CPU2006 450.soplex miscompiled with -Os -flto after r271413
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90576 Bug ID: 90576 Summary: [10 regression] SPEC CPU2006 450.soplex miscompiled with -Os -flto after r271413 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: mkuvyrkov at gcc dot gnu.org Target Milestone: --- After === commit ce7b4f267706c23405705d848c1dcf686496f262 Author: hubicka Date: Mon May 20 12:01:40 2019 + * tree-ssa-alias.c (compare_sizes): New function. (sompare_type_sizes): New function (aliasing_component_refs_p): Use it. (indirect_ref_may_alias_decl_p): Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@271413 138bc75d-0d04-0410-961f-82ee72b054a4 === GCC miscompiles 450.soplex with -Os -flto at least on AArch64 and AArch32. The benchmark finishes within seconds with === 450.soplex: copy 0 non-zero return code (exit code=11, signal=0) === FWIW, "-Os -fno-lto" seem to work. Considering that both AArch64 and AArch32 are affected and the nature of the patch, this is likely affects other architectures. Honza, would you please investigate? Please let me know if it doesn't readily reproduce for you, and I'll help with a testcase.
[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691 --- Comment #34 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #33) > The correct fix is to adjust the value of __STDCPP_DEFAULT_NEW_ALIGNMENT__ > on targets where malloc doesn't agree with GCC's alignof(max_align_t). That only helps for C++17 and later though :-( The header is defined for C++14.
[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569 --- Comment #4 from Jonathan Wakely --- Rainer, the change to gcc/cp/init.c would allow you to do: #define MALLOC_ABI_ALIGNMENT 8 in gcc/config/i386/sol2.h and that would cause std::allocator to know that it can't rely on malloc for 16-byte alignment. Although that would only help for C++17, because otherwise __cpp_aligned_new isn't defined ... drat. It's better than nothing though. Does that seem acceptable for your target?
[Bug debug/90575] New: -gsplit-dwarf leaves behind .dwo file in cwd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90575 Bug ID: 90575 Summary: -gsplit-dwarf leaves behind .dwo file in cwd Product: gcc Version: 9.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: sbergman at redhat dot com Target Milestone: --- At least with current GCC 9.1.1: > $ mkdir testdir > $ echo 'int main(void) { return 0; }' > testdir/test.c > $ gcc -gsplit-dwarf testdir/test.c -o testdir/test > $ ls > testdir test.dwo I at least wouldn't expect the above to leave behind a test.dwo in the current working dir.
[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569 Jason Merrill changed: What|Removed |Added CC||jason at gcc dot gnu.org --- Comment #3 from Jason Merrill --- (In reply to Jonathan Wakely from comment #0) > unsigned > malloc_alignment () > { > if (MALLOC_ABI_ALIGNMENT != BITS_PER_WORD) > return MALLOC_ABI_ALIGNMENT; > return MAX (max_align_t_align(), MALLOC_ABI_ALIGNMENT); > } The last line can just be return max_align_t_align(); Otherwise looks good to me.
[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539 --- Comment #21 from Thomas Koenig --- OK, if the callee is a C function... what is its declaration on the Fortran side? Is there any interface, bind(c) or otherwise? I suppose there must be something, otherwise nf_put_vara_double would have a trailing underscore. On the caller side, I see that an array is passed, but the fsym has rank=0. I think this would be flagged otherwise.
[Bug debug/90574] New: [gdb] gdb wrongly stopped at a breakpoint in an unexecuted line of code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90574 Bug ID: 90574 Summary: [gdb] gdb wrongly stopped at a breakpoint in an unexecuted line of code Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: yangyibiao at nju dot edu.cn Target Milestone: --- $ gcc --version gcc (GCC) 10.0.0 20190517 (experimental) Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gdb --version GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later $ cat small.c #include int main(int argc, char **argv) { if (argc == 0) { int *ptr; label: { } } if (argc == 1) { printf("hello\n"); } return 0; } $ gcc -g small.c; ./a.out hello $ gdb -batch -x cmds a.out Breakpoint 1 at 0x400501: file small.c, line 8. Breakpoint 1, main (argc=1, argv=0x7fffde58) at small.c:8 8 label: ptr = Kill the program being debugged? (y or n) [answered Y; input not from terminal] $ cat cmds b 8 r info locals kill q Line 8 in the body of the "if (argc==0)" is not executed according to the program output. Thus, when we set breakpoint in Line #8, gdb should not stop. However, in this case, it stopped and print something. Thus, I was wondering this should be a bug in gdb.
[Bug rtl-optimization/64895] RA picks the wrong register for -fipa-ra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64895 Iain Sandoe changed: What|Removed |Added CC||iains at gcc dot gnu.org --- Comment #15 from Iain Sandoe --- (IIUC the thread here) It looks to me that the codegen is now DTRT for both pic and non pic. Darwin is doing pic by default, so sees XPASSes There is no Linux pic test (so the change was not noticed there): (I will produce a patch for the tests on the basis that this is now fixed). Linux x86-64 (r271505): Running target unix/-fpic/-m32 Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using /home/iains/gcc-trunk/src-local/gcc/testsuite/config/default.exp as tool-and-target-specific interface file. Running /home/iains/gcc-trunk/src-local/gcc/testsuite/gcc.target/i386/i386.exp ... XPASS: gcc.target/i386/fuse-caller-save-rec.c scan-assembler-not push XPASS: gcc.target/i386/fuse-caller-save-rec.c scan-assembler-not pop XPASS: gcc.target/i386/fuse-caller-save-rec.c scan-assembler-times addl\t%[re]?dx, %[re]?ax 1 FAIL: gcc.target/i386/fuse-caller-save-xmm.c scan-assembler-times addpd\t\\.?LC0.*, %xmm0 1 XPASS: gcc.target/i386/fuse-caller-save-xmm.c scan-assembler-times addpd\t%xmm1, %xmm0 1 XPASS: gcc.target/i386/fuse-caller-save.c scan-assembler-not push XPASS: gcc.target/i386/fuse-caller-save.c scan-assembler-not pop XPASS: gcc.target/i386/fuse-caller-save.c scan-assembler-times addl\t%[re]?d[ix], %[re]?ax 1 === gcc Summary for unix/-fpic/-m64 === # of expected passes18 = code output below - Darwin produces the same (or equivalent, for m32/pic). = (extraneous lines snipped for clarity) fuse-caller-save-rec.c -O2 -fipa-ra -fomit-frame-pointer -fno-optimize-sibling-calls -mregparm=1 -m32 -S {,-fpic} bar: cmpl$4, %eax jg .L9 xorl%eax, %eax ret .L9: subl$12, %esp subl$3, %eax callbar addl$12, %esp ret foo: subl$12, %esp movl%eax, %edx callbar addl$12, %esp addl%edx, %eax ret = fuse-caller-save.c -O2 -fipa-ra -fomit-frame-pointer -mregparm=1 -m32 -S {,-fpic} bar: addl$3, %eax ret foo: movl%eax, %edx callbar addl%edx, %eax ret = fuse-caller-save-xmm.c -O2 -fipa-ra -fomit-frame-pointer -msse2 -mno-avx -m32 -S bar: addpd .LC0, %xmm0 ret foo: subl$12, %esp movapd %xmm0, %xmm1 callbar addl$12, %esp addpd %xmm1, %xmm0 ret fuse-caller-save-xmm.c -O2 -fipa-ra -fomit-frame-pointer -msse2 -mno-avx -m32 -S -fpic bar: call__x86.get_pc_thunk.ax addl$_GLOBAL_OFFSET_TABLE_, %eax movapd .LC0@GOTOFF(%eax), %xmm1 addpd %xmm1, %xmm0 ret foo: subl$12, %esp movapd %xmm0, %xmm2 callbar addl$12, %esp addpd %xmm2, %xmm0 ret
[Bug tree-optimization/90573] Avoid unnecessary data transfer into OMP construct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90573 --- Comment #1 from Thomas Schwinge --- Probably some of these transformation should come with compiler diagnostics, especially for explicit clauses. For example, need to relate this to 'OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT': PR70550 (r234779, r234824, r234826). PR72781? Or "the other way round", PR69876?
[Bug c++/68476] microblaze: compilation of btSoftBody.cpp doesn't terminate with optimisation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68476 Giulio Benetti changed: What|Removed |Added CC||giulio.benetti@micronovasrl ||.com --- Comment #8 from Giulio Benetti --- Duplicate. It turns out that this bug behaves like 85180: - hang on gcc version < 8.x with -O1/2/3 *** This bug has been marked as a duplicate of bug 85180 ***
[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-05-22 Ever confirmed|0 |1 --- Comment #2 from Jonathan Wakely --- I thought we could workaround this in libstdc++ like so: diff --git a/libstdc++-v3/libsupc++/Makefile.am b/libstdc++-v3/libsupc++/Makefile.am index eec7b953514..a50a9848461 100644 --- a/libstdc++-v3/libsupc++/Makefile.am +++ b/libstdc++-v3/libsupc++/Makefile.am @@ -129,6 +129,8 @@ cp-demangle.o: cp-demangle.c # Use special rules for the C++17 sources so that the proper flags are passed. +new_op.lo: new_op.cc + $(LTCXXCOMPILE) -std=gnu++1z -c $< new_opa.lo: new_opa.cc $(LTCXXCOMPILE) -std=gnu++1z -c $< new_opant.lo: new_opant.cc diff --git a/libstdc++-v3/libsupc++/Makefile.in b/libstdc++-v3/libsupc++/Makefile.in index 5d8ac5ca0ba..0e3cbff0055 100644 --- a/libstdc++-v3/libsupc++/Makefile.in +++ b/libstdc++-v3/libsupc++/Makefile.in @@ -956,6 +956,8 @@ cp-demangle.o: cp-demangle.c $(C_COMPILE) -DIN_GLIBCPP_V3 -Wno-error -c $< # Use special rules for the C++17 sources so that the proper flags are passed. +new_op.lo: new_op.cc + $(LTCXXCOMPILE) -std=gnu++1z -c $< new_opa.lo: new_opa.cc $(LTCXXCOMPILE) -std=gnu++1z -c $< new_opant.lo: new_opant.cc diff --git a/libstdc++-v3/libsupc++/new_op.cc b/libstdc++-v3/libsupc++/new_op.cc index 863530b7564..203c57d9171 100644 --- a/libstdc++-v3/libsupc++/new_op.cc +++ b/libstdc++-v3/libsupc++/new_op.cc @@ -27,6 +27,9 @@ #include #include #include "new" +#if defined __sun__ || defined __i386__ +# include +#endif using std::new_handler; using std::bad_alloc; @@ -41,6 +44,14 @@ extern "C" void *malloc (std::size_t); _GLIBCXX_WEAK_DEFINITION void * operator new (std::size_t sz) _GLIBCXX_THROW (std::bad_alloc) { +#if defined __sun__ || defined __i386__ + if (sz >= alignof(std::max_align_t)) +{ + std::align_val_t al{alignof(std::max_align_t)}; + return ::operator new(sz, al); +} +#endif + void *p; /* malloc (0) is unpredictable; avoid it. */ This would force operator new to use aligned_alloc instead of malloc for allocations that might be for objects large enough to require greater alignment than malloc guarantees. But since Solaris 11 doesn't appear to define aligned_alloc, this would use the fallback implementation in libsupc++/new_opa.cc which is much less efficient than plain malloc.
[Bug tree-optimization/90573] New: Avoid unnecessary data transfer into OMP construct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90573 Bug ID: 90573 Summary: Avoid unnecessary data transfer into OMP construct Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: openacc, openmp Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tschwinge at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- As mentioned in PR90067: "it might generally be beneficial to have a pass promoting 'firstprivate(x)' with a dominating write operation on 'x' to 'private(x)'". This will avoid unnecessary data transfer for (all too common!) code like: int i; #pragma acc parallel loop // implicit 'firstprivate(i)' for (i = 0; i < N; ++i) [...] Similarly, there are cases where 'copy(x)' can be optimized to 'copyout(x)', or 'copyin(x)' to 'create(x)'. This need not apply to implicit clauses only, but also to explicit ones, when the user can't observe any difference. The same applies to certain OpenMP clauses too, I suppose.
[Bug middle-end/34678] Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #39 from Jakub Jelinek --- (In reply to Richard Biener from comment #36) > Created attachment 46396 [details] > poor mans solution^Whack > > So this is what a hack looks like, basically sprinkling those asm()s > throughout the code automatically. > > Note I need to protect inputs, not outputs, otherwise the last > testcase isn't fixed. > > Improving this poor-mans solution by writing in some flow-sensitivity > like tracking which values are already protected and if there's a possibly > harmful FENV access inbetween maybe in a similar way tree-complex.c tracks > complex components might work. > > Note that the FENV pragma does _not_ enable -frounding-math (it really has > no effect!) so you need to supply -frounding-math yourself (or fix the > frontends to do that). > > It's a hack of course. > > But it fixes the testcase: > > > ./xgcc -B. t.c -O3 -lm > > ./a.out > 1/0.2: down = 4.999 near = 4.999 up = > 4.999 > a.out: t.c:32: main: Assertion `5.0 <= up' failed. > Aborted > > ./xgcc -B. t.c -O3 -lm -frounding-math > > ./a.out > 1/0.2: down = 4.999 near = 5 up = 5 > > IL after the lowering: > > main () > { > static const char __PRETTY_FUNCTION__[5] = "main"; > double near; > double up; > double down; > double op; > int D.3058; > > op = atof ("0.2"); > fesetround (1024); > __asm__ __volatile__("" : "=g" op : "0" op); > down = 1.0e+0 / op; > fesetround (2048); > __asm__ __volatile__("" : "=g" op : "0" op); > up = 1.0e+0 / op; > fesetround (0); > __asm__ __volatile__("" : "=g" op : "0" op); > near = 1.0e+0 / op; > printf ("1/%.16g: down = %.16g near = %.16g up = %.16g\n", op, down, near, > up); > ... How does this work if op is a SSA_NAME?
[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570 Martin Liška changed: What|Removed |Added Keywords||patch --- Comment #5 from Martin Liška --- I've got a patch candidate.
[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570 --- Comment #4 from Martin Liška --- (In reply to Jakub Jelinek from comment #3) > Given the TREE_STATIC on: > static const int C.0[2] = {1, 2}; > I don't understand why there is ASAN_UNPOISON/ASAN_POISON for C.0, shouldn't > that be applied solely to automatic variables, not block scope locals? Ah, you are right. We shouldn't do it for static variables.
[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570 --- Comment #3 from Jakub Jelinek --- Given the TREE_STATIC on: static const int C.0[2] = {1, 2}; I don't understand why there is ASAN_UNPOISON/ASAN_POISON for C.0, shouldn't that be applied solely to automatic variables, not block scope locals?
[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570 Martin Liška changed: What|Removed |Added CC||jason at gcc dot gnu.org --- Comment #2 from Martin Liška --- Started with r260969 where Jason emit initializer list initialization as automatic variable instead of a const int variable. Difference: BEFORE: stru::stru (struct stru * const this) { struct initializer_list D.17010; const int D.16442[2]; struct allocator_type D.16443; _1 = >v; D.16442[0] = 1; D.16442[1] = 2; D.17010._M_array = D.17010._M_len = 2; .ASAN_MARK (UNPOISON, , 1); std::allocator::allocator (); try { try { std::vector::vector (_1, D.17010, ); } finally { std::allocator::~allocator (); } } finally { .ASAN_MARK (POISON, , 1); } try { this->i = 5; } catch { _2 = >v; std::vector::~vector (_2); } } AFTER: stru::stru (struct stru * const this) { struct initializer_list D.17010; static const int C.0[2] = {1, 2}; struct allocator_type D.16443; _1 = >v; .ASAN_MARK (UNPOISON, , 8); try { D.17010._M_array = D.17010._M_len = 2; .ASAN_MARK (UNPOISON, , 1); std::allocator::allocator (); try { try { std::vector::vector (_1, D.17010, ); } finally { std::allocator::~allocator (); } } finally { .ASAN_MARK (POISON, , 1); } } finally { .ASAN_MARK (POISON, , 8); } try { this->i = 5; } catch { _2 = >v; std::vector::~vector (_2); } } I believe we're doing good and the code is really invalid. Jason?
[Bug tree-optimization/90571] Missed optimization opportunity when returning function pointers based on run-time boolean
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571 --- Comment #3 from Richard Biener --- Turning indirect calls into direct ones might be important enough to also handle int x, y; int f() { return x; } int g() { return y; } int t0(bool b) { int (*i)() = b ? : x = 1; return i(); } int main(int ac, char**) { return t0(ac & 1); } like where there are statements before the indirect call that prevent it from being simply duplicated into the predecessor blocks. The transformation "primitive" would then be to duplicate the joiner up to the call and the "interesting" part of it is creating all required PHI nodes (unless you want to make SSA rewrite deal with this somehow). Sinking stmts below the call and limiting the amount of copying is important. Note there are related PRs for that we miss to sink/hoist stmts through PHI nodes when that reduces the number of PHI nodes. The transform would likely split the block, insert "block-closed" PHI nodes in the tail part for all SSA names defined in the first half and live over the new edge and then duplicate the first half re-wiring edges as needed. This is as opposed to the original testcase where a simpler pattern-matching scheme could be invented. I wonder how the "original" testcase looked like - the one in this bug is probably simplified from real-world code?
[Bug c++/90572] Wrong disambiguation in friend declaration as implicit typename context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90572 Marek Polacek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-05-22 CC||mpolacek at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |mpolacek at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Marek Polacek --- Thanks for the bug report; mine.
[Bug target/71124] Compiler enters infinite loop on Microblaze with -O1/-O2/-O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71124 --- Comment #4 from Giulio Benetti --- Previous Comment was wrong. This duplicates bug: *** This bug has been marked as a duplicate of bug 85180 ***
[Bug target/71124] Compiler enters infinite loop on Microblaze with -O1/-O2/-O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71124 Giulio Benetti changed: What|Removed |Added CC||giulio.benetti@micronovasrl ||.com --- Comment #3 from Giulio Benetti --- Duplicate then. *** This bug has been marked as a duplicate of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85180 ***
[Bug middle-end/34678] Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678 --- Comment #38 from Marc Glisse --- (In reply to Marc Glisse from comment #37) > If you protect even constants, the current effects of -frounding-math become > redundant. Oops, forget that, the hack is too late for this sentence to be true, some constant propagation has already happened by that time.
[Bug tree-optimization/90571] Missed optimization opportunity when returning function pointers based on run-time boolean
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-05-22 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- Let me try.
[Bug tree-optimization/90571] Missed optimization opportunity when returning function pointers based on run-time boolean
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Component|c++ |tree-optimization --- Comment #1 from Richard Biener --- I think there's a dup for this somewhere. Basically we fail to optimize if (_2 != 0) goto ; [50.00%] else goto ; [50.00%] : : # iftmp.0_10 = PHI _11 = iftmp.0_10 (); on the GIMPLE level. It might be tempting to enable tree-ssa-phiprop.c to transform this into if (_2 != 0) goto ; [50.00%] else goto ; [50.00%] : tem1 = f(); goto bb4; : tem2 = g(); : # iftmp.0_10 = PHI _11 = iftmp.0_10;
[Bug middle-end/34678] Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678 --- Comment #37 from Marc Glisse --- (In reply to Richard Biener from comment #36) > Created attachment 46396 [details] > poor mans solution^Whack > > So this is what a hack looks like, basically sprinkling those asm()s > throughout the code automatically. > > Note I need to protect inputs, not outputs, otherwise the last > testcase isn't fixed. Actually, you need to protect both inputs *and* outputs... > Improving this poor-mans solution by writing in some flow-sensitivity > like tracking which values are already protected At least if you use "=x" (or whatever the right constraint is on each target) it doesn't really hurt to have a dozen protections on the same variable. > and if there's a possibly > harmful FENV access in between maybe in a similar way tree-complex.c tracks > complex components might work. > > Note that the FENV pragma does _not_ enable -frounding-math (it really has > no effect!) so you need to supply -frounding-math yourself (or fix the > frontends to do that). If you protect even constants, the current effects of -frounding-math become redundant.
[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539 --- Comment #20 from Martin Liška --- (In reply to Thomas Koenig from comment #19) > Thanks. > > A bit more: > > What are the declarations of the actual srgument, > of the dummy argument (on the callee side), > and what is the argument in the call list? > > > Ill try to construct a test case tonight then. So the callee is actually a C function: ;; Function nf_put_vara_double_ (null) ;; enabled by -tree-original { size_t B3[512]; size_t B4[512]; int A0; # DEBUG BEGIN STMT; size_t B3[512]; # DEBUG BEGIN STMT; size_t B4[512]; # DEBUG BEGIN STMT; int A0; # DEBUG BEGIN STMT; A0 = nc_put_vara_double (*fncid, *fvarid + -1, (const size_t *) f2c_coords (*fncid, *fvarid + -1, (const int *) A3, (size_t *) ), (const size_t *) f2c_counts (*fncid, *fvarid + -1, (const int *) A4, (size_t *) ), A5); # DEBUG BEGIN STMT; return A0; } where nc_put_vara_double is defined as: int nc_put_vara_double(int ncid, int varid, const size_t *start, const size_t *edges, const double *value) { int status = NC_NOERR; NC *ncp; const NC_var *varp; int ii; size_t iocount; status = NC_check_id(ncid, ); if(status != NC_NOERR) return status; if(NC_readonly(ncp)) return NC_EPERM; if(NC_indef(ncp)) return NC_EINDEFINE; varp = NC_lookupvar(ncp, varid); if(varp == NULL) return NC_ENOTVAR; /* TODO: lost NC_EGLOBAL */ if(varp->type == NC_CHAR) return NC_ECHAR; status = NCcoordck(ncp, varp, start); if(status != NC_NOERR) return status; status = NCedgeck(ncp, varp, start, edges); if(status != NC_NOERR) return status; if(varp->ndims == 0) /* scalar variable */ { return( putNCv_double(ncp, varp, start, 1, value) ); } if(IS_RECVAR(varp)) { status = NCvnrecs(ncp, *start + *edges); if(status != NC_NOERR) return status; if(varp->ndims == 1 && ncp->recsize <= varp->len) { /* one dimensional && the only record variable */ return( putNCv_double(ncp, varp, start, *edges, value) ); } } /* * find max contiguous * and accumulate max count for a single io operation */ ii = NCiocount(ncp, varp, edges, ); if(ii == -1) { return( putNCv_double(ncp, varp, start, iocount, value) ); } assert(ii >= 0); { /* inline */ ALLOC_ONSTACK(coord, size_t, varp->ndims); ALLOC_ONSTACK(upper, size_t, varp->ndims); const size_t index = ii; /* copy in starting indices */ (void) memcpy(coord, start, varp->ndims * sizeof(size_t)); /* set up in maximum indices */ set_upper(upper, start, edges, [varp->ndims]); /* ripple counter */ while(*coord < *upper) { const int lstatus = putNCv_double(ncp, varp, coord, iocount, value); if(lstatus != NC_NOERR) { if(lstatus != NC_ERANGE) { status = lstatus; /* fatal for the loop */ break; } /* else NC_ERANGE, not fatal for the loop */ if(status == NC_NOERR) status = lstatus; } value += iocount; odo1(start, upper, coord, [index], [index]); } FREE_ONSTACK(upper); FREE_ONSTACK(coord); } /* end inline */ return status; } that calls:
[Bug c++/90572] New: Wrong disambiguation in friend declaration as implicit typename context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90572 Bug ID: 90572 Summary: Wrong disambiguation in friend declaration as implicit typename context Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: blitzrakete at gmail dot com Target Milestone: --- template struct C { friend C(T::fn)(); // not implicit typename context, declarator-id of friend // declaration }; Courtesy of rsmith. gcc fails to compile this with -std=c++2a, but accepts it in C++17 mode. :11:19: error: ISO C++ forbids declaration of 'C' with no type [-fpermissive] 11 | friend C(T::fn)(); // not implicit typename context, declarator-id of friend | ^ :11:19: error: 'C' declared as function returning a function gcc interprets this as a function taking a T::fn and returning a function, while it should be a function returning C taking no parameters with the (qualified) name T::fn.
[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539 --- Comment #19 from Thomas Koenig --- Thanks. A bit more: What are the declarations of the actual srgument, of the dummy argument (on the callee side), and what is the argument in the call list? Ill try to construct a test case tonight then.
[Bug tree-optimization/88440] size optimization of memcpy-like code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440 --- Comment #22 from Richard Biener --- The code in question was originally added with r202721 by Vlad and likely became more costly after making the target macro a hook (no inlining anymore).
[Bug c++/90571] New: Missed optimization opportunity when returning function pointers based on run-time boolean
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571 Bug ID: 90571 Summary: Missed optimization opportunity when returning function pointers based on run-time boolean Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vittorio.romeo at outlook dot com Target Milestone: --- Given the following two functions: int f() { return 0; } int g() { return 1; } And the following code to invoke one of them depending on a boolean `b`: int t0(bool b) { return (b ? : )(); } int t1(bool b) { return b ? f() : g(); } int t2(bool b) { return b ? t0(true) : t0(false); } Both `g++ (trunk)` and `clang++ (trunk)` with `-std=c++2a -Ofast -march=native` fail to optimize the following code: int main(int ac, char**) { return t0(ac & 1); } Producing the following assembly: > main: > and edi, 1 > mov eax, OFFSET FLAT:f() > mov edx, OFFSET FLAT:g() > cmove rax, rdx > jmp rax > Invoking `t1` or `t2` (instead of `t0`) produces the following optimized assembly: > main: > mov eax, edi > not eax > and eax, 1 > ret Everything can be reproduced live on **gcc.godbolt.org**: https://godbolt.org/z/gh7270
[Bug ipa/88231] aligned functions laid down inefficiently
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231 Martin Liška changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|marxin at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #7 from Martin Liška --- (In reply to Martin Sebor from comment #5) > The feature already exists at -Os by default (i.e., all functions are by > default minimally aligned). The suggestion here is only to let GCC minimize > the amount of padding it adds to functions in order to align the explicitly > overaligned ones that follow by changing the order it emits them in. > > Outside -Os, functions would continue to be optimally aligned unless > overridden by the attribute. When their alignment is explicitly reduced by > the attribute GCC could still be smart about ordering them so as to minimize > wasted space. Consider: > > __attribute__ ((aligned (4))) int f4 (int i) { return 2 * i; } > double f (double x) { return x * x * x; } > __attribute__ ((aligned (4))) int g4 (int i) { return i; } > > for which GCC for x86_64 emits: > > :;; unnecessarily overaligned > 0: 8d 04 3flea(%rdi,%rdi,1),%eax > 3: c3 retq > 4: 66 90 xchg %ax,%ax > 6: 66 2e 0f 1f 84 00 00nopw %cs:0x0(%rax,%rax,1) > d: 00 00 00 > > 0010 : ;; optimally aligned > 10: 66 0f 28 c8 movapd %xmm0,%xmm1 > 14: f2 0f 59 c8 mulsd %xmm0,%xmm1 > 18: f2 0f 59 c1 mulsd %xmm1,%xmm0 > 1c: c3 retq > 1d: 0f 1f 00nopl (%rax) > > 0020 :;; also unnecessarily overaligned > 20: 89 f8 mov%edi,%eax > 22: c3 retq > > If it laid down f first instead it would be able to avoid padding f4: > > : >0: 66 0f 28 c8 movapd %xmm0,%xmm1 >4: f2 0f 59 c8 mulsd %xmm0,%xmm1 >8: f2 0f 59 c1 mulsd %xmm1,%xmm0 >c: c3 retq >d: 0f 1f 00nopl (%rax) > > 0010 : ;; unavoidably overaligned > 10: 8d 04 3flea(%rdi,%rdi,1),%eax > 13: c3 retq > > 0014 : ;; aligned exactly as requested > 14: 89 f8 mov%edi,%eax > 16: c3 retq > Can we do such an optimization without GAS information about size of every function?
[Bug debug/86964] [7/8 Regression] Too many debug symbols included, especially for extern globals
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86964 --- Comment #18 from Thomas De Schampheleire --- Second version of patch, fixing testsuite failures, was posted: https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01403.html
[Bug ipa/88231] aligned functions laid down inefficiently
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231 --- Comment #6 from Martin Liška --- (In reply to Andi Kleen from comment #4) > I'm not sure it's a good idea to do this. Often the goal is not to get the > absolute smallest code, but to get code that minimizes cache line usage. > This is important for "frontend bound" code like gcc itself often is. > > It would be rather better to use an algorithm like Petis-Hansen or the one > in hfsort (see > https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1. > pdf) to lay out the code based on expected call order to minimize foot > print. For best result would need profile feedback of course, but it might > already do a reasonable job based on static call frequencies. I'm planning to implement that for GCC10 with LTO and PGO. So far, we've been ordering functions with LTO by it's first call. We can definitely do better.
[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-05-22 Known to work||8.3.0 Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org Target Milestone|--- |9.2 Summary|AddressSanitizer: |[9/10 Regression] |stack-use-after-scope |AddressSanitizer: ||stack-use-after-scope Ever confirmed|0 |1 Known to fail||10.0, 9.1.0 --- Comment #1 from Martin Liška --- Let me take a look..
[Bug sanitizer/90570] New: AddressSanitizer: stack-use-after-scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570 Bug ID: 90570 Summary: AddressSanitizer: stack-use-after-scope Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: mtekieli at gmail dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- root@marcin:~# cat main.cpp #include struct stru { std::vector v{1,2,3,4}; int i{5}; }; int main() { stru s1; stru s2; return 0; } root@marcin:~# g++-9 -fsanitize=address main.cpp -o main root@marcin:~# ./main = ==1656==ERROR: AddressSanitizer: stack-use-after-scope on address 0x55fd2cb681c0 at pc 0x7f1c3d1a7b90 bp 0x7fff14bed7c0 sp 0x7fff14becf68 It doesn't matter if vector changed to set or map. Works OK on gcc8.3 and clang8.
[Bug bootstrap/90543] Build failure on MINGW for gcc-9.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90543 --- Comment #8 from Jakub Jelinek --- (In reply to Jonathan Wakely from comment #6) > Neither uintptr_t nor PRIxPTR (nor long long nor uint64_t) is part of C++98, > which GCC still requires. I do see existing uses of intptr_t and uintptr_t > in gcc/cp/*.c though. For intptr_t and uintptr_t configure arranges to have those defined: -- Macro: AC_TYPE_UINTPTR_T If `stdint.h' or `inttypes.h' defines the type `uintptr_t', define `HAVE_UINTPTR_T'. Otherwise, define `uintptr_t' to an unsigned integer type wide enough to hold a pointer, if such a type exists. -- Macro: AC_TYPE_INTPTR_T If `stdint.h' or `inttypes.h' defines the type `intptr_t', define `HAVE_INTPTR_T'. Otherwise, define `intptr_t' to a signed integer type wide enough to hold a pointer, if such a type exists. and all we require is that such a type exists, so hosts where pointers don't have size of unsigned int, unsigned long or unsigned long long and don't provide stdint.h or inttypes.h defining those are unsupported. Are there any?
[Bug fortran/89100] Default widths for i, f and g format specifiers in format strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89100 --- Comment #12 from Janne Blomqvist --- Author: jb Date: Wed May 22 11:56:01 2019 New Revision: 271511 URL: https://gcc.gnu.org/viewcvs?rev=271511=gcc=rev Log: fortran/89100: Default widths with -fdec-format-defaults gcc/fortran ChangeLog: 2019-05-22 Jeff Law Mark Eggleston PR fortran/89100 * gfortran.texi: Add Default widths for F, G and I format descriptors to Extensions section. * invoke.texi: Add -fdec-format-defaults * io.c (check_format): Use default widths for i, f and g when flag_dec_format_defaults is enabled. * lang.opt: Add new option. * options.c (set_dec_flags): Add SET_BITFLAG for flag_dec_format_defaults. gcc/testsuite ChangeLog: 2019-05-22 Mark Eggleston PR fortran/89100 * gfortran.dg/fmt_f_default_field_width_1.f90: New test. * gfortran.dg/fmt_f_default_field_width_2.f90: New test. * gfortran.dg/fmt_f_default_field_width_3.f90: New test. * gfortran.dg/fmt_g_default_field_width_1.f90: New test. * gfortran.dg/fmt_g_default_field_width_2.f90: New test. * gfortran.dg/fmt_g_default_field_width_3.f90: New test. * gfortran.dg/fmt_i_default_field_width_1.f90: New test. * gfortran.dg/fmt_i_default_field_width_2.f90: New test. * gfortran.dg/fmt_i_default_field_width_3.f90: New test. libgfortran ChangeLog: 2019-05-22 Jeff Law PR fortran/89100 * io/format.c (parse_format_list): set default width when the IOPARM_DT_DEC_EXT flag is set for i, f and g. * io/io.h: add default_width_for_integer, default_width_for_float and default_precision_for_float. * io/write.c (write_boz): extra parameter giving length of data corresponding to the type's kind. (write_b): pass data length as extra parameter in calls to write_boz. (write_o): pass data length as extra parameter in calls to write_boz. (write_z): pass data length as extra parameter in calls to write_boz. (size_from_kind): also set size is default width is set. * io/write_float.def (build_float_string): new paramter inserted before result parameter. If default width use values passed instead of the values in fnode. (FORMAT_FLOAT): macro modified to check for default width and calls to build_float_string to pass in default width. (get_float_string): set width and precision to defaults when needed. Added: trunk/gcc/testsuite/gfortran.dg/fmt_f_default_field_width_1.f90 trunk/gcc/testsuite/gfortran.dg/fmt_f_default_field_width_2.f90 trunk/gcc/testsuite/gfortran.dg/fmt_f_default_field_width_3.f90 trunk/gcc/testsuite/gfortran.dg/fmt_g_default_field_width_1.f90 trunk/gcc/testsuite/gfortran.dg/fmt_g_default_field_width_2.f90 trunk/gcc/testsuite/gfortran.dg/fmt_g_default_field_width_3.f90 trunk/gcc/testsuite/gfortran.dg/fmt_i_default_field_width_1.f90 trunk/gcc/testsuite/gfortran.dg/fmt_i_default_field_width_2.f90 trunk/gcc/testsuite/gfortran.dg/fmt_i_default_field_width_3.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/gfortran.texi trunk/gcc/fortran/invoke.texi trunk/gcc/fortran/io.c trunk/gcc/fortran/lang.opt trunk/gcc/fortran/options.c trunk/gcc/testsuite/ChangeLog trunk/libgfortran/ChangeLog trunk/libgfortran/io/format.c trunk/libgfortran/io/io.h trunk/libgfortran/io/read.c trunk/libgfortran/io/write.c trunk/libgfortran/io/write_float.def
[Bug bootstrap/90543] Build failure on MINGW for gcc-9.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90543 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- http://gcc.gnu.org/ml/gcc-patches/2019-05/msg01492.html
[Bug bootstrap/90543] Build failure on MINGW for gcc-9.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90543 --- Comment #6 from Jonathan Wakely --- Neither uintptr_t nor PRIxPTR (nor long long nor uint64_t) is part of C++98, which GCC still requires. I do see existing uses of intptr_t and uintptr_t in gcc/cp/*.c though.
[Bug tree-optimization/88440] size optimization of memcpy-like code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440 --- Comment #21 from Richard Biener --- Ick. static inline void check_pseudos_live_through_calls (int regno, HARD_REG_SET last_call_used_reg_set, rtx_insn *call_insn) { ... for (hr = 0; HARD_REGISTER_NUM_P (hr); hr++) if (targetm.hard_regno_call_part_clobbered (call_insn, hr, PSEUDO_REGNO_MODE (regno))) add_to_hard_reg_set (_reg_info[regno].conflict_hard_regs, PSEUDO_REGNO_MODE (regno), hr); this loop is repeatedly computing an implicit hard-reg set for which hard-regs are partly clobbered by the call for the _same_ actual instruction since check_pseudos_live_through_calls is called via /* Mark each defined value as live. We need to do this for unused values because they still conflict with quantities that are live at the time of the definition. */ for (reg = curr_id->regs; reg != NULL; reg = reg->next) { if (reg->type != OP_IN) { update_pseudo_point (reg->regno, curr_point, USE_POINT); mark_regno_live (reg->regno, reg->biggest_mode); check_pseudos_live_through_calls (reg->regno, last_call_used_reg_set, call_insn); ... } and EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j) { IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set, this_call_used_reg_set); if (flush) check_pseudos_live_through_calls (j, last_call_used_reg_set, last_call_insn); } and /* Mark each used value as live. */ for (reg = curr_id->regs; reg != NULL; reg = reg->next) if (reg->type != OP_OUT) { if (reg->type == OP_IN) update_pseudo_point (reg->regno, curr_point, USE_POINT); mark_regno_live (reg->regno, reg->biggest_mode); check_pseudos_live_through_calls (reg->regno, last_call_used_reg_set, call_insn); } and EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, j, bi) { if (sparseset_cardinality (pseudos_live_through_calls) == 0) break; if (sparseset_bit_p (pseudos_live_through_calls, j)) check_pseudos_live_through_calls (j, last_call_used_reg_set, call_insn); } the pseudos mode may change but I guess usually it doesn't. I also wonder why the target hook doesn't return a hard-reg-set ... That said, the above code doesn't scale well with functions with a lot of calls at least, also the passed call_insn isn't the current insn and might even be NULL. All but aarch64 do not even look at the actual instruction (even more an argument for re-designing the hook with it's use in mind). I guess an artificial testcase with a lot of calls and a lot of live pseudos (even single-BB) should show this issue easily. Samples: 579 of event 'cycles:ppp', Event count (approx.): 257134187434191 Overhead Command Shared Object Symbol 22.26% f951 f951 [.] process_bb_lives 15.06% f951 f951 [.] ix86_hard_regno_call_part_clobbered 8.55% f951 f951 [.] concat 6.88% f951 f951 [.] find_base_term 3.60% f951 f951 [.] get_ref_base_and_extent 3.27% f951 f951 [.] find_base_term 2.95% f951 f951 [.] make_hard_regno_dead