[Bug middle-end/90549] missing -Wreturn-local-addr maybe returning an address of a local array plus offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90549 Martin Sebor changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |10.0 --- Comment #6 from Martin Sebor --- Fixed via r273261. Both functions in the test case are now diagnosed: pr90549.c: In function ‘f’: pr90549.c:7:10: warning: function may return address of local variable [-Wreturn-local-addr] 7 | return p;// -Wreturn-local-addr (good) | ^ pr90549.c:5:7: note: declared here 5 | int b[2]; | ^ pr90549.c: In function ‘g’: pr90549.c:15:12: warning: function may return address of local variable [-Wreturn-local-addr] 15 | return p + 1;// missing -Wreturn-local-addr | ~~^~~ pr90549.c:12:7: note: declared here 12 | int b[2]; | ^
[Bug other/90556] [meta-bug] bogus/missing -Wreturn-local-addr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90556 Bug 90556 depends on bug 90549, which changed state. Bug 90549 Summary: missing -Wreturn-local-addr maybe returning an address of a local array plus offset https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90549 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug other/90556] [meta-bug] bogus/missing -Wreturn-local-addr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90556 Bug 90556 depends on bug 71924, which changed state. Bug 71924 Summary: missing -Wreturn-local-addr returning alloca result https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71924 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug c/71924] missing -Wreturn-local-addr returning alloca result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71924 Martin Sebor changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |10.0 --- Comment #7 from Martin Sebor --- Patch committed in r273261.
[Bug c++/64867] split warning for passing non-POD to varargs function from -Wconditionally-supported into new warning flag, -Wnon-pod-varargs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64867 Eric Gallager changed: What|Removed |Added CC||msebor at gcc dot gnu.org --- Comment #26 from Eric Gallager --- Martin Sebor has been doing stuff related to warnings about POD-ness lately; cc-ing him
[Bug middle-end/90549] missing -Wreturn-local-addr maybe returning an address of a local array plus offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90549 --- Comment #5 from Martin Sebor --- Author: msebor Date: Tue Jul 9 04:15:42 2019 New Revision: 273261 URL: https://gcc.gnu.org/viewcvs?rev=273261=gcc=rev Log: PR middle-end/71924 - missing -Wreturn-local-addr returning alloca result PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an address of a local array plus offset gcc/ChangeLog: PR middle-end/71924 PR middle-end/90549 * gimple-ssa-isolate-paths.c (isolate_path): Add attribute. Update comment. (args_loc_t): New type. (args_loc_t, locmap_t): same. (diag_returned_locals): New function. (is_addr_local): Same. (handle_return_addr_local_phi_arg, warn_return_addr_local): Same. (find_implicit_erroneous_behavior): Call warn_return_addr_local_phi_arg. (find_explicit_erroneous_behavior): Call warn_return_addr_local. gcc/testsuite/ChangeLog: PR middle-end/71924 PR middle-end/90549 * gcc.c-torture/execute/return-addr.c: New test. * gcc.dg/Wreturn-local-addr-2.c: New test. * gcc.dg/Wreturn-local-addr-4.c: New test. * gcc.dg/Wreturn-local-addr-5.c: New test. * gcc.dg/Wreturn-local-addr-6.c: New test. * gcc.dg/Wreturn-local-addr-7.c: New test. * gcc.dg/Wreturn-local-addr-8.c: New test. * gcc.dg/Wreturn-local-addr-9.c: New test. * gcc.dg/Wreturn-local-addr-10.c: New test. * gcc.dg/Walloca-4.c: Handle expected warnings. * gcc.dg/pr41551.c: Same. * gcc.dg/pr59523.c: Same. * gcc.dg/tree-ssa/pr88775-2.c: Same. * gcc.dg/tree-ssa/alias-37.c: Same. * gcc.dg/winline-7.c: Same. Added: trunk/gcc/testsuite/gcc.c-torture/execute/return-addr.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-10.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-2.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-3.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-4.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-5.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-6.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-7.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-8.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-9.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-ssa-isolate-paths.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/Walloca-4.c trunk/gcc/testsuite/gcc.dg/pr41551.c trunk/gcc/testsuite/gcc.dg/pr59523.c trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-37.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr88775-2.c trunk/gcc/testsuite/gcc.dg/winline-7.c trunk/libgcc/generic-morestack.c
[Bug c/71924] missing -Wreturn-local-addr returning alloca result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71924 --- Comment #6 from Martin Sebor --- Author: msebor Date: Tue Jul 9 04:15:42 2019 New Revision: 273261 URL: https://gcc.gnu.org/viewcvs?rev=273261=gcc=rev Log: PR middle-end/71924 - missing -Wreturn-local-addr returning alloca result PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an address of a local array plus offset gcc/ChangeLog: PR middle-end/71924 PR middle-end/90549 * gimple-ssa-isolate-paths.c (isolate_path): Add attribute. Update comment. (args_loc_t): New type. (args_loc_t, locmap_t): same. (diag_returned_locals): New function. (is_addr_local): Same. (handle_return_addr_local_phi_arg, warn_return_addr_local): Same. (find_implicit_erroneous_behavior): Call warn_return_addr_local_phi_arg. (find_explicit_erroneous_behavior): Call warn_return_addr_local. gcc/testsuite/ChangeLog: PR middle-end/71924 PR middle-end/90549 * gcc.c-torture/execute/return-addr.c: New test. * gcc.dg/Wreturn-local-addr-2.c: New test. * gcc.dg/Wreturn-local-addr-4.c: New test. * gcc.dg/Wreturn-local-addr-5.c: New test. * gcc.dg/Wreturn-local-addr-6.c: New test. * gcc.dg/Wreturn-local-addr-7.c: New test. * gcc.dg/Wreturn-local-addr-8.c: New test. * gcc.dg/Wreturn-local-addr-9.c: New test. * gcc.dg/Wreturn-local-addr-10.c: New test. * gcc.dg/Walloca-4.c: Handle expected warnings. * gcc.dg/pr41551.c: Same. * gcc.dg/pr59523.c: Same. * gcc.dg/tree-ssa/pr88775-2.c: Same. * gcc.dg/tree-ssa/alias-37.c: Same. * gcc.dg/winline-7.c: Same. Added: trunk/gcc/testsuite/gcc.c-torture/execute/return-addr.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-10.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-2.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-3.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-4.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-5.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-6.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-7.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-8.c trunk/gcc/testsuite/gcc.dg/Wreturn-local-addr-9.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-ssa-isolate-paths.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/Walloca-4.c trunk/gcc/testsuite/gcc.dg/pr41551.c trunk/gcc/testsuite/gcc.dg/pr59523.c trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-37.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr88775-2.c trunk/gcc/testsuite/gcc.dg/winline-7.c trunk/libgcc/generic-morestack.c
[Bug target/91103] AVX512 vector element extract uses more than 1 shuffle instruction; VALIGND can grab any element
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91103 --- Comment #4 from Peter Cordes --- We should not put any stock in what ICC does for GNU C native vector indexing. I think it doesn't know how to optimize that because it *always* spills/reloads even for `vec[0]` which could be a no-op. And it's always a full-width spill (ZMM), not just the low XMM/YMM part that contains the desired element. I mainly mentioned ICC in my initial post to suggest the store/reload strategy in general as an *option*. ICC also doesn't optimize intriniscs: it pretty much always faithfully transliterates them to asm. e.g. v = _mm_add_epi32(v, _mm_set1_epi32(1)); twice compiles to two separate paddd instructions, instead of one with a constant of set1(2). If we want to see ICC's strided-store strategy, we'd need to write some pure C that auto-vectorizes. That said, store/reload is certainly a valid option when we want all the elements, and gets *more* attractive with wider vectors, where the one extra store amortizes over more elements. Strided stores will typically bottleneck on cache/memory bandwidth unless the destination lines are already hot in L1d. But if there's other work in the loop, we care about OoO exec of that work with the stores, so uop throughput could be a factor. If we're tuning for Intel Haswell/Skylake with 1 per clock shuffles but 2 loads + 1 store per clock throughput (if we avoid indexed addressing modes for stores), then it's very attractive and unlikely to be a bottleneck. There's typically spare load execution-unit cycles in a loop that's also doing stores + other work. You need every other uop to be (or include) a load to bottleneck on that at 4 uops per clock, unless you have indexed stores (which can't run on the simple store-AGU on port 7 and need to run on port 2/3, taking a cycle from a load). Cache-split loads do get replayed to grab the 2nd half, so it costs extra execution-unit pressure as well as extra cache-read cycles. Intel says Ice will have 2 load + 2 store pipes, and a 2nd shuffle unit. A mixed strategy there might be interesting: extract the high 256 bits to memory with vextractf32x8 and reload it, but shuffle the low 128/256 bits. That strategy might be good on earlier CPUs, too. At least with movss + extractps stores from the low XMM where we can do that directly. AMD before Ryzen 2 has only 2 AGUs, so only 2 memory ops per clock, up to one of which can be a store. It's definitely worth considering extracting the high 128-bit half of a YMM and using movss then shuffles like vextractps: 2 uops on Ryzen or AMD. - If the stride is small enough (so more than 1 element fits in a vector), we should consider shuffle + vmaskmovps masked stores, or with AVX512 then AVX512 masked stores. But for larger strides, AVX512 scatter may get better in the future. It's currently (SKX) 43 uops for VSCATTERDPS or ...DD ZMM, so not very friendly to surrounding code. It sustains one per 17 clock throughput, slightly worse than 1 element stored per clock cycle. Same throughput on KNL, but only 4 uops so it can overlap much better with surrounding code. For qword elements, we have efficient stores of the high or low half of an XMM. A MOVHPS store doesn't need a shuffle uop on most Intel CPUs. So we only need 1 (YMM) or 3 (ZMM) shuffles to get each of the high 128-bit lanes down to an XMM register. Unfortunately on Ryzen, MOVHPS [mem], xmm costs a shuffle+store. But Ryzen has shuffle EUs on multiple ports.
[Bug c++/91118] New: ubsan does not work with openmp default (none) directive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91118 Bug ID: 91118 Summary: ubsan does not work with openmp default (none) directive Product: gcc Version: 9.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: alan.avbs at rocketmail dot com Target Milestone: --- Program fails to compile when using -fsanitize=undefined and using the directive default(none) in a parallel region. For instance: #include int main() { #pragma omp parallel default(none) shared(std::cerr) { std::cerr<<"hello"
[Bug target/91117] New: _mm_movpi64_epi64/_mm_movepi64_pi64 generating store+load instead of using MOVQ2DQ/MOVDQ2Q
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91117 Bug ID: 91117 Summary: _mm_movpi64_epi64/_mm_movepi64_pi64 generating store+load instead of using MOVQ2DQ/MOVDQ2Q Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: wolfwings+gcc at gmail dot com Target Milestone: --- _mm_movpi64_epi64 is never using MOVQ2DQ (and _mm_movepi64_pi64 never using MOVDQ2Q) despite documentation it should when used in mixed MMX -> SSE situations, and that these are in fact the intrinsics to use when desiring the Q2DQ/DQ2Q opcodes. This appears to be due to the header defining them causing fallback memory write then read except in (technically invalid) SSE -> SSE cases where a MOVD is used. Tested on GCC 7.4 + 9.1 locally, with additional testing on Godbolt all showing identical code being generated all the way back to 4.x series. Compiled with -O1: #include __m128i test( __m128i input ) { __m64 x = _mm_movepi64_pi64( input ); return _mm_movpi64_epi64( _mm_mullo_pi16( x, x ) ); } Generated assembly on GCC 9.1: movq%xmm0, -16(%rsp) movq-16(%rsp), %mm0 movq%mm0, %mm1 pmullw %mm0, %mm1 movq%mm1, -16(%rsp) movq-16(%rsp), %xmm0 ret A version that makes explicit calls to movq2dq/movdq2q works and outputs the expected assembly sequence: #include static inline __m64 _my_movepi64_pi64( __m128i input ) { __m64 result; asm( "movdq2q %1, %0" : "=y" (result) : "x" (input) : ); return result; } static inline __m128i _my_movpi64_epi64( __m64 input ) { __m128i result; asm( "movq2dq %1, %0" : "=x" (result) : "y" (input) : ); return result; } __m128i test( __m128i input ) { __m64 x = _my_movepi64_pi64( input ); return _my_movpi64_epi64( _mm_mullo_pi16( x, x ) ); } Generated assembly on GCC 7.4, 9.1, and others via Godbolt, again with -O1 (-O2 and -O3 make no difference): movdq2q %xmm0, %mm0 pmullw %mm0, %mm0 movq2dq %mm0, %xmm0 ret For completeness, ICC generates the 'short' code form on all available versions without needing the inline assembly workaround.
[Bug c++/91110] [10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in cp_omp_mappable_type_1, at cp/decl2.c:1421
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91110 --- Comment #3 from Jakub Jelinek --- Author: jakub Date: Mon Jul 8 22:08:27 2019 New Revision: 273248 URL: https://gcc.gnu.org/viewcvs?rev=273248=gcc=rev Log: PR c++/91110 * decl2.c (cp_omp_mappable_type_1): Don't emit any note for error_mark_node type. * g++.dg/gomp/pr91110.C: New test. Added: trunk/gcc/testsuite/g++.dg/gomp/pr91110.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/decl2.c trunk/gcc/testsuite/ChangeLog
[Bug c++/61339] add mismatch between struct and class [-Wmismatched-tags] to non-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61339 Martin Sebor changed: What|Removed |Added Keywords||patch --- Comment #11 from Martin Sebor --- Patch: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00621.html
[Bug target/91116] New: bad register choices for rs6000 -m32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91116 Bug ID: 91116 Summary: bad register choices for rs6000 -m32 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: segher at gcc dot gnu.org Target Milestone: --- In the new testcase pr88233.c, which is typedef struct { double a[2]; } A; A foo (const A *a) { return *a; } we currently get as generated code for -m32 addi 10,4,4 lfiwzx 10,0,4 addi 9,3,12 lfiwzx 11,0,10 addi 10,4,8 lfiwzx 12,0,10 addi 10,4,12 stfiwx 10,0,3 lfiwzx 0,0,10 addi 10,3,4 stfiwx 11,0,10 addi 10,3,8 stfiwx 12,0,10 stfiwx 0,0,9 blr Expand decides to do this as four SImode copies, which isn't such a great idea, of course; but RA thinks it is cost 0 to put a SImode in an FP or altivec register. That won't fly.
[Bug rtl-optimization/88233] combine fails to merge insns leaving unneeded reg copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88233 --- Comment #4 from Segher Boessenkool --- Author: segher Date: Mon Jul 8 20:38:46 2019 New Revision: 273245 URL: https://gcc.gnu.org/viewcvs?rev=273245=gcc=rev Log: rs6000: Add testcase for PR88233 This testcase tests that with -mcpu=power8 we do not generate any mtvsr* instructions, and we do the copy with {l,st}xvd2x. gcc/testsuite/ PR rtl-optimization/88233 * gcc.target/powerpc/pr88233.c: New testcase. Added: trunk/gcc/testsuite/gcc.target/powerpc/pr88233.c Modified: trunk/gcc/testsuite/ChangeLog
[Bug c++/91073] [9/10 Regression] if constexpr no longer works directly with Concepts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91073 --- Comment #2 from Paolo Carlini --- In principle the issue is rather simple. The cp_parser_maybe_commit_to_declaration at the beginning of cp_parser_condition since r260482 thinks erroneously that the just parsed HasInit must be a declaration. In practice, I'm still not sure which is the best way to solve this... well, I'm not even sure we are supposed to actively work now on relatively minor concept-related issues.
[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109 --- Comment #2 from Christophe Lyon --- Removing the test*() calls from the end, the first failing one is testX(). However, if I remove all the preceding ones, the test passes. Using -fwhole-program instead of -flto has no effect: the test still fails. Adding a printf() call in check() also makes the test pass.
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 --- Comment #64 from dave.anglin at bell dot net --- On 2019-07-08 2:51 p.m., elowe at elowe dot com wrote: > I made a very simple change: > > --- ia64.c.orig 2019-07-08 14:43:33 + > +++ ia64.c 2019-07-05 16:46:24 + > @@ -1137,7 +1137,7 @@ > emit_insn (gen_load_fptr (dest, src)); >else if (sdata_symbolic_operand (src, VOIDmode)) > emit_insn (gen_load_gprel (dest, src)); > - else if (local_symbolic_operand64 (src, VOIDmode)) > + else if (local_symbolic_operand64 (src, VOIDmode) && !TARGET_HPUX) > { >/* We want to use @gprel rather than @ltoff relocations for local > symbols: > > Which I think has the same effect as disabling it in predicate. I'm happy with > either approach. Okay, I assume we are now at the problem in comment #58. Would you upload the final RTL dump for "IsLower.c" ("-da" opttion will generate)? It would also be useful to find the change which introduced the regression for "IsLower.c". You could post the above patch with a ChangeLog to gcc-patches. It's small enough that a FSF assignment shouldn't be needed.
[Bug sanitizer/91115] stack-buffer-overflow on memset local variable when creating thread on ARM Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91115 --- Comment #1 from Fred Hsueh --- Created attachment 46580 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46580=edit Fixup memory location of shadow This shadow location works better than the 32-bit default.
[Bug sanitizer/91115] New: stack-buffer-overflow on memset local variable when creating thread on ARM Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91115 Bug ID: 91115 Summary: stack-buffer-overflow on memset local variable when creating thread on ARM Linux Product: gcc Version: 8.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: fhsueh at roku dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- I'm getting a ASAN stack-buffer-overflow when thread is starting on ARM Linux. gcc-8.3 and glibc-2.22. Here's the output, cleaned up a bit: > ==1541==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x9bffebf8 at pc 0xa3585e98 bp 0x9bffebc4 sp 0x9bffe790 WRITE of size 36 at 0x9bffebf8 thread T10 #0 0xa3585e97 in __interceptor_memset gcc-8.3.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:709 #1 0x9f6d378b in __pthread_attr_init_2_1 glibc-2.22/nptl/pthread_attr_init.c:41 #2 0xa3619053 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) gcc-8.3.0/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc:105 #3 0xa361940b in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) gcc-8.3.0/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc:415 #4 0xa360f147 in __asan::AsanThread::SetThreadStackAndTls(__asan::AsanThread::InitOptions const*) gcc-8.3.0/libsanitizer/asan/asan_thread.cc:287 #5 0xa360f237 in __asan::AsanThread::Init(__asan::AsanThread::InitOptions const*) gcc-8.3.0/libsanitizer/asan/asan_thread.cc:224 #6 0xa360f367 in __asan::AsanThread::ThreadStart(unsigned long, __sanitizer::atomic_uintptr_t*) gcc-8.3.0/libsanitizer/asan/asan_thread.cc:241 #7 0x9f6d1d63 in start_thread glibc-2.22/nptl/pthread_create.c:336 Address 0x9bffebf8 is located in stack of thread T9 at offset 664 in frame #0 0x25b6e3f in _M_run arm-roku-linux-gnueabi/include/c++/8.3.0/thread:196 This frame has 13 object(s): [32, 36) 'bt' [96, 100) 'bt' [160, 168) '' [224, 232) '' [288, 296) '' [352, 360) '' [416, 424) '' [480, 488) '' [544, 552) 'lock' [608, 620) 'cd' [672, 684) 'cd' <== Memory access at offset 664 partially underflows this variable [736, 748) '' [800, 812) '' HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) Thread T9 created by T0 here: #0 0xa35cdc1f in __interceptor_pthread_create gcc-8.3.0/libsanitizer/asan/asan_interceptors.cc:202 #1 0x9f83d543 in std::thread::_M_start_thread(std::unique_ptr >, void (*)()) (/usr/lib/libstdc++.so.6+0x9c543) SUMMARY: AddressSanitizer: stack-buffer-overflow gcc-8.3.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:709 in __interceptor_memset Shadow bytes around the buggy address: 0x437ffd20: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 0x437ffd30: 04 f2 f2 f2 f2 f2 f2 f2 04 f2 f2 f2 f2 f2 f2 f2 0x437ffd40: 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2 f2 f2 f2 f2 0x437ffd50: 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2 f2 f2 f2 f2 0x437ffd60: 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2 f2 f2 f2 f2 =>0x437ffd70: 00 f2 f2 f2 f2 f2 f2 f2 00 04 f2 f2 f2 f2 f2[f2] 0x437ffd80: 00 04 f2 f2 f2 f2 f2 f2 00 04 f2 f2 f2 f2 f2 f2 0x437ffd90: 00 04 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 0x437ffda0: 00 00 00[ 363.983356] grsec: bruteforce prevention initiated for the next 30 minutes or until service restarted, stalling each fork 30 seconds. Please investigate the crash report for /bin/Application[Application:1541] uid/euid:0/0 gid/egid:0/0, parent /bin/Application[Application:1480] uid/euid:501/501 gid/egid:501/501 00 00 00 00 00 00 00 00 00 00 00 00 00 0x437ffdb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x437ffdc0: 00 00 00 00 364.068455] ltcore_dump: starting dump m 00 00 00 00 00 00 grsec: From 10.14.24.38: denied resource overstep by requesting 52 for RLIMIT_CORE against limit 0 for /bin/Application[Application:1542] uid/euid:501/501 gid/egid:501/501, parent /bin/busybox[sh:1394] uid/euid:0/0 gid/egid:0/0 [1m 00 grsec: From 10.14.24.38: denied resource overstep by requesting 84 for RLIMIT_CORE against limit 0 for /bin/Application[Application:1542] uid/euid:501/501 gid/egid:501/501, parent /bin/busybox[sh:1394] uid/euid:0/0 gid/egid:0/0 [0m00 00[ 364.128767] grsec: From 10.14.24.38: denied resource overstep by requesting 116 for RLIMIT_CORE against limit 0 for /bin/Application[Application:1542] uid/euid:501/501 gid/egid:501/501, parent /bin/busybox[sh:1394] uid/euid:0/0 gid/egid:0/0 00 00 364.152847] grsec: From 10.14.24.38: denied resource overstep by requesting 148 for RLIMIT_CORE against limit 0 for
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 --- Comment #63 from EML --- Sorry, I didn't undo the patch completely. I made a very simple change: --- ia64.c.orig 2019-07-08 14:43:33 + +++ ia64.c 2019-07-05 16:46:24 + @@ -1137,7 +1137,7 @@ emit_insn (gen_load_fptr (dest, src)); else if (sdata_symbolic_operand (src, VOIDmode)) emit_insn (gen_load_gprel (dest, src)); - else if (local_symbolic_operand64 (src, VOIDmode)) + else if (local_symbolic_operand64 (src, VOIDmode) && !TARGET_HPUX) { /* We want to use @gprel rather than @ltoff relocations for local symbols: Which I think has the same effect as disabling it in predicate. I'm happy with either approach.
[Bug tree-optimization/91114] New: [10 Regression] ICE in vect_analyze_loop, at tree-vect-loop.c:2415
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91114 Bug ID: 91114 Summary: [10 Regression] ICE in vect_analyze_loop, at tree-vect-loop.c:2415 Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: ice-checking, ice-on-valid-code, openmp Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- Target: x86_64-unknown-linux-gnu gcc-10.0.0-alpha20190707 snapshot (r273184) ICEs when compiling the following testcase w/ -O1 -fopenmp-simd: void ne (double *zu) { int h3; #pragma omp simd simdlen (4) for (h3 = 0; h3 < 4; ++h3) zu[h3] = 0; } % x86_64-unknown-linux-gnu-gcc-10.0.0-alpha20190707 -O1 -fopenmp-simd -c hnkztevu.c during GIMPLE pass: vect hnkztevu.c: In function 'ne': hnkztevu.c:2:1: internal compiler error: in vect_analyze_loop, at tree-vect-loop.c:2415 2 | ne (double *zu) | ^~ 0x6fe6b2 vect_analyze_loop(loop*, _loop_vec_info*, vec_info_shared*) /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/tree-vect-loop.c:2415 0xfc5495 try_vectorize_loop_1 /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/tree-vectorizer.c:886 0xfc613f vectorize_loops() /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/tree-vectorizer.c:1114
[Bug tree-optimization/91010] ICE: Segmentation fault (in location_wrapper_p)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91010 --- Comment #4 from Arseny Solokha --- Can this PR be closed now?
[Bug rtl-optimization/88233] combine fails to merge insns leaving unneeded reg copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88233 --- Comment #3 from Segher Boessenkool --- Author: segher Date: Mon Jul 8 17:35:12 2019 New Revision: 273240 URL: https://gcc.gnu.org/viewcvs?rev=273240=gcc=rev Log: subreg: Add -fsplit-wide-types-early (PR88233) Currently the second lower-subreg pass is run right before RA. This is much too late to be very useful. At least for targets that do not have RTL patterns for operations on multi-register modes it is a lot better to split patterns earlier, before combine and all related passes. This adds an option -fsplit-wide-types-early that does that, and enables it by default for rs6000. PR rtl-optimization/88233 * common.opt (fsplit-wide-types-early): New option. * common/config/rs6000/rs6000-common.c (rs6000_option_optimization_table): Add OPT_fsplit_wide_types_early for OPT_LEVELS_ALL. * doc/invoke.texi (Optimization Options): Add -fsplit-wide-types-early. * lower-subreg.c (pass_lower_subreg2::gate): Add test for flag_split_wide_types_early. (pass_data_lower_subreg3): New. (pass_lower_subreg3): New. (make_pass_lower_subreg3): New. * passes.def (pass_lower_subreg2): Move after the loop passes. (pass_lower_subreg3): New, inserted where pass_lower_subreg2 was. * tree-pass.h (make_pass_lower_subreg2): Move up, to its new place in the pass pipeline; its previous place is taken by ... (make_pass_lower_subreg3): ... this. Modified: trunk/gcc/ChangeLog trunk/gcc/common.opt trunk/gcc/common/config/rs6000/rs6000-common.c trunk/gcc/doc/invoke.texi trunk/gcc/lower-subreg.c trunk/gcc/passes.def trunk/gcc/tree-pass.h
[Bug testsuite/78529] gcc.c-torture/execute/builtins/strcat-chk.c failed with lto/O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78529 Wilco changed: What|Removed |Added Status|NEW |RESOLVED CC||wilco at gcc dot gnu.org Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org --- Comment #40 from Wilco --- Fixed
[Bug testsuite/78529] gcc.c-torture/execute/builtins/strcat-chk.c failed with lto/O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78529 --- Comment #39 from Wilco --- Author: wilco Date: Mon Jul 8 17:02:35 2019 New Revision: 273238 URL: https://gcc.gnu.org/viewcvs?rev=273238=gcc=rev Log: Turn of ipa-ra in builtins test (PR91059) The gcc.c-torture/execute/builtins/lib directory contains a reimplementation of many C library string functions, which causes non-trivial register allocation bugs with LTO and static linked libraries. To fix this long-standing test issue, turn off ipa-ra which avoids the register corruption across calls. All builtin torture tests now pass on aarch64-none-elf. Committed as obvious. testsuite/ PR testsuite/91059 PR testsuite/78529 * gcc.c-torture/execute/builtins/builtins.exp: Add -fno-ipa-ra. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp
[Bug c/91092] Error on implicit function declarations by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91092 Segher Boessenkool changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-07-08 Ever confirmed|0 |1 --- Comment #12 from Segher Boessenkool --- Given the above, I don't think it can ever be ready in time for GCC 10. But, confirmed.
[Bug testsuite/91059] [10 regression] gcc.c-torture/execute/builtins/snprintf-chk.c fails on aarch64-elf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91059 --- Comment #5 from Wilco --- Author: wilco Date: Mon Jul 8 17:02:35 2019 New Revision: 273238 URL: https://gcc.gnu.org/viewcvs?rev=273238=gcc=rev Log: Turn of ipa-ra in builtins test (PR91059) The gcc.c-torture/execute/builtins/lib directory contains a reimplementation of many C library string functions, which causes non-trivial register allocation bugs with LTO and static linked libraries. To fix this long-standing test issue, turn off ipa-ra which avoids the register corruption across calls. All builtin torture tests now pass on aarch64-none-elf. Committed as obvious. testsuite/ PR testsuite/91059 PR testsuite/78529 * gcc.c-torture/execute/builtins/builtins.exp: Add -fno-ipa-ra. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp
[Bug testsuite/91059] [10 regression] gcc.c-torture/execute/builtins/snprintf-chk.c fails on aarch64-elf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91059 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #4 from Richard Biener --- duplicate then *** This bug has been marked as a duplicate of bug 78529 ***
[Bug testsuite/78529] gcc.c-torture/execute/builtins/strcat-chk.c failed with lto/O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78529 Richard Biener changed: What|Removed |Added CC||clyon at gcc dot gnu.org --- Comment #38 from Richard Biener --- *** Bug 91059 has been marked as a duplicate of this bug. ***
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 --- Comment #62 from dave.anglin at bell dot net --- On 2019-07-08 12:22 p.m., elowe at elowe dot com wrote: > When I remove that gprel patch - the 64bit stage 1 compiler is able to compile > hello world, islower, as well as all the other "conftest" programs > successfully. It can compile libstdc++ as well (some duplicate symbols > however). I doubt removing the gprel patch is an acceptable solution as it fixed a bug on Linux. A better solution is to disable the local_symbolic_operand64 predicate on hpux. That should fix hello world. Then, we can move to other issues.
[Bug testsuite/91059] [10 regression] gcc.c-torture/execute/builtins/snprintf-chk.c fails on aarch64-elf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91059 --- Comment #3 from Wilco --- Confirmed it's the same memset register corruption issue. The fix is trivial: add -fno-ipa-ra.
[Bug c/91092] Error on implicit function declarations by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91092 Rich Felker changed: What|Removed |Added CC||bugdal at aerifal dot cx --- Comment #11 from Rich Felker --- I'm strongly in favor of fixing this, but the configure situation is a mess. Doing this needs both an active project to fix configure scripts (starting with upstream autoconf/gnulib ones, and at least in the past, the maintainers' misguided opinions that testing for symbol presence with missing or invalid declarations was a valid configure test! see: https://ewontfix.com/13/) and probably one of the workarounds described above (detecting configure use and not erroring out in that case?).
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 --- Comment #61 from EML --- Sorry, perhaps I have confused the situation. I have already patched my compiler to remove the gprel in both 32 and 64. That gprel patch breaks things in both 32 and 64. I'm reasonably convinced the patch is wrong for HP-UX, so I'm moving forward with that assumption. When I remove that gprel patch - the 64bit stage 1 compiler is able to compile hello world, islower, as well as all the other "conftest" programs successfully. It can compile libstdc++ as well (some duplicate symbols however). However, the 32-bit compiler does not work which I believe to be a pointer swizzle issue. I've confirmed the binary is 32bit as follows: -bash-5.0$ file islower islower:ELF-32 executable object file - IA64 -bash-5.0$ elfdump -f islower islower: *** ELF Header *** Class: ELF-32 Data:Big-endian OS: HP-UX ABI Version: 1 Type:EXEC Machine: IPF Version: 1 Entry Addr: 0x40008b0 Program Hdr Offset: 0x34 Section Hdr Offset: 0x1104c Flags: trapnil Flags: big-endian PSR Flags: IA-64 Elf Hdr Size:0x34 Program Hdr Size:0x20 Program Hdr Number: 12 Section Hdr Size:0x28 Section Hdr Number: 43 Section Hdr String Idx: 42
[Bug c/89072] -Wall -Werror should be defaults
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89072 Rich Felker changed: What|Removed |Added CC||bugdal at aerifal dot cx --- Comment #2 from Rich Felker --- Just here to second that -Werror should never be the default and that it's pretty much entirely wrong. -Werror is useful for imposing development policy in a development environment you control. It's not at all okay for shipping source that the user will compile in an environment you don't control.
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 --- Comment #60 from dave.anglin at bell dot net --- On 2019-07-08 12:07 a.m., elowe at elowe dot com wrote: > If you insert the addp4 r14 = 0,r14 before that command (like gcc 4.9.3 does), > the program compiles and runs correctly It would be useful to do a regression search to determine the revision that changed the above behavior. > > I'll upload the .s for "IsLower.c" - it's definitely a 32 bit executable, so > the correct options are being passed around. I'm not sure why you say the .s for "IsLower.c" is a 32-bit executable and that the correct options are being passed around. You haven't shown the assembler or linker commands used to create the executable. For applications like the hello world program, there is very little difference between the 32 and 64-bit assembler output generated by gcc (cc1). I'm still trying to understand the problem with the gprel relocation. It seems to work in 64-bit but not in 32-bit. While there might be issues with assembler or linker, you are probably correct that we need to swizzle pointer with ILP32. You could try adding something like the following to this hunk after the emit_insn() line: else if (local_symbolic_operand64 (src, VOIDmode)) { /* We want to use @gprel rather than @ltoff relocations for local symbols: - @gprel does not require dynamic linker - and does not use .sdata section https://gcc.gnu.org/bugzilla/60465 */ emit_insn (gen_load_gprel64 (dest, src)); } if (TARGET_ILP32) { rtx tmp; tmp = gen_rtx_REG_offset (dest, ptr_mode, REGNO (dest), byte_lowpart_offset (ptr_mode, GET_MODE (dest))); REG_POINTER (tmp) = 1; emit_insn (gen_ptr_extend (dest, tmp)); } Alternatively, you could try disabling the local_symbolic_operand64 predicate in predicates.md: (define_predicate "local_symbolic_operand64" (match_code "symbol_ref,const") { switch (GET_CODE (op)) { Just add if (TARGET_ILP32) return false; before switch statement.
[Bug c/91113] New: add declare_simd_variant attribute support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91113 Bug ID: 91113 Summary: add declare_simd_variant attribute support Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: nsz at gcc dot gnu.org Target Milestone: --- to declare vector functions on aarch64 for one simd architecture only, support for the openmp 5.0 declare variant syntax is required, but full support for the omp declare variant pragma is excessive. (for the aarch64 use-case, see user defined vector functions in https://developer.arm.com/docs/101129/latest ) I suggest introducing an attribute in gcc that can handle a subset of omp declare variant pragma and works in c and fortran declarations for declare simd functions. I think the syntax and semantics for the attribute should follow the proposal for clang (without the clang_ prefix): http://lists.llvm.org/pipermail/llvm-dev/2019-June/132987.html ``` declare_simd_variant (, {, }) := The name of a function variant that is a base language identifier, or, for C++, a template-id. := , {, } := simdlen() | simdlen("scalable") := inbranch | notinbranch := | | | {,} := linear_ref(,) | linear_var(, ) | linear_uval(, ) | linear(, ) := | := uniform() := align(, ) := Name of a parameter in the scalar function declaration/definition := ... | -2 | -1 | 1 | 2 | ... := 1 | 2 | 3 | ... := {}{,} {} := isa(target-specific-value) := arch(target-specific-value) ``` example usage: ``` __attribute__(declare_simd_variant("vfoo", simdlen(2), notinbranch, isa("simd")) double foo(double x); float64x2_t vfoo(float64x2_t vx); ``` should be equivalent to the openmp 5.0 code ``` #pragma omp declare variant(vfoo) \ match(construct={simd(simdlen(2), notinbranch)}, device={isa("simd")}) double foo(double x); float64x2_t vfoo(float64x2_t vx); ```
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #16 from Jakub Jelinek --- (In reply to Jakub Jelinek from comment #15) > Seems systemd abuses compound literals even in cases where they make no > sense, perhaps one of those in a short function like that is no longer > optimized away completely and that is why it triggers all the > __asan_malloc_0 calls in there where formerly it got away without that. > E.g. > #define assert_cc(expr) \ > struct CONCATENATE(_assert_struct_, __COUNTER__) { \ > char x[(expr) ? 0 : -1];\ > }; > doesn't make any sense to me, why not say > do { extern char CONCATENATE(_assert_var_, __COUNTER__) [(expr) ? 0 : -1]; } > while (0) > instead? > The IN_SET macro has another compound literal: > assert_cc((sizeof((long double[]){__VA_ARGS__})/sizeof(long double)) <= 20); > It would surprise me if you can't do such counting without resorting to > compound literals. As IN_SET is turning the __VA_ARGS__ arguments into case N:, those have to be constant expressions, so you could say replace IN_SET's assert_cc((sizeof((long double[]){__VA_ARGS__})/sizeof(long double)) <= 20); with static long double __assert_in_set __attribute__((__unused__)) [] = { __VA_ARGS__ }; assert_cc(sizeof (__assert_in_set)/sizeof(long double)) <= 20); or similar, this is in its own scope, so doesn't need to use any __COUNTER__ etc. With -O1 and above it would be surely optimized away, and with -O0 it would be much less costly for asan.
[Bug target/91059] [10 regression] gcc.c-torture/execute/builtins/snprintf-chk.c fails on aarch64-elf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91059 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from Wilco --- (In reply to Richard Biener from comment #1) > Likely target issue - please aarch64 folks investigate first. I'll have a look, but I bet it's PR78529 again since failures only happen with LTO and static linking with newlib.
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #15 from Jakub Jelinek --- Seems systemd abuses compound literals even in cases where they make no sense, perhaps one of those in a short function like that is no longer optimized away completely and that is why it triggers all the __asan_malloc_0 calls in there where formerly it got away without that. E.g. #define assert_cc(expr) \ struct CONCATENATE(_assert_struct_, __COUNTER__) { \ char x[(expr) ? 0 : -1];\ }; doesn't make any sense to me, why not say do { extern char CONCATENATE(_assert_var_, __COUNTER__) [(expr) ? 0 : -1]; } while (0) instead? The IN_SET macro has another compound literal: assert_cc((sizeof((long double[]){__VA_ARGS__})/sizeof(long double)) <= 20); It would surprise me if you can't do such counting without resorting to compound literals.
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 Martin Liška changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |INVALID --- Comment #14 from Martin Liška --- Ahh, I've got it. The systemd is built in the configuration without any optimization level! Please use -O2, that should speed up it significantly.
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #13 from Martin Liška --- And the stack difference is: Before: ;; Function categorize_eol (categorize_eol, funcdef_no=127, decl_uid=8513, cgraph_uid=127, symbol_order=127) categorize_eol (char c, ReadLineFlags flags) { _Bool _found; EndOfLineMarker D.9001; _Bool D.8520; _Bool _1; EndOfLineMarker _3; _Bool _7; EndOfLineMarker _9; EndOfLineMarker _10; EndOfLineMarker _11; EndOfLineMarker _12; : _found_4 = 0; if (flags_5(D) == 1) goto ; [INV] else goto ; [INV] ... After: ;; Function categorize_eol (categorize_eol, funcdef_no=127, decl_uid=8513, cgraph_uid=127, symbol_order=127) categorize_eol (char c, ReadLineFlags flags) { long double D.8516[1] = {1.0e+0}; <--- This stack variable. _Bool _found; EndOfLineMarker D.9001; _Bool D.8520; _Bool _1; EndOfLineMarker _3; _Bool _7; EndOfLineMarker _9; EndOfLineMarker _10; EndOfLineMarker _11; EndOfLineMarker _12; ...
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #12 from Martin Liška --- So the suspected allocation that happens is: #0 0x7723abb2 in __asan::FakeStack::Allocate (real_stack=140737488344072, class_id=0, stack_size_log=20, this=0x725f7000) at ../../../../libsanitizer/asan/asan_fake_stack.cc:103 #1 __asan::OnMalloc (size=, class_id=0) at ../../../../libsanitizer/asan/asan_fake_stack.cc:208 #2 __asan_stack_malloc_0 (size=) at ../../../../libsanitizer/asan/asan_fake_stack.cc:234 #3 0x76b5e6c5 in categorize_eol (c=120 'x', flags=(unknown: 0)) at ../src/basic/fileio.c:759 #4 0x76b5eb01 in read_line_full (f=0x61603f80, limit=1048576, flags=(unknown: 0), ret=0x7290b760) at ../src/basic/fileio.c:833 #5 0x76a25be6 in read_line (f=0x61603f80, limit=1048576, ret=0x7290b760) at ../src/basic/fileio.h:90 #6 0x76a2818f in config_parse (unit=0x0, filename=0x7290b320 "/tmp/test-conf-parser.gVYMCp", f=0x61603f80, sections=0x60b300 "Section", lookup=0x4020a0 , table=0x7290b2a0, flags=CONFIG_PARSE_WARN, userdata=0x0) at ../src/shared/conf-parser.c:309 #7 0x00404967 in test_config_parse (i=15, s=0x409d20 "[Section]\nsetting1=", 'x' ...) at ../src/test/test-conf-parser.c:334 #8 0x00404ef0 in main (argc=1, argv=0x7fffdc58) at ../src/test/test-conf-parser.c:392
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #11 from Martin Liška --- If I apply the following patch: diff --git a/libsanitizer/asan/asan_fake_stack.cc b/libsanitizer/asan/asan_fake_stack.cc index 3140f9a2aeb..2034769161e 100644 --- a/libsanitizer/asan/asan_fake_stack.cc +++ b/libsanitizer/asan/asan_fake_stack.cc @@ -198,6 +198,9 @@ static FakeStack *GetFakeStackFast() { } ALWAYS_INLINE uptr OnMalloc(uptr class_id, uptr size) { + VReport(1, "T%d: OnMalloc called for size: %d\n", + GetCurrentTidOrInvalid(), size); + FakeStack *fs = GetFakeStackFast(); if (!fs) return 0; uptr local_stack; I see a rapid change of calls of the function from 15381->2127789, where the change is an allocation 64B: OnMalloc called for size: 64
[Bug c++/91112] [8 Regression] Bad error message for virtual function of a template class. Wrong "required from here" line number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91112 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2019-07-08 Ever confirmed|0 |1 --- Comment #1 from Jonathan Wakely --- Please provide the code, not URLs (as https://gcc.gnu.org/bugs/ requests).
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #10 from Martin Liška --- The issue is that __asan_stack_malloc_0 function is very high in perf profile: # Overhead Command Shared Object Symbol # ... # 91.97% test-conf-parse libasan.so.5.0.0 [.] __asan_stack_malloc_0 7.00% test-conf-parse libsystemd-shared-242.so [.] config_parse 0.12% test-conf-parse libsystemd-shared-242.so [.] read_line_full 0.11% test-conf-parse libc-2.29.so [.] _IO_getc 0.10% test-conf-parse libc-2.29.so [.] __strlen_avx2 0.08% test-conf-parse libsystemd-shared-242.so [.] safe_fgetc 0.07% test-conf-parse libsystemd-shared-242.so [.] categorize_eol perf annotate says: : Disassembly of section .text: : : 00034a00 <__asan_stack_malloc_0>: : __asan_stack_malloc_0(): : extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __asan_stack_free_##class_id( \ : uptr ptr, uptr size) { \ : OnFree(ptr, class_id, size); \ : } ... : _ZN6__asan9FakeStack8AllocateEmmm(): : GC(real_stack); 0.00 : 34b30: mov%rbp,%rsi 0.00 : 34b33: mov%rax,%rdi 0.00 : 34b36: callq 34800 <__asan::FakeStack::GC(unsigned long)> 0.00 : 34b3b: jmpq 34a2c <__asan_stack_malloc_0+0x2c> : for (int i = 0; i < num_iter; i++) { 0.00 : 34b40: xor%ecx,%ecx 0.12 : 34b42: add$0x1,%ecx 17.35 : 34b45: cmp%ecx,%r8d 17.95 : 34b48: je 34b70 <__asan_stack_malloc_0+0x170> : uptr pos = ModuloNumberOfFrames(stack_size_log, class_id, hint_position++); 0.00 : 34b4a: mov%rdx,%rax 0.01 : 34b4d: add$0x1,%rdx : _ZN6__asan9FakeStack20ModuloNumberOfFramesEmmm(): : return n & (NumberOfFrames(stack_size_log, class_id) - 1); 0.01 : 34b51: and%rsi,%rax : _ZN6__asan9FakeStack8AllocateEmmm(): 0.03 : 34b54: mov%rdx,(%rbx) : if (flags[pos]) continue; 0.10 : 34b57: lea0x1000(%rbx,%rax,1),%rdi 31.71 : 34b5f: cmpb $0x0,(%rdi) 32.65 : 34b62: je 34a72 <__asan_stack_malloc_0+0x72> 0.00 : 34b68: jmp34b42 <__asan_stack_malloc_0+0x142> 0.00 : 34b6a: nopw 0x0(%rax,%rax,1)
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #9 from Martin Liška --- Started with r259641.
[Bug target/90712] [10 regression] gcc.dg/rtl/aarch64/subs_adds_sp.c fails with ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90712 Wilco changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||wilco at gcc dot gnu.org Resolution|--- |FIXED --- Comment #2 from Wilco --- Fixed
[Bug libfortran/91030] Poor performance of I/O -fconvert=big-endian
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030 --- Comment #39 from Janne Blomqvist --- Now, with the fixed benchmark in the previous comment, on Lustre (version 2.5) system I get: Test using 25000 bytes Block size of file system: 4096 bs = 1024, 53.27 MiB/s bs = 2048, 73.99 MiB/s bs = 4096, 222.41 MiB/s bs = 8192, 351.38 MiB/s bs = 16384, 483.86 MiB/s bs = 32768, 583.76 MiB/s bs = 65536, 677.11 MiB/s bs = 131072, 748.60 MiB/s bs = 262144, 700.69 MiB/s bs = 524288, 811.76 MiB/s bs =1048576, 1032.99 MiB/s bs =2097152, 1034.03 MiB/s bs =4194304, 1063.74 MiB/s bs =8388608, 1030.15 MiB/s bs = 16777216, 1084.82 MiB/s bs = 33554432, 1067.05 MiB/s bs = 67108864, 1063.79 MiB/s On the same system, on a NFS filesystem connected with Infiniband I get: Test using 25000 bytes Block size of file system: 1048576 bs = 1024, 301.41 MiB/s bs = 2048, 351.51 MiB/s bs = 4096, 471.39 MiB/s bs = 8192, 444.61 MiB/s bs = 16384, 510.88 MiB/s bs = 32768, 527.99 MiB/s bs = 65536, 516.57 MiB/s bs = 131072, 481.38 MiB/s bs = 262144, 514.29 MiB/s bs = 524288, 462.06 MiB/s bs =1048576, 528.30 MiB/s bs =2097152, 526.76 MiB/s bs =4194304, 501.09 MiB/s bs =8388608, 493.61 MiB/s bs = 16777216, 550.24 MiB/s bs = 33554432, 532.20 MiB/s bs = 67108864, 532.82 MiB/s So for Lustre, a buffer size bigger than the current 8 kB at least seems justified. While Lustre sees improvements all the way to 1 MB buffer size, such large buffers by default seems a bit excessive.
[Bug libfortran/91030] Poor performance of I/O -fconvert=big-endian
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030 --- Comment #38 from Janne Blomqvist --- First, I think there's a bug in the benchmark in comment #c20. It writes blocksize * sizeof(double), but then advances only blocksize for each iteration of the loop. Fixed version writing just bytes below: #include #include #include #include #include #include #include #include double walltime (void) { struct timeval TV; double elapsed; gettimeofday(, NULL); elapsed = (double) TV.tv_sec + 1.0e-6*((double) TV.tv_usec); return elapsed; } #define NAME "out.dat" #define N 25000 int main() { int fd; unsigned char *p, *w; long i, size, blocksize, left, to_write; int bits; double t1, t2; struct statvfs buf; printf ("Test using %ld bytes\n", (long) N); statvfs (".", ); printf ("Block size of file system: %ld\n", buf.f_bsize); p = malloc(N * sizeof (*p)); for (i=0; i 0) { if (left >= blocksize) to_write = blocksize; else to_write = left; write (fd, w, blocksize); w += to_write; left -= to_write; } close (fd); t2 = walltime (); printf ("%.2f MiB/s\n", N / (t2-t1) / 1048576); } free (p); unlink (NAME); return 0; }
[Bug c++/91112] New: Bad error message for virtual function of a template class. Wrong "required from here" line number
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91112 Bug ID: 91112 Summary: Bad error message for virtual function of a template class. Wrong "required from here" line number Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ivan.kharpalev at gmail dot com Target Milestone: --- https://godbolt.org/z/orxMIj I expect to see number of the line that triggers instantiation. gcc-7 and clang show it. P.S. It even does not show bad method invocation line if it was called via base class pointer. https://godbolt.org/z/-SN5n5 It only shows call line for the class itself https://godbolt.org/z/kmaUQt
[Bug lto/90990] [10 Regression] ICE: error: ‘component_ref’ LHS in clobber statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90990 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #10 from Richard Biener --- Honza installed a patch. 2019-07-02 Jan Hubicka * tree-inline.c (remap_gimple_stmt): Do not subtitute handled components to clobber of return value.
[Bug target/91059] [10 regression] gcc.c-torture/execute/builtins/snprintf-chk.c fails on aarch64-elf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91059 Richard Biener changed: What|Removed |Added Component|tree-optimization |target --- Comment #1 from Richard Biener --- Likely target issue - please aarch64 folks investigate first.
[Bug tree-optimization/83518] [8/9 Regression] Missing optimization: useless instructions should be dropped
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83518 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work||10.0 Resolution|--- |FIXED Summary|[8/9/10 Regression] Missing |[8/9 Regression] Missing |optimization: useless |optimization: useless |instructions should be |instructions should be |dropped |dropped Known to fail||8.3.0, 9.1.0 --- Comment #11 from Richard Biener --- Fixed on trunk, not something for backporting.
[Bug tree-optimization/91108] [8/9/10 Regression] Fails to pun through unions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91108 --- Comment #3 from Richard Biener --- Author: rguenth Date: Mon Jul 8 11:48:48 2019 New Revision: 273233 URL: https://gcc.gnu.org/viewcvs?rev=273233=gcc=rev Log: 2019-07-08 Richard Biener PR tree-optimization/91108 * tree-ssa-sccvn.c: Include builtins.h. (vn_reference_lookup_3): Use only alignment constraints to verify same-valued store disambiguation. * gcc.dg/tree-ssa/pr91091-1.c: New testcase. * gcc.dg/tree-ssa/ssa-fre-78.c: Likewise. Added: branches/gcc-9-branch/gcc/testsuite/gcc.dg/tree-ssa/pr91091-1.c branches/gcc-9-branch/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-78.c Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/testsuite/ChangeLog branches/gcc-9-branch/gcc/tree-ssa-sccvn.c
[Bug tree-optimization/91108] [8/9/10 Regression] Fails to pun through unions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91108 --- Comment #2 from Richard Biener --- Author: rguenth Date: Mon Jul 8 11:46:26 2019 New Revision: 273232 URL: https://gcc.gnu.org/viewcvs?rev=273232=gcc=rev Log: 2019-07-08 Richard Biener PR tree-optimization/91108 * tree-ssa-sccvn.c: Include builtins.h. (vn_reference_lookup_3): Use only alignment constraints to verify same-valued store disambiguation. * gcc.dg/tree-ssa/ssa-fre-61.c: Adjust back. * gcc.dg/tree-ssa/ssa-fre-78.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-78.c Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-61.c trunk/gcc/tree-ssa-sccvn.c
[Bug c/91107] __attribute__((pure)) to function with non-const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91107 --- Comment #2 from Alejandro Colomar --- Technically it can modify globals as long as that doesn't affect the state of the program, but in this case it is affecting the state of the program, so it isn't a pure function. Fair enough, then the bug claim is that GCC shouldn't allow functions accepting non-const pointers. --- Comment #3 from Alejandro Colomar --- Technically it can modify globals as long as that doesn't affect the state of the program, but in this case it is affecting the state of the program, so it isn't a pure function. Fair enough, then the bug claim is that GCC shouldn't allow functions accepting non-const pointers.
[Bug c/91107] __attribute__((pure)) to function with non-const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91107 --- Comment #2 from Alejandro Colomar --- Technically it can modify globals as long as that doesn't affect the state of the program, but in this case it is affecting the state of the program, so it isn't a pure function. Fair enough, then the bug claim is that GCC shouldn't allow functions accepting non-const pointers. --- Comment #3 from Alejandro Colomar --- Technically it can modify globals as long as that doesn't affect the state of the program, but in this case it is affecting the state of the program, so it isn't a pure function. Fair enough, then the bug claim is that GCC shouldn't allow functions accepting non-const pointers.
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 --- Comment #19 from dave.anglin at bell dot net --- On 2019-07-07 8:39 p.m., amylaar at gcc dot gnu.org wrote: > It seems suspicious that PREFERRED_STACK_BOUNDARY is smaller for TARGET_64BIT > ? That's the way HP defined things. The preferred stack boundary for 32-bit code was larger than it needed to be. Possibly, someone thought that making it cache aligned would be good. > > Be this as it may, the problem for the 84877 testcase is not that the stack > has > insufficient alignment, but that the stack slot doesn't have an aligned > offset. > > The alignment gets pruned in function.c:get_stack_local_alignment : > > if (mode == BLKmode) > alignment = BIGGEST_ALIGNMENT; > > I have attached a patch to preserve the alignment of the passed type for the > case that the stack is already sufficiently aligned. > > To test the case where the stack is insufficiently aligned, for hppa we should > use a different testcase with > 512 bit alignment of the type.
[Bug c++/91110] [10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in cp_omp_mappable_type_1, at cp/decl2.c:1421
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91110 --- Comment #2 from Jakub Jelinek --- Created attachment 46579 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46579=edit gcc10-pr91110.patch error_mark_node type doesn't have TYPE_MAIN_DECL, but more importantly, error_mark_node on a type doesn't mean the type is incomplete, it means the type is invalid, and some diagnostics should have been emitted already why it is invalid. So, IMNSHO we shouldn't emit any clarification messages in that case.
[Bug inline-asm/91111] arm64 Linux kernel panics at boot due to unexpected register assignment in inline asm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 ktkachov at gcc dot gnu.org changed: What|Removed |Added Target||aarch64 Status|UNCONFIRMED |NEW Known to work||10.0, 9.1.0 Keywords||wrong-code Last reconfirmed||2019-07-08 CC||ktkachov at gcc dot gnu.org Ever confirmed|0 |1 Known to fail||6.5.0, 7.4.1, 8.3.1 --- Comment #1 from ktkachov at gcc dot gnu.org --- Hmm, I see this using x0 properly on GCC 9.1 and trunk but GCC 8 and earlier use x1
[Bug c++/91110] [10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in cp_omp_mappable_type_1, at cp/decl2.c:1421
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91110 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-07-08 CC||jakub at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Target Milestone|--- |10.0 Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- Most likely caused by r273078.
[Bug target/91102] [9/10 Regression] aarch64 ICE on Linux kernel with -Os starting with r270266
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91102 --- Comment #6 from Stefan Kneifel --- It seems to fix the bug - at least the original problem (ICE during compiling Linux kernel for aarch64 with -Os) is solved by this patch.
[Bug inline-asm/91111] New: arm64 Linux kernel panics at boot due to unexpected register assignment in inline asm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 Bug ID: 9 Summary: arm64 Linux kernel panics at boot due to unexpected register assignment in inline asm Product: gcc Version: 8.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: inline-asm Assignee: unassigned at gcc dot gnu.org Reporter: will.deacon at arm dot com Target Milestone: --- Created attachment 46578 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46578=edit Output of -save-temps When compiling the Linux kernel for arm64 with CONFIG_OPTIMIZE_INLINING=y (which effectively removes the use of __attribute__((__always_inline__)) for functions marked as inline), the atomic64 selftest fails due to a local register variable being assigned to a different register from the one specified when used in an inline asm block. While I appreciate that we're treading on thin ice here, my reading of the docs at: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables suggests that this should work. To be more precise, this kernel code: static inline long arch_atomic64_dec_if_positive(atomic64_t *v) { register long x0 asm ("x0") = (long)v; asm volatile(ARM64_LSE_ATOMIC_INSN( /* LL/SC */ __LL_SC_ATOMIC64(dec_if_positive) __nops(6), /* LSE atomics */ "1: ldr x30, %[v]\n" " subs%[ret], x30, #1\n" " b.lt2f\n" " casal x30, %[ret], %[v]\n" " sub x30, x30, #1\n" " sub x30, x30, %[ret]\n" " cbnzx30, 1b\n" "2:") : [ret] "+" (x0), [v] "+Q" (v->counter) : : __LL_SC_CLOBBERS, "cc", "memory"); return x0; } requires that %[ret] expands to register x0, whereas it is instead expanding to register x1. You can see this in the assembly code for the function: .align 2 .type arch_atomic64_dec_if_positive, %function arch_atomic64_dec_if_positive: .LVL0: .LFB244: .file 1 "./arch/arm64/include/asm/atomic_lse.h" .loc 1 411 1 view -0 .cfi_startproc .loc 1 412 2 view .LVU1 .loc 1 414 2 view .LVU2 .loc 1 411 1 is_stmt 0 view .LVU3 stp x29, x30, [sp, -16]! .cfi_def_cfa_offset 16 .cfi_offset 29, -16 .cfi_offset 30, -8 .LVL1: .loc 1 414 2 view .LVU4 mov x1, x0 .loc 1 411 1 view .LVU5 mov x29, sp .loc 1 414 2 view .LVU6 #APP // 414 "./arch/arm64/include/asm/atomic_lse.h" 1 .if 1 == 1 661: bl __ll_sc_arch_atomic64_dec_if_positive .rept 6 nop .endr 662: .pushsection .altinstructions,"a" .word 661b - . .if 0 == 0 .word 663f - . .else .word 0- . .endif .hword 5 .byte 662b-661b .byte 664f-663f .popsection .if 0 == 0 .pushsection .altinstr_replacement, "a" 663: 1: ldr x30, [x0] subsx1, x30, #1 b.lt2f casal x30, x1, [x0] sub x30, x30, #1 sub x30, x30, x1 cbnzx30, 1b 2: 664: .popsection .org. - (664b-663b) + (662b-661b) .org. - (662b-661b) + (664b-663b) .else 663: 664: .endif .endif // 0 "" 2 .LVL2: .loc 1 414 2 view .LVU7 #NO_APP mov x0, x1 .LVL3: .loc 1 431 2 is_stmt 1 view .LVU8 .loc 1 432 1 is_stmt 0 view .LVU9 ldp x29, x30, [sp], 16 .cfi_restore 30 .cfi_restore 29 .cfi_def_cfa_offset 0 ret .cfi_endproc .LFE244: .size arch_atomic64_dec_if_positive, .-arch_atomic64_dec_if_positive I've attached the .i/.s files output by: aarch64-linux-gnu-gcc -save-temps -Wp,-MD,lib/.atomic64_test.o.d -nostdinc -isystem /home/will/system/aarch64/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu/bin/../lib/gcc/aarch64-linux-gnu/8.3.0/include -I./arch/arm64/include -I./arch/arm64/include/generated -I./include -I./arch/arm64/include/uapi -I./arch/arm64/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -DKASAN_SHADOW_SCALE_SHIFT=3 -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 -mgeneral-regs-only -DCONFIG_AS_LSE=1 -fno-asynchronous-unwind-tables -Wno-psabi -mabi=lp64 -DKASAN_SHADOW_SCALE_SHIFT=3 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -O2 --param=allow-store-data-races=0 -Wframe-larger-than=2048 -fstack-protector-strong -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-var-tracking-assignments -g
[Bug target/91103] AVX512 vector element extract uses more than 1 shuffle instruction; VALIGND can grab any element
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91103 Jakub Jelinek changed: What|Removed |Added CC||hjl.tools at gmail dot com, ||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- For the constant vector element extraction, it can be done say with: --- gcc/config/i386/sse.md.jj 2019-07-06 23:55:51.617641994 +0200 +++ gcc/config/i386/sse.md 2019-07-08 12:23:13.315509840 +0200 @@ -9351,7 +9351,7 @@ (define_insn "avx512f_sgetexp")]) -(define_insn "_align" +(define_insn "_align" [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") (unspec:VI48_AVX512VL [(match_operand:VI48_AVX512VL 1 "register_operand" "v") (match_operand:VI48_AVX512VL 2 "nonimmediate_operand" "vm") --- gcc/config/i386/i386-expand.c.jj2019-07-04 00:18:37.067010375 +0200 +++ gcc/config/i386/i386-expand.c 2019-07-08 12:37:24.687562956 +0200 @@ -14827,6 +14827,14 @@ ix86_expand_vector_extract (bool mmx_ok, break; case E_V16SFmode: + if (elt > 12) + { + tmp = gen_reg_rtx (V16SImode); + vec = gen_lowpart (V16SImode, vec); + emit_insn (gen_avx512f_alignv16si (tmp, vec, vec, GEN_INT (elt))); + vec = gen_lowpart (V16SFmode, tmp); + elt = 0; + } tmp = gen_reg_rtx (V8SFmode); if (elt < 8) emit_insn (gen_vec_extract_lo_v16sf (tmp, vec)); @@ -14836,6 +14844,14 @@ ix86_expand_vector_extract (bool mmx_ok, return; case E_V8DFmode: + if (elt >= 6) + { + tmp = gen_reg_rtx (V8DImode); + vec = gen_lowpart (V8DImode, vec); + emit_insn (gen_avx512f_alignv8di (tmp, vec, vec, GEN_INT (elt))); + vec = gen_lowpart (V8DFmode, tmp); + elt = 0; + } tmp = gen_reg_rtx (V4DFmode); if (elt < 4) emit_insn (gen_vec_extract_lo_v8df (tmp, vec)); @@ -14845,6 +14861,13 @@ ix86_expand_vector_extract (bool mmx_ok, return; case E_V16SImode: + if (elt > 12) + { + tmp = gen_reg_rtx (V16SImode); + emit_insn (gen_avx512f_alignv16si (tmp, vec, vec, GEN_INT (elt))); + vec = tmp; + elt = 0; + } tmp = gen_reg_rtx (V8SImode); if (elt < 8) emit_insn (gen_vec_extract_lo_v16si (tmp, vec)); @@ -14854,6 +14877,13 @@ ix86_expand_vector_extract (bool mmx_ok, return; case E_V8DImode: + if (elt >= 6) + { + tmp = gen_reg_rtx (V8DImode); + emit_insn (gen_avx512f_alignv8di (tmp, vec, vec, GEN_INT (elt))); + vec = tmp; + elt = 0; + } tmp = gen_reg_rtx (V4DImode); if (elt < 4) emit_insn (gen_vec_extract_lo_v8di (tmp, vec)); The question is in which cases it is beneficial, from pure -Os POV the valignd/valignq is one instruction and for integer extractions needs a vmovd afterwards, so for 64-bit extraction might be also useful for double [3] and [5] (for long long it is two insns in both cases), for 32-bit extraction likely also shorter for float [5], [6], [7], [9], [10], [11], [12], but not for int. But I admit I have no idea on how fast what is.
[Bug c++/91110] New: [10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in cp_omp_mappable_type_1, at cp/decl2.c:1421
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91110 Bug ID: 91110 Summary: [10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in cp_omp_mappable_type_1, at cp/decl2.c:1421 Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: error-recovery, ice-on-invalid-code, openmp Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- g++-10.0.0-alpha20190707 snapshot (r273184) ICEs when compiling the following testcase derived from gcc/testsuite/gcc.dg/gomp/_Atomic-5.c w/ -fopenmp: void f1 (void) { X int b[2]; b[0] = 1; #pragma omp target map(to: b) ; } % g++-10.0.0-alpha20190707 -fopenmp -c e8gxtyxe.c e8gxtyxe.c: In function 'void f1()': e8gxtyxe.c:4:3: error: 'X' was not declared in this scope 4 | X int b[2]; | ^ e8gxtyxe.c:5:3: error: 'b' was not declared in this scope 5 | b[0] = 1; | ^ e8gxtyxe.c:6:30: error: 'b' does not have a mappable type in 'map' clause 6 | #pragma omp target map(to: b) | ^ e8gxtyxe.c:6:32: internal compiler error: tree check: expected class 'type', have 'exceptional' (error_mark) in cp_omp_mappable_type_1, at cp/decl2.c:1421 6 | #pragma omp target map(to: b) |^ 0x7d125e tree_class_check_failed(tree_node const*, tree_code_class, char const*, int, char const*) /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/tree.c:9950 0x5f39fa tree_class_check(tree_node*, tree_code_class, char const*, int, char const*) /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/tree.h:3340 0x5f39fa cp_omp_mappable_type_1 /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/decl2.c:1421 0xa43c69 finish_omp_clauses(tree_node*, c_omp_region_type) /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/semantics.c:7241 0x9b3267 cp_parser_omp_all_clauses /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:35735 0x9c4146 cp_parser_omp_target /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:38918 0x99f583 cp_parser_pragma /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:41352 0x9a76fd cp_parser_statement /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:11279 0x9a8665 cp_parser_statement_seq_opt /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:11667 0x9a8735 cp_parser_compound_statement /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:11621 0x9c0cbc cp_parser_function_body /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:22651 0x9c0cbc cp_parser_ctor_initializer_opt_and_function_body /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:22702 0x9c15ad cp_parser_function_definition_after_declarator /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:28016 0x9c23a3 cp_parser_function_definition_from_specifiers_and_declarator /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:27932 0x9c23a3 cp_parser_init_declarator /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:20288 0x9a4e7d cp_parser_simple_declaration /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:13546 0x9c8822 cp_parser_declaration /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:13243 0x9c8eb8 cp_parser_translation_unit /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:4699 0x9c8eb8 c_parse_file() /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/cp/parser.c:41495 0xad25ec c_common_parse_file() /var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190707/work/gcc-10-20190707/gcc/c-family/c-opts.c:1160
[Bug target/91106] internal compiler error: output_operand: invalid use of register 'frame'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91106 --- Comment #2 from Shubham Narlawar --- (In reply to Richard Biener from comment #1) > Did you paste the correct reduced testcase? Here is the original reduced test case obtained from Creduce - #pragma pack(1) struct a { int b; char c }; union { struct a b } __attribute__((aligned(32), transparent_union)) d; e() { f(d); } I tried to fix warnings by putting semicolon, data type and function declaration where ever required.
[Bug middle-end/91105] internal compiler error: maximum number of generated reload insns per insn achieved (90)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91105 Uroš Bizjak changed: What|Removed |Added Component|target |middle-end Depends on||91001 --- Comment #2 from Uroš Bizjak --- Not a target problem. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91001 [Bug 91001] internal compiler error: in extract_insn, at recog.c:2310
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 Jorn Wolfgang Rennecke changed: What|Removed |Added Attachment #46574|0 |1 is obsolete|| --- Comment #18 from Jorn Wolfgang Rennecke --- Created attachment 46577 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46577=edit patch for aligned stack - but clamping max alignment at MAX_SUPPORTED_STACK_ALIGNMENT (In reply to r...@cebitec.uni-bielefeld.de from comment #17) > > --- Comment #15 from Jorn Wolfgang Rennecke --- > > Created attachment 46574 [details] > > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46574=edit > > patch for the case that the stack is sufficiently aligned > [...] > > I have attached a patch to preserve the alignment of the passed type for the > > case that the stack is already sufficiently aligned. > > This patch breaks sparc-sun-solaris2.11 bootstrap with an ICE while > compiling stage2 function.c: > > during RTL pass: expand > /vol/gcc/src/hg/trunk/local/gcc/function.c: In function 'void > assign_parm_find_data_types(assign_parm_data_all*, tree, > assign_parm_data_one*)': > /vol/gcc/src/hg/trunk/local/gcc/function.c:2426:49: internal compiler error: This location doesn't make much sense to me. Maybe some artefact from optimized compilation and register windows? > in assign_stack_temp_for_type, at function.c:880 > 2426 | else if (targetm.calls.strict_argument_naming (all->args_so_far)) > |~^~ > 0x11bc22f assign_stack_temp_for_type(machine_mode, poly_int<1u, long long>, > tree_node*) > /vol/gcc/src/hg/trunk/local/gcc/function.c:878 > 0x11bc963 assign_temp(tree_node*, int, int) This looks like the modified assert there has triggered. It'd be interesting to know why - i.e. what variable does want more alignment than MAX_SUPPORTED_STACK_ALIGNMENT - during bootstrap? Or is this a BLKmode variable with less alignment than BIGGEST_ALIGNMENT? User code could specify silly alignments which we couldn't provide with ordinary allocation (using a fixed offset from sp/fp) and which could also blow up the frame size too much if we tried, so it makes sense to clamp the alignment to MAX_SUPPORTED_STACK_ALIGNMENT in get_stack_local_alignment. The other side is that the code in assign_stack_temp_for_type seems to require BIGGEST_ALIGNMENT for BLKmode; I'm not sure about assign_stack_local_1 slots. It seems a bit wasteful, but trying to reduce waste of space in the stack frame is really a different issue, so I also modified the patch to use at least BIGGEST_ALIGNMENT for BLKmode so that it's (bug-?)compatible in that aspect with the previous code - see attached modified patch.
[Bug tree-optimization/91109] [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Target Milestone|--- |10.0 --- Comment #1 from Richard Biener --- Can you help and check which test* () call fails? Also check whether -fwhole-program instead of -flto makes it fail. Does it still fail when you comment all but the failing test* () call?
[Bug c/91107] __attribute__((pure)) to function with non-const pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91107 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Richard Biener --- This function isn't pure. GCC would optimize dest[0] = 0.; array_division (n, dest, src1, src2); return dest[0]; to return 0.0 since pure functions are assumed to not write to (global) memory.
[Bug target/91106] internal compiler error: output_operand: invalid use of register 'frame'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91106 --- Comment #1 from Richard Biener --- Did you paste the correct reduced testcase?
[Bug c++/66999] Missing comma in lambda capture causes internal compiler error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66999 Paolo Carlini changed: What|Removed |Added CC||paolo.carlini at oracle dot com --- Comment #6 from Paolo Carlini --- Unfortunately we still issue two errors for the original testcase.
[Bug target/91105] internal compiler error: maximum number of generated reload insns per insn achieved (90)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91105 Richard Biener changed: What|Removed |Added Keywords||ice-on-valid-code, ra Status|UNCONFIRMED |NEW Last reconfirmed||2019-07-08 Component|middle-end |target Ever confirmed|0 |1 Known to fail||4.8.5, 7.4.0 --- Comment #1 from Richard Biener --- Never worked it seems.
[Bug c++/65143] [C++11] missing devirtualization for virtual base in "final" classes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65143 Paolo Carlini changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |10.0 --- Comment #11 from Paolo Carlini --- Should be completely fixed.
[Bug c++/65143] [C++11] missing devirtualization for virtual base in "final" classes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65143 --- Comment #10 from paolo at gcc dot gnu.org --- Author: paolo Date: Mon Jul 8 09:51:07 2019 New Revision: 273228 URL: https://gcc.gnu.org/viewcvs?rev=273228=gcc=rev Log: 2019-07-08 Paolo Carlini PR c++/65143 * g++.dg/tree-ssa/final2.C: New. * g++.dg/tree-ssa/final3.C: Likewise. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/final2.C trunk/gcc/testsuite/g++.dg/tree-ssa/final3.C Modified: trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/91109] New: [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91109 Bug ID: 91109 Summary: [10 regression][arm] gcc.c-torture/execute/20040709-1.c fails since r273135 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: clyon at gcc dot gnu.org Target Milestone: --- Hi, I've noticed that since r273135 (fix for PR91091), there is a regression on arm-none-linux-gnueabi --with-mode arm --with-cpu cortex-a9 FAIL: gcc.c-torture/execute/20040709-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test There's no such regression on arm-none-linux-gnueabihf or if using --with-mode thumb
[Bug target/91103] AVX512 vector element extract uses more than 1 shuffle instruction; VALIGND can grab any element
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91103 --- Comment #2 from Richard Biener --- (In reply to Richard Biener from comment #1) > So when the vectorizer has the need to use strided stores it would be > cheapest > to spill the vector and do N element loads and stores? I guess we can easily > get bottle-necked by the load/store op bandwith here? That is, the > vectorizer needs > > for (lane) > dest[stride * lane] = vector[lane]; > > thus store a specific (constant) lane of a vector to memory, for each > vector lane. (we could use a scatter store here but only AVX512 has that > and builing the index vector could be tricky and not supported for all > element types) Indeed ICC seems to spill for AVX and AVX512 for typedef int vsi __attribute__((vector_size(SIZE))); void foo (vsi v, int *p, int *o) { for (int i = 0; i < sizeof(vsi)/4; ++i) p[o[i]] = v[i]; }
[Bug target/91103] AVX512 vector element extract uses more than 1 shuffle instruction; VALIGND can grab any element
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91103 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #1 from Richard Biener --- So when the vectorizer has the need to use strided stores it would be cheapest to spill the vector and do N element loads and stores? I guess we can easily get bottle-necked by the load/store op bandwith here? That is, the vectorizer needs for (lane) dest[stride * lane] = vector[lane]; thus store a specific (constant) lane of a vector to memory, for each vector lane. (we could use a scatter store here but only AVX512 has that and builing the index vector could be tricky and not supported for all element types)
[Bug c++/80518] -Wsuggest-override does not warn about missing override on destructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80518 --- Comment #7 from Jonathan Wakely --- The guideline might be changing: https://github.com/isocpp/CppCoreGuidelines/pull/1448 If that pull request is merged we might want to change -Wsuggest-override too, without needing a separate option.
[Bug target/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 --- Comment #14 from rsandifo at gcc dot gnu.org --- Created attachment 46576 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46576=edit Candidate patch I'll test the attached overnight
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 Martin Liška changed: What|Removed |Added Status|WAITING |ASSIGNED Known to work||8.3.1 Known to fail||9.1.0 --- Comment #8 from Martin Liška --- (In reply to Frantisek Sumsal from comment #7) > (In reply to Martin Liška from comment #6) > > > Do you know how to tell meson to use CC=gcc-8? > > > > $ export CC=gcc-8 CXX=g++-8 > $ meson build ... > > should suffice Great, now I can confirm that!
[Bug target/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 --- Comment #13 from Christophe Lyon --- Indeed, this seems to work: diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 820502a..4f69122 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12471,7 +12471,7 @@ neon_expand_vector_init (rtx target, rtx vals) if (n_var == 1) { rtx copy = copy_rtx (vals); - rtx index = GEN_INT (one_var); + rtx index = GEN_INT (1 << one_var); /* Load constant part of vector, substitute neighboring value for varying element. */ @@ -12483,31 +12483,40 @@ neon_expand_vector_init (rtx target, rtx vals) switch (mode) { case E_V8QImode: - emit_insn (gen_neon_vset_lanev8qi (target, x, target, index)); + emit_insn (gen_vec_setv8qi_internal (target, x, index, target)); break; case E_V16QImode: - emit_insn (gen_neon_vset_lanev16qi (target, x, target, index)); + emit_insn (gen_vec_setv16qi_internal (target, x, index, target)); break; case E_V4HImode: - emit_insn (gen_neon_vset_lanev4hi (target, x, target, index)); + emit_insn (gen_vec_setv4hi_internal (target, x, index, target)); break; case E_V8HImode: - emit_insn (gen_neon_vset_lanev8hi (target, x, target, index)); + emit_insn (gen_vec_setv8hi_internal (target, x, index, target)); break; case E_V2SImode: - emit_insn (gen_neon_vset_lanev2si (target, x, target, index)); + emit_insn (gen_vec_setv2si_internal (target, x, index, target)); break; case E_V4SImode: - emit_insn (gen_neon_vset_lanev4si (target, x, target, index)); + emit_insn (gen_vec_setv4si_internal (target, x, index, target)); break; case E_V2SFmode: - emit_insn (gen_neon_vset_lanev2sf (target, x, target, index)); + emit_insn (gen_vec_setv2sf_internal (target, x, index, target)); break; case E_V4SFmode: - emit_insn (gen_neon_vset_lanev4sf (target, x, target, index)); + emit_insn (gen_vec_setv4sf_internal (target, x, index, target)); break; case E_V2DImode: - emit_insn (gen_neon_vset_lanev2di (target, x, target, index)); + emit_insn (gen_vec_setv2di_internal (target, x, index, target)); break;
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #7 from Frantisek Sumsal --- (In reply to Martin Liška from comment #6) > Do you know how to tell meson to use CC=gcc-8? > $ export CC=gcc-8 CXX=g++-8 $ meson build ... should suffice
[Bug tree-optimization/91108] [8/9/10 Regression] Fails to pun through unions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91108 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Status|UNCONFIRMED |ASSIGNED Known to work||7.4.0 Keywords||alias, wrong-code Last reconfirmed||2019-07-08 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 Target Milestone|--- |8.4 --- Comment #1 from Richard Biener --- Mine.
[Bug tree-optimization/91108] New: [8/9/10 Regression] Fails to pun through unions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91108 Bug ID: 91108 Summary: [8/9/10 Regression] Fails to pun through unions Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- The following testcase fails to support our promise for punning through union members if the access happens through the union. /* { dg-do run } */ /* { dg-options "-O3 -fstrict-aliasing" } */ union U { struct A { int : 2; int x : 8; } a; struct B { int : 6; int x : 8; } b; }; int __attribute__((noipa)) foo (union U *p, union U *q) { p->a.x = 1; q->b.x = 1; return p->a.x; } int main() { union U x; if (foo (, ) != x.a.x) __builtin_abort (); return 0; }
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #6 from Martin Liška --- (In reply to Frantisek Sumsal from comment #5) > (In reply to Martin Liška from comment #4) > > Ok, I was able to make the build: > > > > $ meson build -Db_sanitize=address,undefined -Dxkbcommon=false > > > > with GCC 9.1.1: > > > > real0m2.176s > > user0m2.013s > > sys 0m0.160s > > > > which is probably fast enough. And I can't run the second test-case: > > > > Yes, without any ASAN_OPTIONS the built binary behaves as "expected: > > --- > > $ unset ASAN_OPTIONS > $ time build-gcc-9.1.0-sanitizers/test-conf-parser > <...snip...> > = test_config_parse[16] == > /tmp/test-conf-parser.cvqFVQ:1: Continuation line too long > > real 0m2.972s > user 0m2.680s > sys 0m0.280s > > --- > > The real issue arises with ASAN_OPTIONS=detect_stack_use_after_return=1 Ahh, got it. Now it's really much slower. Do you know how to tell meson to use CC=gcc-8? > > --- > > $ export ASAN_OPTIONS=detect_stack_use_after_return=1 > $ time build-gcc-9.1.0-sanitizers/test-conf-parser > <...snip...> > == test_config_parse[16] == > /tmp/test-conf-parser.WhLgS1:1: Continuation line too long > > real 0m29.637s > user 0m29.321s > sys 0m0.298s > > --- > > > > $ ./test/hwdb-test.sh > > ./systemd-hwdb does not exist, please build first > > For this particular case you have to cd into the build directory first (cd > build && ../test/hwdb-test.sh) Good.
[Bug target/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 --- Comment #12 from rguenther at suse dot de --- On Mon, 8 Jul 2019, rsandifo at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 > > rsandifo at gcc dot gnu.org changed: > >What|Removed |Added > > Component|middle-end |target >Assignee|rguenth at gcc dot gnu.org |rsandifo at gcc dot > gnu.org > > --- Comment #11 from rsandifo at gcc dot gnu.org gnu.org> --- > (In reply to rguent...@suse.de from comment #10) > > On Mon, 8 Jul 2019, clyon at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 > > > > > > --- Comment #8 from Christophe Lyon --- > > > (In reply to Richard Biener from comment #5) > > > > Hmm, using a cross configured as > > > > > > > > trunk/configure --target=armeb-none-linux-gnueabihf --with-cpu=cortex-a9 > > > > --with-fpu=neon-fp16 --enable-languages=c > > > > > > > > and trimming the testcase to the first line I cannot reproduce the > > > > reported > > > > assembly. I get at -O3 > > > > > > > > .arm > > > > .fpu softvfp > > > > > > For some reason, you are not targeting the right FPU, I have: > > > .arm > > > .fpu neon-fp16 > > > > I noticed that - but it doesn't change even when supplying > > -mpfu=neon-fp16 -mcpu=cortex-a9 > > > > I suppose some configure-time checking disables this feature somehow > > without notifying me :/ (don't have a armeb assembler installed, > > trying a pure cc1 cross) > > > > Anyway, I can't reproduce even after spending 1+ hours on this. > > Yeah, I see the same thing building it that way. I needed to restate > the abi using -mfloat-abi=hard. Even when adding -mfloat-abi=hard I see .fpu softvfp ... > I'm pretty sure it's a target bug though. If a vector constructor > has a single nonconstant element, neon_expand_vector_init uses the > neon_vset_lane* patterns to set that index. But neon_vset_lane* > use the architecture lane numbering while neon_expand_vector_init > uses GCC lane numbering. Using the vec_set(_internal) patterns > should fix that. I'm out of here then ;)
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 --- Comment #17 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #15 from Jorn Wolfgang Rennecke --- > Created attachment 46574 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46574=edit > patch for the case that the stack is sufficiently aligned [...] > I have attached a patch to preserve the alignment of the passed type for the > case that the stack is already sufficiently aligned. This patch breaks sparc-sun-solaris2.11 bootstrap with an ICE while compiling stage2 function.c: during RTL pass: expand /vol/gcc/src/hg/trunk/local/gcc/function.c: In function 'void assign_parm_find_data_types(assign_parm_data_all*, tree, assign_parm_data_one*)': /vol/gcc/src/hg/trunk/local/gcc/function.c:2426:49: internal compiler error: in assign_stack_temp_for_type, at function.c:880 2426 | else if (targetm.calls.strict_argument_naming (all->args_so_far)) |~^~ 0x11bc22f assign_stack_temp_for_type(machine_mode, poly_int<1u, long long>, tree_node*) /vol/gcc/src/hg/trunk/local/gcc/function.c:878 0x11bc963 assign_temp(tree_node*, int, int) /vol/gcc/src/hg/trunk/local/gcc/function.c:1016 0xeab99b initialize_argument_information /vol/gcc/src/hg/trunk/local/gcc/calls.c:2087 0xeb1957 expand_call(tree_node*, rtx_def*, int) /vol/gcc/src/hg/trunk/local/gcc/calls.c:3605 0x112a247 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /vol/gcc/src/hg/trunk/local/gcc/expr.c:11044 0x111919b expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /vol/gcc/src/hg/trunk/local/gcc/expr.c:8286 0x110a06b store_expr(tree_node*, rtx_def*, int, bool, bool) /vol/gcc/src/hg/trunk/local/gcc/expr.c:5685 0x11085bf expand_assignment(tree_node*, tree_node*, bool) /vol/gcc/src/hg/trunk/local/gcc/expr.c:5447 0xed8cb3 expand_call_stmt /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:2727 0xedd453 expand_gimple_stmt_1 /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:3708 0xedde3b expand_gimple_stmt /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:3867 0xee937f expand_gimple_basic_block /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:5907 0xeeb7c3 execute /vol/gcc/src/hg/trunk/local/gcc/cfgexpand.c:6530
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 --- Comment #5 from Frantisek Sumsal --- (In reply to Martin Liška from comment #4) > Ok, I was able to make the build: > > $ meson build -Db_sanitize=address,undefined -Dxkbcommon=false > > with GCC 9.1.1: > > real 0m2.176s > user 0m2.013s > sys 0m0.160s > > which is probably fast enough. And I can't run the second test-case: > Yes, without any ASAN_OPTIONS the built binary behaves as "expected: --- $ unset ASAN_OPTIONS $ time build-gcc-9.1.0-sanitizers/test-conf-parser <...snip...> = test_config_parse[16] == /tmp/test-conf-parser.cvqFVQ:1: Continuation line too long real0m2.972s user0m2.680s sys 0m0.280s --- The real issue arises with ASAN_OPTIONS=detect_stack_use_after_return=1 --- $ export ASAN_OPTIONS=detect_stack_use_after_return=1 $ time build-gcc-9.1.0-sanitizers/test-conf-parser <...snip...> == test_config_parse[16] == /tmp/test-conf-parser.WhLgS1:1: Continuation line too long real0m29.637s user0m29.321s sys 0m0.298s --- > $ ./test/hwdb-test.sh > ./systemd-hwdb does not exist, please build first For this particular case you have to cd into the build directory first (cd build && ../test/hwdb-test.sh)
[Bug sanitizer/91101] Possible performance regression in libasan with detect_stack_use_after_return=1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91101 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2019-07-08 Ever confirmed|0 |1 --- Comment #4 from Martin Liška --- Ok, I was able to make the build: $ meson build -Db_sanitize=address,undefined -Dxkbcommon=false with GCC 9.1.1: $ time ./build/test-conf-parser filename:1: lvalue= path is not absolute, ignoring: not_absolute/path filename:1: String is not UTF-8 clean, ignoring assignment: /path/� filename:1: Failed to parse log level, ignoring: garbage filename:1: Failed to parse log facility, ignoring: garbage filename:1: Failed to parse size value '-982', ignoring: Numerical result out of range filename:1: Failed to parse size value '498719873987300G', ignoring: Numerical result out of range filename:1: Failed to parse size value 'garbage', ignoring: Invalid argument filename:1: Failed to parse size value '-982', ignoring: Numerical result out of range filename:1: Failed to parse size value '498719873987300G', ignoring: Numerical result out of range filename:1: Failed to parse size value 'garbage', ignoring: Invalid argument filename:1: Failed to parse int value, ignoring: filename:1: Failed to parse int value, ignoring: - filename:1: Failed to parse int value, ignoring: 1G filename:1: Failed to parse int value, ignoring: garbage filename:1: Failed to parse unsigned value, ignoring: filename:1: Failed to parse unsigned value, ignoring: 1G filename:1: Failed to parse unsigned value, ignoring: garbage filename:1: Failed to parse unsigned value, ignoring: 1000garbage filename:1: Failed to parse mode value, ignoring: -777 filename:1: Failed to parse mode value, ignoring: 999 filename:1: Failed to parse mode value, ignoring: garbage filename:1: Failed to parse mode value, ignoring: 777garbage filename:1: Failed to parse mode value, ignoring: 777 garbage filename:1: Failed to parse sec value, ignoring: -1 filename:1: Failed to parse sec value, ignoring: 10foo filename:1: Failed to parse sec value, ignoring: garbage filename:1: Failed to parse nsec value, ignoring: -1 filename:1: Failed to parse nsec value, ignoring: 10foo filename:1: Failed to parse nsec value, ignoring: garbage == test_config_parse[0] == == test_config_parse[1] == == test_config_parse[2] == == test_config_parse[3] == == test_config_parse[4] == == test_config_parse[5] == == test_config_parse[6] == == test_config_parse[7] == == test_config_parse[8] == == test_config_parse[9] == == test_config_parse[10] == == test_config_parse[11] == == test_config_parse[12] == == test_config_parse[13] == == test_config_parse[14] == == test_config_parse[15] == /tmp/test-conf-parser.l7EgI7:1: Line too long == test_config_parse[16] == /tmp/test-conf-parser.2Fj9TE:1: Continuation line too long real0m2.176s user0m2.013s sys 0m0.160s which is probably fast enough. And I can't run the second test-case: $ ./test/hwdb-test.sh ./systemd-hwdb does not exist, please build first
[Bug target/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 rsandifo at gcc dot gnu.org changed: What|Removed |Added Component|middle-end |target Assignee|rguenth at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #11 from rsandifo at gcc dot gnu.org --- (In reply to rguent...@suse.de from comment #10) > On Mon, 8 Jul 2019, clyon at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 > > > > --- Comment #8 from Christophe Lyon --- > > (In reply to Richard Biener from comment #5) > > > Hmm, using a cross configured as > > > > > > trunk/configure --target=armeb-none-linux-gnueabihf --with-cpu=cortex-a9 > > > --with-fpu=neon-fp16 --enable-languages=c > > > > > > and trimming the testcase to the first line I cannot reproduce the > > > reported > > > assembly. I get at -O3 > > > > > > .arm > > > .fpu softvfp > > > > For some reason, you are not targeting the right FPU, I have: > > .arm > > .fpu neon-fp16 > > I noticed that - but it doesn't change even when supplying > -mpfu=neon-fp16 -mcpu=cortex-a9 > > I suppose some configure-time checking disables this feature somehow > without notifying me :/ (don't have a armeb assembler installed, > trying a pure cc1 cross) > > Anyway, I can't reproduce even after spending 1+ hours on this. Yeah, I see the same thing building it that way. I needed to restate the abi using -mfloat-abi=hard. I'm pretty sure it's a target bug though. If a vector constructor has a single nonconstant element, neon_expand_vector_init uses the neon_vset_lane* patterns to set that index. But neon_vset_lane* use the architecture lane numbering while neon_expand_vector_init uses GCC lane numbering. Using the vec_set(_internal) patterns should fix that.
[Bug tree-optimization/83518] [8/9/10 Regression] Missing optimization: useless instructions should be dropped
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83518 --- Comment #10 from Richard Biener --- Author: rguenth Date: Mon Jul 8 07:09:24 2019 New Revision: 273194 URL: https://gcc.gnu.org/viewcvs?rev=273194=gcc=rev Log: 2019-07-08 Richard Biener PR tree-optimization/83518 * tree-ssa-sccvn.c: Include splay-tree.h. (struct pd_range, struct pd_data): New. (struct vn_walk_cb_data): Add data to track partial definitions. (vn_walk_cb_data::~vn_walk_cb_data): New. (vn_walk_cb_data::push_partial_def): New. (pd_tree_alloc, pd_tree_dealloc, pd_range_compare): New. (vn_reference_lookup_2): When partial defs are registered give up. (vn_reference_lookup_3): Track partial defs for memset and constructor zeroing and for defs from constants. * gcc.dg/tree-ssa/ssa-fre-73.c: New testcase. * gcc.dg/tree-ssa/ssa-fre-74.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-75.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-76.c: Likewise. * g++.dg/tree-ssa/pr83518.C: Likewise. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/pr83518.C trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-73.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-74.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-75.c trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-76.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-sccvn.c
[Bug middle-end/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 --- Comment #10 from rguenther at suse dot de --- On Mon, 8 Jul 2019, clyon at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 > > --- Comment #8 from Christophe Lyon --- > (In reply to Richard Biener from comment #5) > > Hmm, using a cross configured as > > > > trunk/configure --target=armeb-none-linux-gnueabihf --with-cpu=cortex-a9 > > --with-fpu=neon-fp16 --enable-languages=c > > > > and trimming the testcase to the first line I cannot reproduce the reported > > assembly. I get at -O3 > > > > .arm > > .fpu softvfp > > For some reason, you are not targeting the right FPU, I have: > .arm > .fpu neon-fp16 I noticed that - but it doesn't change even when supplying -mpfu=neon-fp16 -mcpu=cortex-a9 I suppose some configure-time checking disables this feature somehow without notifying me :/ (don't have a armeb assembler installed, trying a pure cc1 cross) Anyway, I can't reproduce even after spending 1+ hours on this.
[Bug middle-end/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 --- Comment #9 from rsandifo at gcc dot gnu.org --- (In reply to rsand...@gcc.gnu.org from comment #7) > (In reply to Christophe Lyon from comment #4) > > Unfortunately, it's still failing as of r273133. > > > > It fails at the very first check: > > v1 = 2 + v0; check (short, 8, v0, v1, 2, +, l); > > > > The generated code for main is: > > main: > ... > > vmov.16 d16[0], r0 > > sxthr1, r0 > > vadd.i16q0, q8, q9 > > add ip, r1, #2 > > vmov.s16r2, d0[3] > > Yeah, this looks wrong. We should be adding 2 to a single element > here, but we're extracting from one index and inserting into another. > The first quoted instruction should be using [3] as well. > > I'd be unsurprised if this was a target bug. Er, pretend that message never happened, first thing Monday morning :-)
[Bug middle-end/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 --- Comment #8 from Christophe Lyon --- (In reply to Richard Biener from comment #5) > Hmm, using a cross configured as > > trunk/configure --target=armeb-none-linux-gnueabihf --with-cpu=cortex-a9 > --with-fpu=neon-fp16 --enable-languages=c > > and trimming the testcase to the first line I cannot reproduce the reported > assembly. I get at -O3 > > .arm > .fpu softvfp For some reason, you are not targeting the right FPU, I have: .arm .fpu neon-fp16
[Bug middle-end/91060] [10 regression] gcc.c-torture/execute/scal-to-vec1.c fails on armeb-none-linux-gnueabihf since r272843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91060 --- Comment #7 from rsandifo at gcc dot gnu.org --- (In reply to Christophe Lyon from comment #4) > Unfortunately, it's still failing as of r273133. > > It fails at the very first check: > v1 = 2 + v0; check (short, 8, v0, v1, 2, +, l); > > The generated code for main is: > main: ... > vmov.16 d16[0], r0 > sxthr1, r0 > vadd.i16q0, q8, q9 > add ip, r1, #2 > vmov.s16r2, d0[3] Yeah, this looks wrong. We should be adding 2 to a single element here, but we're extracting from one index and inserting into another. The first quoted instruction should be using [3] as well. I'd be unsurprised if this was a target bug.
[Bug c++/85746] Premature evaluation of __builtin_constant_p?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85746 --- Comment #8 from rsandifo at gcc dot gnu.org --- (In reply to Marc Glisse from comment #7) > (In reply to Marc Glisse from comment #6) > > && xi.val[0] <= (HOST_WIDE_INT) ((unsigned HOST_WIDE_INT) > > HOST_WIDE_INT_MAX >> shift)) > > The issue occurs with xi.val[0] == -9223372036854775808 (lshift_large > returns a result of length 2 for that). I don't know if the code mishandles > this case, or if such a number is not supposed to exist in the first place, > but that does seem like a bug. Yeah, looks like this should have been an unsigned HOST_WIDE_INT comparison instead, i.e. casting xi.val[0] rather than the shift result.